DE102021124203A1

DE102021124203A1 - Method and device for training a neural network with quantized parameters

Info

Publication number: DE102021124203A1
Application number: DE102021124203.7A
Authority: DE
Inventors: Alexander Frickenstein; Manoj Rohit Vemparala; Nael Fasfous; Lukas Frickenstein; Anmol Singh; Christian Unger; Naveen Shankar NAGARAJA; Walter Stechele
Original assignee: Bayerische Motoren Werke AG
Current assignee: Bayerische Motoren Werke AG
Priority date: 2021-09-20
Filing date: 2021-09-20
Publication date: 2023-03-23

Abstract

Es wird eine Vorrichtung zum Anlernen eines neuronalen Netzes für eine Zielfunktion beschrieben. Die Vorrichtung ist eingerichtet, ein oder mehrere Aktivierungen der ein oder mehreren Schichten des neuronalen Netzes anhand einer Aktivierungs-Quantisierungsfunktion zu quantisieren, um ein oder mehrere entsprechende quantisierte Aktivierungen zu ermitteln, und die ein oder mehreren Mengen von Parametern der ein oder mehreren Schichten anhand einer Parameter-Quantisierungsfunktion zu quantisieren, um ein oder mehrere entsprechende Mengen von quantisierten Parametern zu ermitteln. Die Vorrichtung ist ferner eingerichtet, anhand der ein oder mehreren quantisierten Aktivierungen und anhand der ein oder mehreren Mengen von quantisierten Parametern einen prädizierten Ausgang des neuronalen Netzes zu ermitteln. Des Weiteren ist die Vorrichtung eingerichtet, die Aktivierungs-Quantisierungsfunktion, die Parameter-Quantisierungsfunktion und die ein oder mehreren Mengen von Parametern auf Basis des prädizierten Ausgangs und auf Basis einer Optimierungsfunktion anzupassen.A device for training a neural network for a target function is described. The apparatus is configured to quantize one or more activations of the one or more layers of the neural network using an activation quantization function to determine one or more corresponding quantized activations, and the one or more sets of parameters of the one or more layers using a quantize a parameter quantization function to determine one or more corresponding sets of quantized parameters. The device is also set up to determine a predicted output of the neural network based on the one or more quantized activations and based on the one or more sets of quantized parameters. Furthermore, the device is set up to adapt the activation quantization function, the parameter quantization function and the one or more sets of parameters on the basis of the predicted output and on the basis of an optimization function.

Description

Die Erfindung betrifft eine Vorrichtung und ein entsprechendes Verfahren zum Anlernen eines neuronalen Netzes, das quantisierte Parameter und/oder quantisierte Aktivierungen aufweist.The invention relates to a device and a corresponding method for training a neural network that has quantized parameters and/or quantized activations.

Ein zumindest teilweise automatisiert fahrendes (Kraft-) Fahrzeug (z.B. ein Personenkraftwagen, ein Lastkraftwagen oder ein Bus) kann eingerichtet sein, Sensordaten von ein oder mehreren Umfeldsensoren des Fahrzeugs (z.B. eine Bildkamera, ein Lidarsensor, ein Radarsensor, etc.) auszuwerten, z.B. um ein oder mehrere Objekte im Umfeld des Fahrzeugs zu detektieren. Das Fahrzeug kann dann in Abhängigkeit von den ein oder mehreren detektieren Objekten automatisiert längs- und/oder quergeführt werden.An at least partially automated (motor) vehicle (e.g. a passenger car, a truck or a bus) can be set up to evaluate sensor data from one or more environmental sensors of the vehicle (e.g. an image camera, a lidar sensor, a radar sensor, etc.), e.g. to detect one or more objects in the vicinity of the vehicle. The vehicle can then be guided longitudinally and/or transversely in an automated manner depending on the one or more detected objects.

Die Erkennung von Objekten auf Basis der Sensordaten von ein oder mehreren Umfeldsensoren kann anhand von spezifisch angelernten neuronalen Netzen, insbesondere sogenannten Deep Neural Networks und/oder Convolutional Neural Networks, erfolgen. Ein neuronales Netz weist typischerweise eine Vielzahl von Parametern auf (die in diesem Dokument auch als Neuron-Parameter bezeichnet werden), die gespeichert werden müssen und die mit Sensordaten verarbeitet werden müssen, um eine Objekterkennung auf Basis der Sensordaten durchzuführen. Die Speicherung und die Verarbeitung der Vielzahl von Neuron-Parametern führt typischerweise zu einem relativ hohen Aufwand an Speicher- und Rechenressourcen (insbesondere auf einem eingebetteten System, wie beispielsweise auf einem Steuergerät in einem Fahrzeug).Objects can be detected on the basis of the sensor data from one or more environmental sensors using specifically trained neural networks, in particular so-called deep neural networks and/or convolutional neural networks. A neural network typically has a large number of parameters (which are also referred to as neuron parameters in this document) which must be stored and which must be processed with sensor data in order to carry out object recognition based on the sensor data. The storage and processing of the large number of neuron parameters typically leads to a relatively large amount of storage and computing resources (especially on an embedded system, such as on a control unit in a vehicle).

Das vorliegende Dokument befasst sich mit der technischen Aufgabe, ein neuronales Netz, insbesondere die Vielzahl von (Neuron-) Parametern des neuronalen Netzes, derart anzulernen, dass der Ressourcenaufwand des neuronalen Netzes reduziert wird, ohne die Güte (z.B. die Objekterkennungsgüte) des neuronalen Netzes (wesentlich) zu beeinträchtigen. Alternativ oder ergänzend befasst sich das vorliegende Dokument damit, den Ressourcenaufwand eines neuronalen Netzes in effizienter und präziser Weise zu ermitteln.The present document deals with the technical task of training a neural network, in particular the large number of (neural) parameters of the neural network, in such a way that the resource requirements of the neural network are reduced without affecting the quality (e.g. the object recognition quality) of the neural network (significantly) to affect. Alternatively or in addition, the present document deals with determining the resource requirements of a neural network in an efficient and precise manner.

Die Aufgabe wird jeweils durch die unabhängigen Ansprüche gelöst. Vorteilhafte Ausführungsformen werden u.a. in den abhängigen Ansprüchen beschrieben. Es wird darauf hingewiesen, dass zusätzliche Merkmale eines von einem unabhängigen Patentanspruch abhängigen Patentanspruchs ohne die Merkmale des unabhängigen Patentanspruchs oder nur in Kombination mit einer Teilmenge der Merkmale des unabhängigen Patentanspruchs eine eigene und von der Kombination sämtlicher Merkmale des unabhängigen Patentanspruchs unabhängige Erfindung bilden können, die zum Gegenstand eines unabhängigen Anspruchs, einer Teilungsanmeldung oder einer Nachanmeldung gemacht werden kann. Dies gilt in gleicher Weise für in der Beschreibung beschriebene technische Lehren, die eine von den Merkmalen der unabhängigen Patentansprüche unabhängige Erfindung bilden können.The object is solved in each case by the independent claims. Advantageous embodiments are described inter alia in the dependent claims. It is pointed out that additional features of a patent claim dependent on an independent patent claim without the features of the independent patent claim or only in combination with a subset of the features of the independent patent claim can form a separate invention independent of the combination of all features of the independent patent claim, which can be made the subject of an independent claim, a divisional application or a subsequent application. This applies equally to the technical teachings described in the description, which can form an invention independent of the features of the independent patent claims.

Gemäß einem Aspekt wird eine Vorrichtung (z.B. ein Computer oder ein Server) zum Anlernen eines neuronalen Netzes für eine Zielfunktion beschrieben. Das neuronale Netz kann auf Basis von Trainingsdaten angelernt werden, wobei die Trainingsdaten eine Vielzahl von Trainings-Datensätzen umfassen. Ein Trainings-Datensatz kann dabei eine Aktivierung für eine Eingangsschicht des neuronalen Netzes aufweisen (welche in diesem Dokument auch als Eingangswert bzw. als Eingangsdaten des neuronalen Netzes bezeichnet wird). Des Weiteren kann der Trainings-Datensatz einen Soll-Ausgang (d.h. einen Soll-Ausgangswert bzw. Soll-Ausgangsdaten) umfassen, den das neuronale Netz für die entsprechende Aktivierung an der Ausgangsschicht des neuronalen Netzes bereitstellen sollte. Durch die Vielzahl von Paarungen aus Aktivierung (der Eingangsschicht) und Soll-Ausgang (der Ausgangsschicht) kann die Zielfunktion (z.B. Objekterkennung oder Klassifizierung) beschrieben werden, für die das neuronale Netz angelernt werden soll.According to one aspect, an apparatus (e.g., a computer or a server) for training a neural network for a target function is described. The neural network can be trained on the basis of training data, with the training data comprising a large number of training data sets. In this case, a training data set can have an activation for an input layer of the neural network (which is also referred to in this document as the input value or input data of the neural network). Furthermore, the training data set can include a target output (i.e. a target output value or target output data) which the neural network should provide for the corresponding activation at the output layer of the neural network. The target function (e.g. object recognition or classification) for which the neural network is to be trained can be described by the large number of pairings of activation (the input layer) and target output (the output layer).

Das neuronale Netz weist eine Menge von ein oder mehreren Schichten mit jeweils einer Menge von Parametern (insbesondere von Gewichten) auf. Das neuronale Netz kann insbesondere L aufeinanderfolgende Schichten umfassen, mit L größer als 10, insbesondere größer als 100. Die erste Schicht (l = 1) kann als Eingangsschicht bezeichnet werden und die letzte Schicht (l = L) kann als Ausgangsschicht bezeichnet werden.The neural network has a set of one or more layers, each with a set of parameters (in particular weights). In particular, the neural network may comprise L consecutive layers, with L greater than 10, in particular greater than 100. The first layer (l=1) may be referred to as the input layer and the last layer (l=L) may be referred to as the output layer.

Eine Schicht kann eine Faltungs-Schicht sein, die ausgebildet ist, eine Faltung (Englisch, Convolution) zwischen der jeweiligen Menge von (ggf. quantisierten) Parametern und der jeweiligen (ggf. quantisierten) Aktivierung zu bewirken, um einen Ausgang der jeweiligen Schicht zu ermitteln. Das neuronale Netz kann insbesondere ein Convolutional Neural Network (CNN) sein.A layer can be a convolution layer that is designed to bring about a convolution between the respective set of (possibly quantized) parameters and the respective (possibly quantized) activation in order to produce an output of the respective layer determine. In particular, the neural network can be a convolutional neural network (CNN).

Beispielsweise kann die Aktivierung für die Eingangsschicht ein oder mehrere Datenmatrizen, insbesondere einen Datentensor, umfassen (der z.B. ein oder mehrere Bilder einer Kamera umfasst). Diese Aktivierung kann sequentiell durch die unterschiedlichen Schichten des neuronalen Netzes verarbeitet werden, um einen Ausgang (d.h. Ausgangsdaten) an der Ausgangsschicht bereitzustellen. Der Ausgang kann ebenfalls ein oder mehrere Datenmatrizen, insbesondere einen Datentensor, umfassen. Beispielsweise kann der Ausgang Rahmen (d.h. Bounding Boxes) um ein oder mehrere detektierte Objekte in einem Kamera-Bild anzeigen.For example, the activation for the input layer can comprise one or more data matrices, in particular a data tensor (which eg comprises one or more images from a camera). This activation can be processed sequentially through the different layers of the neural network to provide an output (ie output data) at the output layer. The output can also include one or more data matrices, in particular a data tensor. For example, the output can display frames (ie, bounding boxes) around one or more detected objects in a camera image.

Die Vorrichtung kann ausgebildet sein, den Ausgang einer Schicht l als Aktivierung für die direkt nachfolgende Schicht l + 1 zu verwenden. Ferner kann die Vorrichtung eingerichtet sein, den Ausgang der Ausgangsschicht des neuronalen Netzes als prädizierten Ausgang des neuronalen Netzes zu verwenden.The device can be designed to use the output of a layer l as an activation for the immediately following layer l+1. Furthermore, the device can be set up to use the output of the output layer of the neural network as the predicted output of the neural network.

Die Vorrichtung ist eingerichtet, in einer Iteration k eines Lernalgorithmus, ein oder mehrere Aktivierungen der ein oder mehreren Schichten anhand einer Aktivierungs-Quantisierungsfunktion zu quantisieren, um ein oder mehrere entsprechende quantisierte Aktivierungen zu ermitteln. Die Aktivierungs-Quantisierungsfunktion kann dabei für jede einzelne Schicht anzeigen, wie die Aktivierung der jeweiligen Schicht jeweils zu quantisieren ist. Dabei kann insbesondere die jeweilige Anzahl von (ggf. gleichmäßig verteilten) Quantisierungswerten angezeigt werden. Die Quantisierung, insbesondere die Anzahl von Quantisierungswerten, kann in den unterschiedlichen Schichten unterschiedlich sein. Die Aktivierungs-Quantisierungsfunktion kann dabei in der direkt vorhergehenden Iteration k - 1 festgelegt worden sein.The device is set up, in an iteration k of a learning algorithm, to quantize one or more activations of the one or more slices using an activation quantization function in order to determine one or more corresponding quantized activations. In this case, the activation quantization function can indicate for each individual layer how the activation of the respective layer is to be quantized in each case. In particular, the respective number of (possibly evenly distributed) quantization values can be displayed. The quantization, in particular the number of quantization values, can be different in the different layers. The activation quantization function can have been defined in the directly preceding iteration k−1.

Des Weiteren kann die Vorrichtung eingerichtet sein, die ein oder mehreren Mengen von (nicht-quantisierten) Parametern der ein oder mehreren Schichten anhand einer Parameter-Quantisierungsfunktion zu quantisieren, um ein oder mehrere entsprechende Mengen von quantisierten Parametern zu ermitteln. Die Parameter-Quantisierungsfunktion kann dabei für jede einzelne Schicht anzeigen, wie die Menge von (nicht-quantisierten) Parametern der jeweiligen Schicht jeweils zu quantisieren ist. Dabei kann insbesondere die jeweilige Anzahl von (ggf. gleichmäßig verteilten) Quantisierungswerten angezeigt werden. Die Quantisierung, insbesondere die Anzahl von Quantisierungswerten, kann in den unterschiedlichen Schichten unterschiedlich sein. Die Parameter-Quantisierungsfunktion kann dabei in der direkt vorhergehenden Iteration k - 1 festgelegt worden sein.Furthermore, the device can be set up to quantize the one or more sets of (non-quantized) parameters of the one or more slices using a parameter quantization function in order to determine one or more corresponding sets of quantized parameters. In this case, the parameter quantization function can indicate for each individual layer how the set of (non-quantized) parameters of the respective layer is to be quantized in each case. In particular, the respective number of (possibly evenly distributed) quantization values can be displayed. The quantization, in particular the number of quantization values, can be different in the different layers. The parameter quantization function can have been defined in the directly preceding iteration k−1.

Des Weiteren kann die Vorrichtung eingerichtet sein, anhand der ein oder mehreren quantisierten Aktivierungen und anhand der ein oder mehreren Mengen von quantisierten Parametern einen prädizierten Ausgang des neuronalen Netzes (an der Ausgangsschicht des neuronalen Netzes) zu ermitteln. Dabei kann sequentiell für die L Schichten, beginnend mit der Schicht l = 1, auf Basis der quantisierten Aktivierung und auf Basis der Menge von quantisierten Parametern der jeweiligen Schicht l ein (nicht-quantisierter) Ausgang der jeweiligen Schicht l ermittelt werden. Der (nicht-quantisierte) Ausgang der Schicht l kann der (nicht-quantisierten) Aktivierung der direkt nachfolgenden Schicht l + 1 entsprechen. Der (nicht-quantisierte) Ausgang der Schicht l kann dann anhand der Aktivierungs-Quantisierungsfunktion (für die direkt nachfolgende Schicht l + 1) quantisiert werden, um die quantisierte Aktivierung für die direkt nachfolgende Schicht l + 1 bereitzustellen.Furthermore, the device can be set up to determine a predicted output of the neural network (at the output layer of the neural network) based on the one or more quantized activations and based on the one or more sets of quantized parameters. A (non-quantized) output of the respective layer l can be determined sequentially for the L layers, starting with the layer l=1, on the basis of the quantized activation and on the basis of the set of quantized parameters of the respective layer l. The (non-quantized) output of layer l may correspond to the (non-quantized) activation of layer l+1 immediately following. The (unquantized) output of layer l can then be quantized using the activation quantization function (for the immediately following layer l+1) to provide the quantized activation for the immediately following layer l+1.

Die Vorrichtung ist ferner eingerichtet, die Aktivierungs-Quantisierungsfunktion, die Parameter-Quantisierungsfunktion und/oder die ein oder mehreren Mengen von (nicht-quantisierten) Parametern auf Basis des prädizierten Ausgangs und auf Basis einer Optimierungsfunktion anzupassen. Dabei kann die Anzahl von Quantisierungswerten individuell für jede einzelne Schicht (ggf. unterschiedlich) angepasst werden.The device is further configured to adapt the activation quantization function, the parameter quantization function and/or the one or more sets of (non-quantized) parameters based on the predicted output and based on an optimization function. The number of quantization values can be adjusted individually for each individual layer (possibly differently).

Die Vorrichtung kann insbesondere eingerichtet sein, für eine Vielzahl von Trainings-Aktivierungen (d.h. für eine Vielzahl von Trainings-Datensätzen) jeweils einen prädizierten Ausgang des neuronalen Netzes zu ermitteln. Die Trainings-Aktivierungen können auf ein oder mehrere Erkennungs- und/oder Klassifikations-Aufgaben in einem autonomen Fahrzeug gerichtet sein. Die Aktivierungs-Quantisierungsfunktion, die Parameter-Quantisierungsfunktion und/oder die ein oder mehreren Mengen von (nicht-quantisierten) Parametern können dann auf Basis der Vielzahl von prädizierten Ausgängen (des neuronalen Netzes) und auf Basis der Optimierungsfunktion angepasst werden.In particular, the device can be set up to determine a respective predicted output of the neural network for a large number of training activations (i.e. for a large number of training data sets). The training activations can be directed to one or more recognition and/or classification tasks in an autonomous vehicle. The activation quantization function, the parameter quantization function, and/or the one or more sets of (unquantized) parameters can then be adjusted based on the plurality of predicted outputs (of the neural network) and based on the optimization function.

Die Vorrichtung kann ferner eingerichtet sein, eine Vielzahl von Iterationen der o.g. Schritte auszuführen (z.B. bis ein bestimmtes Abbruch-, insbesondere Konvergenz-, Kriterium erreicht wird).The device can also be set up to carry out a large number of iterations of the above steps (e.g. until a certain termination, in particular convergence, criterion is reached).

Insbesondere kann die Vorrichtung eingerichtet sein, im Rahmen einer Epoche des Lernalgorithmus eine Vielzahl von Iterationen der o.g. Schritte durchzuführen. Im Rahmen einer Iteration kann ein bestimmter Batch von Trainingsdaten, d.h. eine bestimmte Teilmenge der Trainingsdaten, verwendet werden. Dabei kann bei einer Iteration eine Forward- und eine Backward-Propagation durch das neuronale Netz erfolgen, um die ein oder mehreren Mengen von Parametern und die Quantisierungsfunktionen anzupassen. Für eine darauffolgende Iteration kann dann ein weiterer Batch von Trainingsdaten, d.h. eine weitere Teilmenge von Trainingsdaten, verwendet werden.In particular, the device can be set up to carry out a large number of iterations of the above-mentioned steps within the framework of one epoch of the learning algorithm. As part of an iteration, a specific Batch of training data, ie a specific subset of the training data, are used. In an iteration, forward and backward propagation can be performed by the neural network in order to adapt the one or more sets of parameters and the quantization functions. A further batch of training data, ie a further subset of training data, can then be used for a subsequent iteration.

Eine Epoche des Lernalgorithmus ist typischerweise abgeschlossen, wenn die gesamten Trainingsdaten durchlaufen wurden. Es kann dann eine weitere Epoche des Lernalgorithmus (mit den gleichen Trainingsdaten und wiederum mit einer Vielzahl von Iterationen) durchgeführt werden. In entsprechender Weise kann eine Vielzahl von Epochen des Lernalgorithmus durchgeführt werden, z.B. bis ein Abbruch-, insbesondere Konvergenz-, Kriterium erreicht wird.An epoch of the learning algorithm is typically complete when all of the training data has been run through. A further epoch of the learning algorithm (with the same training data and again with a large number of iterations) can then be carried out. A large number of epochs of the learning algorithm can be carried out in a corresponding manner, e.g. until a termination criterion, in particular a convergence criterion, is reached.

Es wird somit eine Vorrichtung beschrieben, die ausgebildet ist, im Rahmen eines iterativen Lernalgorithmus gleichzeitig die Quantisierungsfunktionen und die Parameter des neuronalen Netzes (individuell für jede einzelne Schicht) anzulernen (unter Verwendung einer kombinierten Optimierungsfunktion). So kann ein optimierter Kompromiss zwischen Ressourcenaufwand und Güte des neuronalen Netzes erreicht werden.A device is thus described which is designed to simultaneously learn the quantization functions and the parameters of the neural network (individually for each individual layer) within the framework of an iterative learning algorithm (using a combined optimization function). In this way, an optimized compromise between resource expenditure and quality of the neural network can be achieved.

Die Vorrichtung kann eingerichtet sein, den prädizierten Ausgang (des neuronalen Netzes) mit einem Soll-Ausgang (aus dem jeweiligen Trainings-Datensatz) zu vergleichen. Die Aktivierungs-Quantisierungsfunktion, die Parameter-Quantisierungsfunktion und/oder die ein oder mehreren Mengen von Parametern können dann in besonders präziser Weise auf Basis des Vergleichs angepasst werden (insbesondere um das neuronale Netz in Bezug auf die Zielfunktion zu trainieren).The device can be set up to compare the predicted output (of the neural network) with a target output (from the respective training data set). The activation quantization function, the parameter quantization function and/or the one or more sets of parameters can then be adjusted in a particularly precise manner based on the comparison (in particular to train the neural network with respect to the target function).

Die Vorrichtung kann insbesondere eingerichtet sein, eine Abweichung des prädizierten Ausgangs von dem Soll-Ausgang zu ermitteln. Typischerweise kann eine Vielzahl von Abweichungen (für die Vielzahl von Trainings-Datensätzen) ermittelt werden. Es kann dann ein Gradient der Optimierungsfunktion für die Abweichung (insbesondere für die Vielzahl von Abweichungen) ermittelt werden. Die Aktivierungs-Quantisierungsfunktion, die Parameter-Quantisierungsfunktion und/oder die ein oder mehreren Mengen von Parametern können dann auf Basis des Gradienten angepasst werden (insbesondere unter Verwendung eines Backpropagation-Algorithmus). So kann das neuronale Netz (inkl. der Quantisierungsfunktionen) in besonders präziser und robuster Weise angelernt werden.In particular, the device can be set up to determine a deviation of the predicted output from the target output. Typically, a large number of deviations (for the large number of training data sets) can be determined. A gradient of the optimization function for the deviation (in particular for the large number of deviations) can then be determined. The activation quantization function, the parameter quantization function, and/or the one or more sets of parameters can then be adjusted based on the gradient (in particular using a backpropagation algorithm). In this way, the neural network (including the quantization functions) can be trained in a particularly precise and robust manner.

Die Optimierungsfunktion kann eine erste Teilfunktion umfassen, die darauf gerichtet ist, das neuronale Netz, insbesondere die ein oder mehreren Mengen von Parametern, in Bezug auf die Zielfunktion anzupassen. Die erste Teilfunktion kann z.B. von einer Abweichung des prädizierten Ausgangs von dem Soll-Ausgang (für die Vielzahl von Trainings-Datensätzen) abhängen.The optimization function can include a first sub-function aimed at adapting the neural network, in particular the one or more sets of parameters, in relation to the target function. For example, the first sub-function can depend on a deviation of the predicted output from the target output (for the plurality of training data sets).

Des Weiteren kann die Optimierungsfunktion eine zweite Teilfunktion umfassen, die darauf gerichtet ist, den Ressourcenaufwand des neuronalen Netzes anzupassen (insbesondere zu reduzieren). Dabei kann insbesondere der Ressourcenaufwand betrachtet werden, der mit der von der Aktivierungs-Quantisierungsfunktion und von der Parameter-Quantisierungsfunktion bewirkten Quantisierung der ein oder mehreren Aktivierungen und der ein oder mehreren Mengen von Parametern verbundenen ist. Es kann somit eine zweite Teilfunktion innerhalb der Optimierungsfunktion berücksichtigt werden, die anzeigt, wie ressourcenaufwändig (in Bezug auf die Anzahl von Rechenoperation und/oder in Bezug auf den erforderlichen Speicherbedarf und/oder in Bezug auf die erforderliche Rechenzeit (Latenz) zur Ermittlung eines Ausgangs des neuronalen Netzes) das neuronale Netz ist.Furthermore, the optimization function can include a second sub-function which is aimed at adapting (in particular reducing) the resource expenditure of the neural network. In this case, in particular, the resource expenditure associated with the quantization of the one or more activations and the one or more sets of parameters caused by the activation quantization function and by the parameter quantization function can be considered. A second sub-function can thus be taken into account within the optimization function, which indicates how resource-intensive (in terms of the number of arithmetic operations and/or in terms of the required memory requirements and/or in terms of the required computing time (latency) to determine an output of the neural network) is the neural network.

Die Optimierungsfunktion (auf Englisch, loss function) kann somit sowohl die Zielfunktion als auch den Ressourcenaufwand berücksichtigen. So kann ein besonders vorteilhafter Kompromiss zwischen Güte und Ressourcenaufwand bewirkt werden.The optimization function (in English, loss function) can thus take into account both the objective function and the resource expenditure. In this way, a particularly advantageous compromise can be achieved between quality and resource expenditure.

Wie bereits oben dargelegt, können die Aktivierungs-Quantisierungsfunktion und/oder die Parameter-Quantisierungsfunktion für eine Schicht jeweils die Anzahl | U | von (ggf. gleichmäßig verteilten) Quantisierungswerten anzeigen, mit denen die Aktivierung dieser Schicht bzw. mit denen die Menge von Parametern dieser Schicht quantisiert werden. Die verfügbaren Quantisierungswerte sind dabei bevorzugt „unique“ bzw. einzigartig. Die Vorrichtung kann eingerichtet sein, die Anzahl von Quantisierungswerten (individuell für jede einzelne Schicht) anzupassen, um die Aktivierungs-Quantisierungsfunktion und/oder die Parameter-Quantisierungsfunktion anzupassen. Es kann somit iterativ die Anzahl | U | von verfügbaren Quantisierungswerten zur Quantisierung der Daten in den unterschiedlichen Schichten angepasst, und insbesondere angelernt, werden. So kann in effizienter und präziser Weise der Ressourcenaufwand des neuronalen Netzes angepasst werden.As already explained above, the activation quantization function and/or the parameter quantization function for a layer can each have the number | U | of (possibly evenly distributed) quantization values with which the activation of this layer or with which the set of parameters of this layer are quantized. The available quantization values are preferably “unique” or unique. The device can be set up to adapt the number of quantization values (individually for each individual layer) in order to adapt the activation quantization function and/or the parameter quantization function. Thus iteratively the number | U | adapted from available quantization values for quantization of the data in the different layers, and in particular learned, become. In this way, the resource requirements of the neural network can be adjusted in an efficient and precise manner.

Die Vorrichtung kann eingerichtet sein, die Anzahl | U | von Quantisierungswerten (für die Aktivierungs-Quantisierungsfunktion und/oder für die Parameter-Quantisierungsfunktion in einer jeweiligen Schicht) aus einer Wertemenge auszuwählen. Die Wertemenge kann dabei die möglichen Werte für die Anzahl |U| festlegen. Dabei ist es typischerweise vorteilhaft, am Ende des iterativen Lernalgorithmus eine Quantisierungsfunktion bereitzustellen, die eine bestimmte Anzahl von Bits zur Quantisierung verwendet. Mit anderen Worten, die Wertemenge für die Anzahl |U| von Quantisierungswerten einer angelernten Quantisierungsfunktion ist bevorzugt {2^b} mit Exponentialwerten zur Basis zwei, wobei b eine natürliche Zahl (größer null) ist (z.B. b = 1, 2, 3, 4, 5,...).The device can be set up, the number | U | of quantization values (for the activation quantization function and/or for the parameter quantization function in a respective layer) from a set of values. The value set can include the possible values for the number |U| establish. In this case, it is typically advantageous to provide a quantization function at the end of the iterative learning algorithm that uses a specific number of bits for quantization. In other words, the set of values for the number |U| of quantization values of a learned quantization function is preferably {2 ^b } with base two exponential values, where b is a natural number (greater than zero) (eg b=1, 2, 3, 4, 5,...).

Die Vorrichtung kann eingerichtet sein, die Kardinalität der Wertemenge (zur Auswahl der Anzahl 1 U 1 von Quantisierungswerten) mit steigender Anzahl von Iterationen des Lernalgorithmus zu reduzieren. Beispielsweise kann zunächst als Wertemenge die Menge der reellen Zahl verwendet werden. Ggf. kann diese initiale Wertemenge (ab einer bestimmten Iteration) auf eine Menge von reellen Zahlen mit einer begrenzten Anzahl von Nachkommastellen reduziert werden. Ferner kann die Wertemenge (ab einer bestimmten Iteration) ggf. auf die Menge von natürlichen Zahlen beschränkt werden. Schließlich kann ggf. (ab einer bestimmten Iteration) die Wertemenge {2^b} verwendet werden.The device can be set up to reduce the cardinality of the set of values (for selecting the number 1 U 1 of quantization values) as the number of iterations of the learning algorithm increases. For example, the set of real numbers can initially be used as the set of values. If necessary, this initial set of values (after a certain iteration) can be reduced to a set of real numbers with a limited number of decimal places. Furthermore, the set of values (after a certain iteration) can be limited to the set of natural numbers. Finally, if necessary (after a certain iteration), the value set {2 ^b } can be used.

Durch das progressive Reduzieren der verfügbaren Wertemenge im Rahmen des iterativen Lernalgorithmus können die Quantisierungsfunktionen in besonders präziser und robuster Weise angepasst, insbesondere optimiert, werden. Insbesondere kann so erreicht werden, dass zur Anpassung ein Gradient der Optimierungsfunktion ermittelt und verwendet werden kann.By progressively reducing the available set of values as part of the iterative learning algorithm, the quantization functions can be adjusted, in particular optimized, in a particularly precise and robust manner. In particular, it can be achieved in this way that a gradient of the optimization function can be determined and used for the adjustment.

Wie bereits weiter oben dargelegt, kann die Vorrichtung eingerichtet sein, unterschiedliche Anzahlen von Quantisierungswerten der Aktivierungs-Quantisierungsfunktion und/oder der Parameter-Quantisierungsfunktion für unterschiedliche Schichten des neuronalen Netzes zu ermitteln. So kann ein besonders vorteilhafter Kompromiss zwischen Güte und Ressourceneffizienz erreicht werden.As already explained above, the device can be set up to determine different numbers of quantization values of the activation quantization function and/or the parameter quantization function for different layers of the neural network. In this way, a particularly advantageous compromise between quality and resource efficiency can be achieved.

Die Aktivierungs-Quantisierungsfunktion und/oder die Parameter-Quantisierungsfunktion können jeweils einen Skalierungs-Wert c aufweisen, durch den der Wertebereich von Quantisierungswerten zur Quantisierung einer Aktivierung bzw. einer Menge von Parametern festgelegt werden kann. Der Skalierungs-Wert kann für unterschiedliche Schichten unterschiedlich sein. Insbesondere kann für jede Schicht ein individueller Skalierungs-Wert (jeweils individuell für die Quantisierung der Aktivierung und für die Quantisierung der Parameter) angelernt werden. Die Vorrichtung kann eingerichtet sein, den Skalierungs-Wert c (für die jeweilige Schicht und/oder für die jeweilige Quantisierungsfunktion) auf Basis des prädizierten Ausgangs (des neuronalen Netzes) und auf Basis der Optimierungsfunktion anzupassen. So kann die Güte des neuronalen Netzes, insbesondere der Quantisierungsfunktionen, weiter erhöht werden.The activation quantization function and/or the parameter quantization function can each have a scaling value c, which can be used to define the value range of quantization values for quantization of an activation or a set of parameters. The scaling value can be different for different layers. In particular, an individual scaling value (each individually for the quantization of the activation and for the quantization of the parameters) can be learned for each layer. The device can be set up to adapt the scaling value c (for the respective layer and/or for the respective quantization function) on the basis of the predicted output (of the neural network) and on the basis of the optimization function. In this way, the quality of the neural network, in particular of the quantization functions, can be further increased.

Wie bereits oben dargelegt, kann die Optimierungsfunktion eine Teilfunktion umfassen, die ausgebildet ist, den Ressourcenaufwand des neuronalen Netzes zu berücksichtigen. Die Optimierungsfunktion kann insbesondere eine Teilfunktion (etwa eine Schätzfunktion) umfassen, die ausgebildet ist, auf Basis von ein oder mehreren charakteristischen Größen zur Beschreibung des neuronalen Netzes, der Aktivierungs-Quantisierungsfunktion und/oder der Parameter-Quantisierungsfunktion einen Wert des Ressourcenaufwands des neuronalen Netzes zu prädizieren. Die Schätzfunktion kann im Vorfeld (auf Basis von Ressourcen-Trainingsdaten) angelernt worden sein. Die Schätzfunktion umfasst in einem bevorzugten Beispiel einen angelernten Gauß-Prozess. So kann der Ressourcenaufwand des neuronalen Netzes in besonders präziser Weise optimiert werden.As already explained above, the optimization function can include a sub-function which is designed to take into account the resource requirements of the neural network. The optimization function can in particular include a sub-function (e.g. an estimation function) which is designed to assign a value to the resource requirements of the neural network on the basis of one or more characteristic variables for describing the neural network, the activation quantization function and/or the parameter quantization function predict. The estimator can be trained in advance (on the basis of resource training data). In a preferred example, the estimator comprises a learned Gaussian process. In this way, the resource expenditure of the neural network can be optimized in a particularly precise manner.

Gemäß einem Aspekt wird eine Vorrichtung (z.B. ein Computer oder ein Server) zur Ermittlung eines Schätzwertes des Ressourcenaufwands eines neuronalen Netzes beschrieben. Die in Zusammenhang mit dieser Vorrichtung beschriebenen Aspekte sind allein oder in Kombination auch für die ein oder mehreren anderen, in diesem Dokument beschriebenen, Vorrichtungen anwendbar. In umgekehrt Richtung sind auch die Aspekte der ein oder mehreren anderen Vorrichtungen allein oder in Kombination auf diese Vorrichtung anwendbar. Ferner sind die in Bezug auf ein neuronales Netz beschriebenen Aspekte einzeln oder in Kombination auch in Zusammenhang mit dieser Vorrichtung anwendbar.In one aspect, an apparatus (e.g., a computer or a server) for determining an estimate of the resource cost of a neural network is described. The aspects described in connection with this device are also applicable, alone or in combination, to the one or more other devices described in this document. Conversely, the aspects of the one or more other devices are also applicable to this device, alone or in combination. Furthermore, the aspects described in relation to a neural network can also be used individually or in combination in connection with this device.

Wie bereits weiter oben dargelegt, kann das neuronale Netz eine Menge von ein oder mehreren Schichten mit jeweils einer Menge von (ggf. quantisierten) Parametern aufweisen. Des Weiteren kann das neuronale Netz ausgebildet sein, ein oder mehrere Aktivierungen der ein oder mehreren Schichten anhand einer Aktivierungs-Quantisierungsfunktion zu quantisieren, um ein oder mehrere entsprechende quantisierte Aktivierungen zu ermitteln. Außerdem kann das neuronale Netz ausgebildet sein, anhand der ein oder mehreren quantisierten Aktivierungen und anhand der ein oder mehreren Mengen von quantisierten Parametern einen prädizierten Ausgang des neuronalen Netzes zu ermitteln.As already explained above, the neural network can have a set of one or more layers, each with a set of (possibly quantized) parameters. Furthermore, it can neural network may be configured to quantize one or more activations of the one or more layers using an activation quantization function to determine one or more corresponding quantized activations. In addition, the neural network can be designed to determine a predicted output of the neural network based on the one or more quantized activations and based on the one or more sets of quantized parameters.

Die Vorrichtung kann eingerichtet sein, Werte von ein oder mehreren charakteristischen Größen zur Beschreibung des neuronalen Netzes (insbesondere zur Beschreibung der Struktur des neuronalen Netzes und/oder zur Beschreibung der Anzahl von Rechenoperationen des neuronalen Netzes), der Aktivierungs-Quantisierungsfunktion und/oder der Parameter-Quantisierungsfunktion, mit der die ein oder mehreren Mengen von quantisierten Parametern quantisiert wurden, zu ermitteln.The device can be set up, values of one or more characteristic variables for describing the neural network (in particular for describing the structure of the neural network and/or for describing the number of arithmetic operations of the neural network), the activation quantization function and/or the parameters - determine the quantization function with which the one or more sets of quantized parameters were quantized.

Die ein oder mehreren charakteristischen Größen können umfassen:

• zumindest eine Größe in Bezug auf die Anzahl von Rechenoperationen, insbesondere Multiplikationen, die in den ein oder mehreren Schichten des neuronalen Netzes durchzuführen sind; Beispielsweise kann die Größe des Faltungs-Kernels einer Schicht beschrieben werden; und/oder
• zumindest eine Größe in Bezug auf die Anzahl 1 U 1 von unterschiedlichen Quantisierungswerten, mit denen die ein oder mehreren quantisierten Aktivierungen repräsentiert werden und/oder mit denen die ein oder mehreren Aktivierungen durch die Aktivierungs-Quantisierungsfunktion quantisiert werden; und/oder
• zumindest eine Größe in Bezug auf die Anzahl 1 U 1 von unterschiedlichen Quantisierungswerten, mit denen die ein oder mehreren Mengen von quantisierten Parametern repräsentiert werden und/oder mit denen die ein oder mehreren Mengen von quantisierten Parametern durch die Parameter-Quantisierungsfunktion quantisiert wurden.

The one or more characteristic quantities may include:

• at least one variable in relation to the number of arithmetic operations, in particular multiplications, to be carried out in the one or more layers of the neural network; For example, the size of the convolution kernel of a layer can be described; and or
• at least one size related to the number 1 U 1 of different quantization values with which the one or more quantized activations are represented and/or with which the one or more activations are quantized by the activation quantization function; and or
• at least one size related to the number 1 U 1 of different quantization values with which the one or more sets of quantized parameters are represented and/or with which the one or more sets of quantized parameters were quantized by the parameter quantization function.

Die Vorrichtung kann ferner eingerichtet sein, den Schätzwert des Ressourcenaufwands des neuronalen Netzes anhand einer Schätzfunktion und anhand der Werte der ein oder mehreren charakteristischen Größen zu prädizieren. Die Schätzfunktion kann dabei im Vorfeld auf Basis von Ressourcen-Trainingsdaten angelernt worden sein. Die Ressourcen-Trainingsdaten können dabei eine Vielzahl von Ressourcen-Datensätzen aufweisen. Dabei kann ein Ressourcen-Datensatz einerseits eine Kombination von Werten der ein oder mehreren charakteristischen Größen und andererseits einen Soll-Wert des Ressourcenaufwands anzeigen.The device can also be set up to predict the estimated value of the resource requirement of the neural network based on an estimation function and based on the values of the one or more characteristic variables. The estimator can be trained in advance on the basis of resource training data. The resource training data can have a large number of resource data sets. A resource data record can display a combination of values of one or more characteristic variables on the one hand and a target value of the resource expenditure on the other.

Es kann somit eine Maschinen-erlernte Schätzfunktion bereitgestellt und verwendet werden, um in effizienter Weise einen präzisen Schätzwert des Ressourcenaufwands eines neuronalen Netzes zu ermitteln. Dieser Schätzwert kann dann vorteilhaft im Rahmen eines Lernalgorithmus zum Anlernen eines neuronalen Netzes für eine bestimmte Zielfunktion verwendet werden, um ein neuronales Netz mit einer hohen Zielfunktions-Güte und mit einem niedrigen Ressourcenaufwand bereitzustellen.A machine-learned estimator can thus be provided and used to efficiently determine an accurate estimate of the resource cost of a neural network. This estimated value can then advantageously be used as part of a learning algorithm for training a neural network for a specific target function, in order to provide a neural network with a high target function quality and with a low outlay on resources.

In einem bevorzugten Beispiel umfasst die Schätzfunktion einen Gauß-Prozess, der z.B. anhand einer Regressions-Methode auf Basis der Ressourcen-Trainingsdaten angelernt wurde. Die Verwendung einer derartigen Schätzfunktion ermöglicht es, einen Gradienten der Schätzfunktion zu ermitteln, der dann für eine robuste und präzise Optimierung der Parameter und/oder der Quantisierungsfunktionen eines neuronalen Netzes verwendet werden kann.In a preferred example, the estimator includes a Gaussian process, which was trained using a regression method based on the resource training data, for example. The use of such an estimator makes it possible to determine a gradient of the estimator, which can then be used for a robust and precise optimization of the parameters and/or the quantization functions of a neural network.

Die Schätzfunktion kann ausgebildet sein, einen Schätzwert des Ressourcenaufwands zu ermitteln, der proportional zu $G P (m (ρ), K (ρ, ρ'))$

ist. Dabei kann m(ρ) ein angelernter Mittelwert des Gauß-Prozesses

G P ()

sein. K (ρ, ρ') kann eine angelernte Kovarianz des Gauß-Prozesses

G P ()

sein. Ferner kann ρ ein Skalar oder ein Vektor sein, der die Werte der ein oder mehreren charakteristischen Größen zur Beschreibung des neuronalen Netzes, der Aktivierungs-Quantisierungsfunktion und/oder der Parameter-Quantisierungsfunktion umfasst.The estimator can be designed to determine an estimate of the resource requirements, which is proportional to

G P (m (ρ), K (ρ, ρ'))

is. Here, m(ρ) can be a learned mean value of the Gaussian process

G P ()

be. K(ρ,ρ') can be a learned covariance of the Gaussian process

G P ()

be. Furthermore, ρ can be a scalar or a vector that includes the values of the one or more characteristic variables for describing the neural network, the activation quantization function and/or the parameter quantization function.

In einem bevorzugten Beispiel ist die Kovarianz K(ρ, ρ') gegeben, durch $K (ρ, ρ') = σ^{2} e x p (- \frac{{‖ ρ - ρ' ‖}^{2}}{2 l^{2}})$

mit einer angelernten Amplitude σ und einem angelernten Skalierungsfaktor ℓ.In a preferred example, the covariance K(ρ,ρ') is given by

K (ρ, ρ') = σ^{2} e x p (- \frac{{‖ ρ - ρ' ‖}^{2}}{2 l^{2}})

with a learned amplitude σ and a learned scaling factor ℓ.

So können Schätzwerte für den Ressourcenaufwand in besonders effizienter und präziser Weise bereitgestellt werden.In this way, estimated values for the resource expenditure can be provided in a particularly efficient and precise manner.

Die Vorrichtung kann eingerichtet sein, die Aktivierungs-Quantisierungsfunktion, die Parameter-Quantisierungsfunktion und/oder die ein oder mehreren Mengen von (ggf. quantisierten) Parametern im Rahmen eines Lernalgorithmus anhand der Schätzfunktion und/oder auf Basis des ermittelten Schätzwertes anzulernen. Insbesondere kann die Vorrichtung eingerichtet sein, die Aktivierungs-Quantisierungsfunktion, die Parameter-Quantisierungsfunktion und/oder die ein oder mehreren Mengen von (ggf. quantisierten) Parametern iterativ unter Verwendung der Schätzfunktion als Teil einer Optimierungsfunktion anzulernen.The device can be set up to learn the activation quantization function, the parameter quantization function and/or the one or more sets of (possibly quantized) parameters as part of a learning algorithm using the estimation function and/or on the basis of the estimated value determined. In particular, the device can be set up to iteratively learn the activation quantization function, the parameter quantization function and/or the one or more sets of (possibly quantized) parameters using the estimation function as part of an optimization function.

Die Schätzfunktion ist bevorzugt derart ausgebildet, dass die Schätzfunktion differenzierbar, insbesondere nach den ein oder mehreren charakteristischen Größen differenzierbar, ist. Dies ist z.B. für die o.g. Gauß-Prozess basierte Schätzfunktion der Fall. Die Differenzierbarkeit ermöglicht eine besonders effiziente und zuverlässige Nutzung im Rahmen eines Optimierungsverfahrens (z.B. zum Anlernen eines neuronalen Netzes).The estimator is preferably designed in such a way that the estimator can be differentiated, in particular can be differentiated according to the one or more characteristic variables. This is the case, for example, for the above-mentioned Gaussian process-based estimator. The ability to differentiate enables particularly efficient and reliable use as part of an optimization process (e.g. for training a neural network).

Gemäß einem weiteren Aspekt wird ein Verfahren zum Anlernen eines neuronalen Netzes für eine Zielfunktion beschrieben. Das neuronale Netz umfasst eine Menge von ein oder mehreren Schichten mit jeweils einer Menge von Parametern. Eine Schicht kann dabei jeweils 1000 oder mehr Parameter aufweisen. Das Verfahren umfasst, in einer Iteration eines Lernalgorithmus, das Quantisieren von ein oder mehreren Aktivierungen der ein oder mehreren Schichten anhand einer Aktivierungs-Quantisierungsfunktion, um ein oder mehrere entsprechende quantisierte Aktivierungen zu ermitteln. Des Weiteren umfasst das Verfahren das Quantisieren der ein oder mehreren Mengen von Parametern der ein oder mehreren Schichten anhand einer Parameter-Quantisierungsfunktion, um ein oder mehrere entsprechende Mengen von quantisierten Parametern zu ermittelnAccording to a further aspect, a method for training a neural network for a target function is described. The neural network comprises a set of one or more layers, each with a set of parameters. A layer can have 1000 or more parameters. The method includes, in one iteration of a learning algorithm, quantizing one or more activations of the one or more layers using an activation quantization function to determine one or more corresponding quantized activations. The method further includes quantizing the one or more sets of parameters of the one or more layers using a parameter quantization function to determine one or more corresponding sets of quantized parameters

Das Verfahren umfasst ferner das Ermitteln, anhand der ein oder mehreren quantisierten Aktivierungen und anhand der ein oder mehreren Mengen von quantisierten Parametern, eines prädizierten Ausgangs des neuronalen Netzes. Außerdem umfasst das Verfahren das Anpassen der Aktivierungs-Quantisierungsfunktion, der Parameter-Quantisierungsfunktion und der ein oder mehreren Mengen von Parametern auf Basis des prädizierten Ausgangs und auf Basis einer Optimierungsfunktion.The method further includes determining, based on the one or more quantized activations and based on the one or more sets of quantized parameters, a predicted output of the neural network. The method also includes adjusting the activation quantization function, the parameter quantization function, and the one or more sets of parameters based on the predicted output and based on an optimization function.

Gemäß einem weiteren Aspekt wird ein Verfahren zur Ermittlung eines Schätzwertes des Ressourcenaufwands eines neuronalen Netzes beschrieben. Das Verfahren umfasst das Ermitteln von Werten von ein oder mehreren charakteristischen Größen zur Beschreibung des neuronalen Netzes (insbesondere der Rechenoperationen und/oder der Struktur des neuronalen Netzes), der Aktivierungs-Quantisierungsfunktion (zur Quantisierung der Aktivierungen) und/oder der Parameter-Quantisierungsfunktion (zur Quantisierung der Parameter).According to a further aspect, a method for determining an estimated value of the resource expenditure of a neural network is described. The method includes determining values of one or more characteristic variables for describing the neural network (in particular the arithmetic operations and/or the structure of the neural network), the activation quantization function (for quantization of the activations) and/or the parameter quantization function ( for quantization of the parameters).

Das Verfahren umfasst ferner das Ermitteln des Schätzwertes des Ressourcenaufwands des neuronalen Netzes anhand einer Schätzfunktion und anhand der Werte der ein oder mehreren charakteristischen Größen. Dabei umfasst die Schätzfunktion bevorzugt einen Gauß-Prozess, der anhand einer Regressions-Methode auf Basis von Ressourcen-Trainingsdaten angelernt wurde.The method also includes determining the estimated value of the resource expenditure of the neural network using an estimator and using the values of the one or more characteristic variables. The estimator preferably includes a Gaussian process that was learned using a regression method based on resource training data.

Gemäß einem weiteren Aspekt wird ein Software (SW) Programm beschrieben. Das SW Programm kann eingerichtet werden, um auf einem Prozessor ausgeführt zu werden, und um dadurch eines oder mehrere der in diesem Dokument beschriebenen Verfahren auszuführen.According to a further aspect, a software (SW) program is described. The SW program can be set up to be executed on a processor and thereby to perform one or more of the methods described in this document.

Gemäß einem weiteren Aspekt wird ein Speichermedium beschrieben. Das Speichermedium kann ein SW Programm umfassen, welches eingerichtet ist, um auf einem Prozessor ausgeführt zu werden, und um dadurch eines oder mehrere der in diesem Dokument beschriebenen Verfahren auszuführen.According to a further aspect, a storage medium is described. The storage medium can comprise a SW program which is set up to be executed on a processor and thereby to carry out one or more of the methods described in this document.

Es ist zu beachten, dass die in diesem Dokument beschriebenen Verfahren, Vorrichtungen und Systeme sowohl alleine, als auch in Kombination mit anderen in diesem Dokument beschriebenen Verfahren, Vorrichtungen und Systemen verwendet werden können. Des Weiteren können jegliche Aspekte der in diesem Dokument beschriebenen Verfahren, Vorrichtungen und Systemen in vielfältiger Weise miteinander kombiniert werden. Insbesondere können die Merkmale der Ansprüche in vielfältiger Weise miteinander kombiniert werden. Ferner sind in Klammern aufgeführte Merkmale als optionale Merkmale zu verstehen.It should be noted that the methods, devices and systems described in this document can be used both alone and in combination with other methods, devices and systems described in this document. Furthermore, any aspects of the methods, devices and systems described in this document can be combined with one another in a variety of ways. In particular, the features of the claims can be combined with one another in many different ways. Furthermore, features listed in brackets are to be understood as optional features.

Im Weiteren wird die Erfindung anhand von Ausführungsbeispielen näher beschrieben. Dabei zeigen

1 beispielhafte Komponenten eines Fahrzeugs;
2a ein beispielhaftes neuronales Netz;
2b ein beispielhaftes Neuron;
3 ein beispielhaftes Verfahren zum Anlernen eines neuronalen Netzes;
4 ein Ablaufdiagramm eines beispielhaften Verfahrens zum Anlernen eines neuronalen Netzes unter Berücksichtigung des Ressourcenaufwands; und
5 ein Ablaufdiagramm eines beispielhaften Verfahrens zur Ermittlung eines Schätzwertes des Ressourcenaufwandes eines neuronalen Netzes.

The invention is described in more detail below using exemplary embodiments. show it

1 exemplary components of a vehicle;
2a an exemplary neural network;
2 B an exemplary neuron;
3 an exemplary method for training a neural network;
4 a flowchart of an exemplary method for training a neural network, taking into account the resource requirements; and
5 a flow chart of an exemplary method for determining an estimated value of the resource expenditure of a neural network.

Wie eingangs dargelegt, befasst sich das vorliegende Dokument damit, ein neuronales Netz derart für eine bestimmte Zielfunktion anzulernen, dass ein optimierter Kompromiss in Bezug auf die Güte des neuronalen Netzes und in Bezug auf den Ressourcenaufwand des neuronalen Netzes bereitgestellt wird. Beispielhafte Zielfunktionen eines neuronalen Netzes sind: die Klassifizierung, die Objekterkennung und/oder die Segmentierung. Des Weiteren befasst sich das vorliegende Dokument damit, den Ressourcenaufwand eines neuronalen Netzes in effizienter und präziser Weise zu ermitteln.As explained at the outset, the present document deals with training a neural network for a specific target function in such a way that an optimized compromise is made available in relation to the quality of the neural network and in relation to the resource requirements of the neural network. Exemplary target functions of a neural network are: classification, object recognition and/or segmentation. Furthermore, the present document deals with determining the resource expenditure of a neural network in an efficient and precise manner.

1 zeigt beispielhafte Komponenten eines Fahrzeugs 100. Das Fahrzeug 100 umfasst ein oder mehrere Umfeldsensoren 102, die eingerichtet sind, Umfelddaten (d.h. Sensordaten) in Bezug auf das Umfeld des Fahrzeugs 100 zu erfassen. Beispielhafte Umfeldsensoren 102 sind eine Bildkamera, ein Radarsensor, ein Lidarsensor und/oder ein Ultraschallsensor. Des Weiteren umfasst das Fahrzeug 100 typischerweise ein oder mehrere Fahrzeugsensoren 103, die eingerichtet sind, Zustandsdaten (d.h. Sensordaten) in Bezug auf eine Zustandsgröße des Fahrzeugs 100 zu erfassen. Beispielhafte Zustandsgrößen sind die Fahrgeschwindigkeit und/oder die (Längs-) Beschleunigung des Fahrzeugs 100. Ferner umfasst das Fahrzeug 100 ein oder mehrere Aktoren 104 zur automatischen Längs- und/oder Querführung des Fahrzeugs 100. Beispielhafte Aktoren 104 sind ein Antriebsmotor, eine Bremsvorrichtung und/oder eine Lenkvorrichtung. 1 FIG. 12 shows exemplary components of a vehicle 100. The vehicle 100 includes one or more surroundings sensors 102 which are set up to record surroundings data (ie sensor data) in relation to the surroundings of the vehicle 100. FIG. Example surroundings sensors 102 are an image camera, a radar sensor, a lidar sensor and/or an ultrasonic sensor. Furthermore, the vehicle 100 typically includes one or more vehicle sensors 103 which are set up to acquire status data (ie sensor data) in relation to a status variable of the vehicle 100 . Exemplary state variables are the driving speed and/or the (longitudinal) acceleration of vehicle 100. Vehicle 100 also includes one or more actuators 104 for automatic longitudinal and/or lateral guidance of vehicle 100. Exemplary actuators 104 are a drive motor, a braking device and /or a steering device.

Eine Steuereinheit 101 des Fahrzeugs 100 kann eingerichtet sein, ein oder mehrere Aktoren 104 automatisch auf Basis der Umfelddaten und/oder auf Basis der Zustandsdaten zu betreiben. Insbesondere kann die Steuereinheit 101 eingerichtet sein, auf Basis der Sensordaten der ein oder mehreren Umfeldsensoren 102 ein Objekt im Umfeld des Fahrzeugs 100 zu detektieren. Die ein oder mehreren Aktoren 104 können dann in Abhängigkeit von dem detektieren Objekt betrieben werden, insbesondere um das Fahrzeug 100 zumindest teilweise automatisiert zu führen.A control unit 101 of the vehicle 100 can be set up to operate one or more actuators 104 automatically on the basis of the environment data and/or on the basis of the status data. In particular, control unit 101 can be set up to detect an object in the vicinity of vehicle 100 on the basis of the sensor data from one or more surroundings sensors 102 . The one or more actuators 104 can then be operated as a function of the detected object, in particular in order to guide the vehicle 100 at least partially in an automated manner.

Zur Erkennung eines Objekts auf Basis der Sensordaten von ein oder mehreren Umfeldsensoren 102 kann zumindest ein neuronales Netz verwendet werden, das auf Basis von Trainingsdaten für die Aufgabe der Objekterkennung angelernt wurde. 2a und 2b zeigen beispielhafte Komponenten eines neuronalen Netzes 200, insbesondere eines Feedforward-Netzes. Das neuronale Netz 200 umfasst in dem dargestellten Beispiel zwei Eingangs-Neuronen bzw. Eingabe-Knoten 202, die zu einem bestimmten Zeitpunkt t jeweils einen aktuellen Wert einer Eingangsgröße als Eingangswert 201 (d.h. eine Aktivierung) aufnehmen. Die ein oder mehrere Eingangs-Knoten 202 sind Teil einer Eingangsschicht 211. Allgemein kann das neuronale Netz 200 ausgebildet sein, Eingangsdaten (d.h. eine Aktivierung) mit ein oder mehreren Eingangswerten 201 (z.B. die Bildpunkte eines Bildes) aus einer Daten-Stichprobe aufzunehmen.At least one neural network can be used to recognize an object on the basis of the sensor data from one or more surroundings sensors 102, which has been trained on the basis of training data for the task of object recognition. 2a and 2 B show exemplary components of a neural network 200, in particular a feedforward network. In the example shown, the neural network 200 comprises two input neurons or input nodes 202, which at a specific point in time t each record a current value of an input variable as the input value 201 (ie an activation). The one or more input nodes 202 are part of an input layer 211. In general, the neural network 200 can be designed to record input data (ie an activation) with one or more input values 201 (eg the pixels of an image) from a data sample.

Das neuronale Netz 200 umfasst ferner Neuronen 220 in ein oder mehreren verdeckten Schichten 212 des neuronalen Netzes 200. Jedes der Neuronen 220 kann als Eingangswerte die einzelnen Ausgangswerte (d.h. den Ausgang) der Neuronen der vorhergehenden Schicht 212, 211 aufweisen (oder zumindest einen Teil davon). In jedem der Neuronen 220 erfolgt eine Verarbeitung, um in Abhängigkeit von den Eingangswerten einen Ausgangswert des Neurons 220 zu ermitteln. Die Ausgangswerte der Neuronen 220 der letzten verdeckten Schicht 212 können in einem Ausgangs-Neuron bzw. Ausgabe-Knoten 220 einer Ausgangsschicht 213 verarbeitet werden, um die ein oder mehreren Ausgangswerte 203 (d.h. den prädizierten Ausgang) des neuronalen Netzes 200 zu ermitteln. Allgemein kann das Netz 200 ausgebildet sein, Ausgangsdaten (d.h. einen Ausgang) mit ein oder mehreren Ausgangswerten 203 bereitzustellen.The neural network 200 further comprises neurons 220 in one or more hidden layers 212 of the neural network 200. Each of the neurons 220 can have as input values the individual output values (i.e. the output) of the neurons of the previous layer 212, 211 (or at least a part thereof ). Processing is carried out in each of the neurons 220 in order to determine an output value of the neuron 220 depending on the input values. The output values of the neurons 220 of the last hidden layer 212 can be processed in an output neuron or output node 220 of an output layer 213 in order to determine the one or more output values 203 (i.e. the predicted output) of the neural network 200. In general, the network 200 may be configured to provide output data (i.e., an output) having one or more output values 203 .

2b veranschaulicht die beispielhafte Signalverarbeitung innerhalb eines Neurons 220, insbesondere innerhalb der Neuronen 202 der ein oder mehreren verdeckten Schichten 212 und/oder der Ausgangsschicht 213. Die ein oder mehreren Eingangswerte 221 des Neurons 220 werden mit individuellen Gewichten 222 gewichtet, um in einer Summeneinheit 223 eine gewichtete Summe 224 der Eingangswerte 221 zu ermitteln (ggf. unter Berücksichtigung eines Bias bzw. Offsets 227). Durch eine Aktivierungsfunktion 225 kann die gewichtete Summe 224 auf einen Ausgangswert 226 des Neurons 220 abgebildet werden. Dabei kann durch die Aktivierungsfunktion 225 z.B. eine Begrenzung des Wertebereichs erfolgen. Für ein Neuron 220 kann z.B. eine Sigmoid-Funktion oder eine Tangens hyperbolicus (tanh)-Funktion oder eine Rectified Linear Unit (ReLU), z.B. f(x) = max(0, x) als Aktivierungsfunktion 225 verwendet werden. Ggf. kann der Wert der gewichteten Summe 224 mit einem Offset 227 verschoben werden. 2 B 1 illustrates exemplary signal processing within a neuron 220, particularly within the neurons 202 of the one or more hidden layers 212 and/or the output layer 213. The one or more input values 221 of the neuron 220 are assigned individual weights 222 in order to determine a weighted sum 224 of the input values 221 in a summation unit 223 (possibly taking into account a bias or offset 227). The weighted sum 224 can be mapped to an output value 226 of the neuron 220 by an activation function 225 . In this case, the activation function 225 can be used, for example, to limit the value range. For example, a sigmoid function or a hyperbolic tangent (tanh) function or a rectified linear unit (ReLU), eg f(x)=max(0, x) can be used as activation function 225 for a neuron 220 . If necessary, the value of the weighted sum 224 can be shifted with an offset 227 .

Ein Neuron 220 weist somit Gewichte 222 und/oder ggf. einen Offset 227 als Neuron-Parameter auf. Die Neuron-Parameter der Neuronen 220 eines neuronalen Netzes 200 können in einer Trainingsphase angelernt werden, um zu bewirken, dass das neuronale Netz 200 eine bestimmte Zielfunktion approximiert und/oder ein bestimmtes Verhalten modelliert.A neuron 220 thus has weights 222 and/or possibly an offset 227 as neuron parameters. The neuron parameters of the neurons 220 of a neural network 200 can be trained in a training phase in order to cause the neural network 200 to approximate a specific target function and/or to model a specific behavior.

Das Anlernen eines neuronalen Netzes 200 kann z.B. anhand eines Backpropagation-Algorithmus erfolgen. Zu diesem Zweck können in einer ersten Phase einer k^ten Iteration und/oder Epoche eines Lernalgorithmus für die Eingangswerte 201 (d.h. für die Aktivierung) an den ein oder mehreren Eingangs-Knoten 202 des neuronalen Netzes 200 einer bestimmten Trainingsmenge von Daten-Stichproben entsprechende Ausgangswerte 203 (d.h. ein Ausgang) der ein oder mehreren Ausgangs-Neuronen 220 ermittelt werden. Auf Basis der Ausgangswerte 203 kann der Fehlerwert einer Optimierungs- bzw. Fehlerfunktion ermittelt werden. Insbesondere können die Abweichungen zwischen den von dem Netz 200 berechneten Ausgangswerten 203 (d.h. von dem prädizierten Ausgang) und dem Soll-Ausgangswerten (d.h. einem Soll-Ausgang) aus den Daten-Stichproben als Fehlerwerte berechnet werden.A neural network 200 can be trained using a backpropagation algorithm, for example. For this purpose, in a first phase of a k ^th iteration and/or epoch of a learning algorithm for the input values 201 (ie for activation) at the one or more input nodes 202 of the neural network 200, output values corresponding to a specific training set of data samples can be generated 203 (ie an output) of the one or more output neurons 220 can be determined. The error value of an optimization or error function can be determined on the basis of the output values 203 . In particular, the deviations between the output values 203 calculated by the network 200 (ie the predicted output) and the target output values (ie a target output) can be calculated from the data samples as error values.

In einer zweiten Phase der k^ten Iteration und/oder Epoche des Lernalgorithmus erfolgt eine Rückpropagation des Fehlers bzw. des Fehlerwertes von der Ausgangsschicht 213 zu der Eingangsschicht 211 des neuronalen Netzes 200, um schichtweise die Neuron-Parameter der Neuronen 220 zu verändern. Dabei kann die ermittelte Fehlerfunktion am Ausgang partiell nach jedem einzelnen Neuron-Parameter des neuronalen Netzes 200 abgeleitet werden, um ein Ausmaß und/oder eine Richtung zur Anpassung der einzelnen Neuron-Parameter zu ermitteln. Dieser Lernalgorithmus kann iterativ für eine Vielzahl von Epochen bzw. Iterationen wiederholt werden, bis ein vordefiniertes Konvergenz- und/oder Abbruchkriterium erreicht wird.In a second phase of the k ^th iteration and/or epoch of the learning algorithm, the error or the error value is backpropagated from the output layer 213 to the input layer 211 of the neural network 200 in order to change the neuron parameters of the neurons 220 layer by layer. In this case, the determined error function at the output can be partially derived according to each individual neuron parameter of the neural network 200 in order to determine an extent and/or a direction for the adjustment of the individual neuron parameters. This learning algorithm can be repeated iteratively for a large number of epochs or iterations until a predefined convergence and/or termination criterion is reached.

Ein neuronales Netz 200 kann somit L unterschiedliche Schichten 211, 212, 213 mit jeweils einer Menge von (Neuron-) Parametern 222, 227, insbesondere Gewichten, aufweisen. Der Eingangswert 201 einer Schicht 211, 212, 213 kann z.B. eine Matrix oder ein Tensor sein. Der Eingangswert 201 kann auch als Aktivierung der jeweiligen Schicht 211, 212, 213 bezeichnet werden. Die Aktivierung A^l-1 zu einer Schicht l ∈ {1, ..., L} kann mit den Parametern W^l der Schicht l verarbeitet werden, um den Ausgang A^l der Schicht l zu ermitteln. Der Ausgang A^l der Schicht l stellt dann die Aktivierung für die nachfolgende Schicht l + 1 dar. A° entspricht somit dem Eingangswert 201 des neuronalen Netzes 200, und A^L entspricht dem Ausgangswert 203 des neuronalen Netzes 200. In den einzelnen Schichten l ∈ {1, ..., L} kann jeweils eine Faltung (engl., convolution) zwischen der Aktivierung A^l-1 (d.h. dem Aktivierungs-Tensor) und den Parametern W^l (d.h. dem Parameter-Tensor) erfolgen.A neural network 200 can thus have L different layers 211, 212, 213, each with a set of (neuron) parameters 222, 227, in particular weights. The input value 201 of a layer 211, 212, 213 can be a matrix or a tensor, for example. The input value 201 can also be referred to as activation of the respective layer 211, 212, 213. The activation Al ^-1 to a layer l ∈ {1,...,L} can be processed with the parameters ^Wl of layer l to find the output ^Al of layer l. The output A ^l of layer l then represents the activation for the subsequent layer l+1. A° thus corresponds to the input value 201 of the neural network 200, and A ^L corresponds to the output value 203 of the neural network 200. In the individual layers l ∈ {1, ..., L} a convolution can take place between the activation A ^l-1 (ie the activation tensor) and the parameters W ^l (ie the parameter tensor).

Die Genauigkeit, mit der die Rechenoperationen in den einzelnen Schichten durchgeführt werden, kann in unterschiedlichen Schichten unterschiedlich sein. Insbesondere kann die Anzahl von Bits, die zur Repräsentation der Aktivierung A^l-1 und/oder der Parameter W^l verwendet wird, in unterschiedlichen Schichten unterschiedlich sein. Die Anzahl von Bits pro Tensoreintrag kann als $b_{w}^{l}$

(für den Parameter-Tensor W^l) und/oder als

b_{a}^{l}

(für den Aktivierungs-Tensor A^l) bezeichnet werden.The accuracy with which the arithmetic operations are carried out in the individual layers can differ in different layers. In particular, the number of bits used to represent the activation Al ^-1 and/or the parameters ^Wl can be different in different layers. The number of bits per tensor entry can be specified as

b_{w}^{l}

(for the parameter tensor W ^l ) and/or as

b_{a}^{l}

(for the activation tensor A ^l ).

Die Anzahl von Bits für einen Tensoreintrag x (z.B. für einen Parameter oder für eine Aktivierung) kann z.B. b sein. Der Tensoreintrag x kann dann durch 2^b unterschiedliche Quantisierungswerte repräsentiert werden. Die unterschiedlichen Quantisierungswerte können in einem Wertebereich [-c, c] (z.B. für Parameter) oder [0, c] (z.B. für Aktivierungen) liegen, wobei c ein Skalierungs-Wert ist, der für die einzelnen Schichten angelernt werden kann. Die Quantisierung eines Tensoreintrags x kann dann durch folgende Formel bewirkt werden $x_{q} = R o u n d (Clip (x,0, c) \cdot \frac{(2^{b} - 1)}{c}) \cdot \frac{c}{(2^{b} - 1)};$

wobei x_q der quantisierte Tensoreintrag ist.The number of bits for a tensor entry x (eg for a parameter or for an activation) can be eg b. The tensor entry x can then be represented by 2 ^b different quantization values. The different quantization values can be in a value range [-c, c] (eg for parameters) or [0, c] (eg for activations), where c is a scaling value that can be learned for the individual layers. The quantization of a tensor entry x can then be effected using the following formula

x_{q} = R O and n i.e (clip (x,0, c) \cdot \frac{(2^{b} - 1)}{c}) \cdot \frac{c}{(2^{b} - 1)};

where x _q is the quantized tensor entry.

Die Anzahl von unterschiedlichen Quantisierungswerten, die in der o.g. Quantisierungsformel berücksichtigt werden können, kann nur die Werte 2^b, mit b = 1,2,3,4, ...., annehmen. Diese Diskontinuität in Bezug auf die mögliche Anzahl von unterschiedlichen Quantisierungswerten kann zu Konvergenzproblemen im Rahmen des iterativen Lernalgorithmus führen. Zum Anlernen der Parameter eines neuronalen Netzes kann daher vorteilhaft angenommen werden, dass zur Quantisierung eine beliebige Anzahl 1 U 1 von unterschiedlichen Quantisierungswerten verwendet werden kann, wobei |U| eine reelle Zahl ist (und somit einen beliebigen Wert aus der Menge von reellen Zahlen annehmen kann). Als Quantisierungsformel für einen Tensoreintrag x ergibt sich dann, ${\tilde{x}}_{q} = R o u n d (Clip (x,0, c) \cdot \frac{| U |}{c}) \times \frac{c}{| U |}$

wobei x̃_q der quantisierte Tensoreintrag ist.The number of different quantization values that can be taken into account in the above quantization formula can only assume the values 2 ^b , with b=1,2,3,4,.... This discontinuity in terms of the possible number of different quantization values can lead to convergence problems in the context of the iterative learning algorithm. To train the parameters of a neural network, it can therefore advantageously be assumed that any number 1 U 1 of different quantization values can be used for quantization, where |U| is a real number (and thus can take on any value from the set of real numbers). The quantization formula for a tensor entry x is then:

{\tilde{x}}_{q} = R O and n i.e (clip (x,0, c) \cdot \frac{| u |}{c}) \times \frac{c}{| u |}

where x̃ _q is the quantized tensor entry.

Im Rahmen des Backpropagation-Algorithmus kann als Gradient g_|U| der Quantisierungsoperation folgende Gradientenformel verwendet werden, $g_{| U |} \overset{ISTE}{=} g_{{\tilde{x}}_{q}} \cdot (\frac{- c}{{| U |}^{2}} \cdot R o u n d (Clip (x,0, c) \cdot \frac{| U |}{c}) + \frac{Clip (x,0, c)}{| U |})$

As part of the backpropagation algorithm, the gradient g _|U| gradient formula following the quantization operation can be used,

G_{| u |} \overset{ISTE}{=} G_{{\tilde{x}}_{q}} \cdot (\frac{- c}{{| u |}^{2}} \cdot R O and n i.e (clip (x,0, c) \cdot \frac{| u |}{c}) + \frac{clip (x,0, c)}{| u |})

Um zu gewährleisten, dass nach Konvergenz des Lernalgorithmus die Anzahl |U| von unterschiedlichen Quantisierungswerten nur Werte 2^b, z.B. mit b = 1,2,3,4, ...., annimmt, kann mit Erhöhung der durchlaufenen Epochen bzw. Iterationen jeweils die Kardinalität der möglichen Wertemenge für |U| reduziert werden. Beispielsweise kann zunächst die Kardinalität von der Wertemenge der reellen Zahlen auf die Wertemenge der natürlichen Zahlen und schließlich auf 2^b, mit b = 1,2,3,4, ...., reduziert werden. Zu diesem Zweck kann der Exponent b berechnet werden als Round(log₂(|U|)).To ensure that after convergence of the learning algorithm, the number |U| of different quantization values only assumes values 2 ^b , eg with b=1,2,3,4, ...., the cardinality of the possible set of values for |U| be reduced. For example, the cardinality can first be reduced from the value set of the real numbers to the value set of the natural numbers and finally to 2 ^b , with b = 1,2,3,4,.... For this purpose, the exponent b can be calculated as Round(log ₂ (|U|)).

Wie weiter oben dargelegt, kann zum Anlernen eines neuronalen Netzes 200 eine Optimierungsfunktion (insbesondere eine Fehlerfunktion) verwendet werden. Die Optimierungsfunktion kann mehrere Komponenten aufweisen, um beim Anlernen des neuronalen Netzes 200 unterschiedliche Aspekte berücksichtigen zu können. Die Optimierungsfunktion kann insbesondere umfassen,

• eine erste Komponente bzw. Teilfunktion, die darauf ausgerichtet ist, die Güte der Zielfunktion des neuronalen Netzes 200 zu erhöhen;
• eine zweite Komponente bzw. Teilfunktion, die darauf ausgerichtet ist, die Komplexität des neuronalen Netzes 200 (z.B. in Bezug auf Speicher- und/oder Rechenressourcen) zu reduzieren; und/oder
• eine dritte Komponente bzw. Teilfunktion, die darauf ausgerichtet ist, einen Informationsverlust aufgrund der Quantisierung der Tensoreinträge zu reduzieren.

As explained above, an optimization function (in particular an error function) can be used to train a neural network 200 . The optimization function can have several components in order to be able to take different aspects into account when training the neural network 200 . In particular, the optimization function can include

• a first component or sub-function which is designed to increase the quality of the target function of the neural network 200;
• a second component or sub-function aimed at reducing the complexity of the neural network 200 (eg in terms of memory and/or computational resources); and or
• a third component or sub-function which is aimed at reducing information loss due to the quantization of the tensor entries.

3 veranschaulicht ein beispielhaftes Verfahren 300 zum Anlernen eines neuronalen Netzes 200. Das Verfahren 300 kann iterativ durchlaufen werden, um wiederholt die Parameter-Tensoren und die Quantisierungsfunktionen in den unterschiedlichen Schichten anzupassen. In einer bestimmten Iteration des Verfahrens 300 liegen aktuelle Parameter-Tensoren W^l 310, aktuelle Aktivierungs-Quantisierungsfunktionen $b_{a}^{l}$

311, und aktuelle Parameter-Quantisierungsfunktionen

b_{w}^{l}

312 vor (für alle Schichten des neuronalen Netzes 200). Die Parameter-Tensoren W^l 310 können dabei in nicht-quantisierter Form gespeichert werden. Die Quantisierungsfunktionen 311, 312 können jeweils die Anzahl |U| von unterschiedlichen Quantisierungswerten und den Skalierungs-Wert c (individuell für jede einzelne Schicht) anzeigen. Die Parameter-Tensoren 310 und die Quantisierungsfunktionen 311, 312 können für alle L Schichten 211, 212, 213 des neuronalen Netzes 200 bereitgestellt werden. 3 FIG. 3 illustrates an example method 300 for training a neural network 200. The method 300 can be iterated to repeatedly adjust the parameter tensors and the quantization functions in the different layers. In a particular iteration of the method 300 are current parameter tensors ^Wl 310, current activation quantization functions

b_{a}^{l}

311, and current parameter quantization functions

b_{w}^{l}

312 (for all layers of the neural network 200). The parameter tensors W ¹ 310 can be stored in non-quantized form. The quantization functions 311, 312 can each have the number |U| of different quantization values and the scaling value c (individually for each single layer). The parameter tensors 310 and the quantization functions 311, 312 can be provided for all L layers 211, 212, 213 of the neural network 200.

Der Block 325 umfasst Schritte, die sequentiell für alle Schichten 211, 212, 213 des neuronalen Netzes 200 durchgeführt werden, von der Eingangsschicht 211, l = 1, bis zu der Ausgangsschicht 213, l = L. Der Ausgang 303 (auch als die Prädiktion bezeichnet) des Blocks 325 für eine Schicht l wird dabei als Eingang bzw. Aktivierung 301 für die darauffolgende Schicht l + 1 verwendet. Der Ausgang 303 für die Ausgangsschicht l = L wird zur Ermittlung der Optimierungsfunktion (d.h. der loss function) und somit zur Anpassung der Parameter-Tensoren 310 und/oder der Quantisierungsfunktionen 311, 312 verwendet (in Block 330).The block 325 comprises steps performed sequentially for all layers 211, 212, 213 of the neural network 200, from the input layer 211, l = 1, to the output layer 213, l = L. The output 303 (also called the prediction designated) of the block 325 for a layer l is used as input or activation 301 for the subsequent layer l+1. The output 303 for the output layer l = L is used to determine the optimization function (i.e. the loss function) and thus to adapt the parameter tensors 310 and/or the quantization functions 311, 312 (in block 330).

Im Folgenden werden die Rechenoperationen für eine Schicht l beschrieben. Auf Basis der aktuellen Aktivierungs-Quantisierungsfunktion 311 für die Schicht l wird eine Menge 308 von Quantisierungswerten für die Aktivierung 301 der Schicht l (d.h. für A^l-1) ermittelt. Die Aktivierung 301 wird in dem Quantisierungs-Block 321 anhand der Menge 308 von Quantisierungswerten quantisiert, sodass eine quantisierte Aktivierung 302 für die Schicht l bereitgestellt wird.The arithmetic operations for a layer 1 are described below. Based on the current activation quantization function 311 for layer l, a set 308 of quantization values for the activation 301 of layer l (ie for A ^l-1 ) is determined. The activation 301 is quantized in the quantization block 321 using the set 308 of quantization values, so that a quantized activation 302 for the layer l is provided.

Aus dem aktuellen Parameter-Tensor 310 für das gesamte neuronale Netz 200 werden die Parameter 305 für die Schicht l extrahiert. Ferner wird auf Basis der aktuellen Parameter-Quantisierungsfunktion 312 für die Schicht l eine Menge 306 von Quantisierungswerten für die Parameter 305 der Schicht l (d.h. für W^l) ermittelt. Die Parameter 305 werden dann in dem Quantisierungs-Block 323 anhand der Menge 306 von Quantisierungswerten quantisiert, sodass quantisierte Parameter 307 für die Schicht l bereitgestellt werden.The parameters 305 for layer 1 are extracted from the current parameter tensor 310 for the entire neural network 200 . Furthermore, based on the current parameter quantization function 312 for layer l, a set 306 of quantization values for the parameters 305 of layer l (ie for W ^l ) is determined. The parameters 305 are then quantized in the quantization block 323 using the set 306 of quantization values, so that quantized parameters 307 are provided for layer l.

Die quantisierte Aktivierung 302 für die Schicht l wird anhand der quantisierten Parameter 307 für die Schicht l verarbeitet (z.B. anhand einer Faltung), um den Ausgang 303 der Schicht l zu ermitteln. Wie bereits weiter oben dargelegt, wird der Ausgang 303 der Schicht l als Aktivierung 301 für die nachfolgende Schicht l + 1 verwendet (für l = 1, ...,L - 1). Für die Ausgangsschicht l = L wird der Ausgang 303 zur Anpassung der Quantisierungsfunktionen 311, 312 und/oder des Parameter-Tensors 310 für das neuronale Netz 200 verwendet.The quantized activation 302 for layer l is processed (e.g., by a convolution) against the quantized parameters 307 for layer l to determine the output 303 of layer l. As already explained above, the output 303 of layer l is used as an activation 301 for the following layer l+1 (for l=1,...,L-1). For the output layer l=L, the output 303 is used to adjust the quantization functions 311, 312 and/or the parameter tensor 310 for the neural network 200.

Der Anpassungsblock 330 nimmt den (anhand des neuronalen Netzes 200) prädizierten Ausgang 303 auf. Ferner nimmt der Anpassungsblock 330 einen entsprechenden Soll-Ausgang 304 auf. Basierend auf dem prädizierten Ausgang 303 und dem Soll-Ausgang 304 kann ein Wert der Optimierungsfunktion ermittelt werden. Ferner kann ein Gradient der Optimierungsfunktion durch das neuronale Netz 200 rückpropagiert werden, um die Quantisierungsfunktionen 311, 312 und/oder den Parameter-Tensor 310 anzupassen (wie weiter oben im Rahmen des Backpropagation-Algorithmus beschrieben). Die Optimierungsfunktion weist dabei zumindest eine Komponente in Bezug auf die anzulernende Zielfunktion des neuronalen Netzes 200 und zumindest eine Komponente in Bezug auf die Ressourceneffizienz bzw. auf den Ressourcenaufwand des neuronalen Netzes 200 auf.The adaptation block 330 receives the predicted output 303 (from the neural network 200). Furthermore, the adjustment block 330 records a corresponding target output 304 . Based on the predicted output 303 and the target output 304, a value of the optimization function can be determined. Furthermore, a gradient of the optimization function can be back-propagated through the neural network 200 in order to adapt the quantization functions 311, 312 and/or the parameter tensor 310 (as described above as part of the back-propagation algorithm). In this case, the optimization function has at least one component relating to the target function of the neural network 200 to be learned and at least one component relating to the resource efficiency or the resource expenditure of the neural network 200 .

Im Rahmen der Gradient-Descent-Optimierung (anhand des Backpropagation-Algorithmus) wird der Gradient der Optimierungsfunktion (d.h. der loss function) ermitteln. Um das zu ermöglichen, muss die Optimierungsfunktion, insbesondere die Komponente in Bezug auf den Ressourcenaufwand, ableitbar sein. Im Folgenden wird eine Optimierungsfunktion für den Hardware-Ressourcenaufwand beschrieben, die eine effiziente und präzise Durchführung einer Gradient-Descent-Optimierung ermöglicht.As part of gradient descent optimization (using the backpropagation algorithm), the gradient of the optimization function (i.e. the loss function) is determined. In order to make this possible, the optimization function, in particular the resource cost component, must be derivable. The following describes a hardware resource optimization function that enables gradient descent optimization to be performed efficiently and accurately.

Es kann eine Gauß-Prozess angelernt werden, durch den der Zusammenhang zwischen den Optimierungsfunktionen 311, 312 und der Hardware-Last bzw. der Hardware-Effizienz bzw. dem Ressourcenaufwand beschrieben wird. Der Gauß-Prozess kann dabei über eine Regressions-Methode angelernt werden. Mit anderen Worten, es kann „Gaussian process regression“ verwendet werden, um eine ableitbare Komponente der Optimierungsfunktion bereitzustellen, durch die der Ressourcenaufwand des anzulernenden neuronalen Netzes 200 beschrieben wird.A Gaussian process can be taught, which describes the connection between the optimization functions 311, 312 and the hardware load or the hardware efficiency or the resource expenditure. The Gaussian process can be learned using a regression method. In other words, Gaussian process regression can be used to provide a derivable component of the optimization function that describes the resource cost of the neural network 200 to be trained.

Der Ressourcenaufwand einer Schicht 211, 212, 213 des neuronalen Netzes 200 hängt typischerweise von folgenden charakteristischen Größen ab,

• der Anzahl von Rechenoperationen, die in der jeweiligen Schicht 211, 212, 213 auszuführen sind; bei einer sogenannten „convolutional“, d.h. Faltungs-, Schicht ist die Anzahl von Rechenoperationen wiederum von der Anzahl von Reihen (rows) und Spalten (columns) der Aktivierung 301, sowie von der Tiefe (depth) des inneren Produkts abhängig;
• der Anzahl |U | von unterschiedlichen Quantisierungswerten der Aktivierungs-Quantisierungsfunktion, insbesondere des Exponenten b_a zur Ermittlung der Anzahl 2^ba von unterschiedlichen Quantisierungswerten; und/oder
• der Anzahl |U | von unterschiedlichen Quantisierungswerten der Parameter-Quantisierungsfunktion, insbesondere des Exponenten b_w zur Ermittlung der Anzahl 2^bw von unterschiedlichen Quantisierungswerten.

The resource expenditure of a

layer

211, 212, 213 of the neural network 200 typically depends on the following characteristic values,

• the number of arithmetic operations to be performed in the respective layer 211, 212, 213; in a so-called "convolutional", ie convolutional, layer, the number of arithmetic operations is in turn dependent on the number of rows (rows) and columns (columns) of the activation 301, and on the depth (depth) of the inner product;
• the number |U | of different quantization values of the activation quantization function, in particular of the exponent b _a to determine the number 2 ^b ^a of different quantization values; and or
• the number |U | of different quantization values of the parameter quantization function, in particular of the exponent b _w to determine the number 2 ^b ^w of different quantization values.

Die Kovarianz des anzulernenden Gauß-Prozesses kann wie folgt formuliert werden, $\begin{array}{l} K (ρ, ρ') = σ^{2} e x p (- \frac{{‖ ρ - ρ' ‖}^{2}}{2 l^{2}}), where ρ = (r o w, d e p t h, c o l, b_{w,} b_{a}) \\ φ_{H W} \sim G P (m (ρ), K (ρ, ρ')) \end{array}$

The covariance of the Gaussian process to be learned can be formulated as follows,

\begin{array}{l} K (ρ, ρ') = σ^{2} e x p (- \frac{{‖ ρ - ρ' ‖}^{2}}{2 l^{2}}), where ρ = (right O w, i.e e p t H, c O l, b_{w,} b_{a}) \\ φ_{H W} \sim G P (m (ρ), K (ρ, ρ')) \end{array}

Dabei ist p ein Vektor, der den Ressourcenaufwand des neuronalen Netzes 200 anhand von ein oder mehreren der o.g. charakteristischen Größen des neuronalen Netzes 200 beschreibt. Die Amplitude σ und der Skalierungsfaktor ℓ können angelernt werden. Der prädizierte Ressourcenaufwand φ_HW für ein durch den Vektor p beschriebenes neuronales Netz 200 ist proportional zu einem von dem Gauß-Prozess mit dem Mittelwert m(ρ) und der Kovarianz K(ρ, ρ') bereitgestellt Wert.In this case, p is a vector that describes the resource expenditure of the neural network 200 based on one or more of the aforementioned characteristic variables of the neural network 200. The amplitude σ and the scaling factor ℓ can be taught. The predicted resource cost φ _HW for a neural network 200 described by the vector p is proportional to a value provided by the Gaussian process with the mean m(ρ) and the covariance K(ρ, ρ′).

Die unterschiedlichen Parameter des Gauß-Prozesses können anhand von Ressourcen-Trainingsdaten angelernt werden, wobei die Ressourcen-Trainingsdaten eine Vielzahl von Paaren von Vektoren ρ und dazu passenden Ressourcenaufwands-Werten aufweisen.The different parameters of the Gaussian process can be learned using resource training data, the resource training data having a large number of pairs of vectors ρ and corresponding resource expenditure values.

Die o.g. Funktion zur Beschreibung des Ressourcenaufwands eines neuronalen Netzes 200 ist ableitbar, und kann somit im Rahmen einer Gradient-Descent-Optimierung verwendet werden.The above-mentioned function for describing the resource requirements of a neural network 200 can be derived and can therefore be used as part of a gradient descent optimization.

4 zeigt ein Ablaufdiagramm eines (ggf. Computer-implementierten) Verfahrens 400 zum Anlernen eines neuronalen Netzes 200 für eine Zielfunktion (z.B. zur Erkennung von Objekten auf Basis der Sensordaten von ein oder mehreren Umfeldsensoren 102 eines Fahrzeugs 100). Die Zielfunktion kann dabei durch Trainingsdaten zum Anlernen des neuronalen Netzes 200 (insbesondere durch die in den Trainingsdaten angezeigten Soll-Ausgänge 304 des neuronalen Netzes 200) beschrieben werden. 4 shows a flowchart of a (possibly computer-implemented) method 400 for training a neural network 200 for a target function (eg for detecting objects based on the sensor data from one or more surroundings sensors 102 of a vehicle 100). In this case, the target function can be described by training data for training the neural network 200 (in particular by the nominal outputs 304 of the neural network 200 indicated in the training data).

Das neuronale Netz 200 umfasst eine Menge von ein oder mehreren Schichten 211, 212, 213 (insbesondere Faltungs-Schichten) mit jeweils einer Menge 305 von Parametern 222. Typischerweise umfasst das neuronale Netz 200 10 oder mehr, oder 100 oder mehr Schichten 211, 212, 213, die in Reihe zueinander angeordnet sind, sodass der Ausgang 303 einer Schicht als Aktivierung 301 der nachfolgenden Schicht dient.The neural network 200 comprises a set of one or more layers 211, 212, 213 (in particular convolutional layers), each with a set 305 of parameters 222. Typically, the neural network 200 comprises 10 or more, or 100 or more layers 211, 212 , 213, which are arranged in series with one another, so that the output 303 of one layer serves as activation 301 of the following layer.

Das Verfahren 400 umfasst in einer Iteration eines Lernalgorithmus das Quantisieren 401 von ein oder mehreren Aktivierungen 301 der ein oder mehreren Schichten 211, 212, 213 anhand einer Aktivierungs-Quantisierungsfunktion 311, um ein oder mehrere entsprechende quantisierte Aktivierungen 302 zu ermitteln. Die Aktivierungs-Quantisierungsfunktion 311 kann dabei für jede Schicht jeweils die zu verwendende Anzahl von Quantisierungswerten anzeigen.In one iteration of a learning algorithm, the method 400 includes the quantization 401 of one or more activations 301 of the one or more layers 211, 212, 213 using an activation quantization function 311 in order to determine one or more corresponding quantized activations 302. The activation quantization function 311 can indicate the number of quantization values to be used for each layer.

Des Weiteren umfasst das Verfahren 400 das Quantisieren 402 der ein oder mehreren Mengen 305 von Parametern 222 der ein oder mehreren Schichten 211, 212, 213 anhand einer Parameter-Quantisierungsfunktion 312, um ein oder mehrere entsprechende Mengen 307 von quantisierten Parametern 222 zu ermitteln. Die Parameter-Quantisierungsfunktion 312 kann dabei für jede Schicht jeweils die zu verwendende Anzahl von Quantisierungswerten anzeigen.Furthermore, the method 400 includes the quantization 402 of the one or more sets 305 of parameters 222 of the one or more layers 211, 212, 213 using a parameter quantization function 312 in order to determine one or more corresponding sets 307 of quantized parameters 222. The parameter quantization function 312 can indicate the number of quantization values to be used for each layer.

Das Verfahren 400 umfasst ferner das Ermitteln 403, anhand der ein oder mehreren quantisierten Aktivierungen 302 und anhand der ein oder mehreren Mengen 307 von quantisierten Parametern 222, eines prädizierten Ausgangs 303 des neuronalen Netzes 200. Dabei kann sequentiell Schicht für Schicht auf Basis der jeweiligen Aktivierung 302 der jeweilige Ausgang 303 ermittelt werden, bis schließlich von der Ausgangsschicht 313 der prädizierte Ausgang 303 des neuronalen Netzes 200 bereitgestellt wird.The method 400 also includes determining 403, based on the one or more quantized activations 302 and based on the one or more sets 307 of quantized parameters 222, a predicted output 303 of the neural network 200. It can sequentially layer by layer based on the respective activation 302 the respective output 303 can be determined until finally the output layer 313 provides the predicted output 303 of the neural network 200 .

Außerdem umfasst das Verfahren 400 das Anpassen 404 der Aktivierungs-Quantisierungsfunktion, der Parameter-Quantisierungsfunktion und/oder der ein oder mehreren Mengen 305 von Parametern 222 auf Basis des prädizierten Ausgangs 303 und auf Basis einer Optimierungsfunktion. Dabei kann die Optimierungsfunktion darauf ausgelegt sein, die Güte des neuronalen Netzes 200 in Bezug auf die Zielfunktion zu optimieren und den Ressourcenaufwand des neuronalen Netzes 200 zu reduzieren. So kann in effizienter und zuverlässiger Weise ein neuronales Netz 200 bereitgestellt werden, das einen optimierten Kompromiss zwischen Zielfunktions-Güte und Ressourcenaufwand aufweist.In addition, the method 400 includes adjusting 404 the activation quantization function, the parameter quantization function and/or the one or more sets 305 of parameters 222 based on the predicted output 303 and based on an optimization function. In this case, the optimization function can be designed to optimize the quality of the neural network 200 in relation to the target function and to reduce the resource expenditure of the neural network 200 . In this way, a neural network 200 can be provided in an efficient and reliable manner, which has an optimized compromise between objective function quality and resource expenditure.

5 zeigt ein Ablaufdiagramm eines (ggf. Computer-implementierten) Verfahrens 500 zur Ermittlung eines Schätzwertes des Ressourcenaufwands eines neuronalen Netzes 200. Die in Bezug auf die Verfahren 300, 400 beschriebenen Aspekte sind auch auf das Verfahren 500 anwendbar (und umgekehrt). Insbesondere kann das Verfahren 500 als Teil der Verfahren 300, 400 verwendet werden. 5 shows a flow chart of a (possibly computer-implemented) method 500 for determining an estimated value of the resource expenditure of a neural network 200. The aspects described in relation to the methods 300, 400 are also applicable to the method 500 (and vice versa). In particular, method 500 may be used as part of methods 300,400.

Wie bereits dargelegt, umfasst das neuronale Netz 200 eine Menge von ein oder mehreren Schichten 211, 212, 213 mit jeweils einer Menge 307 von (ggf. quantisierten) Parametern 222. Das neuronale Netz 200 kann ausgebildet sein, ein oder mehrere Aktivierungen 301 der ein oder mehreren Schichten 211, 212, 213 anhand einer Aktivierungs-Quantisierungsfunktion 311 zu quantisieren, um ein oder mehrere entsprechende quantisierte Aktivierungen 302 zu ermitteln. Ferner kann das neuronale Netz 200 ausgebildet sein, anhand der ein oder mehreren quantisierten Aktivierungen 302 und anhand der ein oder mehreren Mengen 307 von (ggf. quantisierten) Parametern 222 einen prädizierten Ausgang 303 des neuronalen Netzes 200 zu ermitteln.As already explained, the neural network 200 includes a set of one or more layers 211, 212, 213, each with a set 307 of (possibly quantized) parameters 222. The neural network 200 can be designed, one or more activations 301 of a or more slices 211, 212, 213 using an activation quantization function 311 to determine one or more corresponding quantized activations 302. Furthermore, the neural network 200 can be designed to determine a predicted output 303 of the neural network 200 based on the one or more quantized activations 302 and based on the one or more sets 307 of (possibly quantized) parameters 222 .

Das Verfahren 500 umfasst das Ermitteln 501 von Werten von ein oder mehreren charakteristischen Größen zur Beschreibung des neuronalen Netzes 200, der Aktivierungs-Quantisierungsfunktion 311 und/oder einer Parameter-Quantisierungsfunktion 312, mit der die ein oder mehreren Mengen 307 von quantisierten Parametern 222 quantisiert wurden.The method 500 includes determining 501 values of one or more characteristic variables for describing the neural network 200, the activation quantization function 311 and/or a parameter quantization function 312 with which the one or more sets 307 of quantized parameters 222 were quantized .

Des Weiteren umfasst das Verfahren 500 das Ermitteln 502 des Schätzwertes des Ressourcenaufwands des neuronalen Netzes 200 anhand einer Schätzfunktion und anhand der Werte der ein oder mehreren charakteristischen Größen. Dabei umfasst die Schätzfunktion bevorzugt einen Gauß-Prozess, der anhand einer Regressions-Methode auf Basis von Ressourcen-Trainingsdaten angelernt wurde. So kann in effizienter Weise ein präziser Schätzwert für den Ressourcenaufwand des neuronalen Netzes 200 ermittelt werden.Furthermore, the method 500 includes determining 502 the estimated value of the resource expenditure of the neural network 200 based on an estimation function and based on the values of the one or more characteristic variables. The estimator preferably includes a Gaussian process that was learned using a regression method based on resource training data. A precise estimated value for the resource expenditure of the neural network 200 can thus be determined in an efficient manner.

Die vorliegende Erfindung ist nicht auf die gezeigten Ausführungsbeispiele beschränkt. Insbesondere ist zu beachten, dass die Beschreibung und die Figuren nur beispielhaft das Prinzip der vorgeschlagenen Verfahren, Vorrichtungen und Systeme veranschaulichen sollen.The present invention is not limited to the exemplary embodiments shown. In particular, it should be noted that the description and the figures are only intended to illustrate the principle of the proposed methods, devices and systems by way of example.

Claims

Device for training a neural network (200) for a target function; wherein the neural network (200) comprises a set of one or more layers (211, 212, 213) each having a set (305) of parameters (222); wherein the device is set up, in an iteration of a learning algorithm, - quantizing one or more activations (301) of the one or more layers (211, 212, 213) using an activation quantization function (311) to determine one or more corresponding quantized activations (302); - quantizing the one or more sets (305) of parameters (222) of the one or more layers (211, 212, 213) using a parameter quantization function (312) to obtain one or more corresponding sets (307) of quantized parameters ( 222) to determine; - to determine a predicted output (303) of the neural network (200) based on the one or more quantized activations (302) and based on the one or more sets (307) of quantized parameters (222); and - adapt the activation quantization function, the parameter quantization function and the one or more sets (305) of parameters (222) based on the predicted output (303) and based on an optimization function.

Device according to claim 1 , wherein the optimization function comprises - a first sub-function aimed at adapting the neural network (200), in particular the one or more sets (305) of parameters (222), with respect to the objective function; and - a second sub-function aimed at calculating a resource cost of the neural network (200), in particular a resource cost associated with the quantization effected by the activation quantization function (311) and by the parameter quantization function (312) of the one or more Activations (301) and the one or more sets (205) of parameters (222) associated.

Device according to one of the preceding claims, wherein the device is set up - to compare the predicted output (303) with a target output (304); and - adjust the activation quantization function, the parameter quantization function and/or the one or more sets (305) of parameters (222) based on the comparison.

Device according to claim 3 , wherein the device is set up - to determine a deviation of the predicted output (303) from the target output (304); - determine a gradient of the optimization function for the deviation; and - adjust the activation quantization function, the parameter quantization function and/or the one or more sets (305) of parameters (222) based on the gradient.

Device according to one of the preceding claims, wherein - the activation quantization function and/or the parameter quantization function for a layer (211, 212, 213) respectively indicates a number of quantization values with which the activation (301) of this layer (211, 212, 213) or the quantity ( 205) are quantized from parameters (222) of this layer (211, 212, 213); and - the device is set up to adapt the number of quantization values in order to adapt the activation quantization function and/or the parameter quantization function.

Device according to claim 5 , wherein the device is set up to - select the number of quantization values from a set of values; and - to reduce a cardinality of the set of values with an increasing number of iterations of the learning algorithm, in particular - from the set of real numbers; - optionally over a set of real numbers with a limited number of decimal places; - optionally over the set of natural numbers; - to a set of values {2 ^b } with base-two exponential values, where b is a natural number.

Device according to one of Claims 5 until 6 , wherein the device is set up to determine different numbers of quantization values of the activation quantization function and/or the parameter quantization function for different layers (211, 212, 213) of the neural network (200).

Device according to one of the preceding claims, wherein - the activation quantization function and/or the parameter quantization function each have a scaling value c, which defines a value range of quantization values for quantization of an activation (301) or a set (205) of parameters (222); and - the device is set up to adapt the scaling value c on the basis of the predicted output (303) and on the basis of the optimization function.

Device according to one of the preceding claims, wherein the optimization function comprises a sub-function which is formed on the basis of one or more characteristic variables for describing the neural network (200), the activation quantization function (311) and/or the parameter quantization function ( 312) predict a value of a resource cost of the neural network (200).

Device according to claim 9 , wherein the one or more characteristic quantities comprise, - at least one quantity relating to a number of arithmetic operations, in particular multiplications, to be carried out in the one or more layers (211, 212, 213); - at least one variable in relation to a number of different quantization values with which the one or more quantized activations (302) are represented and/or with which the one or more activations (301) are quantized by the activation quantization function (311); and/or - at least one quantity in relation to a number of different quantization values with which the one or more sets (307) of quantized parameters (222) are represented and/or with which the one or more sets (305) of parameters ( 222) can be quantized by the parameter quantization function (312).

Device according to one of claims 9 until 10 , wherein - the sub-function for predicting a value of the resource cost of the neural network (200) comprises a Gaussian process; and - the Gaussian process was trained using a regression method based on resource training data.

Device according to claim 11 , where - the sub-function is designed to determine an estimated value of the resource expenditure, which is proportional to

G P (m (ρ), K (ρ, ρ'))

is; - m(ρ) a learned mean of the Gaussian process

G P ()

is; - K(ρ, ρ') a learned covariance of the Gaussian process

G P ()

is; and - ρ is a scalar or a vector comprising values of one or more characteristic quantities describing the neural network (200), the activation quantization function (311) and/or the parameter quantization function (312).

Device according to claim 12 , where the covariance K(ρ,ρ') is given by

K (ρ, ρ') = σ^{2} e x p (- \frac{{‖ ρ - ρ' ‖}^{2}}{2 l^{2}})

with a learned amplitude σ and a learned scaling factor ℓ.

Device according to one of the preceding claims, wherein the device is set up - to determine a predicted output (303) of the neural network (200) for a plurality of training activations (301); and - adapt the activation quantization function, the parameter quantization function and the one or more sets (305) of parameters (222) based on the plurality of predicted outputs (303) and based on the optimization function.

Device according to one of the preceding claims, wherein - the neural network (200) comprises L consecutive layers (211, 212, 213), with L greater than 10, in particular greater than 100; and or - a layer (211, 212, 213) is a convolution layer arranged to cause a convolution between the respective set (307) of quantized parameters (222) and the respective quantized activation (302) to produce an output ( 303) of the respective layer (211, 212, 213); and or - the device is designed - to use an output (303) of a layer l as an activation (301) for a directly following layer l+1; and or - using an output (303) of an output layer (213) of the neural network (200) as the output (303) of the neural network (200).

Method (400) for training a neural network (200) for a target function; wherein the neural network (200) comprises a set of one or more layers (211, 212, 213) each having a set (305) of parameters (222); wherein the method (400) comprises in an iteration of a learning algorithm, - quantizing (401) one or more activations (301) of the one or more layers (211, 212, 213) using an activation quantization function (311) to determine one or more corresponding quantized activations (302); - quantizing (402) the one or more sets (305) of parameters (222) of the one or more layers (211, 212, 213) using a parameter quantization function (312) to obtain one or more corresponding sets (307) of quantized determine parameters (222); - determining (403), based on the one or more quantized activations (302) and based on the one or more sets (307) of quantized parameters (222), a predicted output (303) of the neural network (200); and - adjusting (404) the activation quantization function, the parameter quantization function and the one or more sets (305) of parameters (222) based on the predicted output (303) and based on an optimization function.