DE102019134156A1

DE102019134156A1 - Train a multitask neural network

Info

Publication number: DE102019134156A1
Application number: DE102019134156.6A
Authority: DE
Inventors: Isabelle Leang; Fabian Burger; Diego Mendoza Barrenechea
Original assignee: Valeo Schalter und Sensoren GmbH
Current assignee: Valeo Schalter und Sensoren GmbH
Priority date: 2019-12-12
Filing date: 2019-12-12
Publication date: 2021-06-17

Abstract

Die vorliegende Erfindung betrifft ein Verfahren zum Trainieren eines neuronalen Netzes (10), insbesondere eines konvolutionellen neuronalen Netzes zur Verwendung in einem Fahrunterstützungssystem eines Fahrzeugs, mit einem Kodierer (12) zum Kodieren bereitgestellter Eingabedaten (14) und mehreren Dekodierern (16), von denen jeder eine Dekodieraufgabe (18) ausführt, mit annotierten Trainingsdaten (20) als Eingabedaten (14), wobei jede der mehreren Dekodieraufgaben (18) einen Zielanwendungskontext aufweist, mit den Schritten: Bereitstellen mehrerer Sätze (22) unterschiedlich annotierter Trainingsdaten (20) zum Trainieren der mehreren Dekodieraufgaben (18), Anordnen der mehreren Sätze (22) unterschiedlich annotierter Trainingsdaten (20) in Datenströmen (24, 26), wobei zum Trainieren jeder der mehreren Dekodieraufgaben (18) eine erste Menge (38) mit Trainingsdaten (20), die einem Zielanwendungskontext des neuronalen Netzes (10) entsprechen, und zusätzlich eine zweite Menge (40) mit Trainingsdaten (20) bereitgestellt werden, die nicht dem Zielanwendungskontext des neuronalen Netzes (10) entsprechen, Erzeugen von Mini-Stapeln (32) von Trainingsdaten (20), die mehrere Sub-Mini-Stapel (34, 36) von Trainingsdaten (20) zum Trainieren jeder der Dekodieraufgaben (18) aufweisen, Erzeugen der Sub-Mini-Stapel (34, 36) von Trainingsdaten (20) zum Trainieren jeder der Dekodieraufgaben (18) auf der Grundlage der jeweiligen ersten Menge (38) in Kombination mit der jeweiligen zweiten Menge (40), wobei die Trainingsdaten (20) für die gemeinsam trainierten Dekodieraufgaben (18) gemäß einem Kombinationsschema kombiniert werden, um den jeweiligen Sub-Mini-Stapel (34, 36) bereitzustellen, Trainieren des neuronalen Netzes (10) unter Verwendung der erzeugten Mini-Stapel (32), und Aktualisieren des Kombinationsschemas zum Kombinieren der Sub-Mini-Stapel (34, 36) von Trainingsdaten (20) in Abhängigkeit von einer Lernentwicklung des neuronalen Netzes (10).The present invention relates to a method for training a neural network (10), in particular a convolutional neural network for use in a driving support system of a vehicle, with an encoder (12) for coding provided input data (14) and several decoders (16), of which each carries out a decoding task (18) with annotated training data (20) as input data (14), each of the multiple decoding tasks (18) having a target application context, with the steps of: providing multiple sets (22) of differently annotated training data (20) for training the multiple decoding tasks (18), arranging the multiple sets (22) of differently annotated training data (20) in data streams (24, 26), wherein for training each of the multiple decoding tasks (18) a first set (38) with training data (20), which correspond to a target application context of the neural network (10), and additionally a second set (40) with traini ngs data (20) are provided which do not correspond to the target application context of the neural network (10), generation of mini-batches (32) of training data (20), the several sub-mini-batches (34, 36) of training data (20) for training each of the decoding tasks (18), generating the sub-mini-batches (34, 36) of training data (20) for training each of the decoding tasks (18) based on the respective first set (38) in combination with the respective one second set (40), the training data (20) for the jointly trained decoding tasks (18) being combined according to a combination scheme in order to provide the respective sub-mini-stack (34, 36), training the neural network (10) using the generated mini-stacks (32), and updating the combination scheme for combining the sub-mini-stacks (34, 36) of training data (20) as a function of a learning development of the neural network (10).

Description

Die vorliegende Erfindung betrifft ein Verfahren zum Trainieren eines neuronalen Netzes, insbesondere eines konvolutionellen neuronalen Netzes zur Verwendung in einem Fahrunterstützungssystem eines Fahrzeugs, mit einem Kodierer zum Kodieren bereitgestellter Eingabedaten und mehreren Dekodierern, die jeweils eine Dekodieraufgabe ausführen, mit annotierten Trainingsdaten als Eingabedaten, wobei jede der Dekodieraufgaben einen Zielanwendungskontext aufweist.The present invention relates to a method for training a neural network, in particular a convolutional neural network for use in a driving support system of a vehicle, with an encoder for coding input data provided and several decoders, each of which carries out a decoding task, with annotated training data as input data, each the decoding task has a target application context.

Ein neuronales Multitask-Netz besteht aus einem gemeinsam genutzten Kodierer und mehreren Task-Dekodierern. Dies ermöglicht eine effiziente Datenverarbeitung mit verringerten Ressourcen im Vergleich zu mehreren parallelen neuronalen Single-Task-Netzen, von denen jedes nur eine Aufgabe handhabt. Die Dekodierer sind in der Lage, die zur Verfügung gestellte Information unter verschiedenen Aspekten parallel zu dekodieren, z.B. Ausführen einer Segmentierung oder Erfassung von Objekten, um nur einige zu nennen. Die Eingabedaten, die als Eingabe zum Trainieren des neuronalen Netzes verwendet werden, sind Eingabebilder (oder können Eingaben von anderen Sensoren wie LiDAR, Radar, usw. sein), und die Ausgabe des neuronalen Netzes ist die jeweilige Dekodierinformation für jeden Dekodierer, z.B. Erfassungs- oder Segmentierungsannotationen. Die neuronalen Multitask-Netze eignen sich aufgrund ihrer Skalierbarkeit besonders gut für den Aufbau autonomer Fahrsysteme, um bei Bedarf weitere Dekodieraufgaben hinzuzufügen.A multitask neural network consists of a shared encoder and several task decoders. This enables efficient data processing with reduced resources compared to several parallel single-task neural networks, each of which only handles one task. The decoders are able to decode the information made available in parallel under various aspects, e.g. carrying out segmentation or detection of objects, to name just a few. The input data that are used as input for training the neural network are input images (or can be inputs from other sensors such as LiDAR, radar, etc.), and the output of the neural network is the respective decoding information for each decoder, e.g. acquisition or segmentation annotations. Due to their scalability, the neural multitask networks are particularly well suited for the construction of autonomous driving systems in order to add further decoding tasks if necessary.

Multitask-Lernen (MTL), d.h. das Trainieren solcher neuronalen Multitask-Netze, ist ein aufstrebendes Feld im Bereich des Deep Learning, das sich auf das Designen und das Trainieren eines gemeinsamen neuronalen Netzes zur Lösung mehrerer Aufgaben konzentriert. Daher ist MTL besonders wichtig für das Designen und das Trainieren autonomer Fahrsysteme, die solche neuronalen Multitask-Netze anwenden. Das Trainieren eines solchen neuronalen Multitask-Netzes erfordert jedoch eine Menge an Trainingsdaten (Hunderttausende von Dateninstanzen), die für die Zieldekodieraufgaben annotiert sind. Diese annotierten Trainingsdaten werden auch als Ground Truth bezeichnet. Im Idealfall, wenn die Eingabedaten, d.h. Bilddatensätze für das Trainieren des neuronalen Multitask-Netzes, für die verschiedenen Aufgaben, die durch die Multitask-Dekodierer repräsentiert werden, vollständig annotiert sind, verwendet das Netz die gleichen Stapel von Eingabedatensätzen, um alle Aufgaben simultan zu trainieren. Dies ermöglicht das Trainieren des Kodierers des neuronalen Netzes und das Trainieren jedes Dekodierers mit jedem der Datensätze.Multitask learning (MTL), i.e. the training of such neural multitask networks, is an emerging field in the field of deep learning that focuses on the design and training of a common neural network to solve several tasks. Therefore, MTL is particularly important for the design and training of autonomous driving systems that use such neural multitasking networks. However, training such a multitask neural network requires a lot of training data (hundreds of thousands of data instances) that are annotated for the target decoding tasks. This annotated training data is also known as the ground truth. Ideally, if the input data, i.e. image data sets for training the multitask neural network, are completely annotated for the various tasks represented by the multitask decoders, the network uses the same batch of input data sets to perform all tasks simultaneously work out. This enables the neural network coder to be trained and each decoder to be trained with each of the data sets.

In der Praxis ist es schwierig, einen solchen vollständig annotierten Datensatz für alle Zielaufgaben zu finden. Typischerweise sind mehrere Datensätze von verschiedenen Quellen oder Projekten verfügbar, die teilweise für eine oder mehrere der Zieldekodieraufgaben innerhalb eines bestimmten Anwendungskontextes annotiert sind. Solche Anwendungskontexte beziehen sich z.B. auf die Fußgängererfassung beim Parken oder die Fahrspurerfassung in einem Fahrszenario auf der Autobahn. Der Anwendungskontext ist entscheidend für jedes System, das geeignet trainiert werden soll. Jedoch können zum Verallgemeinern zusätzlich Eingabedaten aus anderen Anwendungskontexten verwendet werden. So sind z.B. bei der Betrachtung einer Fußgängerfassung in einem Parkszenario Parkplätze der Zielanwendungskontext. Es können jedoch auch Eingabedaten mit annotierten Fußgängern in anderen städtischen Umgebungen verwendet werden.In practice it is difficult to find such a fully annotated data set for all target tasks. Typically, several data sets are available from different sources or projects, some of which are annotated for one or more of the target decoding tasks within a particular application context. Such application contexts relate, for example, to the detection of pedestrians when parking or the detection of lanes in a driving scenario on the motorway. The application context is crucial for any system that needs to be trained appropriately. However, input data from other application contexts can also be used for generalization. For example, when looking at pedestrian detection in a parking scenario, parking spaces are the target application context. However, input data with annotated pedestrians in other urban environments can also be used.

Aufgrund des Bedarfs an großen Mengen an Trainingsdaten ist eine ausreichende Menge solcher Trainingsdaten für das Trainieren neuronaler Netze, insbesondere für Deep-Learning-Systeme, besonders wichtig. Das Beschaffen eines neuen Datensatzes mit Trainingsdaten, die alle erforderlichen Annotationen für das Trainieren aller Dekodieraufgaben aufweisen, ist sehr kostenintensiv, z.B. im Bereich von Millionen von Euro, was die Wiederverwendung verfügbarer Trainingsdaten bedeutsam macht. Daher sind für das Trainieren solcher neuronaler Multitask-Netze Mischstrategien erforderlich, damit diese neuronalen Netze unter Verwendung verfügbarer Trainingsdaten, die nicht für alle Dekodieraufgaben vollständig annotiert sind und/oder unterschiedliche Anwendungskontexte abdecken, effizient trainiert werden können.Because of the need for large amounts of training data, a sufficient amount of such training data is particularly important for training neural networks, in particular for deep learning systems. Obtaining a new data set with training data that has all the necessary annotations for training all decoding tasks is very costly, e.g. in the range of millions of euros, which makes the reuse of available training data important. Therefore, mixed strategies are required for training such neural multitask networks so that these neural networks can be trained efficiently using available training data that are not completely annotated for all decoding tasks and / or cover different application contexts.

Allerdings können beim Versuch, verschiedene Arten unterschiedlich annotierter Datensätze zu kombinieren, mehrere technische Probleme auftreten. Eine Fragestellung besteht darin, wie annotierte und nicht annotierte Daten für jede Dekodieraufgabe während des Trainings verwaltet werden sollen. Eine weitere Fragestellung besteht darin, wie mit den unterschiedlichen Größen verfügbarer Datensätze umgegangen werden soll, um eine Tendenz des Trainings hin zum größten Datensatz zu vermeiden. Ebenso wichtig ist es zu wissen, wie Datensätze mit unterschiedlichen Anwendungskontexten verwaltet werden sollen, z.B. Autobahnspurerfassung vs. Fahrspurerfassung in einem Parkszenario. Zufällige Mischstrategien, die derzeit untersucht werden, zeigen keine ausreichenden Ergebnisse. Diese zufälligen Mischstrategien können zu einem unausgewogenen Training der Dekodieraufgaben führen, wobei eine Dekodieraufgabe häufiger aktualisiert wird als eine andere. Dies betrifft auch den Kodierer, dessen Aktualisierung vom Beitrag der Trainingsdaten aus den in einem Stapel von Trainingsdaten vorhandenen Aufgaben abhängt. Das Verwenden von Trainingsdatensätzen, die sich in ihrer Größe unterscheiden, führt zu einem unausgewogenen Training der Dekodieraufgaben und zu einer Tendenz hin zum größeren Datensatz. Die Verwendung von Trainingsdatensätzen mit unterschiedlichen Anwendungskontexten wird nicht behandelt.However, several technical problems can arise when attempting to combine different types of differently annotated records. One question is how annotated and non-annotated data should be managed for each decoding task during training. Another question is how to deal with the different sizes of available data sets in order to avoid a tendency in training towards the largest data set. It is just as important to know how data sets with different application contexts are to be managed, e.g. motorway lane detection vs. lane detection in a parking scenario. Random mixing strategies, which are currently being investigated, do not show sufficient results. These random shuffling strategies can lead to an unbalanced training of the decoding tasks, with one decoding task being updated more frequently than another. This also applies to the encoder, whose update is based on the contribution of the training data from those in a batch of training data Tasks depends. The use of training data sets that differ in size leads to an unbalanced training of the decoding tasks and a tendency towards the larger data set. The use of training data sets with different application contexts is not dealt with.

Der vorliegenden Erfindung liegt die Aufgabe zugrunde, ein Verfahren der oben genannten Art anzugeben, das zumindest einige der oben genannten Probleme und Nachteile überwindet. Insbesondere liegt der vorliegenden Erfindung die Aufgabe zugrunde, ein Verfahren zum Trainieren eines neuronalen Netzes anzugeben, insbesondere eines konvolutionellen neuronalen Netzes zur Verwendung in einem Fahrunterstützungssystem eines Fahrzeugs, das einen Kodierer zum Kodieren bereitgestellter Eingabedaten und mehrere Dekodierer aufweist, von denen jeder eine Dekodieraufgabe ausführt, das ein effizientes und ausgewogenes Training des Kodierers und der mehreren Dekodierer ermöglicht.It is an object of the present invention to provide a method of the above-mentioned type which overcomes at least some of the above-mentioned problems and disadvantages. In particular, the present invention is based on the object of specifying a method for training a neural network, in particular a convolutional neural network for use in a driving support system of a vehicle, which has an encoder for encoding input data provided and several decoders, each of which carries out a decoding task, which enables efficient and balanced training of the encoder and the multiple decoders.

Die Lösung dieser Aufgabe erfolgt durch den unabhängigen Anspruch. Vorteilhafte Ausgestaltungen sind in den Unteransprüchen angegeben.This problem is solved by the independent claim. Advantageous refinements are given in the subclaims.

Insbesondere ist durch die vorliegende Erfindung ein Verfahren zum Trainieren eines neuronalen Netzes, insbesondere eines konvolutionellen neuronalen Netzes zur Verwendung in einem Fahrunterstützungssystem eines Fahrzeugs, das einen Kodierer zum Kodieren bereitgestellter Eingabedaten und mehrere Dekodierer aufweist, die jeweils eine Dekodieraufgabe ausführen, mit annotierten Trainingsdaten als Eingabedaten angegeben, wobei jede der mehreren Dekodieraufgaben einen Zielanwendungskontext aufweist, mit den Schritten zum Bereitstellen mehrerer Sätze unterschiedlich annotierter Trainingsdaten zum Trainieren der mehreren Dekodieraufgaben, Anordnen der mehreren Sätze unterschiedlich annotierter Trainingsdaten in Datenströmen, wobei zum Trainieren jeder der mehreren Dekodieraufgaben ein Datenstrom mit Trainingsdaten, die dem Zielanwendungskontext entsprechen, und solchen, die nicht dem Zielanwendungskontext entsprechen, für jede Aufgabe (d.h. ein Datengenerator für jede Aufgabe) bereitgestellt wird, Erzeugen von Mini-Stapeln von Trainingsdaten, die mehrere Sub-Mini-Stapel von Trainingsdaten zum Trainieren jeder Dekodieraufgabe aufweisen, wobei beim Erzeugen der Sub-Mini-Stapel von Trainingsdaten zum Trainieren jeder der Dekodieraufgaben, die dem Zielanwendungskontext entspricht, und derjenigen, die nicht dem Zielanwendungskontext entsprechen, die jeweiligen Trainingsdaten gemäß einem Kombinationsschema kombiniert werden, um den jeweiligen Sub-Mini-Stapel bereitzustellen, Trainieren des neuronalen Netzes unter Verwendung der erzeugten Mini-Stapel und Aktualisieren des Kombinationsschemas zum Kombinieren der Sub-Mini-Stapel von Trainingsdaten in Abhängigkeit von einer Lernentwicklung des neuronalen Netzes.In particular, the present invention provides a method for training a neural network, in particular a convolutional neural network for use in a driving support system of a vehicle, which has an encoder for coding input data provided and several decoders, each of which performs a decoding task, with annotated training data as input data specified, wherein each of the multiple decoding tasks has a target application context, with the steps of providing multiple sets of differently annotated training data for training the multiple decoding tasks, arranging the multiple sets of differently annotated training data in data streams, wherein for training each of the multiple decoding tasks a data stream with training data, the correspond to the target application context, and those that do not correspond to the target application context are provided for each task (ie a data generator for each task) generating mini-batches of training data having multiple sub-mini-batches of training data for training each decoding task, in generating the sub-mini-batches of training data for training each of the decoding tasks that corresponds to the target application context and those which do not correspond to the target application context, the respective training data are combined according to a combination scheme to provide the respective sub-mini-batches, training the neural network using the generated mini-batches and updating the combination scheme to combine the sub-mini-batches of training data depending on a learning development of the neural network.

Grundidee der vorliegenden Erfindung ist es, ein adaptives Schema zum Mischen der bereitgestellten Trainingsdaten für jede der Dekodieraufgaben anzuwenden. Dies wird erreicht, indem die Lernentwicklung des neuronalen Netzes als Feedback ermittelt wird. Die Lernentwicklung ermöglicht ein Anpassen der Erzeugung der Sub-Mini-Stapel von Trainingsdaten für das Trainieren jeder der Dekodieraufgaben, d.h. das Kombinieren der dem Zielanwendungskontext entsprechenden und der nicht dem Zielanwendungskontext entsprechenden Trainingsdaten, für jede der gemeinsam trainierten Dekodieraufgaben. Das Erzeugen der Sub-Mini-Stapel und das Anpassen des Kombinationsschemas können auf unterschiedliche Weise erfolgen. Einige Möglichkeiten zum Erzeugen der Sub-Mini-Stapel und zum Anpassen des Kombinationsschemas werden im Folgenden näher erläutert. Das vorgeschlagene Verfahren stellt eine einfache Lösung bereit, um das neuronale Multitask-Netz effizient mit teilweise und/oder vollständig annotierten Daten zu versorgen, indem für jede Dekodieraufgabe separate Datenströme verwendet werden. Darüber hinaus wird ein ausgewogenes Training der Dekodieraufgaben durch die Verwendung der Mini-Stapel ausgeführt, die Trainingsdaten für jede der Dekodieraufgaben enthalten. Darüber hinaus können verschiedene Trainingsdatensätze verarbeitet werden, die unterschiedliche Anwendungskontexte abdecken. So kann beispielsweise ein Trainingsdatensatz hilfreich sein, um die jeweilige Dekodieraufgabe zu trainieren, auch wenn er sich auf einen anderen Anwendungskontext bezieht, um eine bessere Verallgemeinerung mit mehr Trainingsdaten zu erreichen, z.B. behält ein Objekt wie ein Auto die gleichen visuellen Eigenschaften bei, egal ob es sich in einem städtischen oder in einem ländlichen Anwendungskontext befindet. Das vorgeschlagene Verfahren lässt sich einfach auf so viele Trainingsdatensätze wie möglich skalieren. Es wird eine dynamische Mischung der Trainingsdaten, die dem Zielanwendungskontext entsprechen, mit denen, die nicht dem Zielanwendungskontext entsprechen, erreicht.The basic idea of the present invention is to use an adaptive scheme for mixing the provided training data for each of the decoding tasks. This is achieved by determining the learning development of the neural network as feedback. The learning evolution enables the generation of the sub-mini-batches of training data to be adapted for training each of the decoding tasks, i.e. combining the training data corresponding to the target application context and the training data which does not correspond to the target application context, for each of the jointly trained decoding tasks. The creation of the sub-mini-stacks and the adaptation of the combination scheme can be done in different ways. Some options for creating the sub-mini-stacks and for adapting the combination scheme are explained in more detail below. The proposed method provides a simple solution for efficiently supplying the neural multitask network with partially and / or fully annotated data by using separate data streams for each decoding task. In addition, a balanced training of the decoding tasks is carried out through the use of the mini-batches which contain training data for each of the decoding tasks. In addition, different training data sets can be processed that cover different application contexts. For example, a training data set can be helpful to train the respective decoding task, even if it relates to a different application context, in order to achieve a better generalization with more training data, e.g. an object like a car retains the same visual properties, regardless of whether it is located in an urban or in a rural application context. The proposed method can easily be scaled to as many training data sets as possible. A dynamic mix of training data that corresponds to the target application context with those that do not correspond to the target application context is achieved.

Jeder der Dekodierer dekodiert die bereitgestellten Eingabedaten unter verschiedenen Aspekten, beispielsweise führt er eine parallele Segmentierung oder eine Erfassung von Objekten aus, um nur einige zu nennen. Bei den Eingabedaten, die als Eingabe zum Trainieren des neuronalen Netzes verwendet werden, handelt es sich um Eingabebilder, die dem Kodierer zugeführt werden. Die Ausgabe des neuronalen Netzes ist entsprechende Dekodierinformation für jeden Dekodierer, z.B. Erfassungs- oder Segmentierungsannotationen .Each of the decoders decodes the input data provided under different aspects, for example it carries out a parallel segmentation or a detection of objects, to name just a few. The input data that are used as input for training the neural network are input images that are fed to the encoder. The output of the neural network is corresponding decoding information for each decoder, e.g. acquisition or segmentation annotations.

Die Trainingsdaten weisen mindestens einen annotierten Trainingsdatensatz für jede der Dekodieraufgaben auf, um ein effizientes und ausgewogenes Training für das gesamte neuronale Netz auszuführen.The training data have at least one annotated training data set for each of the decoding tasks in order to carry out efficient and balanced training for the entire neural network.

Jeder Satz von Trainingsdaten, der für das Training der gleichen Dekodieraufgabe annotiert ist, wird den jeweiligen Datenströmen hinzugefügt. Die Trainingsdaten weisen für jede der Dekodieraufgaben mindestens annotierte Daten aus dem Zielanwendungskontext oder annotierte Daten auf, die nicht dem Zielanwendungskontext entsprechen. Die Trainingsdaten können jedoch für alle Dekodieraufgaben sowohl Trainingsdaten beinhalten, die dem Zielanwendungskontext entsprechen, als auch Trainingsdaten, die nicht dem Zielanwendungskontext entsprechen. Für jede Dekodieraufgabe müssen bei den Trainingsdaten des für diese Dekodieraufgabe annotierten Datenstroms alle Zielklassen annotiert sein, d.h., der jeweilige Datenstrom der Trainingsdaten ist für die jeweilige Dekodieraufgabe vollständig annotiert. Darüber hinaus sind für jede Dekodieraufgabe zumindest Trainingsdaten für diese Dekodieraufgabe annotiert. Insbesondere existieren für jede Dekodieraufgabe Trainingsdaten, die sich auf den Zielanwendungskontext des neuronalen Netzes beziehen.Each set of training data annotated for training the same decoding task is added to the respective data streams. For each of the decoding tasks, the training data have at least annotated data from the target application context or annotated data that do not correspond to the target application context. For all decoding tasks, however, the training data can contain both training data that correspond to the target application context and training data that do not correspond to the target application context. For each decoding task, all target classes of the training data of the data stream annotated for this decoding task must be annotated, i.e. the respective data stream of the training data is completely annotated for the respective decoding task. In addition, at least training data for this decoding task are annotated for each decoding task. In particular, there are training data for each decoding task that relate to the target application context of the neural network.

Allgemein bezieht sich der Zielanwendungskontext auf das neuronale Netz. In anderen Fällen kann der Zielanwendungskontext jedoch unabhängig für jede Dekodieraufgabe definiert werden.In general, the target application context relates to the neural network. In other cases, however, the target application context can be defined independently for each decoding task.

Die Mini-Stapel sind kleine Teilmengen von Trainingsdaten, die bei jedem Trainingsschritt eines iterativen Trainingsprozesses des neuronalen Netzes verwendet werden. Die Trainingsdaten werden während des Trainings in N Mini-Stapel aufgeteilt. Jeder Mini-Stapel hat eine Stapelgröße m, d.h., der Mini-Stapel enthält m Proben von Trainingsdaten. Typische Werte für die Stapelgröße m sind z.B. 8, 16, 32. Die gesamte Menge der Trainingsdaten entspricht einer Epoche, die eine Datengröße von N*m Proben aufweist. Die Stapelgröße wird z.B. in Abhängigkeit von einer GPU-Kapazität zum Ausführen des Trainings des neuronalen Netzes gewählt. Die Mini-Stapel ermöglichen eine effiziente Verarbeitung großer Mengen an Trainingsdaten. Aufgrund von Speicherbegrenzungen von GPUs ist es in der Regel nicht möglich, die gesamten Trainingsdaten in einem Schritt rückzupropagieren.The mini-stacks are small subsets of training data that are used in each training step of an iterative training process of the neural network. The training data is split into N mini-batches during training. Each mini-batch has a batch size m, i.e. the mini-batch contains m samples of training data. Typical values for the stack size m are, for example, 8, 16, 32. The total amount of training data corresponds to an epoch with a data size of N * m samples. The stack size is selected, for example, depending on a GPU capacity for carrying out the training of the neural network. The mini-batches enable efficient processing of large amounts of training data. Due to the memory limitations of GPUs, it is usually not possible to propagate the entire training data back in one step.

Jeder der Mini-Stapel von Trainingsdaten weist vorzugsweise einen Sub-Mini-Stapel für jede Dekodieraufgabe auf. Jeder der Sub-Mini-Stapel weist vorzugsweise Trainingsdaten für eine der Dekodieraufgaben auf. Daher werden alle Dekodieraufgaben in jeder Iteration, d.h. mit jedem Mini-Stapel, trainiert. Die Sub-Mini-Stapel werden im Allgemeinen aus ungesehenen Trainingsdaten gebildet, d.h. aus Trainingsdaten, die während einer Epoche noch nicht verwendet worden sind. Eine Epoche bezieht sich im Allgemeinen auf die Präsentation des vollständigen Trainingsdatensatzes für das neuronale Netz für ein Training. Falls die Trainingsdaten jedoch für die verschiedenen Trainingsaufgaben eine unterschiedliche Größe aufweisen, kann dies nicht vollständig angewendet werden, sondern es können z.B. einige der Trainingsdaten nicht verwendet werden. Beim Trainieren des neuronalen Netzes werden typischerweise mehrere Trainingsepochen durchgeführt.Each of the mini-batches of training data preferably has a sub-mini-batch for each decoding task. Each of the sub-mini-stacks preferably has training data for one of the decoding tasks. Therefore, all decoding tasks are trained in each iteration, i.e. with each mini-batch. The sub-mini-batches are generally formed from unseen training data, i.e. training data that has not yet been used during an epoch. An epoch generally relates to the presentation of the complete training data set for the neural network for a training session. However, if the training data have a different size for the different training tasks, this cannot be applied in full, but some of the training data, for example, cannot be used. When training the neural network, several training epochs are typically carried out.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Kombinieren der dem Zielanwendungskontext entsprechenden und nicht entsprechenden Trainingsdaten für die gemeinsam trainierten Dekodieraufgaben nach einem Kombinationsschema zum Bereitstellen des jeweiligen Sub-Mini-Stapels das Bereitstellen eines individuellen Kombinationsschemas für jede der gemeinsam trainierten Dekodieraufgaben auf, und weist das Aktualisieren des Kombinationsschemas zum Kombinieren der Sub-Mini-Stapel von Trainingsdaten in Abhängigkeit von einer Lernentwicklung des neuronalen Netzes das individuelle Aktualisieren des Kombinationsschemas in Abhängigkeit von der Lernentwicklung des neuronalen Netzes für jede der gemeinsam trainierten Dekodieraufgaben auf. Daher wird die individuelle Lernentwicklung für jede der Dekodieraufgaben individuell bestimmt, so dass die individuelle Lernentwicklung zum Aktualisieren des Kombinationsschemas für jede der Dekodieraufgaben verwendet werden kann. Somit kann jeder der Sub-Mini-Trainingsdatensätze zum Trainieren der verschiedenen Dekodieraufgaben individuell unter Berücksichtigung der jeweiligen Lernentwicklung zusammengestellt werden. Das individuelle Kombinationsschema ermöglicht ein verbessertes Training des neuronalen Netzes, da das Trainieren jeder der Dekodieraufgaben unabhängig voneinander angepasst wird.According to a modified embodiment of the invention, combining the training data corresponding to the target application context and not corresponding to the jointly trained decoding tasks according to a combination scheme for providing the respective sub-mini-stack includes providing an individual combination scheme for each of the jointly trained decoding tasks, and has that Updating the combination scheme for combining the sub-mini-stacks of training data as a function of a learning development of the neural network, the individual updating of the combination scheme depending on the learning development of the neural network for each of the jointly trained decoding tasks. Therefore, the individual learning development is determined individually for each of the decoding tasks, so that the individual learning development can be used for updating the combination scheme for each of the decoding tasks. Thus, each of the sub-mini training data sets for training the various decoding tasks can be put together individually, taking into account the respective learning development. The individual combination scheme enables improved training of the neural network, since the training of each of the decoding tasks is adapted independently of one another.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Kombinieren der dem Zielanwendungskontext entsprechenden und nicht entsprechenden Trainingsdaten für die gemeinsam trainierten Dekodieraufgaben gemäß einem Kombinationsschema zum Bereitstellen des jeweiligen Sub-Mini-Stapels das Auswählen der dem Zielanwendungskontext entsprechenden und nicht entsprechenden Trainingsdaten gemäß einem Verhältnis auf, und weist das Aktualisieren des Kombinationsschemas zum Kombinieren der Sub-Mini-Stapel von Trainingsdaten in Abhängigkeit von einer Lernentwicklung des neuronalen Netzes das Aktualisieren des Verhältnisses auf. Daher werden jeder der beiden Kategorien (die dem Zielanwendungskontext entspricht oder nicht entspricht) gemäß dem Verhältnis Proben von Trainingsdaten entnommen. Die Präsentation der Trainingsdaten für das neuronale Netz wird daher hinsichtlich ihrer Zusammensetzung von den beiden Datenkategorien gesteuert. Das Verhältnis ist vorzugsweise ein Wert innerhalb eines Zahlenraumes zwischen Null und Eins. Es kann aber auch ein beliebiger andersartiger Wert aus einem beliebigen Zahlenraum verwendet werden, insofern er ein Verhältnis zwischen den Trainingsdaten, die aus den beiden Datenkategorien entnommen werden, angeben kann.According to a modified embodiment of the invention, combining the training data corresponding and not corresponding to the target application context for the jointly trained decoding tasks according to a combination scheme for providing the respective sub-mini-stack includes selecting the training data corresponding and not corresponding to the target application context in accordance with a ratio, and the updating of the combination scheme for combining the sub-mini-batches of training data as a function of a learning development of the neural network comprises the updating of the ratio. Therefore, training data is sampled according to the ratio of each of the two categories (whether or not corresponding to the target application context). The presentation of the training data for the neural network is therefore dependent on the composition of the controlled by both data categories. The ratio is preferably a value within a number range between zero and one. However, any other value from any number range can also be used, provided that it can indicate a relationship between the training data that are taken from the two data categories.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Auswählen der Trainingsdaten aus dem ersten und dem zweiten Datenstrom gemäß einem Verhältnis das Erzeugen einer Zufallszahl, insbesondere zwischen null und eins, für jede Trainingsdatenprobe, die dem jeweiligen Sub-Mini-Stapel hinzugefügt werden soll, und das Entnehmen der hinzuzufügenden Trainingsdatenprobe aus dem ersten oder zweiten Datenstrom in Abhängigkeit davon auf, ob die Zufallszahl größer als das Verhältnis ist oder nicht. Die Zufallszahlen werden innerhalb des gleichen Zahlenraums, in dem das Verhältnis angegeben wird, typischerweise eine reelle Zahl zwischen null und eins, generiert. Auch wenn das Verhältnis die Präsentation der Trainingsdaten aus dem ersten und zweiten Datenstrom für das neuronale Netz definiert, ermöglichen die Zufallszahlen eine flexiblere Auswahl der Trainingsdaten aus dem ersten und dem zweiten Datenstrom. Daher können die einzelnen Sub-Mini-Stapel Trainingsdaten mit anderen Prozentsätzen aus den beiden Datenströmen enthalten, als durch das Verhältnis angegeben ist. Bei der Betrachtung mehrerer Sub-Mini-Stapel können die Sub-Mini-Stapel jedoch die Trainingsdaten aus den beiden Datenströmen mit dem definierten Verhältnis enthalten. Da die Sub-Mini-Stapel in der Regel nur eine kleine Anzahl von Trainingsdatenproben enthalten, kann das Verhältnis nicht mit hoher Genauigkeit angewendet werden. Wenn der Sub-Mini-Stapel z.B. zehn Proben von Trainingsdaten enthält, können nur zehn Verhältnisse angewendet werden. Mit den Zufallszahlen erzielt das Verfahren eine korrekte Anwendung der Trainingsdaten aus den beiden Datenströmen entsprechend dem Verhältnis für mehrere Sub-Mini-Stapel von Trainingsdaten. Auch wenn sich das Verhältnis während des Trainings des neuronalen Netzes ändert, gelten die gleichen Prinzipien.According to a modified embodiment of the invention, the selection of the training data from the first and the second data stream according to a ratio includes the generation of a random number, in particular between zero and one, for each training data sample that is to be added to the respective sub-mini-batch, and that Taking the training data sample to be added from the first or second data stream as a function of whether the random number is greater than the ratio or not. The random numbers are generated within the same number range in which the ratio is specified, typically a real number between zero and one. Even if the ratio defines the presentation of the training data from the first and second data streams for the neural network, the random numbers allow a more flexible selection of the training data from the first and second data streams. Therefore, the individual sub-mini-batches may contain training data with different percentages from the two data streams than indicated by the ratio. When considering several sub-mini-batches, however, the sub-mini-batches can contain the training data from the two data streams with the defined ratio. Since the sub-mini-batches typically contain only a small number of training data samples, the ratio cannot be applied with high accuracy. For example, if the sub-mini-batch contains ten samples of training data, only ten ratios can be applied. With the random numbers, the method achieves a correct application of the training data from the two data streams in accordance with the ratio for several sub-mini-stacks of training data. Even if the relationship changes during training of the neural network, the same principles apply.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Aktualisieren des Kombinationsschemas zum Kombinieren der Sub-Mini-Stapel von Trainingsdaten in Abhängigkeit von einer Lernentwicklung des neuronalen Netzes das Anwenden einer Lernrate auf. Die Lernrate ist ein Maß für eine Veränderung der Zusammensetzung der Sub-Mini-Stapel. Sie ist typischerweise ein numerischer Wert. Die Lernrate kann gewählt und angepasst werden, um das Aktualisieren des Kombinationsschemas anzupassen. Eine höhere Lernrate ermöglicht allgemein eine schnellere Konvergenz, erhöht aber das Risiko einer Überanpassung einer bestimmten Art von Trainingsdaten. Die Lernrate kann während des Trainings des neuronalen Netzes variieren.According to a modified embodiment of the invention, the updating of the combination scheme for combining the sub-mini-stacks of training data as a function of a learning development of the neural network includes applying a learning rate. The learning rate is a measure of a change in the composition of the sub-mini-stacks. It is typically a numerical value. The learning rate can be chosen and adjusted to accommodate the updating of the combination scheme. A higher learning rate generally allows faster convergence, but increases the risk of overfitting a certain type of training data. The learning rate can vary during the training of the neural network.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Kombinieren der Trainingsdaten des ersten und des zweiten Datenstroms für die gemeinsam trainierten Dekodieraufgaben gemäß einem Kombinationsschema zum Bereitstellen des jeweiligen Sub-Mini-Stapels das Wiederverwenden von Trainingsdaten des ersten und/oder des zweiten Datenstroms von Trainingsdaten für die jeweilige Dekodieraufgabe auf. Wenn mehrere unabhängig voneinander bereitgestellte Trainingsdatensätze verwendet werden, besteht eine hohe Wahrscheinlichkeit dafür, dass einer oder mehrere der Datenströme von Trainingsdaten mehr Trainingsdaten enthalten als andere Datenströme von Trainingsdaten. Dies kann dazu führen, dass ein Teil der Trainingsdaten während einer Epoche nicht genutzt wird, falls das Training sofort abgebrochen wird, wenn keine ungenutzten Trainingsdaten aus einem oder mehreren Datenströmen von Trainingsdaten mehr zur Verfügung stehen. Durch das Wiederverwenden eines Teils der bereits verwendeten Trainingsdaten während einer Epoche kann die Nutzung der verfügbaren Trainingsdaten verbessert werden. Dies ist insbesondere dann wichtig, wenn die meisten Datenströme mehr Trainingsdaten aufweisen als ein einzelner Datenstrom von Trainingsdaten oder nur einige wenige Datenströme von Trainingsdaten. Das Wiederverwenden von Trainingsdaten ist daher insbesondere bei unausgewogenen Datenströmen von Trainingsdaten für die jeweilige Dekodieraufgabe sinnvoll. Falls die Trainingsdaten für die verschiedenen Aufgaben unausgewogen sind, können auch die Trainingsdaten des ersten und des zweiten Datenstroms von Trainingsdaten einer oder mehrerer vollständiger Dekodieraufgabe(n) wiederverwendet werden.According to a modified embodiment of the invention, combining the training data of the first and second data streams for the jointly trained decoding tasks according to a combination scheme for providing the respective sub-mini-stack includes reusing training data of the first and / or the second data stream of training data for the respective decoding task. If several independently provided training data sets are used, there is a high probability that one or more of the data streams of training data contain more training data than other data streams of training data. This can lead to part of the training data not being used during an epoch if the training is terminated immediately when no more unused training data from one or more data streams of training data are available. By reusing part of the training data that has already been used during an epoch, the use of the available training data can be improved. This is particularly important when most of the data streams contain more training data than a single data stream of training data or only a few data streams of training data. The reuse of training data is therefore particularly useful in the case of unbalanced data streams of training data for the respective decoding task. If the training data for the different tasks are unbalanced, the training data of the first and the second data stream of training data of one or more complete decoding task (s) can also be reused.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Anordnen der mehreren unterschiedlich annotierten Trainingsdatensätze in Datenströmen, wobei zum Trainieren jeder der mehreren Dekodieraufgaben ein erster Datenstrom mit Trainingsdaten bereitgestellt wird, die einem Zielanwendungskontext des neuronalen Netzes entsprechen, und zum zusätzlichen Trainieren mindestens einer gemeinsam trainierten Dekodieraufgabe aus den mehreren Dekodieraufgaben ein zweiter Datenstrom mit Trainingsdaten bereitgestellt wird, die nicht dem Zielanwendungskontext des neuronalen Netzes entsprechen, das Kombinieren aller Trainingsdatensätze, die einer bestimmten Dekodieraufgabe und dem jeweiligen Zielanwendungskontext entsprechen, in dem jeweiligen ersten Datenstrom und das Kombinieren aller Trainingsdatensätze, die der bestimmten Dekodieraufgabe und nicht der jeweiligen Zielanwendung entsprechen, in dem jeweiligen zweiten Datenstrom auf. Somit werden alle verfügbaren Trainingsdaten, die für das Trainieren einer der Dekodieraufgaben annotiert sind, in dem jeweiligen ersten und/oder zweiten Datenstrom kombiniert. Insbesondere wenn mehrere Trainingsdatensätze bereitgestellt werden, die für das Trainieren einer bestimmten Dekodieraufgabe geeignet sind, die dem Zielanwendungskontext entspricht, werden die einzelnen Proben aus diesen Trainingsdatensätzen in dem jeweiligen ersten Datenstrom kombiniert. Auf ähnliche weise werden, falls mehrere Trainingsdatensätze für das Trainieren einer bestimmten Dekodieraufgabe bereitgestellt werden, die nicht dem Zielanwendungskontext entspricht, die Proben aus diesen Trainingsdatensätzen in dem jeweiligen zweiten Datenstrom kombiniert. Zum Bereitstellen der Trainingsdaten aus mehreren Trainingsdatensätzen in einem Datenstrom können Mischstrategien angewendet werden. Alternativ kann ein Zufallszugriff auf die Trainingsdaten der Datenströme ausgeführt werden.According to a modified embodiment of the invention, arranging the multiple differently annotated training data sets in data streams, a first data stream with training data corresponding to a target application context of the neural network being provided for training each of the multiple decoding tasks, and for additional training at least one jointly trained decoding task The multiple decoding tasks are provided with a second data stream with training data that do not correspond to the target application context of the neural network, combining all training data sets that correspond to a specific decoding task and the respective target application context in the respective first data stream and combining all training data sets that correspond to the specific decoding task and do not correspond to the respective target application in the respective second data stream. Thus, all are available Training data that are annotated for training one of the decoding tasks, combined in the respective first and / or second data stream. In particular, if several training data sets are provided which are suitable for training a specific decoding task that corresponds to the target application context, the individual samples from these training data sets are combined in the respective first data stream. Similarly, if several training data sets are provided for training a specific decoding task that does not correspond to the target application context, the samples from these training data sets are combined in the respective second data stream. Mixing strategies can be used to provide the training data from several training data sets in a data stream. Alternatively, the training data of the data streams can be accessed at random.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Trainieren des neuronalen Netzes unter Verwendung der erzeugten Mini-Stapel das Trainieren des neuronalen Netzes in einer synchronen Rückpropagationsweise auf, und weist das Aktualisieren des Kombinationsschemas für die Sub-Mini-Stapel von Trainingsdaten in Abhängigkeit von einer Lernentwicklung des neuronalen Netzes das dynamische Aktualisieren des Kombinationsschemas für nachfolgende Mini-Stapel von Trainingsdaten in Abhängigkeit von einer Lernentwicklung des neuronalen Netzes auf. Demnach wird das neuronale Netz nach jeder Trainingssequenz mit einem Mini-Stapel aktualisiert, wobei für jede Dekodieraufgabe z.B. Aufgabenverluste aus jedem Sub-Mini-Stapel ermittelt werden und ein Fehler gleichzeitig im jeweiligen Dekodierer dieser speziellen Dekodieraufgabe und zusätzlich im Kodierer rückpropagiert wird. Dies gewährleistet eine gute Ausgewogenheit der trainierten Aufgaben, insbesondere im Hinblick auf das Training des Kodierers, z.B. das Trainieren von Kodierermerkmalen.According to a modified embodiment of the invention, training the neural network using the generated mini-stacks comprises training the neural network in a synchronous back-propagation manner, and comprises updating the combination scheme for the sub-mini-stacks of training data depending on a learning development of the neural network dynamically updating the combination scheme for subsequent mini-stacks of training data as a function of a learning development of the neural network. According to this, the neural network is updated with a mini-stack after each training sequence, whereby for each decoding task e.g. task losses from each sub-mini-stack are determined and an error is simultaneously backpropagated in the respective decoder of this special decoding task and additionally in the encoder. This ensures a good balance of the trained tasks, particularly with regard to training the coder, e.g. training coder features.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Erzeugen von Mini-Stapeln mit mehreren Sub-Mini-Stapeln von Trainingsdaten zum Trainieren jeder der Dekodieraufgaben das Ausgleichen der Sub-Mini-Stapel für jeden der Mini-Stapel gemäß einem Ausgleichsschema auf. Verschiedene Ausgleichsschemas zum parallelen Ausgleichen des Trainings der mehreren Dekodieraufgaben sind auf dem Fachgebiet bekannt.According to a modified embodiment of the invention, generating mini-stacks with multiple sub-mini-stacks of training data for training each of the decoding tasks includes balancing the sub-mini-stacks for each of the mini-stacks according to a balancing scheme. Various equalization schemes for equalizing the training of the multiple decoding tasks in parallel are known in the art.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Verfahren einen Schritt zum Bestimmen einer Lernentwicklung des neuronalen Netzes in Abhängigkeit von einem Validierungsverlust und/oder von einem Validierungsschlüssel-Leistungsindikator des neuronalen Netzes auf. Der Validierungsverlust ist eine Größe, die beim Deep Learning häufig verwendet wird, um eine gute Schätzung darüber zu erhalten, wie das neuronale Netz lernt, z.B. eine Trainingsgeschwindigkeit und dergleichen. Insbesondere gibt der Validierungsverlust Auskunft darüber, ob das neuronale Netz noch lernt oder vielmehr eine Überanpassung auftritt. Um ein System zu erhalten, das gut verallgemeinert, insbesondere um ein System zu erhalten, das für alle Trainingsdatensätze gut funktioniert, ist eine Überanpassung zu vermeiden. Eine Überanpassung kann erkannt werden, wenn der Validierungsverlust zu steigen beginnt, anstatt zu sinken. Tatsächlich kann jeder andere Parameter, der eine gute Schätzung der Überanpassung des neuronalen Netzes angibt, verwendet werden. Insbesondere kann der Validierungsschlüssel-Leistungsindikator (Validierungs-KPI) anstelle des Validierungsverlustes verwendet werden, falls er verfügbar ist.According to a modified embodiment of the invention, the method has a step for determining a learning development of the neural network as a function of a loss of validation and / or of a validation key performance indicator of the neural network. The loss of validation is a quantity that is often used in deep learning to get a good estimate of how the neural network is learning, e.g. a training speed and the like. In particular, the loss of validation provides information as to whether the neural network is still learning or rather an over-adaptation occurs. In order to get a system that generalizes well, especially to get a system that works well for all training data sets, avoid overfitting. Overfitting can be detected when the loss of validation begins to increase instead of decreasing. In fact, any other parameter that gives a good estimate of the neural network overfitting can be used. In particular, the Validation Key Performance Indicator (Validation KPI) can be used in place of the Loss of Validation if it is available.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Bestimmen einer Lernentwicklung des neuronalen Netzes in Abhängigkeit von einem Validierungsverlust und/oder von einem Validierungsschlüssel-Leistungsindikator des neuronalen Netzes das unabhängige Bestimmen der Lernentwicklung des neuronalen Netzes für jede der Dekodieraufgaben auf. Dies ermöglicht einen guten und detaillierten Einblick in den Lernprozess des neuronalen Netzes. Die individuell ermittelte Lernentwicklung ermöglicht insbesondere ein individuelles Aktualisieren des Kombinationsschemas für die Trainingsdaten aus dem ersten und dem zweiten Datenstrom.According to a modified embodiment of the invention, the determination of a learning development of the neural network as a function of a loss of validation and / or of a validation key performance indicator of the neural network comprises the independent determination of the learning development of the neural network for each of the decoding tasks. This enables a good and detailed insight into the learning process of the neural network. The individually determined learning development enables in particular an individual updating of the combination scheme for the training data from the first and the second data stream.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Bestimmen einer Lernentwicklung des neuronalen Netzes in Abhängigkeit von einem Validierungsverlust und/oder von einem Validierungsschlüssel-Leistungsindikator des neuronalen Netzes das unabhängige Bestimmen der Lernentwicklung des neuronalen Netzes für jeden der Anwendungskontexte auf. Demnach wird die Lernentwicklung für das ausgeführte Training unter Verwendung der jeweiligen Trainingsdaten aus dem ersten und dem zweiten Datenstrom individuell ermittelt. Auf der Grundlage dieser ermittelten Lernentwicklung lässt sich zuverlässig bestimmen, ob das neuronale Netz als Ganzes und/oder die jeweiligen Dekodieraufgaben vom Trainieren mit Trainingsdaten mit dem entsprechenden Zielanwendungskontext oder mit Trainingsdaten ohne den entsprechenden Zielanwendungskontext mehr profitieren. Somit kann das Trainieren des neuronalen Netzes auf der Grundlage der Aktualisierung des Kombinationsschemas als Ganzes und/oder der jeweiligen Dekodieraufgaben auf die effizienteste Weise angepasst werden.According to a modified embodiment of the invention, the determination of a learning development of the neural network as a function of a loss of validation and / or of a Validation key performance indicator of the neural network on the independent determination of the learning development of the neural network for each of the application contexts. Accordingly, the learning development for the training carried out is determined individually using the respective training data from the first and the second data stream. On the basis of this determined learning development, it can be reliably determined whether the neural network as a whole and / or the respective decoding tasks benefit more from training with training data with the corresponding target application context or with training data without the corresponding target application context. Thus, the training of the neural network can be adapted in the most efficient way on the basis of the updating of the combination scheme as a whole and / or of the respective decoding tasks.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Bestimmen einer Lernentwicklung des neuronalen Netzes in Abhängigkeit von einem Validierungsverlust und/oder von einem Validierungsschlüssel-Leistungsindikator des neuronalen Netzes das Anwenden einer Lernrate auf. Die Lernentwicklung kann somit anhand einer Lernrate generisch bestimmt werden. Die Lernrate ist typischerweise ein numerischer Wert. Die Lernentwicklung kann als Grundlage für das Aktualisieren des Kombinationsschemas verwendet werden. Eine höhere Lernrate ermöglicht im Allgemeinen eine schnellere Konvergenz, erhöht aber das Risiko einer Überanpassung einer bestimmten Art von Trainingsdaten. Beispielsweise kann ein Aktualisieren des Verhältnisses für die Auswahl der Trainingsdaten aus dem ersten und dem zweiten Datenstrom durch Multiplizieren eines Gradienten des Validierungsverlusts mit der Lernrate erfolgen. Dieses Produkt kann von einem aktuellen Verhältnis subtrahiert werden.According to a modified embodiment of the invention, the determination of a learning development of the neural network as a function of a loss of validation and / or of a validation key performance indicator of the neural network includes the application of a learning rate. The learning development can thus be determined generically on the basis of a learning rate. The learning rate is typically a numerical value. The learning development can be used as a basis for updating the combination scheme. A higher learning rate generally allows for faster convergence, but increases the risk of overfitting a particular type of training data. For example, the ratio for the selection of the training data from the first and the second data stream can be updated by multiplying a gradient of the validation loss by the learning rate. This product can be subtracted from a current ratio.

Gemäß einer modifizierten Ausführungsform der Erfindung weist das Verfahren einen Schritt zum Abbrechen des Trainings auf, falls die Lernentwicklung darauf hinweist, dass sich das neuronale Netz nicht weiter verbessert, insbesondere, wenn die Lernentwicklung darauf hinweist, dass sich das neuronale Netz in Bezug auf mindestens eine der Dekodieraufgaben nicht weiter verbessert. Daher wird das Training abgebrochen, so dass eine Überanpassung des neuronalen Netzes vermieden werden kann. Beispielsweise kann das neuronale Netz als sich nicht verbessernd betrachtet werden, wenn der Gradient des Validierungsverlusts für das Training des neuronalen Netzes mit Trainingsdaten aus dem ersten Datenstrom und/oder für das Training des neuronalen Netzes mit Trainingsdaten aus dem zweiten Datenstrom größer als null ist.According to a modified embodiment of the invention, the method has a step for terminating the training if the learning development indicates that the neural network is no longer improving, in particular if the learning development indicates that the neural network is changing in relation to at least one the decoding tasks not further improved. The training is therefore terminated so that overadaptation of the neural network can be avoided. For example, the neural network can be regarded as not improving if the gradient of the validation loss for training the neural network with training data from the first data stream and / or for training the neural network with training data from the second data stream is greater than zero.

Diese und andere Aspekte der Erfindung werden anhand der nachstehend beschriebenen Ausführungsformen ersichtlich und erläutert. Einzelne Merkmale, die in den Ausführungsformen dargestellt sind, können für sich alleine oder in Kombination einen Aspekt der vorliegenden Erfindung bilden. Merkmale der verschiedenen Ausführungsformen können von einer Ausführungsform auf eine andere Ausführungsform übertragen werden.These and other aspects of the invention will be apparent and illustrated by the embodiments described below. Individual features that are shown in the embodiments can form an aspect of the present invention on their own or in combination. Features of the various embodiments can be transferred from one embodiment to another embodiment.

Es zeigen:

1 eine schematische Darstellung eines neuronalen Netzes, das einen einzelnen Kodierer und zwei Dekodierer aufweist, wobei dem Kodierer Eingabedaten mit zwei Datenströmen zugeführt werden und jeder der Dekodierer entsprechend seiner Dekodieraufgabe eine einzelne Ausgabe bereitstellt, gemäß einer ersten bevorzugten Ausführungsform;
2 eine Tabelle, die die Anwendbarkeit verschiedener Trainingsdatensätze angibt, die für verschiedene Dekodieraufgaben bereitgestellt werden, und außerdem angibt, ob der Trainingsdatensatz dem Zielanwendungskontext zugehörig ist, in Übereinstimmung mit der ersten Ausführungsform;
3 eine schematische Zeichnung, die einen Sub-Mini-Stapel von Trainingsdaten zum Trainieren eines neuronalen Netzes mit zwei Dekodierern für zwei Dekodieraufgaben darstellt, in Übereinstimmung mit der ersten Ausführungsform;
4 eine schematische Zeichnung, die zwei in 3 dargestellte Sub-Mini-Stapel von Trainingsdaten zum Trainieren einer Dekodieraufgabe des neuronalen Netzes darstellt, die dem Zielanwendungskontext entsprechen bzw. nicht dem Zielanwendungskontext der Mini-Trainingsdatensätze für die anderen Dekodieraufgaben entsprechen;
5 beispielhaft ein Diagramm, das eine Zusammensetzung der Sub-Mini-Stapel von Trainingsdaten mit Trainingsdaten darstellt, die dem Anwendungskontext entsprechen und Trainingsdaten, die nicht dem Anwendungskontext entsprechen, über verschiedene Epochen in Übereinstimmung mit der ersten Ausführungsform; und
6 ein Ablaufdiagramm, das ein Verfahren zum Trainieren des vorstehend erwähnten neuronalen Netzes mit einem Kodierer und mehreren Dekodierern mit annotierten Trainingsdaten darstellt, wobei jede der mehreren Dekodieraufgaben einen Zielanwendungskontext aufweist, in Übereinstimmung mit der ersten Ausführungsform.

Show it:

1 a schematic representation of a neural network having a single encoder and two decoders, the encoder being supplied with input data with two data streams and each of the decoders providing a single output corresponding to its decoding task, according to a first preferred embodiment;
2 a table indicating the applicability of various training data sets provided for different decoding tasks and also indicating whether the training data set is associated with the target application context, in accordance with the first embodiment;
3 Fig. 13 is a schematic drawing showing a sub-mini-batch of training data for training a neural network with two decoders for two decoding tasks, in accordance with the first embodiment;
4th a schematic drawing showing two in 3 The illustrated sub-mini-stacks of training data for training a decoding task of the neural network, which correspond to the target application context or do not correspond to the target application context of the mini-training data sets for the other decoding tasks;
5 is an exemplary diagram illustrating a composition of the sub-mini-stacks of training data with training data that correspond to the application context and training data that do not correspond to the application context over different epochs in accordance with the first embodiment; and
6th Figure 12 is a flowchart illustrating a method of training the aforementioned neural network with an encoder and multiple decoders with annotated training data, each of the multiple decoding tasks having a target application context, in accordance with the first embodiment.

6 zeigt ein Ablaufdiagramm eines Verfahrens zum Trainieren eines neuronalen Netzes 10, insbesondere eines konvolutionellen neuronalen Netzes für den Einsatz in einem Fahrunterstützungssystem eines Fahrzeugs, gemäß einer ersten bevorzugten Ausführungsform. 6th shows a flow chart of a method for training a neural network 10 , in particular a convolutional neural network for use in a driving support system of a vehicle, according to a first preferred embodiment.

Das neuronale Netz 10 der ersten Ausführungsform weist einen Kodierer 12 zum Kodieren bereitgestellter Eingabedaten 14 und mehrere Dekodierer 16 auf, von denen jeder eine Dekodieraufgabe 18 ausführt, mit annotierten Trainingsdaten 20 als Eingabedaten 14, wobei jede der mehreren Dekodieraufgaben 18 einen Anwendungskontext aufweist. Die Eingabedaten 14, die als Eingabe zum Trainieren des neuronalen Netzes 10 verwendet werden, weisen mehrere Proben von Eingabebildern auf, die dem Kodierer 12 zugeführt werden. Jeder der Dekodierer 16 dekodiert die bereitgestellten und kodierten Eingabedaten 14 hinsichtlich unterschiedlicher Dekodieraufgaben, z.B. führt er parallel eine Segmentierung oder eine Erfassung von Objekten aus, um nur einige zu nennen. Eine Ausgabe 28 des neuronalen Netzes 10 ist für jeden Dekodierer 16 entsprechende Dekodierinformation, z.B. Erfassungsannotationen oder Segmentierungsannotationen.The neural network 10 the first embodiment has an encoder 12th for coding provided input data 14th and several decoders 16 on, each of which has a decoding task 18th executes, with annotated training data 20th as input data 14th , each of the multiple decoding tasks 18th has an application context. The input data 14th that are used as input for training the neural network 10 are used have multiple samples of input images presented to the encoder 12th are fed. Each of the decoders 16 decodes the provided and encoded input data 14th with regard to different decoding tasks, e.g. it carries out a segmentation or a detection of objects in parallel, to name just a few. One issue 28 of the neural network 10 is for every decoder 16 appropriate Decoding information, e.g. acquisition annotations or segmentation annotations.

Zum Ausführen des Trainings werden die Trainingsdaten 20 dem Kodierer 12 des neuronalen Netzes 10 als Eingabedaten 14 zugeführt. Die Trainingsdaten 20 weisen mehrere Sätze 22 von Trainingsdaten 20 auf, wobei jeder Satz 22 von Trainingsdaten 20 unterschiedlich annotiert ist, wie in 2 beispielhaft dargestellt ist. 2 zeigt Trainingsdaten 20, die in vier Sätzen 22 von Trainingsdaten 20 bereitgestellt werden, die als Datensatz A bis D bezeichnet sind. Beispielhaft wird in 2 angenommen, dass das neuronale Netz 10 vier Dekodieraufgaben 18 aufweist, die als Aufgabe 1 bis Aufgabe 4 bezeichnet sind. Wie in 2 dargestellt ist, sind die vier Sätze 22 von Trainingsdaten 20 unterschiedlich annotiert, und jeder Satz 22 von Trainingsdaten 20 ist beispielhaft zum Trainieren einer der Dekodieraufgaben 18 annotiert. Außerdem zeigt die Tabelle in 2 für jeden Satz 22 von Trainingsdaten 20 an, ob er dem Zielanwendungskontext entspricht oder nicht. Im angegebenen Beispiel entsprechen zwei Sätze 22 von Trainingsdaten 20 dem Zielanwendungskontext und zwei nicht.The training data 20th the encoder 12th of the neural network 10 as input data 14th fed. The training data 20th assign multiple sentences 22nd of training data 20th on, with each sentence 22nd of training data 20th is annotated differently, as in 2 is shown by way of example. 2 shows training data 20th that in four sentences 22nd of training data 20th provided, which are designated as data set A to D. In 2 assumed that the neural network 10 four decoding tasks 18th which are designated as task 1 to task 4. As in 2 shown are the four sentences 22nd of training data 20th annotated differently, and each sentence 22nd of training data 20th is an example for training one of the decoding tasks 18th annotated. In addition, the table in 2 for each sentence 22nd of training data 20th whether or not it corresponds to the target application context. In the example given, two sentences correspond 22nd of training data 20th the target application context and two not.

Daher werden die Sätze 22 der Trainingsdaten 20 unabhängig voneinander bereitgestellt. Im Allgemeinen kann jeder Satz 22 von Trainingsdaten 20 einer oder mehreren der Dekodieraufgaben 18 entsprechen, wobei jeder Satz 22 von Trainingsdaten 20 unabhängig voneinander dem Zielanwendungskontext des neuronalen Netzes 10 entsprechen kann oder nicht.Hence the sentences 22nd the training data 20th provided independently. In general, any sentence can 22nd of training data 20th one or more of the decoding tasks 18th correspond, with each sentence 22nd of training data 20th independently of the target application context of the neural network 10 may or may not match.

Das Verfahren beginnt mit Schritt S100, der sich auf das Bereitstellen mehrerer Sätze 22 unterschiedlich annotierter Trainingsdaten 20 zum Trainieren der mehreren Dekodieraufgaben 18 bezieht. Die Trainingsdaten 20, wie sie in den mehreren Sätzen 22 unterschiedlich annotierter Trainingsdaten 20 bereitgestellt werden, werden als Grundlage für das Trainieren des neuronalen Netzes 10 verwendet. Das neuronale Netz 10 in 1 weist nur zwei Dekodierer 16 für zwei verschiedene Dekodieraufgaben 18 auf, die beispielhaft den Aufgaben 1 und 2 der Tabelle in 2 entsprechen.The method begins with step S100, which focuses on providing multiple sentences 22nd differently annotated training data 20th for training the multiple decoding tasks 18th relates. The training data 20th as they are in the multiple sentences 22nd differently annotated training data 20th are provided as a basis for training the neural network 10 used. The neural network 10 in 1 has only two decoders 16 for two different decoding tasks 18th which are examples of tasks 1 and 2 in the table in 2 correspond.

Schritt S110 bezieht sich auf das Anordnen der mehreren Sätze 22 unterschiedlich annotierter Trainingsdaten 20 in Datenströmen 24, 26, wobei für das Training jeder der beiden Dekodieraufgaben 18 ein erster Datenstrom 24 mit Trainingsdaten 20, die der ersten Dekodieraufgabe (Erfassungsdekodierer) entsprechen, und ein zweiter Datenstrom 26 mit Trainingsdaten 20, die der zweiten Dekodieraufgabe (Segmentierungsdekodierer) des neuronalen Netzes 10 entsprechen, bereitgestellt werden. Daher wird für die als Aufgabe 1 bezeichnete Dekodieraufgabe 18 dem ersten Datenstrom 24 der Datensatz D hinzugefügt, und für die als Aufgabe 2 bezeichnete Dekodieraufgabe 18 wird dem zweiten Datenstrom 26 der Datensatz C hinzugefügt, da die Sätze 22 von Trainingsdaten 20 für die jeweilige Trainingsaufgabe 18 annotiert sind und dem Zielanwendungskontext des neuronalen Netzes 10 entsprechen. Ferner stellen die beiden als Aufgabe 1 und Aufgabe 2 bezeichneten Dekodieraufgaben 18 gemeinsam trainierte Dekodieraufgaben 18 dar. Die Trainingsdaten 20 weisen daher zusätzlich für jede der beiden Dekodieraufgaben 18 einen Satz 22 von Trainingsdaten 20 auf, die für die jeweilige Trainingsaufgabe 18 annotiert sind und nicht dem Zielanwendungskontext des neuronalen Netzes 10 entsprechen und jeweils dem zweiten Datenstrom 24, 26 hinzugefügt wird. Insbesondere wird für die als Aufgabe 1 bezeichnete Dekodieraufgabe 18 dem jeweils ersten Datenstrom 24 der Datensatz A hinzugefügt, und für die als Aufgabe 2 bezeichnete Dekodieraufgabe 18 wird dem jeweils zweiten Datenstrom 26 ebenfalls der Datenstrom A hinzugefügt, da dieser Satz 22 von Trainingsdaten 20 für die beiden Dekodieraufgaben 18 annotiert ist und nicht dem Zielanwendungskontext des neuronalen Netzes 10 entspricht.Step S110 relates to arranging the plurality of sentences 22nd differently annotated training data 20th in data streams 24 , 26th , with each of the two decoding tasks for training 18th a first data stream 24 with training data 20th corresponding to the first decoding task (acquisition decoder), and a second data stream 26th with training data 20th that of the second decoding task (segmentation decoder) of the neural network 10 are provided. Therefore, for the decoding task referred to as task 1 18th the first data stream 24 the data set D added, and for the decoding task designated as task 2 18th becomes the second data stream 26th the record C added as the records 22nd of training data 20th for the respective training task 18th are annotated and the target application context of the neural network 10 correspond. Furthermore, the two decoding tasks referred to as task 1 and task 2 pose 18th jointly trained decoding tasks 18th The training data 20th therefore additionally have for each of the two decoding tasks 18th a set 22nd of training data 20th on that for the respective training task 18th are annotated and not the target application context of the neural network 10 and correspond to the second data stream 24 , 26th will be added. In particular, for the decoding task referred to as task 1 18th the first data stream in each case 24 the data set A is added, and for the decoding task designated as task 2 18th becomes the second data stream 26th data stream A is also added as this record 22nd of training data 20th for the two decoding tasks 18th is annotated and not the target application context of the neural network 10 corresponds to.

In einem anderen Beispiel, in dem mehrere der bereitgestellten Sätze 22 von Trainingsdaten 20 die vorstehenden Bedingungen dafür erfüllen, dass sie dem ersten oder dem zweiten Datenstrom 24, 26 hinzugefügt werden, werden alle Sätze 22 von Trainingsdaten 20, die einer bestimmten Dekodieraufgabe 18 und dem jeweiligen Zielanwendungskontext entsprechen, kombiniert und dem jeweiligen ersten und zweiten Datenstrom 24, 26 hinzugefügt, und alle Sätze 22 von Trainingsdaten 20, die der bestimmten Dekodieraufgabe 18 entsprechen und der jeweiligen Zielanwendung nicht entsprechen, werden kombiniert und dem jeweiligen ersten und zweiten Datenstrom 24, 26 hinzugefügt. Zum Bereitstellen der Trainingsdaten 20 aus mehreren Sätzen 22 von Trainingsdaten 20 in einem jeweiligen Datenstrom 24, 26 können Mischstrategien angewendet werden.In another example, in which several of the provided sentences 22nd of training data 20th meet the above conditions for being the first or the second data stream 24 , 26th are added, all sentences 22nd of training data 20th that a specific decoding task 18th and correspond to the respective target application context, combined and the respective first and second data stream 24 , 26th added, and all sentences 22nd of training data 20th that of the particular decoding task 18th and do not correspond to the respective target application, are combined and the respective first and second data stream 24 , 26th added. To provide the training data 20th from several sentences 22nd of training data 20th in a respective data stream 24 , 26th Mixing strategies can be used.

Schritt S120 bezieht sich auf das Erzeugen von Mini-Stapeln 32 von Trainingsdaten 20, die mehrere Sub-Mini-Stapel 34, 36 der Trainingsdaten 20 zum Trainieren jeder der Dekodieraufgaben 18 aufweisen. Ein Mini-Stapel 32 der Trainingsdaten 20 ist in 3 beispielhaft dargestellt. Jeder Mini-Stapel 32 enthält m Proben von Trainingsdaten 20, wobei eine Anzahl von n Proben von Trainingsdaten 20 einer Dekodieraufgabe 18 und eine Anzahl von (m - n) Proben von Trainingsdaten 20 einer anderen Dekodieraufgabe 18 entspricht. Das beschriebene neuronale Netz 10 weist zwei Dekodieraufgaben 18 auf, so dass der Mini-Stapel 32 zwei Sub-Mini-Stapel 34, 36 enthält, jeweils einen für jede der beiden Dekodieraufgaben 18. Eine Nebenbedingung zum Bereitstellen der Mini-Stapel 32 ist, dass jeder der Sub-Mini-Stapel 34, 36 mindestens eine Probe von Trainingsdaten 20 aufweist.Step S120 relates to creating mini-stacks 32 of training data 20th who have favourited Multiple Sub-Mini Stacks 34 , 36 the training data 20th for training each of the decoding tasks 18th exhibit. A mini pile 32 the training data 20th is in 3 shown as an example. Any mini-stack 32 contains m samples of training data 20th , where n number of samples of training data 20th a decoding task 18th and a number of (m-n) samples of training data 20th another decoding task 18th corresponds to. The neural network described 10 has two decoding tasks 18th on so that the mini pile 32 two sub-mini stacks 34 , 36 contains, one for each of the two decoding tasks 18th . A constraint for providing the mini-batch 32 is that each of the sub-mini stacks 34 , 36 at least one sample of training data 20th having.

Daher werden die Trainingsdaten 20 in N Mini-Stapel 32 für das Trainieren jeder Epoche aufgeteilt. Jeder Mini-Stapel 32 hat eine Größe m, d.h. der Mini-Stapel 32 enthält m Proben von Trainingsdaten 20. Typische Werte für die Stapelgröße sind z.B. 8, 16, 32. Die Gesamtheit der Trainingsdaten 20 bildet folglich die Epoche, die eine Datengröße von N*m Proben aufweist.Hence the training data 20th in N mini-stacks 32 split for training each epoch. Any mini-stack 32 has a size m, ie the mini-stack 32 contains m samples of training data 20th . Typical values for the stack size are, for example, 8, 16, 32. The entirety of the training data 20th consequently forms the epoch with a data size of N * m samples.

Schritt S130 bezieht sich auf das Erzeugen der Sub-Mini-Stapel 34, 36 von Trainingsdaten 20 zum Trainieren jeder der beiden Dekodieraufgaben 18 basierend auf dem jeweiligen Datenstrom 24, 26. Jede der beiden Dekodieraufgaben 18 ist eine gemeinsam trainierte Dekodieraufgabe 18. Die Sub-Mini-Stapel 34, 36 weisen daher jeweils Trainingsdaten 20 aus dem ersten und zweiten Datenstrom 24, 26 auf. Die Trainingsdaten 20 für jede der Dekodieraufgaben 18 werden gemäß einem Kombinationsschema kombiniert, wie im Folgenden näher erläutert wird, was zu einer Menge 38 von Trainingsdaten 20, die dem Zielanwendungskontext entspricht, und einer Menge 40 von Trainingsdaten 20 führt, die nicht dem Zielanwendungskontext entspricht, wie in 4 dargestellt ist.Step S130 relates to creating the sub-mini-batches 34 , 36 of training data 20th for training each of the two decoding tasks 18th based on the respective data stream 24 , 26th . Either of the two decoding tasks 18th is a jointly trained decoding task 18th . The sub-mini stacks 34 , 36 therefore each have training data 20th from the first and second data stream 24 , 26th on. The training data 20th for each of the decoding tasks 18th are combined according to a combination scheme, as explained in more detail below, resulting in a set 38 of training data 20th that corresponds to the target application context and a set 40 of training data 20th that does not match the target application context, as in 4th is shown.

Das Kombinationsschema weist das Bereitstellen eines Verhältnisses auf, und die Proben von Trainingsdaten 20 werden entsprechend dem Verhältnis aus der Menge 38, 40 entnommen. Das Verhältnis in der ersten Ausführungsform entspricht einem reellen Wert aus einem Zahlenraum zwischen null und eins. Ferner wird bei der Auswahl einer Probe von Trainingsdaten 20 aus dem ersten oder zweiten Datenstrom 24, 26 eine Zufallszahl im gleichen Zahlenraum, d.h. zwischen null und eins, erzeugt. Für jede Probe von Trainingsdaten 20, die dem jeweiligen Sub-Mini-Stapel 34, 36 hinzugefügt werden soll, wird die Probe von Trainingsdaten 20 in Abhängigkeit davon aus der Menge 38, 40 hinzugefügt, ob die Zufallszahl größer als das Verhältnis ist oder nicht.The combination scheme includes providing a ratio and samples of training data 20th are according to the ratio of the amount 38 , 40 taken. The ratio in the first embodiment corresponds to a real value from a number range between zero and one. In addition, when selecting a sample of training data 20th from the first or second data stream 24 , 26th a random number in the same number range, ie between zero and one, is generated. For each sample of training data 20th corresponding to the respective sub-mini-stack 34 , 36 is to be added, the sample of training data 20th depending on the amount 38 , 40 added whether the random number is greater than the ratio or not.

In dieser Ausführungsform ist das Kombinationsschema ein individuelles Kombinationsschema für jede der gemeinsam trainierten Dekodieraufgaben 18. Dementsprechend wird jeder der Sub-Mini-Stapel 34, 36 von Trainingsdaten 20 für das Training der verschiedenen Dekodieraufgaben 18 individuell zusammengestellt.In this embodiment, the combination scheme is an individual combination scheme for each of the jointly trained decoding tasks 18th . Accordingly, each of the sub-mini-stacks becomes 34 , 36 of training data 20th for training the various decoding tasks 18th individually compiled.

Typischerweise handelt es sich bei den Proben von Trainingsdaten 20 aus dem ersten und dem zweiten Datenstrom 24, 26 um Proben, die während einer Epoche noch nicht für das Training verwendet worden sind. Eine Wiederverwendung von Trainingsdaten 20 des ersten und/oder des zweiten Datenstroms 24, 26 von Trainingsdaten 20 kann jedoch für die jeweilige Dekodieraufgabe 18 ausgeführt werden. Wenn einer oder mehrere der Datenströme 24, 26 von Trainingsdaten 20 mehr Trainingsdaten 20 enthält (enthalten) als andere Datenströme 24, 26 von Trainingsdaten 20, ist eine Wiederverwendung von Trainingsdaten 20 besonders sinnvoll.Typically, the samples are training data 20th from the first and the second data stream 24 , 26th about samples that have not yet been used for training during an epoch. A reuse of training data 20th of the first and / or the second data stream 24 , 26th of training data 20th can, however, for the respective decoding task 18th are executed. If one or more of the data streams 24 , 26th of training data 20th more training data 20th contains (contain) than other data streams 24 , 26th of training data 20th , is a reuse of training data 20th particularly useful.

Die Trainingsdaten 20 zum Trainieren jeder Dekodieraufgabe 18 aus der ersten Menge 38 entsprechen D_Ziel, und die Trainingsdaten 20 zum Trainieren jeder Dekodieraufgabe 18 aus der zweiten Menge 40 entsprechen D_Andere. Dementsprechend definiert das Verhältnis eine Wahrscheinlichkeit P_Ziel der Proben von Trainingsdaten 20, die aus der ersten Menge 38 entnommen werden. P_Ziel entspricht direkt der Wahrscheinlichkeit P_Andere der Proben von Trainingsdaten 20, die aus der zweiten Menge 40 für jede Dekodieraufgabe 18 entnommen werden. Die Bedingung P_Ziel = 1 - P_Andere ist immer erfüllt.The training data 20th for training every decoding task 18th from the first set 38 correspond to D _goal , and the training data 20th for training every decoding task 18th from the second set 40 correspond to D _others . Accordingly, the ratio defines a probability P _{target of} the samples of training data 20th that came from the first lot 38 can be removed. P _goal corresponds directly to the probability P _{others of} the samples of training data 20th that came from the second lot 40 for every decoding task 18th can be removed. The condition P _Ziel = 1 - P _Other is always fulfilled.

Im Fall einer erstmaligen Ausführung von Schritt S130 müssen die Werte für P_Ziel und P_Andere initialisiert werden. Die Initialisierung kann z.B. sein
P_Ziel = P_Andere = 0,5 oder
P_Ziel = 1 and P_Andere = 0, oder
P_Ziel = 0 and P_Andere = 1 .
Die Initialisierung kann beliebig ausgeführt werden.If step S130 is carried out for the first time, the values for P _target and P _{other must be} initialized. The initialization can be, for example
P _Goal = P _Other = 0.5 or
P _Target = 1 and P _Other = 0, or
P _Target = 0 and P _Other = 1.
The initialization can be carried out in any way.

Basierend darauf wird für jede Probe eine zufällige Gleitkommazahl f zwischen null und eins generiert. Falls die zufällige Gleitkommazahl f kleiner als P_Ziel ist, wird eine Probe von Trainingsdaten 20 aus der ersten Menge 38, entnommen, die sich auf D_Ziel bezieht. Diese Probe wird dann dem jeweiligen Sub-Mini-Stapel 34, 36 hinzugefügt. Andernfalls, d.h. falls f größer als P_Ziel ist, wird die Probe von Trainingsdaten 20 aus der zweiten Menge 40 entnommen, die sich auf D_Andere bezieht.Based on this, a random floating point number f between zero and one is generated for each sample. If the random floating point number f is less than P _target , a sample of training data is used 20th from the first set 38 , taken, which relates to D _goal . This sample is then added to the respective sub-mini-stack 34 , 36 added. Otherwise, ie if f is greater than P _target , the sample of training data is used 20th from the second set 40 which refers to D _Others .

Schritt S140 bezieht sich auf das Trainieren des neuronalen Netzes 10 unter Verwendung der erzeugten Mini-Stapel 32. Das Training wird für jeden der Mini-Stapel 32 in einer synchronen Rückpropagationsweise individuell ausgeführt. Ein Fehler wird daher im jeweiligen Dekodierer 16 der jeweiligen Dekodieraufgabe 18 und zusätzlich im Kodierer 14 gleichzeitig rückpropagiert. Das bedeutet, dass bei einem Rückwärtsdurchlauf die Dekodiererverluste zum jeweiligen Dekodierer 16 und zusätzlich zum Kodierer 14 gleichzeitig rückpropagiert werden. Da jeder Mini-Stapel 32 Trainingsdaten 20 für beide Dekodieraufgaben 18 aufweist, werden die Gewichtungen des Kodierers 14 durch den Gradientenabfall unter Verwendung der Fehler beider Dekodierer 16 gleichzeitig aktualisiert. Entsprechend wird nach dem Training mit jedem der Mini-Stapel 32 eine Aktualisierung des gesamten neuronalen Netzes 10 ausgeführt.Step S140 relates to the training of the neural network 10 using the generated mini-stacks 32 . The workout will be for each of the mini stacks 32 individually executed in a synchronous back propagation manner. An error will therefore appear in the respective decoder 16 the respective decoding task 18th and additionally in the encoder 14th simultaneously backpropagated. This means that with a backward pass, the decoder losses to the respective decoder 16 and in addition to the encoder 14th be backpropagated at the same time. Because every mini-stack 32 Training data 20th for both decoding tasks 18th the weights of the encoder 14th by the gradient descent using the errors of both decoders 16 updated at the same time. The same applies to each of the mini-stacks after training 32 an update of the entire neural network 10 executed.

Schritt S150 bezieht sich auf das Bestimmen einer Lernentwicklung des neuronalen Netzes 10 in Abhängigkeit von einem Validierungsverlust und/oder von einem Validierungsschlüssel-Leistungsindikator des neuronalen Netzes 10. Der Validierungsverlust ist eine Größe, die beim Deep Learning häufig verwendet wird, um eine gute Schätzung darüber zu erhalten, wie das neuronale Netz 10 lernt, z.B. eine Trainingsgeschwindigkeit, usw. Insbesondere gibt der Validierungsverlust Information darüber, ob das neuronale Netz 10 noch lernt oder eher eine Überanpassung auftritt. Step S150 relates to determining a learning history of the neural network 10 as a function of a loss of validation and / or of a validation key performance indicator of the neural network 10 . The loss of validation is a quantity that is often used in deep learning to get a good estimate of how the neural network 10 learns, for example a training speed, etc. In particular, the loss of validation gives information about whether the neural network 10 is still learning or rather an overadaptation occurs.

Die Lernentwicklung des neuronalen Netzes 10 wird für jede der Dekodieraufgaben 18 unabhängig bestimmt. Die Lernentwicklung des neuronalen Netzes 10 wird außerdem für jeden der Anwendungskontexte unabhängig bestimmt, d.h. für Trainingsdaten 20, die dem Zielanwendungskontext entsprechen, und für Trainingsdaten 20, die nicht dem Zielanwendungskontext entsprechen. Daher wird die Lernentwicklung für jede Dekodieraufgabe 18 und für die Trainingsdaten 20 aus dem ersten und dem zweiten Datenstrom 24, 26 bestimmt.The learning development of the neural network 10 is used for each of the decoding tasks 18th independently determined. The learning development of the neural network 10 is also determined independently for each of the application contexts, ie for training data 20th corresponding to the target application context and for training data 20th that do not match the target application context. Therefore, the learning progression for each decoding task will be 18th and for the training data 20th from the first and the second data stream 24 , 26th certainly.

Das Bestimmen der Lernentwicklung des neuronalen Netzes 10 weist das Anwenden einer Lernrate auf. Die Lernrate entspricht einem numerischen Wert. Weitere Einzelheiten werden nachstehend erläutert.Determining the learning development of the neural network 10 includes applying a learning rate. The learning rate corresponds to a numerical value. Further details are provided below.

Schritt S160 bezieht sich auf das Aktualisieren des Kombinationsschemas zum Kombinieren der Sub-Mini-Stapel 34, 36 von Trainingsdaten 20 in Abhängigkeit von der Lernentwicklung des neuronalen Netzes 10. Der Schritt zum Aktualisieren des neuronalen Netzes 10 wird nach jeder Trainingssequenz mit einem Mini-Stapel 32 ausgeführt, wobei für jede Dekodieraufgabe 18 z.B. die Aufgabenverluste von jedem Sub-Mini-Stapel 34, 36 ermittelt werden. Somit wird das Verhältnis für nachfolgende Mini-Stapel 32 von Trainingsdaten 20 in Abhängigkeit von der Lernentwicklung des neuronalen Netzes 10 aktualisiert.Step S160 relates to updating the combination scheme to combine the sub-mini-stacks 34 , 36 of training data 20th depending on the learning development of the neural network 10 . The step of updating the neural network 10 comes with a mini-stack after each training sequence 32 executed, for each decoding task 18th e.g. the task losses from each sub-mini-batch 34 , 36 be determined. Thus the ratio for subsequent mini-stacks will be 32 of training data 20th depending on the learning development of the neural network 10 updated.

Das Verhältnis, d.h. P_Ziel, wird größer, wenn das neuronale Netz 10 positiv lernt, d.h. L_Ziel abnimmt, und kleiner, wenn es negativ lernt, d.h. L_Ziel zunimmt. Gleichzeitig soll das neuronale Netz 10 gut auf D_Andere verallgemeinern. L_Ziel bezieht sich auf einen (auf eine bestimmte Anzahl von Epochen) geglätteten Validierungsverlust bezüglich D_Ziel, und L_Andere bezieht sich auf einen geglätteten Validierungsverlust bezüglich D_Andere.The ratio, ie P _target , becomes larger when the neural network 10 learns positively, ie L _target decreases, and smaller when it learns negatively, ie L _target increases. At the same time, the neural network should 10 good to generalize _{to D others.} L _goal refers to a smoothed loss of validation (over a certain number of epochs) with respect to D _goal , and L _other refers to a smoothed loss of validation with respect to D _others .

Das Aktualisieren des Kombinationsschemas bezieht sich auf ein Aktualisieren des Verhältnisses und wird in Abhängigkeit von der Lernentwicklung des neuronalen Netzes 10 für jede Dekodieraufgabe 18 individuell ausgeführt. Das Aktualisieren des Kombinationsschemas wird unter Anwendung einer Lernrate ausgeführt. Die Lernrate ist ein numerischer Wert. Die Lernrate ist ein Maß für eine Veränderung der Zusammensetzung der Sub-Mini-Stapel 34, 36. Eine höhere Lernrate ermöglicht im Allgemeinen eine schnellere Konvergenz, erhöht aber das Risiko der Überanpassung einer bestimmten Art von Trainingsdaten 20. Das Aktualisieren des Verhältnisses für die Auswahl der Trainingsdaten 20 aus der Menge 38, 40 erfolgt in dieser Ausführungsform durch Multiplizieren eines Gradienten des Validierungsverlustes mit der Lernrate. Dieses Produkt wird dann von einem aktuellen Verhältnis subtrahiert, z.B. $P_{Ziel, neu} = P_{Ziel, aktuell} - Lernrate * Gradient (L_{Ziel})$

The updating of the combination scheme refers to updating the ratio and becomes dependent on the learning development of the neural network 10 for every decoding task 18th individually executed. The updating of the combination scheme is carried out using a learning rate. The learning rate is a numerical value. The learning rate is a measure of a change in the composition of the sub-mini-

stacks

34 , 36 . A higher learning rate generally allows for faster convergence, but increases the risk of overfitting a particular type of training data 20th . Updating the ratio for the selection of the training data 20th from the

crowd

38 , 40 takes place in this embodiment by multiplying a gradient of the validation loss by the learning rate. This product is then subtracted from a current ratio, e.g.

{P.}_{aim, New} = {P.}_{aim, current} - Learning rate * gradient ({L.}_{aim})

Eine mögliche Entwicklung des Verhältnisses, d.h. eine Entwicklung der Wahrscheinlichkeiten P_Ziel und P_Andere, ist beispielhaft in 5 über mehrere Trainingsepochen dargestellt.A possible development of the relationship, ie a development of the probabilities P _target and P _other , is exemplified in 5 shown over several training periods.

Schritt S170 bezieht sich auf das Beenden des Trainings, falls die Lernentwicklung anzeigt, dass sich das neuronale Netz 10 nicht weiter verbessert, insbesondere, wenn die Lernentwicklung anzeigt, dass sich das neuronale Netz 10 in Bezug auf mindestens eine der Dekodieraufgaben 18 nicht weiter verbessert. Daher wird das Training abgebrochen, so dass eine Überanpassung des neuronalen Netzes vermieden wird. Das neuronale Netz 10 wird als sich nicht weiter verbessernd betrachtet, wenn der Gradient des Validierungsverlustes für das Trainieren des neuronalen Netzes 10 mit Trainingsdaten 20 aus der ersten Menge 38 und für das Trainieren des neuronalen Netzes 10 mit Trainingsdaten 20 aus der zweiten Menge 40 größer als null ist. Eine Überanpassung kann z.B. erfasst werden, wenn der Validierungsverlust anfängt zu steigen, statt zu sinken.Step S170 relates to ending the training if the learning history indicates that the neural network is working 10 no further improvement, especially if the learning curve shows that the neural network is changing 10 in relation to at least one of the decoding tasks 18th not further improved. The training is therefore aborted so that overadaptation of the neural network is avoided. The neural network 10 is regarded as not improving further if the gradient of the loss of validation for the training of the neural network 10 with training data 20th from the first set 38 and for training the neural network 10 with training data 20th from the second set 40 is greater than zero. An overfitting can be detected, for example, when the validation loss starts to increase instead of decreasing.

Andernfalls wird das Training wie vorstehend diskutiert fortgesetzt.Otherwise, the training continues as discussed above.

BezugszeichenlisteList of reference symbols

1010: neuronales Netzneural network
1212th: KodiererEncoder
1414th: EingabedatenInput data
1616: DekodiererDecoder
1818th: DekodieraufgabeDecoding task
2020th: TrainingsdatenTraining data
2222nd: TrainingsdatensatzTraining data set
2424: erster (Erfassungs-) Datenstromfirst (acquisition) data stream
2626th: zweiter (Segmentierungs-) Datenstromsecond (segmentation) data stream
2828: Ausgabeoutput
3232: Mini-StapelMini pile
3434: Sub-Mini-StapelSub-mini stacks
3636: Sub-Mini-StapelSub-mini stacks
3838: Menge von Trainingsdaten, die dem Zielanwendungskontext entsprechenSet of training data that corresponds to the target application context
4040: Menge von Trainingsdaten, die nicht dem Zielanwendungskontext entsprechenAmount of training data that does not correspond to the target application context

Claims

Method for training a neural network (10), in particular a convolutional neural network for use in a driving support system of a vehicle, with an encoder (12) for coding input data (14) provided and several decoders (16), each of which has a decoding task (18) with annotated training data (20) as input data (14), each of the multiple decoding tasks (18) having a target application context, with the steps: Providing several sets (22) of differently annotated training data (20) for training the several decoding tasks (18); Arranging the multiple sets (22) of differently annotated training data (20) in data streams (24, 26), wherein for training each of the multiple decoding tasks (18) for a data stream (24, 26) a first set (38) with training data (20) which correspond to a target application context of the neural network (10), and additionally a second set (40) with training data (20) which do not correspond to the target application context of the neural network (10) are provided; Generating mini-batches (32) of training data (20) comprising a plurality of sub-mini-batches (34, 36) of training data (20) for training each of the decoding tasks (18); Generating the sub-mini-batches (34, 36) of training data (20) for training each of the decoding tasks (18) based on the set (38) in combination with the set (40), the training data (20) for the jointly trained decoding tasks (18) are combined according to a combination scheme in order to provide the respective sub-mini-stacks (34, 36); Training the neural network (10) using the generated mini-stacks (32); and Updating the combination scheme for combining the sub-mini-stacks (34, 36) of training data (20) as a function of a learning development of the neural network (10).

Method according to the preceding Claim 1 , characterized in that combining the training data (20) of the data stream (24, 26) for the jointly trained decoding tasks (18) according to a combination scheme for providing the respective sub-mini-stack (34, 36) provides an individual combination scheme for each of the jointly trained decoding tasks (18) has; and updating the combination scheme for combining the sub-mini-stacks (34, 36) of training data (20) as a function of a learning development of the neural network (10) individually updating the combination scheme as a function of the learning development of the neural network (10) for each of the jointly trained decoding tasks (18).

Method according to one of the preceding Claims 1 or 2 , characterized in that the combining of the training data (20) of the data stream (24, 26) for the jointly trained decoding tasks (18) according to a combination scheme for providing the respective sub-mini-stack (34, 36) the selection of the training data (20 ) from the set (38, 40) according to a ratio; and updating the combination scheme for combining the sub-mini-stacks (34, 36) of training data (20) as a function of a learning development of the neural network (10) comprises updating the ratio.

Method according to the preceding Claim 3 , characterized in that the selection of the training data (20) from the first and the second set (38, 40) according to a ratio generating a random number, in particular between zero and one, for each sample of training data (20) corresponding to the respective Sub-mini-batches (34, 36) are to be added and the sampling of training data (20) to be added from the first or the second set (38, 40) depending on whether the random number is greater than the ratio or not.

Method according to one of the preceding claims, characterized in that the updating of the combination scheme for combining the sub-mini-stacks (34, 36) of training data (20) as a function of a learning development of the neural network (10) comprises applying a learning rate.

Method according to one of the preceding claims, characterized in that the combining of the training data (20) of the first and second sets (38, 40) for the jointly trained decoding tasks (18) according to a combination scheme for providing the respective sub-mini-stack ( 34, 36) comprises reusing training data (20) of the first and / or the second data stream (24, 26) of training data (20) for the respective decoding task (18).

Method according to one of the preceding claims, characterized in that the arrangement of the multiple sets (22) of differently annotated training data (20) in data streams (24, 26), a first set (38) of training data (20) corresponding to the target application context of the neural network (10) and additionally a second set (40) of training data (20) being provided for training each of the multiple decoding tasks (18) which do not correspond to the target application context of the neural network (10), combining all sets (22) of training data (20), which correspond to a specific decoding task (18) and the respective target application context, in the respective first set (38) and the Combining all sets (22) of training data (20) which correspond to the specific decoding task (18) and do not correspond to the respective target application in the respective second set (40).

Method according to one of the preceding claims, characterized in that the training of the neural network (10) using the generated mini-stacks (32) comprises training of the neural network (10) in a synchronous back-propagation manner; and the updating of the combination scheme for the sub-mini-batches (34, 36) of training data (20) as a function of a learning development of the neural network (10) a dynamic update of the combination scheme for subsequent mini-batches (32) of training data (20 ) as a function of a learning development of the neural network (10).

Method according to one of the preceding claims, characterized in that the generation of mini-stacks (32) of training data (20) which have a plurality of sub-mini-stacks (34, 36) of training data (20) for training each of the decoding tasks, comprises balancing the sub-mini-stacks (34, 36) for each of the mini-stacks (32) according to a balancing scheme.

Method according to one of the preceding claims, characterized in that the method has a step of determining a learning development of the neural network (10) as a function of a loss of validation and / or of a validation key performance indicator of the neural network (10).

Method according to the preceding Claim 10 , characterized in that the determination of a learning development of the neural network (10) as a function of a loss of validation and / or of a validation key performance indicator of the neural network (10) the independent determination of the learning development of the neural network (10) for each of the decoding tasks ( 18).

Method according to one of the preceding Claims 10 or 11 , characterized in that the determination of a learning development of the neural network (10) as a function of a loss of validation and / or of a validation key performance indicator of the neural network (10) comprises the independent determination of the learning development of the neural network (10) for each of the application contexts .

Method according to one of the preceding Claims 10 to 12th , characterized in that the determination of a learning development of the neural network (10) as a function of a loss of validation and / or of a validation key performance indicator of the neural network (10) includes the application of a learning rate.

Method according to one of the preceding claims, characterized in that the method has a step for terminating the training if the learning development indicates that the neural network (10) is no longer improving, in particular if the learning development indicates that the neural network ( 10) not further improved with regard to at least one of the decoding tasks (18).