DE102021100940A1

DE102021100940A1 - Reconstructing the environment of an environment sensor system

Info

Publication number: DE102021100940A1
Application number: DE102021100940.5A
Authority: DE
Inventors: Arindam Das; Sankaralingam Madasamy; Senthil Kumar Yogamani
Original assignee: Connaught Electronics Ltd
Current assignee: Connaught Electronics Ltd
Priority date: 2021-01-19
Filing date: 2021-01-19
Publication date: 2022-07-21

Abstract

Eine computerimplementiertes Verfahren zum Trainieren eines künstlichen neuronalen Netzwerks (10'), um einen generischen Zustand einer Umgebung eines Umfeldsensorsystems (4) zu rekonstruieren, beinhaltet das Bereitstellen eines künstlichen neuronalen Netzwerks (10') mit einem Encodermodul (6') und einem Decodermodul (7'). Für eine Vielzahl von aufeinanderfolgenden Trainings-Frames werden entsprechende Sensordatensätze, die die Umgebung darstellen, von dem Umfeldsensorsystem (4) erhalten. Für jeden der Trainings-Frames wird eine erste Merkmalsabbildung (11a, 11b) durch Anwendung des Encodermoduls (6') auf den jeweiligen Sensordatensatz (8a, 8b) erzeugt. Ein rekonstruierter Datensatz (13) wird durch das Decodermodul (7') abhängig von den ersten abgebildeten Merkmalsabbildungen (11a, 11b) erzeugt. Das künstliche neuronale Netzwerk (10') wird in Abhängigkeit von dem rekonstruierten Datensatz (13) unüberwacht trainiert, um ein dynamisches Objekt (14) in einem die Umgebung darstellenden Eingabedatensatz zu entfernen.A computer-implemented method for training an artificial neural network (10') in order to reconstruct a generic state of an environment of an environment sensor system (4), includes the provision of an artificial neural network (10') with an encoder module (6') and a decoder module ( 7'). Corresponding sensor data records, which represent the environment, are obtained from the environment sensor system (4) for a large number of consecutive training frames. A first feature mapping (11a, 11b) is generated for each of the training frames by applying the encoder module (6') to the respective sensor data set (8a, 8b). A reconstructed data record (13) is generated by the decoder module (7') depending on the first imaged feature images (11a, 11b). The artificial neural network (10') is trained unsupervised as a function of the reconstructed data set (13) in order to remove a dynamic object (14) in an input data set representing the environment.

Description

Die vorliegende Erfindung richtet sich auf ein computerimplementiertes Verfahren um ein künstliches neuronalen Netzwerk zur Rekonstruktion eines generischen Zustands einer Umgebung eines Umfeldsensorsystems zu trainieren. Die Erfindung bezieht sich ferner auf ein Verfahren zum zumindest teilweise automatischen Führen eines Kraftfahrzeugs, auf ein elektronisches Fahrzeugführungssystem und auf ein Computerprogrammprodukt.The present invention is directed to a computer-implemented method for training an artificial neural network to reconstruct a generic state of an environment of an environmental sensor system. The invention also relates to a method for at least partially automatically driving a motor vehicle, to an electronic vehicle guidance system and to a computer program product.

Für automatische oder teilautomatische Fahrfunktionen für ein Kraftfahrzeug können zuvor generierte und gespeicherte Trajektorien verwendet werden. Zum Beispiel werden bei der visuellen gleichzeitigen Ortung und Kartierung, VSLAM, Trajektorien erstellt und gespeichert und danach, wenn die gleiche Situation oder Umgebung erkannt wird, kann die gespeicherte Trajektorie zur zumindest teilweisen automatischen Steuerung des Kraftfahrzeugs verwendet werden. Mögliche Anwendungsfälle sind beispielsweise automatische oder halbautomatische Parkanwendungen.Previously generated and stored trajectories can be used for automatic or semi-automatic driving functions for a motor vehicle. For example, in Visual Simultaneous Location and Mapping, VSLAM, trajectories are created and stored and thereafter, when the same situation or environment is recognized, the stored trajectory can be used for at least partially automatic control of the motor vehicle. Possible use cases are, for example, automatic or semi-automatic parking applications.

VSLAM an sich ist eine bekannte Technik, um eine Umgebung als Karte zu erstellen und den physischen Standort des Fahrzeugs zu identifizieren. Während einer Trainingsphase kann der VSLAM-Algorithmus Schlüsselpunkte oder Schlüsselmerkmale aus Bildern, die von einer oder mehreren Kameras des Kraftfahrzeugs aufgenommen wurden, und/oder aus anderen Umgebungssensordaten, wie beispielsweise Lidar-Punktwolken, extrahieren und gleichzeitig eine Karte für die erfasste Umgebung erstellen. Wenn während einer Wiedergabephase festgestellt wird, dass sich das Fahrzeug wieder in der gleichen Umgebung befindet, versucht der VSLAM-Algorithmus, die zuvor extrahierten Schlüsselpunkte zu erkennen. Dann kann eine während der Trainingsphase erzeugte Trajektorie wiedergegeben werden.VSLAM in itself is a well-known technique to map an environment and identify the vehicle's physical location. During a training phase, the VSLAM algorithm can extract key points or key features from images captured by one or more cameras of the motor vehicle and/or from other environmental sensor data, such as lidar point clouds, while simultaneously creating a map for the sensed environment. If during a playback phase it is determined that the vehicle is back in the same environment, the VSLAM algorithm tries to recognize the previously extracted key points. A trajectory generated during the training phase can then be played back.

Ein Problem bei diesem und ähnlichen Ansätzen, die auf zuvor trainierten Trajektorien basieren, besteht darin, dass die während der Wiedergabephase extrahierten Schlüsselpunkte mit den während der Trainingsphase extrahierten Schlüsselpunkten identisch sein müssen. Im Laufe der Zeit kann es jedoch notwendig werden, die anfänglich trainierten Trajektorien zu ändern, beispielsweise wenn neue Objekte in der Umgebung auftauchen oder bereits vorhandene Objekte nicht mehr vorhanden sind. Bei den gegenwärtigen Ansätzen kann der VSLAM-Algorithmus in solchen Fällen während der Wiedergabephase versagen.A problem with this and similar approaches based on previously trained trajectories is that the key points extracted during the rendering phase must be identical to the key points extracted during the training phase. However, over time it may become necessary to change the initially trained trajectories, for example when new objects appear in the environment or already existing objects are no longer present. With the current approaches, the VSLAM algorithm may fail during the playback phase in such cases.

Darüber hinaus können ähnliche Probleme auftreten, wenn das Erscheinungsbild der Umgebung durch Wetterbedingungen beeinträchtigt wird, die sich von den Wetterbedingungen in der Trainingsphase unterscheiden. Auch in diesem Fall können die Schlüsselpunkte möglicherweise nicht mit ausreichender Zuverlässigkeit bestimmt werden. Mit anderen Worten funktionieren der VSLAM-Ansatz und ähnliche Ansätze nur dann zuverlässig, wenn die Umgebung des Kraftfahrzeugs absolut gleich bleibt.In addition, similar problems can occur when the appearance of the environment is affected by weather conditions that differ from the weather conditions in the training phase. In this case, too, the key points may not be determined with sufficient reliability. In other words, the VSLAM approach and similar approaches only work reliably if the vehicle's environment remains absolutely the same.

Eine Aufgabe der vorliegenden Erfindung ist es daher, ein verbessertes Konzept zur Verarbeitung von Daten eines Umfeldsensorsystems bereitzustellen, das eine zuverlässigere Erkennung relevanter Merkmale in der Umgebung des Umfeldsensorsystems ermöglicht.It is therefore an object of the present invention to provide an improved concept for processing data from an environment sensor system that enables more reliable detection of relevant features in the environment of the environment sensor system.

Diese Aufgabe wird durch den jeweiligen Gegenstand der unabhängigen Ansprüche erreicht. Weitere Implementierungen und bevorzugte Ausführungsformen sind Gegenstand der abhängigen Ansprüche.This object is achieved by the respective subject matter of the independent claims. Further implementations and preferred embodiments are subject of the dependent claims.

Das verbesserte Konzept basiert auf der Idee, ein künstliches neuronales Netzwerk zu verwenden, um einen generischen Zustand der Umgebung eines Umfeldsensorsystems zu rekonstruieren.The improved concept is based on the idea of using an artificial neural network to reconstruct a generic state of the environment of an environmental sensor system.

Nach dem verbesserten Konzept wird ein computerimplementiertes Verfahren, um ein künstliches neuronalen Netzwerk zur Rekonstruktion eines generischen Zustands einer Umgebung eines Umfeldsensorsystems zu trainieren, angegeben. Zu diesem Zweck wird ein künstliches neuronales Netzwerk, insbesondere ein untrainiertes oder teilweise trainiertes künstliches neuronales Netzwerk, bereitgestellt, beispielsweise gespeichert auf einer Speichereinheit. Dabei ist das künstliche neuronale Netzwerk als Variations-Autoencoder ausgelegt, der ein Encodermodul und einem Decodermodul beinhaltet. Für jeden von mehreren aufeinanderfolgenden Trainings-Frames wird von dem Umfeldsensorsystem, insbesondere von einer Recheneinheit, jeweils ein die Umgebung repräsentierender Sensordatensatz empfangen. Für jeden der Trainings-Frames wird, insbesondere durch die Recheneinheit, eine erste Merkmalsdarstellung erzeugt, indem das Encodermodul auf den Sensordatensatz des jeweiligen Trainings-Frames angewendet wird. Ein rekonstruierter Datensatz wird, insbesondere durch die Recheneinheit, durch das Decodermodul in Abhängigkeit von den ersten Merkmalsdarstellungen, insbesondere in Abhängigkeit von allen ersten Merkmalsdarstellungen der Trainings-Frames, erzeugt. Das künstliche neuronale Netzwerk wird, insbesondere durch die Recheneinheit, in Abhängigkeit von dem rekonstruierten Datensatz unüberwacht trainiert, um ein dynamisches Objekt in einem Eingabedatensatz, der die Umgebung repräsentiert, zu entfernen.According to the improved concept, a computer-implemented method is specified for training an artificial neural network to reconstruct a generic state of an environment of an environment sensor system. For this purpose, an artificial neural network, in particular an untrained or partially trained artificial neural network, is provided, for example stored on a memory unit. The artificial neural network is designed as a variation autoencoder that contains an encoder module and a decoder module. For each of several consecutive training frames, a sensor data set representing the surroundings is received by the surroundings sensor system, in particular by a computing unit. A first feature representation is generated for each of the training frames, in particular by the computing unit, in that the encoder module is applied to the sensor data set of the respective training frame. A reconstructed data set is generated, in particular by the processing unit, by the decoder module as a function of the first feature representations, in particular as a function of all first feature representations of the training frames. The artificial neural network is trained unsupervised, in particular by the computing unit, as a function of the reconstructed data set in order to remove a dynamic object in an input data set that represents the environment.

Nachdem das Training des künstlichen neuronalen Netzwerkes abgeschlossen ist oder einen Zielzustand erreicht hat, kann das trainierte künstliche neuronale Netzwerk auf einer Speichereinheit gespeichert werden, zum Beispiel als Teil eines entsprechenden Inferenzmodells.After the training of the artificial neural network has been completed or has reached a target state, the trained artificial neural network can be stored on a storage unit, for example as part of a corresponding inference model.

Der generische Zustand der Umgebung des Umfeldsensorsystems kann als eine Version der Umgebung ohne dynamische Objekte betrachtet werden. Der generische Zustand kann einem gegenwärtigen früheren Zustand der Umgebung oder einem hypothetischen früheren Zustand der Umgebung entsprechen. Insbesondere kann der generische Zustand als ein Zustand der Umgebung betrachtet werden, bei dem alle dynamischen Objekte oder zumindest annähernd alle dynamischen Objekte entfernt oder nicht vorhanden sind.The generic state of the environment of the environment sensor system can be viewed as a version of the environment without dynamic objects. The generic state may correspond to a current prior state of the environment or a hypothetical prior state of the environment. In particular, the generic state can be viewed as a state of the environment in which all dynamic objects or at least approximately all dynamic objects are removed or not present.

Ein dynamisches Objekt kann als ein Objekt betrachtet werden, das in der Umgebung nicht statisch ist. Mit anderen Worten, das dynamische Objekt bewegt sich oder ändert seine Position und/oder Orientierung innerhalb eines statischen Umgebungskoordinatensystems. Mit anderen Worten, wenn sich das Umfeldsensorsystem in Bezug auf die Umgebung nicht bewegt, hat ein dynamisches Objekt eine zeitabhängige Position und/oder Orientierung in einem Sensor-Koordinatensystem, das starr mit dem Umfeldsensorsystem verbunden ist.A dynamic object can be viewed as an object that is not static in its environment. In other words, the dynamic object moves or changes its position and/or orientation within a static environment coordinate system. In other words, when the environment sensor system is not moving with respect to the environment, a dynamic object has a time-dependent position and/or orientation in a sensor coordinate system that is rigidly connected to the environment sensor system.

Die Umgebung des Umfeldsensorsystems entspricht einer gegebenen Szene in der Welt, die durch eine Pose, das heißt Position und Orientierung, des Umfeldsensorsystems in der Welt, das heißt in dem Umgebungskoordinatensystem, und gegebenenfalls durch das Sichtfeld des Umfeldsensorsystems definiert ist. Im Verlauf der aufeinanderfolgenden Trainings-Frames ändert sich die Pose des Umfeldsensorsystems nicht wesentlich oder, mit anderen Worten, Änderungen der Pose sind vernachlässigbar.The environment of the environment sensor system corresponds to a given scene in the world defined by a pose, i.e. position and orientation, of the environment sensor system in the world, i.e. in the environment coordinate system, and possibly by the field of view of the environment sensor system. Over the course of the successive training frames, the pose of the environment sensor system does not change significantly, or in other words, changes in pose are negligible.

Unter einem Umfeldsensorsystem kann ein Sensorsystem verstanden werden, das in der Lage ist, Sensordaten oder Sensorsignale zu erzeugen, die die Umgebung des Umfeldsensorsystems abbilden oder darstellen. Zum Beispiel können Kameras, Lidarsysteme, Radarsysteme und Ultraschallsensorsysteme als Umfeldsensorsysteme angesehen werden.An environment sensor system can be understood to mean a sensor system that is able to generate sensor data or sensor signals that map or represent the environment of the environment sensor system. For example, cameras, lidar systems, radar systems, and ultrasonic sensor systems can be viewed as environment sensor systems.

Vorzugsweise beinhaltet das Umfeldsensorsystem nach dem verbesserten Konzept eine oder mehrere Kameras und/oder eines oder mehrere Lidarsysteme. Folglich kann hier und im Folgenden ein Sensordatensatz eines oder mehrere Kamerabilder und/oder eines oder mehrere Lidar-Bilder oder Lidar-Punktwolken beinhalten. Beispielsweise können die Trainings-Frames Kamera-Frames oder Video-Frames oder Lidar-Scan-Frames entsprechen.According to the improved concept, the environment sensor system preferably includes one or more cameras and/or one or more lidar systems. Consequently, here and below, a sensor data set can include one or more camera images and/or one or more lidar images or lidar point clouds. For example, the training frames can correspond to camera frames or video frames or lidar scan frames.

Insbesondere kann ein Sensordatensatz von einem oder mehreren Sensoren des Umfeldsensorsystems erzeugt werden. Mit anderen Worten, ein Sensordatensatz kann einem fusionierten Sensordatensatz entsprechen.In particular, a sensor data set can be generated by one or more sensors of the surroundings sensor system. In other words, a sensor data set can correspond to a merged sensor data set.

Das Encodermodul und das Decodermodul können selbst als künstliche neuronale Netzwerke oder als Teilnetzwerke des künstlichen neuronalen Netzwerkes betrachtet werden. Neben dem Encoder- und dem Decodermodul kann das künstliche neuronale Netzwerk auch eines oder mehrere weitere Unternetzwerke oder Module umfassen.The encoder module and the decoder module can themselves be regarded as artificial neural networks or as sub-networks of the artificial neural network. In addition to the encoder and the decoder module, the artificial neural network can also include one or more further sub-networks or modules.

Im Allgemeinen ist das Encodermodul eines Autoencoders so ausgelegt, dass es einen Code erzeugt, der die als Eingabe bereitgestellten Daten repräsentiert, wobei der Code eine geringere Dimension hat als die als Eingabe bereitgestellten Daten. Das Decodermodul ist so ausgelegt, dass es die als Eingabe bereitgestellten Daten in Abhängigkeit von dem niedrigdimensionalen Code rekonstruiert. In diesem Zusammenhang bedeutet Rekonstruktion jedoch nicht unbedingt, dass die als Eingabe gelieferten Daten identisch wiedergegeben werden. Insbesondere kann der Autoencoder die als Eingabe bereitgestellten Daten in einer definierten Weise modifizieren oder korrigieren. Nach dem verbesserten Konzept kann diese Modifikation die Entfernung des dynamischen Objekts umfassen. Im Gegensatz zu einem regulären Autoencoder erzeugt das Encodermodul eines Variations-Autoencoders nicht einen einzelnen Code, sondern eine Wahrscheinlichkeitsverteilung über Codes. Das Decodermodul eines Variations-Autoencoders kann einen Code aus dieser Wahrscheinlichkeitsverteilung abtasten und versuchen, auf der Grundlage dieses abgetasteten Codes zu rekonstruieren. Im Zusammenhang mit dem verbesserten Konzept können die ersten Merkmalsdarstellungen entsprechende Teile der Codes darstellen.In general, the encoder module of an autoencoder is designed to generate a code that represents the data provided as input, where the code has a smaller dimension than the data provided as input. The decoder module is designed to reconstruct the data provided as input in dependence on the low-dimensional code. In this context, however, reconstruction does not necessarily mean that the data provided as input is reproduced identically. In particular, the autoencoder can modify or correct the data provided as input in a defined way. According to the improved concept, this modification can include the removal of the dynamic object. In contrast to a regular autoencoder, the encoder module of a variation autoencoder does not generate a single code, but rather a probability distribution over codes. The decoder module of a variational autoencoder can sample a code from this probability distribution and attempt to reconstruct it based on this sampled code. In the context of the improved concept, the first feature representations can represent corresponding parts of the codes.

Für das Training des künstlichen neuronalen Netzwerkes kann das Encodermodul auch als mehrere Encoder-Submodule beinhaltend betrachtet werden, eines für jeden der Trainings-Frames. Die Sensordatensätze der jeweiligen Trainings-Frames werden dem entsprechenden Encoder-Submodul unabhängig voneinander zur Verfügung gestellt.For the training of the artificial neural network, the encoder module can also be considered as including several encoder sub-modules, one for each of the training frames. The sensor data records of the respective training frames are made available to the corresponding encoder submodule independently of one another.

Insbesondere kann das Encodermodul für jeden der Trainings-Frames mehrere Merkmalsdarstellungen erzeugen, einschließlich der ersten Merkmalsdarstellung. Dabei stellt für jeden der Trainings-Frames eine der jeweiligen Merkmalsdarstellungen ein bestimmtes Merkmal in der Umgebung beziehungsweise im jeweiligen Sensordatensatz dar. Insbesondere kann jedes Encoder-Submodul die verschiedenen Merkmalsdarstellungen für den entsprechenden Trainings-Frames generieren. Die ersten Merkmalsdarstellungen der verschiedenen Trainings-Frames entsprechen demselben Merkmal in der Umgebung oder stellen dasselbe Merkmal dar. Dasselbe gilt analog für alle übrigen Merkmalsdarstellungen. Insbesondere stellt jede Merkmalsdarstellung für einen bestimmten Trainings-Frame ein entsprechendes Merkmal in der Umgebung dar, und für jeden Trainings-Frames wird eine entsprechende Merkmalsdarstellung erzeugt, die dasselbe Merkmal darstellt.In particular, the encoder module can generate multiple feature representations for each of the training frames, including the first feature representation. For each of the training frames, one of the respective feature representations represents a specific feature in the environment or in the respective sensor data set. Ins in particular, each encoder submodule can generate the different feature representations for the corresponding training frames. The first feature representations of the different training frames correspond to the same feature in the environment or represent the same feature. The same applies analogously to all other feature representations. In particular, for a given training frame, each feature representation represents a corresponding feature in the environment, and for each training frame, a corresponding feature representation is generated that represents the same feature.

Dass das künstliche neuronale Netzwerk unüberwacht trainiert wird, um das dynamische Objekt im Eingabedatensatz zu entfernen, kann so verstanden werden, dass das künstliche neuronale Netzwerk mit dem Ziel trainiert wird, das dynamische Objekt im Eingabedatensatz entfernen zu können. Dabei handelt es sich bei dem Eingabedatensatz nicht um einen der Sensordatensätze entsprechend den Trainings-Frames, sondern um einen, insbesondere hypothetischen, Eingabedatensatz, der dem künstlichen neuronalen Netzwerk nach dem Training zugeführt werden kann und die gleiche Umgebung wie die Sensordatensätze der Trainings-Frames darstellt.The fact that the artificial neural network is trained unsupervised to remove the dynamic object in the input data set can be understood as training the artificial neural network with the aim of being able to remove the dynamic object in the input data set. The input data set is not one of the sensor data sets corresponding to the training frames, but rather an, in particular hypothetical, input data set that can be fed to the artificial neural network after training and represents the same environment as the sensor data sets of the training frames .

Das unüberwachte Training kann beispielsweise iterativ durchgeführt werden, wobei Netzwerkparameter des künstlichen neuronalen Netzwerkes, insbesondere des Encodermoduls und/oder des Decodermoduls, in jeder Iteration oder Trainingsepoche so angepasst werden, dass eine vordefinierte Verlustfunktion minimiert wird. Die Netzwerkparameter können zum Beispiel entsprechende Gewichtungsparameter oder Verzerrungsparameter enthalten. Bei der Verlustfunktion kann es sich um eine Standardverlustfunktion für das Training von Variations-Autoencodern handeln, wie beispielsweise eine Kreuz-Entropie Verlustfunktion.The unsupervised training can be carried out iteratively, for example, with network parameters of the artificial neural network, in particular of the encoder module and/or the decoder module, being adjusted in each iteration or training epoch such that a predefined loss function is minimized. For example, the network parameters may include appropriate weighting parameters or distortion parameters. The loss function can be a standard loss function for training variational autoencoders, such as a cross-entropy loss function.

Wenn der Eingangsdatensatz, der die Umgebung repräsentiert und das dynamische Objekt umfasst, nach Abschluss des Trainings dem künstlichen neuronalen Netzwerk, insbesondere dem Encodermodul, zugeführt wird, wird der Eingangsdatensatz vom Encodermodul und vom Decodermodul verarbeitet, um einen entsprechenden rekonstruierten Datensatz zu erzeugen, der das dynamische Objekt nicht enthält.If the input data set, which represents the environment and includes the dynamic object, is fed to the artificial neural network, in particular the encoder module, after the training has been completed, the input data set is processed by the encoder module and the decoder module in order to generate a corresponding reconstructed data set that dynamic object does not contain.

Hier und im Folgenden können alle Verfahrensschritte von der Recheneinheit durchgeführt werden, sofern nicht anders angegeben. Die Recheneinheit kann beispielsweise eine Recheneinheit des Umfeldsensorsystems sein oder mit dem Umfeldsensorsystem gekoppelt sein. Das Umfeldsensorsystem kann beispielsweise an einem Kraftfahrzeug angebracht sein oder von einem Kraftfahrzeug umfasst sein. Die Recheneinheit kann dann eine Recheneinheit des Kraftfahrzeugs sein und beispielsweise von einer elektronischen Steuereinheit, ECU, des Kraftfahrzeugs umfasst sein.Here and in the following, all method steps can be carried out by the computing unit, unless stated otherwise. The processing unit can, for example, be a processing unit of the surroundings sensor system or can be coupled to the surroundings sensor system. The environment sensor system can be attached to a motor vehicle, for example, or can be included in a motor vehicle. The computing unit can then be a computing unit of the motor vehicle and, for example, be included in an electronic control unit, ECU, of the motor vehicle.

Neben dem trainierten künstlichen neuronalen Netzwerk oder Teilen davon können weitere Informationen über die gegenwärtige Umgebung, für die das Training durchgeführt wurde, gespeichert werden, um das Inferenzmodell zu generieren. Die weiteren Informationen können beispielsweise eine Pose des Umfeldsensorsystems und/oder des Kraftfahrzeugs beziehungsweise Kalibrierdaten für das Umfeldsensorsystem, insbesondere eine Pose des Umfeldsensorsystems in Bezug auf das Kraftfahrzeug, einen Zeitstempel oder andere Zeitinformationen und so weiter umfassen.In addition to the trained artificial neural network or parts thereof, further information about the current environment for which the training has been carried out can be stored in order to generate the inference model. The additional information can include, for example, a pose of the surroundings sensor system and/or the motor vehicle or calibration data for the surroundings sensor system, in particular a pose of the surroundings sensor system in relation to the motor vehicle, a time stamp or other time information and so on.

Das computerimplementierte Verfahren für das Training des künstlichen neuronalen Netzwerkes nach dem verbesserten Konzept verwendet die zeitabhängige Information, die durch die aufeinanderfolgenden Trainings-Frames dargestellt wird, um den Variations-Autoencoder zu zwingen, zu lernen, was dynamische Objekte in der gegenwärtigen Umgebung des Umfeldsensorsystems sind und wie man folglich den generischen Zustand der Umgebung, das heißt die Umgebung ohne dynamische Objekte, rekonstruieren kann.The improved concept computer-implemented method for training the artificial neural network uses the time-varying information represented by the successive training frames to force the variational autoencoder to learn what dynamic objects are in the current environment of the environmental sensor system and consequently how to reconstruct the generic state of the environment, that is, the environment without dynamic objects.

Nachdem das Training des künstlichen neuronalen Netzwerkes in der beschriebenen Weise abgeschlossen ist, ist das trainierte künstliche neuronale Netzwerk in der Lage, eine generische Vorlage der Umgebung bereitzustellen, wenn sich das Fahrzeug oder das Umfeldsensorsystem wieder in der gleichen Umgebung befindet, unabhängig davon, ob sich die dynamische Situation geändert hat oder zusätzliche Objekte aufgetreten sind oder zuvor vorhandene Objekte verschwunden sind.After the training of the artificial neural network is completed in the manner described, the trained artificial neural network is able to provide a generic template of the environment when the vehicle or the environment sensor system is again in the same environment, regardless of whether the dynamic situation has changed or additional objects have appeared or previously existing objects have disappeared.

Daher kann der entsprechend rekonstruierte Datensatz für nachfolgende Trajektorienplanungsalgorithmen, beispielsweise im Rahmen von VSLAM, verwendet werden. Die Rekonstruktion der Umgebung oder des generischen Zustands der Umgebung kann als ein Vorverarbeitungsschritt betrachtet werden, der die vom Umfeldsensorsystem empfangenen gegenwärtigen Daten auf generische Weise aufbereitet, um reproduzierbare Ergebnisse und eine zuverlässigere Erkennung von Schlüsselmerkmalen in der Umgebung zu gewährleisten. Daher können zuvor trainierte Trajektorien auch in späteren Situationen verwendet werden, wenn sich die Wetterbedingungen geändert haben oder andere dynamische Objekte die Sensordaten beeinflussen.The correspondingly reconstructed data set can therefore be used for subsequent trajectory planning algorithms, for example in the context of VSLAM. The reconstruction of the environment or the generic state of the environment can be viewed as a pre-processing step that pre-processes the current data received from the environment sensor system in a generic way to ensure reproducible results and more reliable detection of key features in the environment. Therefore, previously trained trajectories can also be used in later situations when weather conditions have changed or other dynamic objects affect the sensor data.

Mit anderen Worten kann sichergestellt werden, dass wechselnde äußere Bedingungen oder Umweltbedingungen zwischen einem Zeitpunkt, zu dem eine Trajektorie anfänglich erzeugt und gespeichert wird, und einem Zeitpunkt, zu dem die Trajektorie wiedergegeben wird oder verwendet werden soll, die Zuverlässigkeit nicht beeinträchtigen oder sogar die Verwendung einer entsprechenden trajektorienbasierten Funktion behindern. Aus den gleichen Gründen kann die Häufigkeit des erforderlichen Nachtrainierens von Trajektorien verringert werden.In other words, it can be ensured that changing external conditions or Environmental conditions between a time when a trajectory is initially generated and stored and a time when the trajectory is played back or is to be used do not affect the reliability or even hinder the use of a corresponding trajectory-based function. For the same reasons, the frequency of required retraining of trajectories can be reduced.

Darüber hinaus kann durch die Verwendung des Variations-Autoencoders ein unüberwachtes Training eingesetzt werden, was besonders in Situationen nützlich ist, in denen die typischen Objekte für ein beaufsichtigtes Training schwierig zu annotieren sind. Der Variations-Autoencoder wird verwendet, um statische Objekte in den mehreren aufeinanderfolgenden Trainings-Frames zu analysieren und die dynamischen Objekte zu entfernen, indem sie effektiv als Teil des Rauschens betrachtet werden. Für die Auslegung des Encodermoduls beziehungsweise der Encoder-Submodule und des Decodermoduls können allgemein bekannte Architekturen verwendet werden. Insbesondere können Encodermodule oder Decodermodule auf der Basis von faltenden neuronalen Netzwerken, CNN, beispielsweise VGG-Netzwerken, verwendet werden.In addition, unsupervised training can be employed by using the variation autoencoder, which is particularly useful in situations where the typical objects for supervised training are difficult to annotate. The variational autoencoder is used to analyze static objects in the multiple consecutive training frames and remove the dynamic objects, effectively regarding them as part of the noise. Generally known architectures can be used for the design of the encoder module or the encoder submodules and the decoder module. In particular, encoder modules or decoder modules based on convolutional neural networks, CNN, for example VGG networks, can be used.

Gemäß mehreren Ausführungsformen ist das Umfeldsensorsystem ein Umfeldsensorsystem eines Kraftfahrzeugs und die Umgebung des Umfeldsensorsystems ist eine Umgebung des Kraftfahrzeugs.According to several specific embodiments, the environment sensor system is an environment sensor system of a motor vehicle and the environment of the environment sensor system is an environment of the motor vehicle.

Gemäß mehreren Ausführungsformen wird ein Merkmalspriorisierungsmodul des künstlichen neuronalen Netzwerkes auf eine Ausgabe des Encodermoduls angewandt, insbesondere auf eine gemeinsame Ausgabe aller Encoder-Submodule. Durch Anwendung des Merkmalspriorisierungsmoduls wird für jedes Paar der ersten Merkmalsdarstellungen eine jeweils erste Abweichung zwischen den ersten Merkmalsdarstellungen des jeweiligen Paares ermittelt. Die ersten Merkmalsdarstellungen, insbesondere jede der ersten Merkmalsdarstellungen, werden in Abhängigkeit von den ermittelten ersten Abweichungen modifiziert. Der rekonstruierte Datensatz wird durch Anwendung des Decodermoduls auf eine Ausgabe des Merkmalspriorisierungsmoduls erzeugt.According to several embodiments, a feature prioritization module of the artificial neural network is applied to an output of the encoder module, in particular to a common output of all encoder sub-modules. By using the feature prioritization module, a respective first deviation between the first feature representations of the respective pair is determined for each pair of the first feature representations. The first representations of features, in particular each of the first representations of features, are modified as a function of the determined first deviations. The reconstructed data set is generated by applying the decoder module to an output of the feature prioritization module.

Die Paare der ersten Merkmalsdarstellungen umfassen alle möglichen Paare der ersten Merkmalsdarstellungen, das heißt eine Anzahl von N*(N-1)/2 Paaren für eine Anzahl von N Trainings-Frames beziehungsweise ersten Merkmalsdarstellungen.The pairs of the first feature representations include all possible pairs of the first feature representations, ie a number of N*(N-1)/2 pairs for a number of N training frames or first feature representations.

Das Merkmalspriorisierungsmodul kann als Teil des künstlichen neuronalen Netzwerkes betrachtet werden, das zwischen dem Encodermodul und dem Decodermodul angeordnet ist. Während herkömmliche Variations-Autoencoder die Ausgabe der Encoder direkt an den Decoder liefern, ist das Merkmalspriorisierungsmodul zumindest für die Trainingsphase des künstlichen neuronalen Netzwerkes in entsprechenden Ausführungsformen zwischen ihnen angeordnet. Insbesondere ist das Merkmalspriorisierungsmodul so ausgelegt, dass bestimmte Merkmale beziehungsweise Merkmalsdarstellungen gegenüber anderen verstärkt werden. Zu diesem Zweck werden die jeweils ersten Abweichungen und gegebenenfalls entsprechende weitere Abweichungen zwischen Paaren von weiteren Merkmalsdarstellungen, die denselben Merkmalen entsprechen, berechnet und ausgewertet, um die Veränderungen bestimmter Merkmale während aufeinanderfolgender Trainings-Frames zu verstehen.The feature prioritization module can be considered part of the artificial neural network that is placed between the encoder module and the decoder module. While conventional variational autoencoders provide the output of the encoders directly to the decoder, the feature prioritization module is placed between them in respective embodiments, at least for the training phase of the artificial neural network. In particular, the feature prioritization module is designed in such a way that certain features or feature representations are reinforced in relation to others. For this purpose, the respective first deviations and possibly corresponding further deviations between pairs of further feature representations that correspond to the same features are calculated and evaluated in order to understand the changes in certain features during consecutive training frames.

Mit anderen Worten, je kleiner die erste Abweichung für ein bestimmtes Paar erster Merkmalsdarstellungen ist, desto weniger hat sich das jeweilige Merkmal zwischen den entsprechenden Rahmen verändert. Daher kann ein Mittelwert der ersten Abweichung als ein Maß dafür angesehen werden, wie dynamisch sich das jeweilige Merkmal während der aufeinander folgenden Trainings-Frames verhält. Diese Information kann dazu verwendet werden, die ersten Merkmalsdarstellungen zu verstärken oder, mit anderen Worten, die ersten Merkmalsdarstellungen zu priorisieren, insbesondere im Hinblick auf weitere Merkmalsdarstellungen. Infolgedessen wird während des Trainings des künstlichen neuronalen Netzwerkes die Bedeutung oder Priorität statischer Objekte, die über die aufeinanderfolgenden Trainings-Frames konsistent sind, betont, und Objekte, die inkonsistent oder dynamisch sind, erhalten eine stark reduzierte Priorität in den kodierten Merkmalen im jeweiligen latenten Raum. Dynamische Objekte können daher im Verlauf des Trainings vollständig entfernt werden.In other words, the smaller the first deviation for a given pair of first feature representations, the less the respective feature has changed between the corresponding frames. Therefore, an average of the first deviation can be taken as a measure of how dynamically the respective feature behaves during the successive training frames. This information can be used to reinforce the first feature representations or, in other words, to prioritize the first feature representations, particularly with regard to further feature representations. As a result, during the training of the artificial neural network, the importance or priority of static objects that are consistent across the successive training frames is emphasized, and objects that are inconsistent or dynamic are given a greatly reduced priority in the encoded features in the respective latent space . Dynamic objects can therefore be completely removed in the course of training.

Die Ausgabe des Encodermoduls, auf das das Merkmalspriorisierungsmodul angewendet wird, umfasst alle ersten Merkmalsdarstellungen und, falls zutreffend, alle weiteren Merkmalsdarstellungen jedes der aufeinander folgenden Trainings-Frames. Die Ausgabe des Priorisierungsmoduls umfasst die die in Abhängigkeit von den ersten Abweichungen modifizierten ersten Merkmalsdarstellungen.The output of the encoder module to which the feature prioritization module is applied includes all first feature representations and, if applicable, all further feature representations of each of the consecutive training frames. The output of the prioritization module includes the first feature representations modified as a function of the first deviations.

Das Merkmalspriorisierungsmodul kann zum Beispiel als Selbst-Attention-Modul oder, mit anderen Worten, als Modul zur Anwendung eines Selbst-Attention-Algorithmus auf die Ausgabe des Encodermoduls ausgelegt sein. Insbesondere ist das Selbst-Attention-Modul so konzipiert, dass es Merkmale mit der geringsten durchschnittlichen Abweichung in ihren jeweiligen Merkmalsdarstellungen verstärkt. Der Selbst-Attention-Algorithmus kann zum Beispiel einen globalen Durchschnittspooling-Algorithmus (GAP-Algorithmus) umfassen. Der GAP-Algorithmus kann einen durchschnittlichen Neuronenwert für jede der Merkmalsdarstellungen bestimmen und dann diesen durchschnittlichen Neuronenwert mit den Neuronen mit dem geringsten Abstand multiplizieren.For example, the feature prioritization module can be designed as a self-attention module or, in other words, as a module for applying a self-attention algorithm to the output of the encoder module. In particular, the self-attention engine is designed to reinforce features with the lowest average deviation in their respective feature representations. For example, the self-attention algorithm can include a global average pooling (GAP) algorithm. The GAP algorithm can determine an average neuron score for each of the feature representations and then multiply that average neuron score by the closest-spaced neurons.

Gemäß verschiedenen Ausführungsformen wird ein euklidischer Abstand zwischen den ersten Merkmalsdarstellungen des jeweiligen Paares berechnet, um die erste Abweichung des jeweiligen Paares zu bestimmen. Mit anderen Worten, die erste Abweichung entspricht einem ersten euklidischen Abstand.According to various embodiments, a Euclidean distance between the first feature representations of the respective pair is calculated to determine the first deviation of the respective pair. In other words, the first deviation corresponds to a first Euclidean distance.

Gemäß mehreren Ausführungsformen wird für jeden der Trainings-Frames eine zweite Merkmalsdarstellung generiert, indem das Encodermodul auf den Sensordatensatz des jeweiligen Trainings-Frames angewendet wird.According to several embodiments, a second feature representation is generated for each of the training frames by applying the encoder module to the sensor data set of the respective training frame.

Insbesondere wird für jeden der Trainings-Frames eine Vielzahl von Merkmalsdarstellungen, einschließlich der ersten und zweiten Merkmalsdarstellung, durch Anwendung des Encodermoduls auf den Sensordatensatz des jeweiligen Trainings-Frames erzeugt. Für jede der Merkmalsdarstellungen eines gegebenen Trainings-Frames gibt es eine entsprechende Merkmalsdarstellung in jedem der übrigen Trainings-Frames, wobei entsprechende Merkmalsdarstellungen dieselben Merkmale oder Arten von Merkmalen repräsentieren. Mit anderen Worten, alle ersten Merkmalsdarstellungen entsprechen dem gleichen ersten Merkmal, alle zweiten Merkmalsdarstellungen entsprechen dem gleichen zweiten Merkmal und so weiter.In particular, for each of the training frames, a plurality of feature representations, including the first and second feature representations, are generated by applying the encoder module to the sensor data set of the respective training frame. For each of the feature representations of a given training frame, there is a corresponding feature representation in each of the remaining training frames, with corresponding feature representations representing the same features or types of features. In other words, all first feature representations correspond to the same first feature, all second feature representations correspond to the same second feature, and so on.

Gemäß mehreren Ausführungsformen wird durch Anwendung des Merkmalspriorisierungsmoduls auf die Ausgabe des Encodermoduls für jedes Paar der zweiten Merkmalsdarstellungen eine jeweilige zweite Abweichung zwischen den zweiten Merkmalsdarstellungen des jeweiligen Paares ermittelt und die ersten Merkmalsdarstellungen und/oder die zweiten Merkmalsdarstellungen werden in Abhängigkeit von den ermittelten ersten Abweichungen und den ermittelten zweiten Abweichungen modifiziert.According to several embodiments, by applying the feature prioritization module to the output of the encoder module, a respective second deviation between the second feature representations of the respective pair is determined for each pair of the second feature representations and the first feature representations and/or the second feature representations are determined as a function of the determined first deviations and the determined second deviations modified.

Insbesondere ergeben sich die erste und die zweite Merkmalsdarstellung sowie gegebenenfalls die weiteren Merkmalsdarstellungen aus der einmaligen Anwendung des Encodermoduls auf den Sensordatensatz des jeweiligen Trainings-Frames.In particular, the first and second feature representations and possibly the further feature representations result from the one-time application of the encoder module to the sensor data set of the respective training frame.

Gemäß mehreren Ausführungsformen werden durch Anwendung des Merkmalspriorisierungsmoduls auf die Ausgabe des Encodermoduls Merkmale, die den ersten Merkmalsdarstellungen entsprechen, höher priorisiert als Merkmale, die den zweiten Merkmalsdarstellungen entsprechen, wenn ein Mittelwert der ersten Abweichungen größer als ein Mittelwert der zweiten Abweichungen ist.According to several embodiments, by applying the feature prioritization module to the output of the encoder module, features corresponding to the first feature representations are prioritized higher than features corresponding to the second feature representations when a mean value of the first deviations is greater than a mean value of the second deviations.

Analog dazu werden die Merkmale, die den ersten Merkmalsdarstellungen entsprechen, weniger priorisiert als die Merkmale, die den zweiten Merkmalsdarstellungen entsprechen, wenn der Mittelwert der ersten Abweichungen kleiner ist als der Mittelwert der zweiten Abweichungen.Similarly, if the mean of the first deviations is less than the mean of the second deviations, the features corresponding to the first feature representations are prioritized less than the features corresponding to the second feature representations.

Dieses Prinzip kann analog auf alle weiteren Merkmalsdarstellungen ausgeweitet werden, die jeweils dieselben Merkmale darstellen. Insbesondere wird den Merkmalen, die den Merkmalsdarstellungen mit dem niedrigsten Mittelwert der jeweiligen Abweichung entsprechen, die höchste Priorität zugemessen.This principle can be extended analogously to all other feature representations that each represent the same features. In particular, the features that correspond to the feature representations with the lowest mean value of the respective deviation are assigned the highest priority.

Gemäß mehreren Ausführungsformen wird durch Anwendung des Merkmalspriorisierungsmoduls auf die Ausgabe des Encodermoduls der Selbst-Attention-Algorithmus in Abhängigkeit von den ersten Abweichungen und den zweiten Abweichungen auf die Ausgabe des Encodermoduls angewendet, um die ersten Merkmalsdarstellungen und/oder die zweiten Merkmalsdarstellungen zu modifizieren.According to several embodiments, by applying the feature prioritization module to the output of the encoder module, the self-attention algorithm is applied to the output of the encoder module depending on the first deviations and the second deviations to modify the first feature representations and/or the second feature representations.

Gemäß mehreren Ausführungsformen umfasst der Selbst-Attention-Algorithmus einen GAP-Algorithmus.According to several embodiments, the self-attention algorithm includes a GAP algorithm.

Gemäß verschiedenen Ausführungsformen wird der euklidische Abstand zwischen den ersten Merkmalsdarstellungen des jeweiligen Paares berechnet, um die erste Abweichung für das jeweilige Paar zu bestimmen. Entsprechendes gilt analog für die zweite Abweichung für die Paare der zweiten Merkmalsdarstellungen und gegebenenfalls weitere Abweichungen für Paare weiterer Merkmalsdarstellungen.According to various embodiments, the Euclidean distance between the first feature representations of each pair is calculated to determine the first deviation for each pair. The same applies analogously to the second deviation for the pairs of the second representations of features and possibly further deviations for pairs of further representations of features.

Gemäß mehreren Ausführungsformen wird ein weiteres künstliches neuronales Netzwerk, insbesondere ein weiteres untrainiertes oder teiltrainiertes künstliches neuronales Netzwerk bereitgestellt, wobei das weitere künstliche neuronale Netzwerk als Variations-Autoencoder mit einem weiteren Encodermodul und einem weiteren Decodermodul ausgelegt ist. Für jeden von mehreren aufeinanderfolgenden Trainings-Frames wird jeweils ein weiterer Sensordatensatz, der eine weitere Umgebung des Umfeldsensorsystems repräsentiert empfangen, insbesondere von dem Umfeldsensorsystem. Für jeden der Trainings-Frames wird durch Anwendung des weiteren Encodermoduls auf den weiteren Sensordatensatz des jeweiligen Trainings-Frames eine weitere erste Merkmalsdarstellung erzeugt. Ein weiterer rekonstruierter Datensatz wird durch das weitere Decodermodul in Abhängigkeit von den weiteren ersten Merkmalsdarstellungen erzeugt. Das weitere künstliche neuronale Netzwerk wird in Abhängigkeit von dem weiter rekonstruierten Datensatz unüberwacht trainiert, um in einem weiteren Eingangsdatensatz, der die weitere Umgebung repräsentiert, ein dynamisches Objekt zu entfernen.According to several embodiments, a further artificial neural network, in particular a further untrained or partially trained artificial neural network, is provided, the further artificial neural network being designed as a variation autoencoder with a further encoder module and a further decoder module. For each of a plurality of consecutive training frames, a further sensor data set, which represents a further environment of the surroundings sensor system, is received, in particular from the surroundings sensor system. A further first feature representation is generated for each of the training frames by applying the further encoder module to the further sensor data set of the respective training frame. Another The reconstructed data set is generated by the further decoder module as a function of the further first feature representations. The further artificial neural network is trained unsupervised as a function of the further reconstructed data set in order to remove a dynamic object in a further input data set which represents the further environment.

Alle Erläuterungen in Bezug auf das künstliche neuronale Netzwerk und die Trainings-Frames übertragen sich analog auf das weitere künstliche neuronale Netzwerk beziehungsweise die weiteren Trainings-Frames.All explanations relating to the artificial neural network and the training frames are transferred analogously to the further artificial neural network or the further training frames.

Die weitere Umgebung des Umfeldsensorsystems unterscheidet sich von der Umgebung des Umfeldsensorsystems. Insbesondere unterscheidet sich die Pose des Umfeldsensorsystems nach der weiteren Umgebung von der Pose des Umfeldsensorsystems nach der Umgebung.The further environment of the environment sensor system differs from the environment of the environment sensor system. In particular, the pose of the surroundings sensor system according to the further surroundings differs from the pose of the surroundings sensor system according to the surroundings.

Folglich kann ein weiteres Inferenzmodell auf der Grundlage des weiteren trainierten künstlichen neuronalen Netzwerkes generiert werden. In solchen Implementierungen erhält man zwei unterschiedlich trainierte künstliche neuronale Netzwerke, die in der Lage sind, die dynamischen Objekte für verschiedene Posen des Umfeldsensorsystems zu entfernen. Auf die gleiche Weise können zusätzliche weitere künstliche neuronale Netzwerke für zusätzliche weitere Umgebungen trainiert werden.Consequently, a further inference model can be generated on the basis of the further trained artificial neural network. In such implementations, two differently trained artificial neural networks are obtained that are able to remove the dynamic objects for different poses of the environmental sensor system. In the same way, additional further artificial neural networks can be trained for additional further environments.

Gemäß verschiedenen Ausführungsformen wird die Pose des Umfeldsensorsystems, die der Umgebung des Umfeldsensorsystems entspricht, von der Recheneinheit empfangen und eine weitere Pose des Umfeldsensorsystems, die der weiteren Umgebung des Umfeldsensorsystems entspricht, wird von der Recheneinheit empfangen. Nach dem Training des künstlichen neuronalen Netzwerkes wird ein Inferenzmodell, das die Pose, das Encodermodul und das Decodermodul des trainierten künstlichen neuronalen Netzwerkes umfasst, erzeugt und gespeichert. Nach dem Training des weiteren künstlichen neuronalen Netzwerkes wird ein weiteres Inferenzmodell, das die weitere Pose, das weitere Encodermodul und das weitere Decodermodul des trainierten weiteren künstlichen neuronalen Netzwerkes umfasst, erzeugt und gespeichert.According to various embodiments, the pose of the surroundings sensor system, which corresponds to the surroundings of the surroundings sensor system, is received by the computing unit and a further pose of the surroundings sensor system, which corresponds to the further surroundings of the surroundings sensor system, is received by the computing unit. After the artificial neural network has been trained, an inference model comprising the pose, the encoder module and the decoder module of the trained artificial neural network is generated and stored. After the further artificial neural network has been trained, a further inference model, which includes the further pose, the further encoder module and the further decoder module of the trained further artificial neural network, is generated and stored.

Die Pose und die weitere Pose können beispielsweise durch das Rechensystem auf der Grundlage der Ausgabe eines oder mehrerer Positionssensoren des Umfeldsensorsystems oder des Kraftfahrzeugs berechnet werden. Die Pose und die weitere Pose umfassen insbesondere die Position und Orientierung des Umfeldsensorsystems beziehungsweise die weitere Position und die weitere Orientierung. Die Positionen können beispielsweise anhand eines Signals gemäß einem globalen Navigationssatellitensystem, GNSS, bestimmt werden, das von einem GNSS-Empfänger empfangen wird, der mit dem Umfeldsensorsystem gekoppelt ist. Die Orientierung kann abhängig von Sensormesswerten von Beschleunigungssensoren und/oder Gierratensensoren auf der Grundlage eines odometrischen Berechnungsschemas berechnet werden.The pose and the additional pose can be calculated, for example, by the computing system based on the output of one or more position sensors of the surroundings sensor system or of the motor vehicle. The pose and the further pose include in particular the position and orientation of the surroundings sensor system or the further position and the further orientation. The positions may be determined, for example, from a Global Navigation Satellite System (GNSS) signal received by a GNSS receiver coupled to the environment sensor system. The orientation can be calculated as a function of sensor readings from acceleration sensors and/or yaw rate sensors based on an odometric calculation scheme.

Beispielsweise kann das Inferenzmodell das trainierte künstliche neuronale Netzwerk vollständig oder das künstliche neuronale Netzwerk ohne das Priorisierungsmodul umfassen. Dasselbe gilt analog für das weitere Inferenzmodell und das weitere künstliche neuronale Netzwerk.For example, the inference model can completely include the trained artificial neural network or the artificial neural network without the prioritization module. The same applies analogously to the further inference model and the further artificial neural network.

Analog zum Inferenzmodell und zum weiteren Inferenzmodell können eines oder mehrere zusätzliche weitere Inferenzmodelle für eine oder mehrere zusätzliche weiteren Posen und Umgebungen generiert werden. Auf diese Weise kann die Recheneinheit nach der Generierung der Inferenzmodelle während einer Wiedergabephase eines der Inferenzmodelle auswählen, das der gegenwärtigen Pose des Umfeldsensorsystems beziehungsweise des Kraftfahrzeugs am besten entspricht, und mit dem entsprechend trainierten Encoder- und Decodermodul den generischen Zustand der gegenwärtigen Umgebung in der beschriebenen Weise rekonstruieren.Analogously to the inference model and to the additional inference model, one or more additional additional inference models can be generated for one or more additional additional poses and environments. In this way, after the generation of the inference models, the computing unit can select one of the inference models during a playback phase that best corresponds to the current pose of the environment sensor system or the motor vehicle, and with the correspondingly trained encoder and decoder module, the generic state of the current environment in the described way to reconstruct.

Gemäß dem verbesserten Konzept ist auch ein Verfahren zum Training eines künstlichen neuronalen Netzwerkes zur Rekonstruktion eines generischen Zustands einer Umgebung eines Umfeldsensorsystems vorgesehen, das ein computerimplementiertes Verfahren zum Training eines künstlichen neuronalen Netzwerkes zur Rekonstruktion eines generischen Zustands einer Umgebung eines Umfeldsensorsystems entsprechend dem verbesserten Konzept sowie den Schritt der Erzeugung der Sensordatensätze für die aufeinanderfolgenden Trainings-Frames durch das Umfeldsensorsystem umfasst.According to the improved concept, a method for training an artificial neural network for the reconstruction of a generic state of an environment of an environment sensor system is provided, which is a computer-implemented method for training an artificial neural network for the reconstruction of a generic state of an environment of an environment sensor system according to the improved concept and the Step of generating the sensor data sets for the successive training frames by the environment sensor system.

Gemäß dem verbesserten Konzept wird ein Verfahren zur Rekonstruktion eines generischen Zustandes eines Umfeldsensorsystems bereitgestellt. Ein Sensordatensatz, der eine gegenwärtige Umgebung des Umfeldsensorsystems repräsentiert, wird durch das Umfeldsensorsystem erzeugt. Ein trainiertes künstliches neuronales Netzwerk wird durch eine Recheneinheit auf den Sensordatensatz angewendet, um ein dynamisches Objekt in dem Sensordatensatz zu entfernen, wobei das trainierte künstliche neuronale Netzwerk als ein Variations-Autoencoder ausgelegt ist, der ein Encodermodul und ein Decodermodul beinhaltet. Um das dynamische Objekt im Sensordatensatz zu entfernen, wird eine erste Merkmalsdarstellung erzeugt, indem das Encodermodul auf den Sensordatensatz angewendet wird, und ein rekonstruierter Datensatz wird durch das Decodermodul in Abhängigkeit von der ersten Merkmalsdarstellung erzeugt.According to the improved concept, a method for reconstructing a generic state of an environment sensor system is provided. A sensor data set that represents a current environment of the surroundings sensor system is generated by the surroundings sensor system. A trained artificial neural network is applied to the sensor dataset by a computing unit to remove a dynamic object in the sensor dataset, the trained artificial neural network being designed as a variational autoencoder including an encoder module and a decoder module. In order to remove the dynamic object in the sensor data set, a first feature representation is generated by Encoder module is applied to the sensor data set, and a reconstructed data set is generated by the decoder module in dependence on the first feature representation.

Folglich umfasst der rekonstruierte Datensatz das dynamische Objekt nicht. Das Verfahren zur Rekonstruktion eines generischen Zustands gemäß dem verbesserten Konzept kann als Inferenzphase oder als Wiedergabephase für das trainierte künstliche neuronale Netzwerk betrachtet werden, während das computerimplementierte Verfahren zum Training des künstlichen neuronalen Netzwerkes zur Rekonstruktion eines generischen Zustands der Trainingsphase für das künstliche neuronale Netzwerk entspricht.Consequently, the reconstructed dataset does not include the dynamic object. The method for reconstructing a generic state according to the improved concept can be regarded as an inference phase or as a playback phase for the trained artificial neural network, while the computer-implemented method for training the artificial neural network for the reconstruction of a generic state corresponds to the training phase for the artificial neural network.

Insbesondere wird ein computerimplementiertes Verfahren zum Trainieren eines künstlichen neuronalen Netzwerkes zur Rekonstruktion eines generischen Zustands einer Umgebung eines Umfeldsensorsystems durchgeführt, um das trainierte künstliche neuronale Netzwerk für das Verfahren zur Rekonstruktion eines generischen Zustands für eine Umgebung des Umfeldsensorsystems zu erzeugen.In particular, a computer-implemented method for training an artificial neural network for reconstructing a generic state of an environment of an environment sensor system is performed in order to generate the trained artificial neural network for the method for reconstructing a generic state for an environment of the environment sensor system.

Insbesondere beinhaltet das trainierte künstliche neuronale Netzwerk das künstliche neuronalen Netzwerk des Inferenzmodells oder das weitere künstliche neuronale Netzwerk des weiteren Inferenzmodells zumindest teilweise.In particular, the trained artificial neural network at least partially contains the artificial neural network of the inference model or the further artificial neural network of the further inference model.

Gemäß verschiedenen Ausführungsformen wird eine gegenwärtige Pose des Umfeldsensorsystems bestimmt. Die gegenwärtige Pose wird mit der Pose des Inferenzmodells und mit der weiteren Pose des weiteren Inferenzmodells verglichen. In Abhängigkeit von jeweiligen Ergebnissen der Vergleiche wird das künstliche neuronale Netzwerk des Inferenzmodells oder das weitere künstliche neuronale Netzwerk des weiteren Inferenzmodells als das trainierte künstliche neuronale Netzwerk ausgewählt.According to various embodiments, a current pose of the environment sensor system is determined. The current pose is compared to the pose of the inference model and to the further pose of the further inference model. Depending on the respective results of the comparisons, the artificial neural network of the inference model or the further artificial neural network of the further inference model is selected as the trained artificial neural network.

In alternativen Implementierungen werden in Abhängigkeit von den jeweiligen Ergebnissen der Vergleiche das Encodermodul und das Decodermodul des künstlichen neuronalen Netzwerkes des Inferenzmodells ausgewählt oder das weitere Encodermodul und das weitere Decodermodul des weiteren künstlichen neuronalen Netzwerkes des weiteren Inferenzmodells werden ausgewählt.In alternative implementations, depending on the respective results of the comparisons, the encoder module and the decoder module of the artificial neural network of the inference model are selected or the further encoder module and the further decoder module of the further artificial neural network of the further inference model are selected.

Mit anderen Worten, wenn sich die gegenwärtige Pose von der Pose des Inferenzmodells um eine vordefinierte Toleranz oder weniger unterscheidet, können das Encodermodul und das Decodermodul des Inferenzmodells für das trainierte künstliche neuronale Netzwerk ausgewählt werden. Wenn sich andererseits die gegenwärtige Pose von der weiteren Pose des weiteren Inferenzmodells um die vordefinierte Toleranz oder weniger unterscheidet, können das weitere Encodermodul und das weitere Decodermodul des weiteren Inferenzmodells ausgewählt werden.In other words, if the current pose differs from the pose of the inference model by a predefined tolerance or less, the encoder module and the decoder module of the inference model can be selected for the trained artificial neural network. On the other hand, if the current pose differs from the further pose of the further inference model by the predefined tolerance or less, the further encoder module and the further decoder module of the further inference model can be selected.

Nach dem verbesserten Konzept wird auch ein Verfahren zum zumindest teilweise automatischen Führen eines Kraftfahrzeugs angegeben. Das Verfahren umfasst die Durchführung eines Verfahrens zur Rekonstruktion eines generischen Zustands einer Umgebung eines Umfeldsensorsystems gemäß dem verbesserten Konzept, wobei das Umfeldsensorsystem an dem Kraftfahrzeug montiert ist. Auf der Basis des rekonstruierten Datensatzes wird eine Trajektorie für das Kraftfahrzeug, insbesondere durch die Recheneinheit oder durch eine weitere Recheneinheit des Kraftfahrzeugs, geplant. Das Kraftfahrzeug entsprechend der geplanten Trajektorie zumindest teilweise automatisch geführt.According to the improved concept, a method for at least partially automatically driving a motor vehicle is also specified. The method includes performing a method for reconstructing a generic state of an environment of an environment sensor system according to the improved concept, wherein the environment sensor system is mounted on the motor vehicle. A trajectory for the motor vehicle is planned on the basis of the reconstructed data set, in particular by the computing unit or by a further computing unit of the motor vehicle. The motor vehicle is at least partially automatically guided according to the planned trajectory.

Zum Beispiel kann die Trajektorie auf der Grundlage eines VSLAM-Verfahrens geplant werden. Zum Beispiel kann eine von mehreren zuvor trainierten und gespeicherten Trajektorien von der Recheneinheit in Abhängigkeit von dem rekonstruierten Datensatz ausgewählt werden, um die Trajektorie zu planen.For example, the trajectory can be planned based on a VSLAM method. For example, one of several previously trained and stored trajectories can be selected by the computing unit depending on the reconstructed data set in order to plan the trajectory.

Das Kraftfahrzeug kann beispielsweise durch ein elektronisches Fahrzeugführungssystem des Kraftfahrzeugs zumindest teilweise automatisch gesteuert werden.The motor vehicle can be at least partially automatically controlled, for example, by an electronic vehicle guidance system of the motor vehicle.

Ein elektronisches Fahrzeugführungssystem kann als ein elektronisches System verstanden werden, das dazu eingerichtet ist, dass es ein Fahrzeug vollautomatisch oder vollautonom und insbesondere ohne manuelles Eingreifen oder manuelle Steuerung durch einen Fahrer oder Benutzer des Fahrzeugs führt. Das Fahrzeug führt alle erforderlichen Funktionen, wie Lenkmanöver, Verzögerungs- und/oder Beschleunigungsmanöver sowie die Überwachung und Aufzeichnung des Straßenverkehrs und der entsprechenden Reaktionen automatisch durch. Insbesondere kann das elektronische Fahrzeugführungssystem einen vollautomatischen oder vollständig autonomen Fahrmodus gemäß Stufe 5 der SAE J3016-Klassifizierung implementieren. Ein elektronisches Fahrzeugführungssystem kann auch als Fahrerassistenzsystem, ADAS implementiert sein, das den Fahrer beim teilautomatischen oder teilautonomen Fahren unterstützt. Insbesondere kann das elektronische Fahrzeugführungssystem einen teilautomatischen oder teilautonomen Fahrmodus gemäß den Stufen 1 bis 4 der SAE J3016-Klassifizierung implementieren. Hier und im Folgenden bezieht sich SAE J3016 auf die entsprechende Norm vom Juni 2018.An electronic vehicle guidance system can be understood as an electronic system that is set up to guide a vehicle fully automatically or fully autonomously and in particular without manual intervention or manual control by a driver or user of the vehicle. The vehicle automatically carries out all necessary functions, such as steering manoeuvres, deceleration and/or acceleration manoeuvres, as well as monitoring and recording road traffic and the corresponding reactions. In particular, the electronic vehicle guidance system can implement a fully automated or fully autonomous driving mode according to level 5 of the SAE J3016 classification. An electronic vehicle guidance system can also be implemented as a driver assistance system, ADAS, which supports the driver in semi-automatic or semi-autonomous driving. In particular, the electronic vehicle guidance system can implement a semi-automatic or semi-autonomous driving mode according to levels 1 to 4 of the SAE J3016 classification. Here and in the following, SAE J3016 refers to the corresponding standard from June 2018.

Nach dem verbesserten Konzept wird auch ein Computersystem bereitgestellt, das dazu eingerichtet ist, ein computerimplementiertes Verfahren zum Trainieren eines künstlichen neuronalen Netzwerkes zur Rekonstruktion des generischen Zustands einer Umgebung nach dem verbesserten Konzept durchzuführen. Das Rechensystem beinhaltet eine Recheneinheit und eine Speichereinheit. Die Speichereinheit speichert ein künstliches neuronales Netzwerk, das als ein Variations-Autoencoder mit einem Encodermodul und einem Decodermodul ausgelegt ist. Das Speichermedium speichert ferner für jeden von mehreren aufeinanderfolgenden Trainings-Frames einen entsprechenden Datensatz, der eine Umgebung repräsentiert. Die Recheneinheit ist dazu eingerichtet, für jeden der Trainings-Frames eine erste Merkmalsdarstellung zu erzeugen, indem sie das Encodermodul auf den jeweiligen Trainings-Frames des Sensordatensatzes anwendet, und einen rekonstruierten Datensatz unter Verwendung des Decodermoduls in Abhängigkeit von den ersten Merkmalsdarstellungen zu erzeugen. Die Recheneinheit ist dazu eingerichtet, das künstliche neuronale Netzwerk in Abhängigkeit von dem rekonstruierten Datensatz unüberwacht zu trainieren, um ein dynamisches Objekt in einem Eingabedatensatz, der die Umgebung repräsentiert, zu entfernen.According to the improved concept, a computer system is also provided which is set up to carry out a computer-implemented method for training an artificial neural network to reconstruct the generic state of an environment according to the improved concept. The computing system includes a computing unit and a memory unit. The storage unit stores an artificial neural network designed as a variational autoencoder with an encoder module and a decoder module. The storage medium also stores a corresponding data set representing an environment for each of a plurality of consecutive training frames. The computing unit is set up to generate a first feature representation for each of the training frames by applying the encoder module to the respective training frames of the sensor data set and to generate a reconstructed data set using the decoder module as a function of the first feature representations. The computing unit is set up to train the artificial neural network in an unsupervised manner as a function of the reconstructed data set in order to remove a dynamic object in an input data set that represents the environment.

Weitere Ausführungsformen des Computersystems folgen direkt aus den verschiedenen Ausführungsformen des computerimplementierten Verfahrens zum Trainieren eines künstlichen neuronalen Netzwerks nach dem verbesserten Konzept und umgekehrt. Insbesondere kann das Computersystem dazu eingerichtet sein, ein computerimplementiertes Verfahren nach dem verbesserten Konzept durchzuführen oder es führt ein computerimplementiertes Verfahren nach dem verbesserten Konzept durch.Further embodiments of the computer system follow directly from the various embodiments of the computer-implemented method for training an artificial neural network according to the improved concept and vice versa. In particular, the computer system can be set up to carry out a computer-implemented method based on the improved concept or it carries out a computer-implemented method based on the improved concept.

Nach dem verbesserten Konzept wird auch eine Sensoranordnung aufweisend ein Umfeldsensorsystem und eine Recheneinheit angegeben. Die Sensoranordnung ist dazu eingerichtet, ein Verfahren zur Rekonstruktion eines generischen Zustandes einer Umgebung des Umfeldsensorsystems nach dem verbesserten Konzept durchzuführen. Insbesondere kann die Sensoranordnung ein Computersystem gemäß dem verbesserten Konzept und das Umfeldsensorsystem umfassen.According to the improved concept, a sensor arrangement having an environment sensor system and a computing unit is also specified. The sensor arrangement is set up to carry out a method for reconstructing a generic state of an environment of the surroundings sensor system according to the improved concept. In particular, the sensor arrangement can include a computer system according to the improved concept and the surroundings sensor system.

Nach dem verbesserten Konzept wird ein elektronisches Fahrzeugführungssystem angegeben. Das Fahrzeugführungssystem beinhaltet eine Recheneinheit und eine Speichereinheit, in der ein trainiertes künstliches neuronales Netzwerk gespeichert ist, und das Fahrzeugführungssystem umfasst ein Umfeldsensorsystem. Das Umfeldsensorsystem ist dazu eingerichtet, einen Sensordatensatz zu erzeugen, der eine gegenwärtige Umgebung des Umfeldsensorsystems darstellt. Das trainierte künstliche neuronale Netzwerk ist als Variations-Autoencoder ausgelegt, der ein Encodermodul und ein Decodermodul umfasst. Die Recheneinheit dazu eingerichtet, das trainierte künstliche neuronale Netzwerk auf den Sensordatensatz anzuwenden, um ein dynamisches Objekt in dem Sensordatensatz zu entfernen, wobei eine erste Merkmalsdarstellung durch Anwenden des Encodermoduls auf den Sensordatensatz erzeugt wird und ein rekonstruierter Datensatz durch das Decodermodul in Abhängigkeit von der ersten Merkmalsdarstellung erzeugt wird. Die Recheneinheit dazu eingerichtet, auf der Grundlage des rekonstruierten Datensatzes eine Trajektorie für das Kraftfahrzeug zu planen und eines oder mehrere Steuersignale zum zumindest teilweise automatischen Führen eines Kraftfahrzeugs entsprechend der geplanten Trajektorie zu erzeugen.An electronic vehicle guidance system is specified according to the improved concept. The vehicle guidance system includes a computing unit and a memory unit in which a trained artificial neural network is stored, and the vehicle guidance system includes an environment sensor system. The environment sensor system is set up to generate a sensor data set that represents a current environment of the environment sensor system. The trained artificial neural network is designed as a variational autoencoder that includes an encoder module and a decoder module. The processing unit is configured to apply the trained artificial neural network to the sensor data set in order to remove a dynamic object in the sensor data set, wherein a first feature representation is generated by applying the encoder module to the sensor data set and a reconstructed data set by the decoder module as a function of the first Feature representation is generated. The processing unit is set up to plan a trajectory for the motor vehicle on the basis of the reconstructed data set and to generate one or more control signals for at least partially automatically guiding a motor vehicle according to the planned trajectory.

Das elektronische Fahrzeugführungssystem kann beispielsweise eine Sensoranordnung nach dem verbesserten Konzept umfassen, wobei die Sensoranordnung die Recheneinheit und das Umfeldsensorsystem des elektronischen Fahrzeugführungssystems umfasst.The electronic vehicle guidance system can, for example, include a sensor arrangement based on the improved concept, the sensor arrangement including the computing unit and the environment sensor system of the electronic vehicle guidance system.

Gemäß mehreren Ausführungsformen des Fahrzeugführungssystem umfasst das Fahrzeugführungssystem eine oder mehrere Betätigungseinheiten, um das Kraftfahrzeug in Abhängigkeit von dem einem oder den mehreren Steuersignalen entsprechend der Trajektorie zumindest teilweise automatisch zu führen.According to several embodiments of the vehicle guidance system, the vehicle guidance system comprises one or more actuation units in order to at least partially automatically guide the motor vehicle depending on the one or more control signals according to the trajectory.

Weitere Ausführungsformen des Fahrzeugführungssystems nach dem verbesserten Konzept folgen direkt aus den verschiedenen Ausführungsformen des Verfahrens zur Rekonstruktion eines generischen Zustands einer Umgebung nach dem verbesserten Konzept beziehungsweise aus den verschiedenen Ausführungsformen des Verfahrens zum zumindest teilweise automatischen Fahren eines Kraftfahrzeugs nach dem verbesserten Konzept und umgekehrt.Further embodiments of the vehicle guidance system according to the improved concept follow directly from the different embodiments of the method for reconstructing a generic state of an environment according to the improved concept or from the different embodiments of the method for at least partially automatically driving a motor vehicle according to the improved concept and vice versa.

Insbesondere kann ein elektronisches Fahrzeugführungssystem nach dem verbesserten Konzept dazu eingerichtet oder programmiert sein, ein Verfahren zur Rekonstruktion eines generischen Zustands einer Umgebung oder ein Verfahren zum zumindest teilweisen automatischen Führen eines Kraftfahrzeugs nach dem verbesserten Konzept durchzuführen oder es führt ein solches Verfahren durch.In particular, an electronic vehicle guidance system according to the improved concept can be set up or programmed to carry out a method for reconstructing a generic state of an environment or a method for at least partially automatically guiding a motor vehicle according to the improved concept, or it carries out such a method.

Nach dem verbesserten Konzept wird ein erstes Computerprogramm mit ersten Befehlen angegeben. Wenn das erste Computerprogramm beziehungsweise die ersten Befehle von einem Computersystem, insbesondere von einem Computersystem nach dem verbesserten Konzept, ausgeführt werden, veranlassen die ersten Befehle das Computersystem, ein computerimplementiertes Verfahren zum Trainieren eines künstlichen neuronalen Netzwerkes nach dem verbesserten Konzept auszuführen.According to the improved concept, a first computer program with first commands is specified. If the first computer program or the first instructions from a computer system, in particular from a Compu tersystem according to the improved concept, the first instructions cause the computer system to execute a computer-implemented method for training an artificial neural network according to the improved concept.

Nach dem verbesserten Konzept wird ein zweites Computerprogramm mit zweiten Befehlen bereitgestellt. Wenn die zweiten Befehle beziehungsweise das zweite Computerprogramm von einem elektronischen Fahrzeugführungssystem nach dem verbesserten Konzept ausgeführt werden, bewirken die zweiten Befehle, dass das elektronische Fahrzeugführungssystem ein computerimplementiertes Verfahren zum Training eines künstlichen neuronalen Netzwerkes nach dem verbesserten Konzept oder ein Verfahren zur Rekonstruktion eines generischen Zustands einer Umgebung nach dem verbesserten Konzept oder ein Verfahren zum zumindest teilweisen automatischen Führen eines Kraftfahrzeugs nach dem verbesserten Konzept durchführt.According to the improved concept, a second computer program with second commands is provided. If the second commands or the second computer program are executed by an electronic vehicle guidance system according to the improved concept, the second commands cause the electronic vehicle guidance system to use a computer-implemented method for training an artificial neural network according to the improved concept or a method for reconstructing a generic state of a Environment according to the improved concept or a method for at least partially automatically driving a motor vehicle according to the improved concept.

Nach dem verbesserten Konzept wird auch ein computerlesbares Speichermedium angegeben, das ein erstes Computerprogramm und/oder ein zweites Computerprogramm nach dem verbesserten Konzept speichert.According to the improved concept, a computer-readable storage medium is also specified, which stores a first computer program and/or a second computer program according to the improved concept.

Das erste und das zweite Computerprogramm und das computerlesbare Speichermedium nach dem verbesserten Konzept können als jeweilige Computerprogrammprodukte bezeichnet werden, welche die ersten und/oder die zweiten Befehle enthalten.The first and the second computer program and the computer-readable storage medium according to the improved concept can be referred to as respective computer program products which contain the first and/or the second instructions.

Weitere Merkmale der Erfindung ergeben sich aus den Ansprüchen, den Abbildungen und der Beschreibung der Figuren. Die oben in der Beschreibung genannten Merkmale und Merkmalskombinationen sowie die unten in der Beschreibung der Figuren genannten und/oder in den Figuren dargestellten Merkmale und Merkmalskombinationen allein können von dem verbesserten Konzept nicht nur in der jeweils angegebenen Kombination, sondern auch in anderen Kombinationen erfasst sein. Damit sind Ausführungsformen des verbesserten Konzepts erfasst und offenbart, die nicht explizit in den Abbildungen gezeigt oder erläutert werden, sondern sich aus getrennten Merkmalskombinationen aus den erläuterten Implementierungen ergeben und durch diese erzeugt werden können. Ausführungsformen und Merkmalskombinationen, die nicht alle Merkmale eines ursprünglich formulierten Anspruchs aufweisen, können von dem verbesserten Konzept erfasst sein. Darüber hinaus können Ausführungsformen und Merkmalskombinationen, die über die in den Beziehungen der Ansprüche dargelegten Merkmalskombinationen hinausgehen oder von diesen abweichen, von dem verbesserten Konzept erfasst sein.Further features of the invention emerge from the claims, the illustrations and the description of the figures. The features and combinations of features mentioned above in the description and the features and combinations of features mentioned below in the description of the figures and/or shown in the figures alone can be covered by the improved concept not only in the combination specified in each case, but also in other combinations. This covers and discloses embodiments of the improved concept that are not explicitly shown or explained in the figures, but result from separate combinations of features from and can be generated by the implementations explained. Embodiments and combinations of features that do not have all the features of an originally formulated claim can be covered by the improved concept. In addition, embodiments and combinations of features that go beyond or deviate from the combinations of features presented in the relationships of the claims can be covered by the improved concept.

In den Abbildungen:

1 zeigt eine schematische Darstellung eines Kraftfahrzeugs mit einer beispielhaften Ausführungsform eines elektronischen Fahrzeugführungssystems nach dem verbesserten Konzept;
2 zeigt ein schematisches Flussdiagramm einer beispielhaften Ausführungsform eines Verfahrens zur Rekonstruktion eines generischen Zustands einer Umgebung nach dem verbesserten Konzept;
3 zeigt schematisch das Ergebnis von Rekonstruktionen anhand exemplarischer Ausführungsformen von Verfahren zur Rekonstruktion eines generischen Zustandes einer Umgebung nach dem verbesserten Konzept;
4 zeigt schematisch eine Architektur eines künstlichen neuronalen Netzwerkes, das nach einer beispielhaften Ausführungsform eines computerimplementierten Verfahrens zum Training eines künstlichen neuronalen Netzwerkes nach dem verbesserten Konzept trainiert wird;
5 zeigt ein schematisches Flussdiagramm einer beispielhaften Ausführungsform eines computerimplementierten Verfahrens zum Trainieren eines künstlichen neuronalen Netzwerkes nach dem verbesserten Konzept; und
6 zeigt ein schematisches Flussdiagramm einer beispielhaften Ausführungsform eines Verfahrens zum zumindest teilweise automatischen Fahren eines Kraftfahrzeugs nach dem verbesserten Konzept.

In the pictures:

1 12 is a schematic representation of a motor vehicle having an exemplary embodiment of an electronic vehicle guidance system according to the improved concept;
2 12 shows a schematic flow diagram of an exemplary embodiment of a method for reconstructing a generic state of an environment according to the improved concept;
3 shows schematically the result of reconstructions based on exemplary embodiments of methods for reconstructing a generic state of an environment according to the improved concept;
4 12 schematically shows an architecture of an artificial neural network being trained according to an exemplary embodiment of a computer-implemented method for training an artificial neural network according to the improved concept;
5 12 shows a schematic flow diagram of an exemplary embodiment of a computer-implemented method for training an artificial neural network according to the improved concept; and
6 shows a schematic flow chart of an exemplary embodiment of a method for at least partially automatic driving of a motor vehicle according to the improved concept.

1 zeigt schematisch ein Kraftfahrzeug 1 mit einer beispielhaften Ausführungsform eines elektronischen Fahrzeugführungssystems 2 gemäß dem verbesserten Konzept. Das Fahrzeugführungssystem 2 beinhaltet eine Recheneinheit 3 und ein Speichereinheit 5, die ein trainiertes künstliches neuronales Netzwerk 10 speichert, wie es im unteren Teil von 2 schematisch dargestellt ist. Darüber hinaus beinhaltet das Fahrzeugführungssystem 2 ein Umfeldsensorsystem 4, das eine oder mehrere Kameras und/oder Lidarsysteme enthalten kann, die gegebenenfalls an verschiedenen Positionen am Fahrzeug 1 montiert sind. 1 FIG. 1 schematically shows a motor vehicle 1 with an exemplary embodiment of an electronic vehicle guidance system 2 according to the improved concept. The vehicle guidance system 2 includes a computing unit 3 and a memory unit 5 storing a trained artificial neural network 10, as shown in the lower part of FIG 2 is shown schematically. In addition, the vehicle guidance system 2 includes an environment sensor system 4, which can contain one or more cameras and/or lidar systems, which may be mounted at different positions on the vehicle 1.

Das Fahrzeugführungssystem 2 ist dazu eingerichtet, ein Verfahren zum zumindest teilweisen automatischen Führen des Kraftfahrzeugs 1 gemäß dem verbesserten Konzept durchzuführen. Darüber hinaus ist das Fahrzeugführungssystem 2 dazu eingerichtet, computerimplementiertes Verfahren zum Trainieren des künstlichen neuronalen Netzwerks 10 gemäß dem verbesserten Konzept durchzuführen. Die Funktion des Fahrzeugführungssystems 2 wird im Folgenden anhand solcher Verfahren, insbesondere anhand von 2 bis 6, näher erläutert.The vehicle guidance system 2 is set up to carry out a method for at least partially automatically guiding the motor vehicle 1 according to the improved concept. In addition, the vehicle guidance system 2 is set up to use a computer-implemented method for training the artificial neural network 10 perform according to the improved concept. The function of the vehicle guidance system 2 is based on such methods, in particular based on 2 until 6 , explained in more detail.

2 zeigt ein Flussdiagramm einer beispielhaften Ausführungsform des Verfahrens zum zumindest teilweise automatischen Fahren des Kraftfahrzeugs 1 sowie eine schematische Darstellung des trainierten künstlichen neuronalen Netzes 10. 2 shows a flowchart of an exemplary embodiment of the method for at least partially automatic driving of the motor vehicle 1 and a schematic representation of the trained artificial neural network 10.

Das Fahrzeugführungssystem 2 kann zum Beispiel eingerichtet sein, das Kraftfahrzeug 1 in Abhängigkeit von einer trajektorienbasierten Funktion zumindest teilweise automatisch zu führen, beispielsweise unter Verwendung von VSLAM, wobei während einer vorangegangenen Trainingsphase eine Vielzahl von Trajektorien für verschiedene Umgebungen des Fahrzeugs 1 geplant und in der Speichereinheit 5 gespeichert wurden. Wenn das Fahrzeugführungssystem 2 feststellt, dass sich das Fahrzeug 1 wieder in einer dieser Umgebungen befindet, kann es eine entsprechende der gespeicherten Trajektorien auswählen und das Kraftfahrzeug 1 entsprechend steuern. Um die Umgebung des Kraftfahrzeugs 1 zuverlässig zu erkennen, kann das in 2 dargestellte Verfahren durchgeführt werden.The vehicle guidance system 2 can be set up, for example, to guide the motor vehicle 1 at least partially automatically as a function of a trajectory-based function, for example using VSLAM, with a large number of trajectories for different surroundings of the vehicle 1 being planned during a previous training phase and stored in the memory unit 5 have been saved. If the vehicle guidance system 2 determines that the vehicle 1 is again in one of these surroundings, it can select a corresponding one of the stored trajectories and control the motor vehicle 1 accordingly. In order to reliably detect the surroundings of the motor vehicle 1, the 2 procedures shown are carried out.

Im Schritt S1 des Verfahrens erzeugt das Umfeldsensorsystem 4 einen Sensordatensatz 8, der die gegenwärtige Umgebung des Umfeldsensorsystems 4 beziehungsweise des Kraftfahrzeugs 1 repräsentiert. Der Sensordatensatz 8 kann zum Beispiel einem Kamerabild entsprechen, wie es schematisch für zwei beispielhafte Situationen in der linken Spalte von 3 dargestellt ist. Es ist zu erkennen, dass der Sensordatensatz 8 dynamische Objekte 14 abbildet, zum Beispiel Radfahrer in der oberen Reihe von 3 oder einen Fußgänger in der unteren Reihe von 3.In step S1 of the method, surroundings sensor system 4 generates a sensor data set 8 that represents the current surroundings of surroundings sensor system 4 or motor vehicle 1 . The sensor data set 8 can, for example, correspond to a camera image, as is shown schematically for two exemplary situations in the left column of FIG 3 is shown. It can be seen that the sensor data set 8 depicts dynamic objects 14, for example cyclists in the top row of 3 or a pedestrian in the bottom row of 3 .

Das Vorhandensein der dynamischen Objekte 14 kann die zuverlässige Erkennung der gegenwärtigen Umgebung basierend auf bestimmten vordefinierten Schlüsselmerkmalen in der Umgebung erschweren oder unmöglich machen. Daher führt das Fahrzeugführungssystem 2 eine beispielhafte Ausführungsform des Verfahrens zur Rekonstruktion eines generischen Zustands einer Umgebung des Umfeldsensorsystems 4 gemäß dem verbesserten Konzept in Schritt S4 und gegebenenfalls in den optionalen Schritten S2, S3 durch.The presence of the dynamic objects 14 may make it difficult or impossible to reliably identify the current environment based on certain predefined key features in the environment. The vehicle guidance system 2 therefore carries out an exemplary embodiment of the method for reconstructing a generic state of an environment of the surroundings sensor system 4 according to the improved concept in step S4 and optionally in the optional steps S2, S3.

Im Schritt S4 wendet die Recheneinheit 3 das trainierte künstliche neuronale Netzwerk 10 auf den Sensordatensatz 8 an, um die dynamischen Objekte 14 aus dem Sensordatensatz 8 zu entfernen. Zu diesem Zweck wird ein Encodermodul 6 gefolgt von einem Decodermodul 7 des trainierten künstlichen neuronalen Netzwerks 10, das insbesondere als Variations-Autoencoder implementiert ist, auf den Input-Sensordatensatz 8 angewendet, um einen rekonstruierten Datensatz 9 zu erzeugen. Dabei entspricht der rekonstruierte Datensatz 9 dem Sensordatensatz 8 ohne die dynamischen Objekte 14, wie in der rechten Spalte von 3 schematisch dargestellt. Genauer gesagt erzeugt das Encodermodul 6 basierend auf dem Sensordatensatz 8 eine Ausgabe, die eine Vielzahl von Merkmalsdarstellungen beinhaltet, die jeweils verschiedenen Merkmalen im Sensordatensatz 8 beziehungsweise der Umgebung entsprechen. Das Decodermodul 7 verwendet die Ausgabe des Encodermoduls 6 als Eingabe zur Erzeugung des rekonstruierten Datensatzes 9.In step S4 the processing unit 3 applies the trained artificial neural network 10 to the sensor data set 8 in order to remove the dynamic objects 14 from the sensor data set 8 . For this purpose, an encoder module 6 followed by a decoder module 7 of the trained artificial neural network 10, which is implemented in particular as a variation autoencoder, is applied to the input sensor data set 8 in order to generate a reconstructed data set 9. In this case, the reconstructed data set 9 corresponds to the sensor data set 8 without the dynamic objects 14, as in the right-hand column of FIG 3 shown schematically. To put it more precisely, the encoder module 6 generates an output based on the sensor data set 8 which contains a large number of feature representations which each correspond to different features in the sensor data set 8 or the environment. The decoder module 7 uses the output of the encoder module 6 as input to generate the reconstructed data set 9.

Um die dynamischen Objekte 14 für den Sensordatensatz 8 entfernen zu können, wurde das künstliche neuronale Netzwerk 10 vorab trainiert, insbesondere nach einem computerimplementierten Verfahren zum Trainieren eines künstlichen neuronalen Netzwerks gemäß dem verbesserten Konzept. Ein solches Verfahren wird weiter unten bezüglich 4 bis 6 näher erläutert.In order to be able to remove the dynamic objects 14 for the sensor data set 8, the artificial neural network 10 was trained in advance, in particular using a computer-implemented method for training an artificial neural network according to the improved concept. Such a procedure is discussed below 4 until 6 explained in more detail.

Im Schritt S5 plant die Recheneinheit 3 auf der Basis des rekonstruierten Datensatzes 9 eine Trajektorie für das Kraftfahrzeug 1 und erzeugt eines oder mehrere Steuersignale, um das Kraftfahrzeug 1 entsprechend der geplanten Trajektorie zumindest teilweise automatisch zu führen. Ein oder mehrere Betätigungseinheiten (nicht dargestellt) des Kraftfahrzeugs 1 werden entsprechend dem einen oder der mehreren Steuersignale angesteuert.In step S5, the computing unit 3 plans a trajectory for the motor vehicle 1 on the basis of the reconstructed data set 9 and generates one or more control signals in order to at least partially automatically guide the motor vehicle 1 according to the planned trajectory. One or more actuation units (not shown) of motor vehicle 1 are activated in accordance with the one or more control signals.

Insbesondere kann die Recheneinheit 3 einen oder mehrere Schlüsselpunkte in der Umgebung aus dem rekonstruierten Datensatz 9 extrahieren und die extrahierten Schlüsselpunkte mit zuvor identifizierten Schlüsselpunkten abgleichen, um eine geeignete Trajektorie wiedergeben zu können. Durch das verbesserte Konzept erfolgt die Schlüsselpunktextraktion in Schritt S5 basierend auf dem rekonstruierten Datensatz 9 ohne die dynamischen Objekte 14 und nicht direkt auf dem Sensordatensatz 8. Daher behindern die dynamischen Objekte 14 die Schlüsselpunktextraktion nicht, was zu einem zuverlässigeren automatischen oder teilautomatischen Fahren führt.In particular, the computing unit 3 can extract one or more key points in the area from the reconstructed data set 9 and compare the extracted key points with previously identified key points in order to be able to reproduce a suitable trajectory. Thanks to the improved concept, the key point extraction in step S5 is based on the reconstructed data set 9 without the dynamic objects 14 and not directly on the sensor data set 8. The dynamic objects 14 therefore do not impede the key point extraction, which leads to more reliable automatic or semi-automatic driving.

Für die Rekonstruktion in Schritt S4 kann die Recheneinheit 3 ein geeignetes trainiertes künstliches neuronales Netzwerk 10 beziehungsweise dessen Encodermodul und Decodermodul 6, 7 aus einer Vielzahl zuvor gespeicherter Inferenzmodelle auswählen. Die verschiedenen Inferenzmodelle entsprechen jeweils unterschiedlichen Posen des Kraftfahrzeugs 1 beziehungsweise des Umfeldsensorsystems 4 und damit unterschiedlichen Umgebungen. Daher kann die Recheneinheit 3 im optionalen Schritt S2 mittels der Recheneinheit 3 eine gegenwärtige Pose des Kraftfahrzeugs 1 beziehungsweise des Umfeldsensorsystems 4 bestimmen. Zu diesem Zweck kann die Recheneinheit 3 Sensorausgaben von Gierratensensoren, Beschleunigungssensoren und so weiter nutzen, um eine entsprechende Posenabschätzung, zum Beispiel basierend auf odometrischen Modellen durchführen.For the reconstruction in step S4, the computing unit 3 can select a suitably trained artificial neural network 10 or its encoder module and decoder module 6, 7 from a large number of previously stored inference models. The different inference models each correspond to different poses of the motor vehicle 1 or of the surroundings sensor system 4 and thus to different ones environments. In the optional step S2, the computing unit 3 can therefore use the computing unit 3 to determine a current pose of the motor vehicle 1 or of the surroundings sensor system 4. For this purpose, the computing unit 3 can use sensor outputs from yaw rate sensors, acceleration sensors and so on in order to carry out a corresponding pose estimation, for example based on odometric models.

Im optionalen Schritt S3 kann die Recheneinheit 3 das am besten passende Inferenzmodell, das auf der Speichereinheit 5 gespeichert ist, abhängig von der in Schritt S2 ermittelten gegenwärtigen Pose auswählen. Zum Beispiel beinhaltet jedes auf der Speichereinheit 5 gespeicherte Inferenzmodell eine entsprechende Pose beziehungsweise das entsprechende Encodermodul 6 und das entsprechende Decodermodul 7.In the optional step S3, the arithmetic unit 3 can select the most suitable inference model, which is stored on the storage unit 5, depending on the current pose determined in step S2. For example, each inference model stored on the storage unit 5 contains a corresponding pose or the corresponding encoder module 6 and the corresponding decoder module 7.

4 zeigt eine beispielhafte Darstellung eines nach einem computerimplementierten Verfahren gemäß dem verbesserten Konzept zu trainierenden künstlichen neuronalen Netzwerks 10'. Ein beispielhaftes Flussdiagramm eines solchen computerimplementierten Verfahrens ist in 5 dargestellt. Darin werden in den Schritten T1, T1', T1'' jeweilige Posen des Kraftfahrzeugs 1 beziehungsweise des Umfeldsensorsystems 4 zu unterschiedlichen Zeiten beziehungsweise Trainingsversuchen ermittelt. Im Schritt T2 wird ein erstes Inferenzmodell generiert, indem ein computerimplementiertes Verfahren zum Trainieren eines künstlichen neuronalen Netzwerks 10' gemäß dem verbesserten Konzept für den ersten Trainingsversuch durchgeführt wird. Dasselbe wird für ein zweites Inferenzmodell und ein drittes Inferenzmodell in den Schritten T2' beziehungsweise T2" durchgeführt. In Schritt T3 werden die ermittelten Inferenzmodelle und die entsprechenden Posen in der Speichereinheit 5 gespeichert. 4 12 shows an exemplary representation of an artificial neural network 10' to be trained using a computer-implemented method according to the improved concept. An exemplary flow chart of such a computer-implemented method is in 5 shown. In steps T1, T1′, T1″, respective poses of the motor vehicle 1 or of the surroundings sensor system 4 are determined at different times or training attempts. In step T2, a first inference model is generated by carrying out a computer-implemented method for training an artificial neural network 10' according to the improved concept for the first training attempt. The same is carried out for a second inference model and a third inference model in steps T2′ and T2″, respectively. In step T3, the inference models determined and the corresponding poses are stored in the storage unit 5 .

Für jeden Trainingsversuch werden mehrere Trainingssensordatensätze 8a, 8b durch das Umfeldsensorsystem 4 für aufeinanderfolgende Trainings-Frame erzeugt und der Recheneinheit 3 bereitgestellt. Das zu trainierende künstliche neuronale Netzwerk 10' ist als Variations-Autoencoder ausgelegt, der ein zu trainierendes Encodermodul 6' und ein zu trainierendes Decodermodul 7' beinhaltet. Dabei kann davon ausgegangen werden, dass das Encodermodul 6' aus einer Vielzahl von Encoder-Submodulen 6a, 6b besteht, eines für jeden Trainings-Frame. Der jeweilige Sensordatensatz 8a, 8b jedes Trainings-Frames wird als Eingabe in das entsprechende Encoder-Submodul 6a, 6b eingespeist. Dann erzeugen die Encoder-Submodule 6a, 6b eine erste Merkmalsdarstellung 11a, 11b und eine Vielzahl weiterer Merkmalsdarstellungen (nicht abgebildet) für jeden der Trainings-Frames.For each training attempt, a number of training sensor data sets 8a, 8b are generated by the environment sensor system 4 for consecutive training frames and made available to the processing unit 3. The artificial neural network 10' to be trained is designed as a variation autoencoder which contains an encoder module 6' to be trained and a decoder module 7' to be trained. It can be assumed that the encoder module 6' consists of a large number of encoder sub-modules 6a, 6b, one for each training frame. The respective sensor data set 8a, 8b of each training frame is fed as an input to the corresponding encoder sub-module 6a, 6b. Then the encoder sub-modules 6a, 6b generate a first feature representation 11a, 11b and a large number of further feature representations (not shown) for each of the training frames.

Das zu trainierende künstliche neuronale Netzwerk 10' kann auch ein Merkmalspriorisierungsmodul 12 beinhalten, das dazu ausgelegt ist, einen jeweiligen euklidischen Abstand zwischen allen Paaren der ersten Merkmalsdarstellungen 11a, 11b sowie einen entsprechenden Mittelwert der euklidischen Abstände für die ersten Merkmalsdarstellungen 11a, 11 b zu berechnen. Auf die gleiche Weise berechnet das Modul 12 zur Merkmalspriorisierung eine entsprechende mittlere euklidische Distanz für alle entsprechenden weiteren abgebildeten Merkmalsdarstellungen. Insbesondere berechnet das Merkmalspriorisierungsmodul 12 jeweils paarweise euklidische Abstände für alle Paare von zweiten Merkmalsdarstellungen, die von den verschiedenen Encodermodulen 6a, 6b abgebildet werden, sowie für alle Paare von dritten Merkmalsdarstellungen und so weiter. Für jede Gruppe der abgebildeten Merkmalsdarstellungen wird ein entsprechender durchschnittlicher euklidischer Abstand bestimmt. Das Merkmalspriorisierungsmodul 12 verstärkt oder priorisiert die Merkmale, die den jeweiligen Merkmalsdarstellungen entsprechen, entsprechend ihrem jeweiligen Mittelwert des euklidischen Abstands. Insbesondere gilt: Je kleiner der durchschnittliche euklidische Abstand einer bestimmten Gruppe von abgebildeten Merkmalen ist, desto mehr werden die entsprechenden Merkmale verstärkt oder priorisiert.The artificial neural network 10' to be trained may also include a feature prioritization module 12 configured to calculate a respective Euclidean distance between each pair of first feature representations 11a, 11b and a corresponding mean value of the Euclidean distances for the first feature representations 11a, 11b . In the same manner, the feature prioritization module 12 calculates a corresponding mean Euclidean distance for all corresponding other mapped feature representations. In particular, the feature prioritization module 12 calculates pairwise Euclidean distances for each pair of second feature representations mapped by the different encoder modules 6a, 6b, for each pair of third feature representations, and so on. A corresponding average Euclidean distance is determined for each group of mapped feature representations. The feature prioritization module 12 enhances or prioritizes the features corresponding to the respective feature representations according to their respective mean Euclidean distance. In particular, the smaller the average Euclidean distance of a particular set of mapped features, the more the corresponding features are enhanced or prioritized.

Auf diese Weise wird die Bedeutung statischer Objekte, die über die aufeinanderfolgenden Trainings-Frames hinweg konsistent sind, erhöht, und Objekte, die über die aufeinanderfolgenden Trainings-Frames hinweg inkonsistent oder dynamisch sind, erhalten in den kodierten Merkmalen im latenten Raum eine viel geringere Priorität. Dynamische Objekte werden daher vollständig entfernt, sobald die Encoder-Submodule 6a, 6b über verschiedene Trainingsepochen hinweg genügend Eingabe erhalten haben, um Vertrauen bezüglich der statischen Objekte aufzubauen.In this way, the importance of static objects that are consistent across successive training frames is increased, and objects that are inconsistent or dynamic across successive training frames are given much lower priority in the encoded features in latent space . Dynamic objects are therefore completely removed as soon as the encoder sub-modules 6a, 6b have received enough input over different training epochs to build up confidence in the static objects.

Insbesondere kann Merkmalspriorisierungsmodul 12 so ausgelegt sein, dass ein Algorithmus zur Selbst-Attention auf die Ausgabe des Encodermoduls 6' angewendet wird. Zu diesem Zweck können auf die Neuronen unterhalb einer bestimmten Schwelle des durchschnittlichen euklidischen Abstands Standardtechniken wie zum Beispiel Global Average Pooling angewendet werden.In particular, feature prioritization module 12 may be configured to apply a self-attention algorithm to the output of encoder module 6'. For this purpose, standard techniques such as global average pooling can be applied to the neurons below a certain threshold of average Euclidean distance.

Das Decodermodul 7' wird auf eine Ausgabe des Merkmals-Priorisierungsmoduls angewendet und erzeugt basierend darauf einen rekonstruierten Datensatz 13. Das künstliche neuronale Netzwerk 10' wird mittels der Recheneinheit 3 unüberwacht unter Verwendung einer Standardverlustfunktion, zum Beispiel einer Kreuzentropie-Verlustfunktion, während mehrerer aufeinanderfolgender Trainingsepochen trainiert. Nach der Konvergenz können das resultierende trainierte Encodermodul 6' und Decodermodul 7' zusammen mit der entsprechenden Pose des Kraftfahrzeugs 1 oder des Umfeldsensorsystems 4 in der Speichereinheit 5 gespeichert werden. Auf die gleiche Weise werden die beschriebenen Schritte für die verschiedenen Posen entsprechend den verschiedenen Trainingsversuchen in T1, T1', T1'' wiederholt.The decoder module 7' is applied to an output of the feature prioritization module and based thereon generates a reconstructed data set 13. The artificial neural network 10' is unsupervised by the computing unit 3 using a standard loss function, for example a cross-entropy loss function, during several consecutive training sessions epochs trained. After the convergence, the resulting trained encoder module 6' and decoder module 7' can be stored in the memory unit 5 together with the corresponding pose of the motor vehicle 1 or the environment sensor system 4. In the same way, the steps described are repeated for the different poses corresponding to the different training attempts in T1, T1', T1''.

6 zeigt ein Flussdiagramm einer weiteren Ausführungsform des Verfahrens zum zumindest teilweise automatischen Fahren des Kraftfahrzeugs 1 gemäß dem verbesserten Konzept. 6 shows a flow chart of a further embodiment of the method for at least partially automatic driving of the motor vehicle 1 according to the improved concept.

Im Schritt R1 wird das Verfahren manuell oder basierend auf einer Position des Kraftfahrzeugs 1 ausgelöst, die zum Beispiel über einen GNSS-Empfänger des Kraftfahrzeugs 1 erhalten wird. Zum Beispiel kann das Kraftfahrzeug 1 automatisch in eine Parkfläche einfahren, wenn der GNSS-Empfänger den Standort identifiziert oder ein Benutzer die Anwendung entsprechend auslöst. In Schritt R2 bestimmt die Trainingsrecheneinheit 3, ob ein Training durchgeführt werden soll oder nicht. Zum Beispiel kann die Recheneinheit 3 manuell durch einen Benutzer veranlasst werden, das Trainieren durchzuführen, oder automatisch basierend auf vordefinierten Bedingungen. Wenn das Training durchgeführt werden soll, führen die Recheneinheit 3 und das Fahrzeugführungssystem 2 ein entsprechendes Verfahren zum Trainieren des künstlichen neuronalen Netzwerks 10' mittels der Recheneinheit 3 nach einem computerimplementierten Verfahren gemäß dem verbesserten Konzept in Schritt R3 durch, wie in Bezug auf 4 beziehungsweise 5 beschrieben.In step R1, the method is triggered manually or based on a position of the motor vehicle 1, which is obtained via a GNSS receiver of the motor vehicle 1, for example. For example, the motor vehicle 1 can automatically enter a parking area when the GNSS receiver identifies the location or a user triggers the application accordingly. In step R2, the training computing unit 3 determines whether or not training should be performed. For example, the computing unit 3 can be prompted manually by a user to perform the training, or automatically based on predefined conditions. If the training is to be carried out, the computing unit 3 and the vehicle guidance system 2 carry out a corresponding method for training the artificial neural network 10' by means of the computing unit 3 according to a computer-implemented method according to the improved concept in step R3, as in relation to FIG 4 respectively 5 described.

Wenn es keinen expliziten Auslöser für die Durchführung eines Trainings gibt, wird die Wiederholungsphase für das automatische Fahren in Schritt R4 gestartet. In Schritt R5 stellt die Recheneinheit 3 fest, ob bereits ein geeignetes Inferenzmodell auf der Speichereinheit 5 gespeichert ist. Ist dies der Fall, so führt die Recheneinheit 3 ein Verfahren zur Rekonstruktion eines generischen Umgebungszustandes gemäß dem verbesserten Konzept durch, beziehungsweise wie im Hinblick auf 2 beziehungsweise 3 oben beschrieben. Dann wird eine Trajektorie geplant, und das Kraftfahrzeug 1 wird wie beschrieben automatisch oder teilweise automatisch gefahren.If there is no explicit trigger for performing a training, the automatic driving repetition phase is started in step R4. In step R5, the computing unit 3 determines whether a suitable inference model is already stored in the memory unit 5. If this is the case, the computing unit 3 carries out a method for reconstructing a generic environmental state according to the improved concept, or as with regard to FIG 2 respectively 3 described above. A trajectory is then planned and the motor vehicle 1 is driven automatically or partially automatically as described.

Optional kann in Schritt R7 eine lokale Anpassung durchgeführt werden, um das bestehende Inferenzmodell zu verbessern. In diesem Fall wird das Trainieren erneut in Schritt R3 ausgelöst. Auch wenn in Schritt R5 festgestellt wird, dass kein geeignetes Inferenzmodell vorhanden ist, kann in Schritt R8 ein Training vorgeschlagen und in Schritt R3 durchgeführt werden.A local adaptation can optionally be carried out in step R7 in order to improve the existing inference model. In this case the training is triggered again in step R3. Even if it is determined in step R5 that no suitable inference model is available, training can be proposed in step R8 and carried out in step R3.

Wie beschrieben, insbesondere im Hinblick auf die Abbildungen, ermöglicht das verbesserte Konzept eine zuverlässigere Erkennung relevanter Merkmale in der Umgebung des Umfeldsensorsystems.As described, in particular with regard to the illustrations, the improved concept enables a more reliable detection of relevant features in the environment of the surroundings sensor system.

Das verbesserte Konzept ist besonders nützlich im Zusammenhang mit VSLAM. Bei VSLAM wird die Umgebung als Karte abgebildet und der genaue physische Standort des Fahrzeugs wird identifiziert, wenn sich das Fahrzeug auf einer zuvor trainierten Trajektorie oder in der Nähe einer solchen befindet. Während des VSLAM-Trainings werden Schlüsselpunkte oder Merkmale beispielsweise aus den erfassten Bildern mehrerer Kameras extrahiert, und gleichzeitig wird eine Karte für die erfasste Umgebung erzeugt. Während des Wiedergabemodus des VSLAM, wenn sich das Fahrzeug wieder in der gleichen Umgebung befindet, werden die gegenwärtigen Schlüsselpunkte mit denen, die während des Trainings verwendet wurden, abgeglichen und die tatsächliche Trajektorie mit den trainierten Trajektorien verglichen. Durch das verbesserte Konzept kann erreicht werden, dass lediglich die identische Trajektorie wie während des Trainierens beibehalten wird. Zu diesem Zweck wird die Umgebung so rekonstruiert, dass die extrahierten Schlüsselpunkte während der Wiederholung im Wesentlichen identisch mit denen während des Trainings sind.The improved concept is particularly useful in the context of VSLAM. With VSLAM, the environment is mapped and the precise physical location of the vehicle is identified when the vehicle is on or near a previously trained trajectory. During VSLAM training, for example, key points or features are extracted from the captured images from multiple cameras and a map of the captured environment is generated at the same time. During playback mode of the VSLAM, when the vehicle is back in the same environment, the current keypoints are compared to those used during training and the actual trajectory is compared to the trained trajectories. With the improved concept it can be achieved that only the identical trajectory as during the training is retained. For this purpose, the environment is reconstructed in such a way that the key points extracted during repetition are essentially identical to those during training.

Auf diese Weise können derzeitige Beschränkungen bei VSLAM überwunden werden. Wenn es zum Beispiel zu einer Veränderung der Trajektorie kommt, beispielsweise durch die Aufnahme oder den Ausschluss neuer Objekte oder geringfügige Änderungen der Konstruktionen während des Wiedergabemodus, kann herkömmliches VSLAM die Trajektorie nicht beibehalten. Die Fähigkeit, den Standort einer Kamera sowie die Umgebung zu erfassen, ohne die Datenpunkte vorher zu kennen, ist sehr schwierig, wenn die Umgebung beispielsweise aufgrund von Wetterbedingungen beeinträchtigt ist. Auch wenn die Karte im Trainingsmodus aufgrund der Anwesenheit vieler Objekte zu viele Merkmale abgebildet hat, im Wiedergabemodus jedoch weniger Merkmale aufweist, kann eine Merkmalsanpassung behindert werden. Diese Probleme können mit Hilfe des verbesserten Konzepts überwunden werden.In this way, current limitations in VSLAM can be overcome. For example, if there is a change in trajectory such as the inclusion or exclusion of new objects or minor changes in constructions during playback mode, traditional VSLAM cannot maintain the trajectory. The ability to capture a camera's location as well as the environment without knowing the data points in advance is very difficult when the environment is compromised due to weather conditions, for example. Also, if the map has too many features in training mode due to the presence of many objects, but has fewer features in playback mode, feature adjustment can be impeded. These problems can be overcome with the help of the improved concept.

Anstatt die Kamerabilder direkt dem entsprechenden VSLAM-Modul zur Extraktion der Schlüsselpunkte zuzuführen, kann ein generischer Zustand der gegenwärtigen Umgebung gemäß dem verbesserten Konzept rekonstruiert werden. Die Schlüsselpunkte werden dann aus der rekonstruierten Umgebung extrahiert. Daher fällt die geplante Trajektorie während des Wiedergabemodus automatisch auf eine trainierte Trajektorie. Zum Beispiel kann vor der Durchführung der Rekonstruktion die Pose des Fahrzeugs geschätzt und mit den jeweils gespeicherten Posen an den gleichen Stellen unter den trainierten Inferenzmodellen verglichen werden. Das Inferenzmodell, das der nächstgelegenen Pose entspricht, kann ausgewählt werden.Instead of feeding the camera images directly to the appropriate VSLAM module for keypoint extraction, a generic state of the current environment can be reconstructed according to the improved concept. The key points are then extracted from the reconstructed environment. Therefore, the planned Trajectory automatically to a trained trajectory during playback mode. For example, before performing the reconstruction, the pose of the vehicle can be estimated and compared to the respective stored poses at the same locations under the trained inference models. The inference model that corresponds to the closest pose can be selected.

Mit Hilfe des verbesserten Konzepts kann die aktuelle Umgebung des Fahrzeugs basierend auf seiner tatsächlichen Pose wieder in die trainierte Umgebung rekonstruiert werden. Bewegliche oder dynamische Objekte, die Probleme bei der Merkmalspunktanpassung verursachen können, können entfernt werden. Insbesondere durch die Verwendung der Variations-Autoencoder-Architektur erfordern geringfügige und moderate Änderungen in der Umgebung nicht unbedingt ein erneutes Trainieren des Inferenzmodells.With the help of the improved concept, the current environment of the vehicle can be reconstructed into the trained environment based on its actual pose. Moving or dynamic objects that can cause feature point adjustment problems can be removed. In particular, by using the variational autoencoder architecture, small and moderate changes in the environment do not necessarily require retraining of the inference model.

In einigen Ausführungen unterstützt das Trainieren mehrere Versuche, wobei für jeden Versuch, für jeden Zeitstempel oder Frame einige frühere Frames mit Hilfe des Variations-Autoencoders verarbeitet werden. Daher kann der Variations-Autoencoder eine zeitliche Konsistenz erlernen. Für das Trainieren kann eine Mehrzahl von Encoder-Submodulen Eingaben von verschiedenen aufeinanderfolgenden Trainings-Frames erhalten. Sobald die entsprechenden Merkmale durch einzelne Encoder-Submodule getrennt kodiert sind, kann der euklidische Abstand zwischen den einzelnen Merkmalsabbildungen der verschiedenen Frames im latenten Raum berechnet werden. In dem sich daraus ergebenden Satz von Merkmalsdarstellungen kann Selbst-Attention unter Verwendung einer Standardtechnik, zum Beispiel GAP, auf die Neuronen unterhalb eines bestimmten Schwellenwertes angewendet werden. Auf diese Weise wird die Bedeutung von statischen Objekten, die über die Frames hinweg konsistent sind, ausgeweitet, während alle Objekte, die inkonsistent oder dynamisch sind, eine reduzierte Priorität in den kodierten Merkmalen im latenten Raum erhalten. Wenn der Algorithmus ausgeführt wird, kann der Variations-Autoencoder die gegenwärtige Szene an einer zuvor erstellten Vorlage ausrichten und Teile, die nicht mit der Vorlage übereinstimmen, eliminieren. Mit anderen Worten wird eine Vorlage der Szene datengesteuert aufgebaut und dann wird die Szene zur Inferenz/Laufzeit an dieser Vorlage ausgerichtet. Auf diese Weise kann mit bewegten Objekten, unterschiedlichen Wetterbedingungen oder kleinen zusätzlichen Objekten, die in der Szene platziert werden, umgegangen werden.In some implementations, the training supports multiple trials, where for each trial, for each timestamp or frame, some earlier frames are processed using the variational autoencoder. Therefore, the variation autoencoder can learn temporal consistency. For training, a plurality of encoder sub-modules can receive inputs from different consecutive training frames. Once the corresponding features are separately encoded by individual encoder sub-modules, the Euclidean distance between the individual feature maps of the different frames in latent space can be calculated. In the resulting set of feature representations, self-attention can be applied to the neurons below a certain threshold using a standard technique, e.g. GAP. In this way, the importance of static objects that are consistent across frames is expanded, while all objects that are inconsistent or dynamic are given reduced priority in the encoded features in latent space. When the algorithm runs, the variation autoencoder can align the current scene to a previously created template and eliminate parts that don't match the template. In other words, a template of the scene is built under data control and then the scene is aligned to that template at inference/runtime. In this way, moving objects, different weather conditions or small additional objects placed in the scene can be dealt with.

Zum Beispiel kann während der Inferenz- oder Wiedergabephase immer nur ein Frame zur gleichen Zeit betrachtet und in den trainierten Encoder eingegeben werden. Der trainierte Variations-Autoencoder ist in der Lage, die Umgebung gut zu verstehen und macht den Ansatz robust, um alle Objekte zu entfernen, die während des Trainierens nicht gesehen wurden. Es hat sich auch herausgestellt, dass unterschiedliche Wetterbedingungen im Allgemeinen als Rauschen behandelt werden, so dass die Wetterbedingungen beim Trainieren kein potenziell limitierender Faktor sind.For example, during the inference or playback phase, only one frame can be viewed at a time and input to the trained encoder. The trained variation autoencoder is able to understand the environment well and makes the approach robust to remove any objects not seen during training. It has also been found that varying weather conditions are generally treated as noise, so weather conditions are not a potentially limiting factor when exercising.

Visuelle Standard-Odometrie kann verwendet werden, um die Pose des Fahrzeugs in entsprechenden Implementierungen abzuschätzen. Sie kann insgesamt sechs Werte bereitstellen, nämlich die Translation entlang der x-, y- und z-Achsen und die Rotation um die x-, y- und z-Achsen. Während des VSLAM-Trainings wird die Pose gespeichert und während der Wiedergabe wird die Position und Pose des Fahrzeugs erkannt. Dann wird ein Inferenzmodell, das an einer sehr ähnlichen Position und Orientierung trainiert wurde, zur Wiedergabe ausgewählt.Standard visual odometry can be used to estimate the vehicle's pose in appropriate implementations. It can provide a total of six values, namely translation along the x, y, and z axes and rotation about the x, y, and z axes. During VSLAM training, the pose is saved and during playback, the vehicle's position and pose is recognized. Then an inference model trained at a very similar position and orientation is selected for playback.

Claims

Computer-implemented method for training an artificial neural network (10') to reconstruct a generic state of an environment of an environment sensor system (4), characterized in that - an artificial neural network (10') is provided, the artificial neural network (10 ') is designed as a variation autoencoder, which includes an encoder module (6') and a decoder module (7'); - For each of a plurality of consecutive training frames, a respective sensor data set (8a, 8b), which represents the environment, is obtained from the environment sensor system (4); - A first feature representation (11a, 11b) is generated for each of the training frames by applying the encoder module (6') to the sensor data set (8a, 8b) of the respective training frame; - A reconstructed data set (13) is generated by the decoder module (7') as a function of the first feature representations (11a, 11b); and - the artificial neural network (10') is trained unsupervised depending on the reconstructed data set (13) in order to remove a dynamic object (14) in an input data set representing the environment.

Computer-implemented method claim 1 , characterized in that - by applying a feature prioritization module (12) of the artificial neural network (10') to an output of the encoder module (6') for each pair of the first feature representations (11a, 11b) a respective first deviation between the first feature representations ( 11a, 11b) of the respective ligens pair is determined and the first feature representations (11a, 11b) are modified depending on the determined first deviations; - the reconstructed data set (13) is generated by applying the decoder module (7') to an output of the feature prioritization module (12).

Computer-implemented method claim 2 , characterized in that - for each of the training frames, a second feature representation is generated by applying the encoder module (6') to the sensor data set (8a, 8b) of the respective training frame; - by applying the feature prioritization module (12) to the output of the encoder module (6'), - for each pair of second feature representations, determining a respective second deviation between the second feature representations of the respective pair; and - the first depicted feature representations (11a, 11b) and/or the second feature representations are modified as a function of the determined first deviations and the determined second deviations.

Computer-implemented method claim 3 , characterized in that by applying the feature prioritization module (12) to the output of the encoder module (6'), - features corresponding to the first feature representations (11a, 11b) are prioritized higher than features corresponding to the second feature representations when a mean of the first deviations is greater than a mean of the second deviations; and - the features that correspond to the first feature representations (11a, 11b) are prioritized lower than the features that correspond to the second feature representations if the mean value of the first deviations is smaller than the mean value of the second deviations.

Computer-implemented method according to one of claims 3 or 4 , characterized in that by applying the feature prioritization module (12) to the output of the encoder module (6'), a self-attention algorithm depending on the first deviations and the second deviations is applied to the output of the encoder module (6') in order to to modify the first feature representations (11a, 11b).

Computer-implemented method according to one of claims 2 until 5 , characterized in that a Euclidean distance between the first feature representations (11a, 11b) of the respective pair is calculated in order to determine the first deviation for the respective pair.

Computer-implemented method according to one of the preceding claims, characterized in that - a further artificial neural network is provided, the further artificial neural network being designed as a variation autoencoder which contains a further encoder module and a further decoder module; - A respective further sensor data set is obtained for each of a plurality of subsequent further training frames, which represents a further environment of the surroundings sensor system (4); - For each of the further training frames, a further first feature representation is generated by the further encoder module being applied to the further sensor data set of the respective further training frame; - a further reconstructed data set is generated by the further decoder module as a function of the further first feature representations; and - the further artificial neural network is trained unsupervised as a function of the further reconstructed data set in order to remove a dynamic object (14) in a further input data set which represents the further environment.

Computer-implemented method claim 7 , characterized in that - a pose of the environment sensor system (4) which corresponds to the environment is obtained and a further pose of the environment sensor system (4) which corresponds to the further environment is obtained; - after training the artificial neural network (10'), an inference model is generated which includes the pose, the encoder module (6') and the decoder module (7'); and - after the further artificial neural network (10') has been trained, a further inference model is generated which contains the further pose, the further encoder module (6') and the further decoder module (7').

Method for reconstructing a generic state of an environment of an environment sensor system (4), characterized in that - a sensor data set (8), which represents a current environment of the environment sensor system (4), is generated by the environment sensor system (4); - a trained artificial neural network (10) is applied to the sensor data set (8) in order to remove a dynamic object (14) in the sensor data set (8), the trained artificial neural network (10) being designed as a variation autoencoder , which includes an encoder module (6) and a decoder module (7); and - in order to remove the dynamic object (14) in the sensor data set (8), a first feature representation (11a, 11b) by using the encoder module (6) is generated on the sensor data record (8) and a reconstructed data record (9) is generated by the decoder module (7) depending on the first feature representation (11a, 11b).

procedure after claim 9 , characterized in that a computer-implemented method according to one of Claims 1 until 8th is performed to generate the trained artificial neural network (10).

procedure after claim 9 , characterized in that - for training the artificial neural network (10) according to a computer-implemented method claim 8 is carried out; and - the encoder module (6) and the decoder module (7) of the trained artificial neural network (10) the encoder module (6) and the decoder module (7) of the inference model or the further encoder module (6) and the further decoder module (7) further match inference model.

procedure after claim 11 , characterized in that - a current pose of the environment sensor system (4) is determined; - the current pose is compared with the pose of the inference model and with the further pose of the further inference model; and - depending on the respective results of the comparisons, the encoder module (6) and the decoder module (7) of the inference model or the further encoder module (6) and the further decoder module of the further inference model as encoder module (6) and decoder module (7) of the trained artificial neural network (10) to be selected.

Method for at least partially automatic driving of a motor vehicle (1), characterized in that - a method according to one of claims 9 until 12 is carried out, in which the environment sensor system (1) is mounted on the motor vehicle (1); - A trajectory for the motor vehicle (1) is planned on the basis of the reconstructed data set (9); and - the motor vehicle (1) is driven at least partially automatically according to the trajectory.

Electronic vehicle guidance system containing a computing unit (3), a memory unit (5) that stores a trained artificial neural network (10), and an environment sensor system (4) that is set up to generate a sensor data set (8) that represents a current environment of the surroundings sensor system (4), characterized in that - the trained artificial neural network (10) is designed as a variation autoencoder which contains an encoder module (6) and a decoder module (7); and - the computing unit (3) is set up to - apply the trained artificial neural network (10) to the sensor data set (8) in order to remove a dynamic object (14) in the sensor data set (8), wherein a first feature representation (11a , 11b) is generated by applying the encoder module (6) to the sensor data set (8) and a reconstructed data set (9) is generated by the decoder module (7) depending on the first feature representation (11a, 11b); - to plan a trajectory for the motor vehicle (1) based on the reconstructed data set (9); and - at least partially automatically generate one or more control signals for driving a motor vehicle (1) according to the trajectory.

Computer program product comprising - first instructions which, when executed by a computer system, cause the computer system to perform a computer-implemented method according to any one of Claims 1 until 8th to perform; and/or - second commands which, when sent by an electronic vehicle guidance system (2). Claim 14 are executed, cause the electronic vehicle guidance system (2), a computer-implemented method according to one of Claims 1 until 8th or a method according to any of claims 9 until 13 to execute.