DE102020117271A1

DE102020117271A1 - Method and device for determining object data relating to an object

Info

Publication number: DE102020117271A1
Application number: DE102020117271.0A
Authority: DE
Inventors: Alvaro Marcos-Ramiro; Johannes Niedermayer; Mohammad-Ali Nikouei Mahani; Barbara Hilsenbeck; Vinzenz Dallabetta; Naveen Shankar NAGARAJA; Michael Schmidt; Stefano Gasperini
Original assignee: Bayerische Motoren Werke AG
Current assignee: Bayerische Motoren Werke AG
Priority date: 2020-07-01
Filing date: 2020-07-01
Publication date: 2022-01-05

Abstract

Es wird eine Vorrichtung zur Ermittlung von Objektdaten in Bezug auf ein oder mehrere Objekte im Umfeld von ein oder mehreren Umfeldsensoren beschrieben. Die Vorrichtung ist eingerichtet, auf Basis der Sensordaten der ein oder mehreren Umfeldsensoren Daten eines Umfeldrasters für das durch die Sensordaten erfasste Umfeld zu ermitteln. Die Vorrichtung ist ferner eingerichtet, auf Basis der Daten des Umfeldrasters mittels eines ersten neuronalen Encoder-Netzwerks eine erste Merkmalsmatrix zu ermitteln. Außerdem ist die Vorrichtung eingerichtet, Objektdaten in Bezug auf ein oder mehrere Objekte in dem Umfeld auf Basis der ersten Merkmalsmatrix mittels eines neuronalen Auswerte-Netzwerks zu ermitteln.A device for determining object data relating to one or more objects in the vicinity of one or more surroundings sensors is described. The device is set up to determine data of an environment grid for the environment detected by the sensor data on the basis of the sensor data of the one or more environment sensors. The device is also set up to determine a first feature matrix on the basis of the data of the surrounding area grid using a first neural encoder network. In addition, the device is set up to determine object data relating to one or more objects in the surrounding area on the basis of the first feature matrix using a neural evaluation network.

Description

Die Erfindung betrifft ein Verfahren und eine entsprechende Vorrichtung, die es z.B. einem Fahrzeug ermöglichen, auf Basis von Sensordaten von ein oder mehreren Umfeldsensoren Objektdaten in Bezug auf ein oder mehrere Objekte im Umfeld der ein oder mehreren Umfeldsensoren zu ermitteln.The invention relates to a method and a corresponding device that enable a vehicle, for example, to determine object data relating to one or more objects in the vicinity of the one or more environment sensors on the basis of sensor data from one or more environment sensors.

Ein Fahrzeug umfasst typischerweise eine Mehrzahl von unterschiedlichen Umfeldsensoren, die eingerichtet sind, unterschiedliche Sensordaten bezüglich des Umfelds des Fahrzeugs zu erfassen. Beispielhafte Umfeldsensoren sind Lidarsensoren, Bildsensoren bzw. Bildkameras, Radarsensoren, Ultraschallsensoren, etc. Auf Basis der Sensordaten der ein oder mehreren Umfeldsensoren des Fahrzeugs können ein oder mehrere Umgebungs-Objekte (z.B. ein oder mehrere andere Fahrzeuge) in dem Umfeld des Fahrzeugs detektiert und ggf. nachverfolgt werden.A vehicle typically includes a plurality of different environment sensors which are set up to record different sensor data relating to the environment of the vehicle. Exemplary surroundings sensors are lidar sensors, image sensors or image cameras, radar sensors, ultrasound sensors, etc. .be tracked.

Das vorliegende Dokument befasst sich mit der technischen Aufgabe, eine besonders zuverlässige und/oder präzise Erkennung von Objekten auf Basis von Sensordaten von ein oder mehreren Umfeldsensoren zu ermöglichen.The present document deals with the technical task of enabling particularly reliable and / or precise detection of objects on the basis of sensor data from one or more environmental sensors.

Die Aufgabe wird durch jeden der unabhängigen Ansprüche gelöst. Vorteilhafte Ausführungsformen werden u.a. in den abhängigen Ansprüchen beschrieben. Es wird darauf hingewiesen, dass zusätzliche Merkmale eines von einem unabhängigen Patentanspruch abhängigen Patentanspruchs ohne die Merkmale des unabhängigen Patentanspruchs oder nur in Kombination mit einer Teilmenge der Merkmale des unabhängigen Patentanspruchs eine eigene und von der Kombination sämtlicher Merkmale des unabhängigen Patentanspruchs unabhängige Erfindung bilden können, die zum Gegenstand eines unabhängigen Anspruchs, einer Teilungsanmeldung oder einer Nachanmeldung gemacht werden kann. Dies gilt in gleicher Weise für in der Beschreibung beschriebene technische Lehren, die eine von den Merkmalen der unabhängigen Patentansprüche unabhängige Erfindung bilden können.The problem is solved by each of the independent claims. Advantageous embodiments are described, inter alia, in the dependent claims. It is pointed out that additional features of a patent claim dependent on an independent patent claim without the features of the independent patent claim or only in combination with a subset of the features of the independent patent claim can form a separate invention that is independent of the combination of all the features of the independent patent claim can be made the subject of an independent claim, a divisional application or a subsequent application. This applies equally to the technical teachings described in the description, which can form an invention that is independent of the features of the independent patent claims.

Gemäß einem Aspekt wird eine Vorrichtung zur Ermittlung von Objektdaten in Bezug auf ein oder mehrere Objekte im Umfeld von ein oder mehreren Umfeldsensoren beschrieben. Die Vorrichtung kann insbesondere eingerichtet sein, Objektdaten auf Basis der Sensordaten zumindest eines 3D (drei-dimensionalen) Umfeldsensors (wie z.B. eines Lidarsensors und/oder Radarsensors) und der Sensordaten zumindest eines 2D (zwei-dimensionalen) Sensors (wie z.B. einer Kamera) zu ermitteln. Die ein oder mehreren (3D und 2D) Umfeldsensoren können Teil eines Fahrzeugs sein. Das Fahrzeug kann ausgebildet sein, zumindest teilweise automatisiert auf Basis der ermittelten Objektdaten betrieben zu werden.According to one aspect, a device for determining object data in relation to one or more objects in the vicinity of one or more environment sensors is described. The device can in particular be set up to supply object data based on the sensor data of at least one 3D (three-dimensional) environment sensor (such as a lidar sensor and / or radar sensor) and the sensor data of at least one 2D (two-dimensional) sensor (such as a camera) detect. The one or more (3D and 2D) environment sensors can be part of a vehicle. The vehicle can be designed to be operated at least partially in an automated manner on the basis of the determined object data.

Die Vorrichtung ist eingerichtet, auf Basis der Sensordaten der ein oder mehreren (3D und 2D) Umfeldsensoren Daten eines Umfeldrasters für das durch die Sensordaten erfasste Umfeld zu ermitteln. Das Umfeldraster kann eine Vielzahl von Zellen in einer Rasterebene umfassen. Dabei können die Zellen zumindest teilweise in unterschiedlichen Abständen zu den ein oder mehreren Umfeldsensoren angeordnet sein. Die Daten des Umfeldrasters können insbesondere auf Basis der Sensordaten von ein oder mehreren Lidarsensoren und/oder Radarsensoren und ggf. von ein oder mehreren Bildsensoren bzw. Bildkameras ermittelt werden.The device is set up to determine data of an environment grid for the environment detected by the sensor data on the basis of the sensor data from the one or more (3D and 2D) environment sensors. The surrounding grid can comprise a multiplicity of cells in a grid plane. The cells can be arranged at least partially at different distances from the one or more environment sensors. The data of the environment grid can be determined in particular on the basis of the sensor data from one or more lidar sensors and / or radar sensors and possibly from one or more image sensors or image cameras.

Es kann ein kartesisches Koordinatensystem mit einer Längsachse, einer senkrecht darauf stehenden Querachse und einer senkrecht darauf stehenden Hochachse betrachtet werden. Diese Achsen können der Längs-, Quer-, bzw. Hochachse eines Fahrzeugs entsprechen. Die Rasterebene kann einer durch die Längsachse und die Querachse aufgespannten Ebene entsprechen. Die ein oder mehreren Umfeldsensoren (insbesondere die ein oder mehreren Lidar- und/oder Radarsensoren) können ausgebildet sein, ein Sensorsignal parallel zu der Rasterebene auszusenden.A Cartesian coordinate system with a longitudinal axis, a transverse axis perpendicular to it and a vertical axis perpendicular to it can be considered. These axes can correspond to the longitudinal, transverse or vertical axis of a vehicle. The grid plane can correspond to a plane spanned by the longitudinal axis and the transverse axis. The one or more environment sensors (in particular the one or more lidar and / or radar sensors) can be designed to transmit a sensor signal parallel to the grid plane.

Die Daten des Umfeldrasters können für eine (insbesondere für jede einzelne) Zelle der Vielzahl von Zellen des Umfeldrasters anzeigen: Information in Bezug auf die Wahrscheinlichkeit dafür, dass die Zelle durch ein dynamisches und/oder ein statisches Objekt belegt ist, und/oder Information in Bezug auf die Wahrscheinlichkeit dafür, dass die Zelle Freiraum ist; und/oder Information in Bezug auf die Höhe eines an der Zelle angeordneten Objekts (entlang der Hochachse). Des Weiteren können die Daten des Umfeldrasters für eine (insbesondere für jede einzelne) Zelle Information auf einer digitalen Karte in Bezug auf das Umfeld umfassen.The data of the environmental grid can display for one (in particular for each individual) cell of the plurality of cells of the environmental grid: information relating to the probability that the cell is occupied by a dynamic and / or a static object and / or information in Regarding the likelihood that the cell is free space; and / or information relating to the height of an object arranged on the cell (along the vertical axis). Furthermore, the data of the environment grid for one (in particular for each individual) cell can include information on a digital map with regard to the environment.

Die Vorrichtung ist ferner eingerichtet, auf Basis der Daten des Umfeldrasters mittels eines ersten neuronalen Encoder-Netzwerks eine erste Merkmalsmatrix zu ermitteln. Die Daten des Umfeldrasters können dabei als Eingangswerte an das erste neuronale Encoder-Netzwerk übergeben werden. Die erste Merkmalsmatrix kann dann als Ausgangswert des ersten neuronalen Encoder-Netzwerks bereitgestellt werden. Das erste neuronale Encoder-Netzwerk kann ein Convolutional Neuronal Network (CNN) umfassen. Des Weiteren kann das erste neuronale Encoder-Netzwerk im Vorfeld anhand von gelabelten Trainingsdaten angelernt worden sein.The device is also set up to determine a first feature matrix on the basis of the data of the environment grid by means of a first neural encoder network. The data of the environment grid can be transferred as input values to the first neural encoder network. The first feature matrix can then be provided as the output value of the first neural encoder network. The first neural encoder network can comprise a convolutional neural network (CNN). Furthermore, the first neural encoder network can have been trained in advance on the basis of labeled training data.

Ferner kann die Vorrichtung eingerichtet sein, Objektdaten in Bezug auf ein oder mehrere Objekte in dem Umfeld auf Basis der ersten Merkmalsmatrix mittels eines neuronalen Auswerte-Netzwerks zu ermitteln. Dabei kann das neuronale Auswerte-Netzwerk im Vorfeld anhand von gelabelten Trainingsdaten angelernt worden sein. Die Objektdaten für ein Objekt können eine Umrandung, insbesondere eine Bounding-Box, des Objekts auf der Rasterebene des Umfeldrasters anzeigen.Furthermore, the device can be set up to determine object data in relation to one or more objects in the environment on the basis of the first feature matrix by means of a neural evaluation network. The neural evaluation network can have been trained in advance on the basis of labeled training data. The object data for an object can display a border, in particular a bounding box, of the object on the grid level of the surrounding grid.

Durch die Verwendung von unterschiedlichen neuronalen Netzwerken auf Basis der Daten eines Umfeldrasters können Objektdaten in Bezug auf ein oder mehrere Umfeld-Objekte in zuverlässiger und präziser Weise ermittelt werden.By using different neural networks based on the data of an environment grid, object data relating to one or more environment objects can be determined in a reliable and precise manner.

Die Vorrichtung kann eingerichtet sein, wiederholt an aufeinanderfolgenden Zeitpunkten, auf Basis von aktuellen Sensordaten der ein oder mehreren Umfeldsensoren für den jeweiligen Zeitpunkt aktuelle Daten des Umfeldrasters für das durch die aktuellen Sensordaten erfasste Umfeld zu ermitteln. Auf Basis der jeweils aktuellen Daten des Umfeldrasters kann dann mittels des ersten neuronalen Encoder-Netzwerks eine jeweils aktuelle erste Merkmalsmatrix für den jeweiligen Zeitpunkt ermittelt werden. Des Weiteren können aktuelle Objektdaten in Bezug auf die ein oder mehreren Objekte in dem Umfeld auf Basis der jeweils aktuellen ersten Merkmalsmatrix mittels des neuronalen Auswerte-Netzwerks ermittelt werden. Es können somit an den aufeinanderfolgenden Zeitpunkten jeweils aktuelle Objektdaten ermittelt werden. So können die ein oder mehreren Objekte in zuverlässiger Weise nachverfolgt werden.The device can be set up to repeatedly determine, at successive points in time, on the basis of current sensor data from the one or more environment sensors for the respective point in time, current data of the environment grid for the environment detected by the current sensor data. On the basis of the respectively current data of the environment grid, a respectively current first feature matrix for the respective point in time can then be determined by means of the first neural encoder network. Furthermore, current object data relating to the one or more objects in the environment can be determined on the basis of the current first feature matrix by means of the neural evaluation network. Current object data can thus be determined at the successive points in time. In this way, the one or more objects can be tracked in a reliable manner.

Die Vorrichtung kann eingerichtet sein, mittels eines neuronalen Decoder-Netzwerks auf Basis der ersten Merkmalsmatrix eine Referenzpunkt-Karte zu ermitteln. Dabei kann die Referenzpunkt-Karte eine räumliche Wahrscheinlichkeitsverteilung von ein oder mehreren Referenzpunkten (z.B. von ein oder mehreren Mittelpunkten und/oder Schwerpunkten) für die ein oder mehreren Objekte anzeigen.The device can be set up to determine a reference point map by means of a neural decoder network on the basis of the first feature matrix. The reference point map can display a spatial probability distribution of one or more reference points (e.g. of one or more center points and / or focal points) for the one or more objects.

Des Weiteren kann die Vorrichtung eingerichtet sein, mittels des neuronalen Decoder-Netzwerks auf Basis der ersten Merkmalsmatrix zumindest eine Merkmals-Karte für zumindest ein Merkmal, insbesondere für eine Dimension (etwa die Länge und/oder Breite) und/oder für die Orientierung, der ein oder mehreren Objekte zu ermitteln.Furthermore, the device can be set up, by means of the neural decoder network on the basis of the first feature matrix, at least one feature map for at least one feature, in particular for one dimension (such as the length and / or width) and / or for the orientation, the to determine one or more objects.

Die Objektdaten in Bezug auf die ein oder mehrere Objekte in dem Umfeld können dann in besonders präziser Weise auf Basis der Referenzpunkt-Karte und auf Basis der zumindest einen Merkmals-Karte mittels des neuronalen Auswerte-Netzwerks ermittelt werden.The object data relating to the one or more objects in the environment can then be determined in a particularly precise manner on the basis of the reference point map and on the basis of the at least one feature map by means of the neural evaluation network.

In dem Dokument Zhou, Xingyi, Dequan Wang, and Philipp Krähenbühl, „Objects as points“, arXiv preprint arXiv: 1904.07850 (2019) wird eine neuronale Netzwerkstruktur beschrieben. Insbesondere werden in diesem Dokument eine beispielhafte Implementierung für ein Encoder-Netzwerk und/oder für ein Decoder-Netzwerk und/oder für ein Auswerte-Netzwerk beschrieben. Der Inhalt dieses Dokuments wird per Referenz in die vorliegende Beschreibung aufgenommen. In the document Zhou, Xingyi, Dequan Wang, and Philipp Krähenbühl, "Objects as points", arXiv preprint arXiv: 1904.07850 (2019) describes a neural network structure. In particular, an exemplary implementation for an encoder network and / or for a decoder network and / or for an evaluation network is described in this document. The content of this document is incorporated into the present description by reference.

Die Vorrichtung kann eingerichtet sein, auf Basis der Sensordaten der ein oder mehreren Umfeldsensoren zumindest ein Bild in Bezug auf das Umfeld zu ermitteln. Das zumindest eine Bild kann anhand zumindest einer Kamera erfasst werden. Das Bild weist typischerweise Bildpunkte bzw. Pixel in einer Bildebene auf, die sich von der Rasterebene unterscheidet. Die Bildebene kann im Wesentlichen senkrecht auf der Rasterebene stehen. Die einzelnen Bildpunkte des Bildes können sich jedoch zumindest teilweise auf unterschiedliche Zellen innerhalb der Rasterebene beziehen (je nachdem wie weit ein auf dem Bild sichtbares Objekt von der Kamera entfernt ist).The device can be set up to determine at least one image in relation to the environment on the basis of the sensor data from the one or more environment sensors. The at least one image can be captured using at least one camera. The image typically has image points or pixels in an image plane that differs from the raster plane. The image plane can be essentially perpendicular to the raster plane. The individual pixels of the image can, however, at least partially relate to different cells within the raster plane (depending on how far an object visible in the image is from the camera).

Die Vorrichtung kann ferner eingerichtet sein, auf Basis des Bildes mittels eines zweiten neuronalen Encoder-Netzwerks eine zweite Merkmalsmatrix zu ermitteln. Die erste Merkmalsmatrix und die zweite Merkmalsmatrix können gleiche Dimensionen aufweisen. Das zweite neuronale Encoder-Netzwerk kann ein CNN umfassen. Des Weiteren kann das zweite neuronale Encoder-Netzwerk im Vorfeld anhand von gelabelten Trainingsdaten angelernt worden sein.The device can also be set up to determine a second feature matrix on the basis of the image by means of a second neural encoder network. The first feature matrix and the second feature matrix can have the same dimensions. The second encoder neural network can comprise a CNN. Furthermore, the second neural encoder network can have been trained in advance on the basis of labeled training data.

Des Weiteren kann die Vorrichtung eingerichtet sein, auf Basis der ersten Merkmalsmatrix und auf Basis der zweiten Merkmalsmatrix, insbesondere durch Konkatenation und/oder durch Addition, eine fusionierte Merkmalsmatrix zu ermitteln. Die Objektdaten in Bezug auf die ein oder mehreren Objekte in dem Umfeld können dann in besonders zuverlässiger und präziser Weise auf Basis der fusionierten Merkmalsmatrix mittels des neuronalen Auswerte-Netzwerks ermittelt werden.Furthermore, the device can be set up to determine a merged feature matrix on the basis of the first feature matrix and on the basis of the second feature matrix, in particular by concatenation and / or by addition. The object data relating to the one or more objects in the environment can then be determined in a particularly reliable and precise manner on the basis of the merged feature matrix by means of the neural evaluation network.

Die Vorrichtung kann eingerichtet sein, die zweite Merkmalsmatrix von der Bildebene des Bildes auf die Rasterebene des Umfeldrasters zu transformieren und/oder zu projizieren, um eine transformierte Merkmalsmatrix zu ermitteln. Eine beispielhafte Transformation- und/oder Projektionsmethode wird in Roddick, Thomas, Alex Kendall, and Roberto Cipolla, „Orthographie feature transform for monocular 3d object detection“, arXiv preprint arXiv:1811.08188 (2018) beschrieben. Der Inhalt dieses Dokuments wird per Referenz in die vorliegende Beschreibung aufgenommen.The device can be set up to transform and / or project the second feature matrix from the image plane of the image onto the raster plane of the surrounding raster in order to determine a transformed feature matrix. An exemplary transformation and / or projection method is described in Roddick, Thomas, Alex Kendall, and Roberto Cipolla, "Orthographie feature transform for monocular 3d object detection", arXiv preprint arXiv: 1811.08188 (2018). The content of this document is incorporated into the present description by reference.

Die fusionierte Merkmalsmatrix kann dann in besonders präziser Weise, insbesondere durch Konkatenation und/oder durch Addition, auf Basis der ersten Merkmalsmatrix und auf Basis der transformierten Merkmalsmatrix ermittelt werden. So kann die Güte der ermittelten Objektdaten weiter erhöht werden.The fused feature matrix can then be determined in a particularly precise manner, in particular by concatenation and / or by addition, on the basis of the first feature matrix and on the basis of the transformed feature matrix. In this way, the quality of the object data determined can be further increased.

Die Vorrichtung kann eingerichtet sein, auf Basis der Sensordaten der ein oder mehreren Umfeldsensoren eine Mehrzahl von Bildern in Bezug auf das Umfeld zu ermitteln. Insbesondere können von unterschiedlichen Kameras an einem bestimmten Zeitpunkt jeweils unterschiedliche Bilder in Bezug auf das Umfeld erfasst werden.The device can be set up to determine a plurality of images in relation to the environment on the basis of the sensor data of the one or more environment sensors. In particular, different images in relation to the surroundings can be recorded by different cameras at a specific point in time.

Es kann dann auf Basis der Mehrzahl von Bildern mittels einer entsprechenden Mehrzahl von zweiten neuronalen Encoder-Netzwerken eine entsprechende Mehrzahl von zweiten Merkmalsmatrizen (für jeweils einen bestimmten Zeitpunkt) ermittelt werden. Des Weiteren kann die fusionierte Merkmalsmatrix, insbesondere durch Konkatenation und/oder durch Addition, auf Basis der ersten Merkmalsmatrix und auf Basis der Mehrzahl von zweiten Merkmalsmatrizen (insbesondere auf Basis der jeweils transformierten Merkmalsmatrizen) ermittelt werden. So kann die Güte der ermittelten Objektdaten weiter erhöht werden.A corresponding plurality of second feature matrices (each for a specific point in time) can then be determined on the basis of the plurality of images by means of a corresponding plurality of second neural encoder networks. Furthermore, the merged feature matrix can be determined, in particular by concatenation and / or by addition, on the basis of the first feature matrix and on the basis of the plurality of second feature matrices (in particular on the basis of the respectively transformed feature matrices). In this way, the quality of the object data determined can be further increased.

Gemäß einem weiteren Aspekt wird ein (Straßen-)Kraftfahrzeug (insbesondere ein Personenkraftwagen oder ein Lastkraftwagen oder ein Bus oder ein Motorrad) beschrieben, das die in diesem Dokument beschriebene Vorrichtung umfasst.According to a further aspect, a (road) motor vehicle (in particular a passenger car or a truck or a bus or a motorcycle) is described which comprises the device described in this document.

Gemäß einem weiteren Aspekt wird ein Verfahren zur Ermittlung von Objektdaten in Bezug auf ein oder mehrere Objekte im Umfeld von ein oder mehreren Umfeldsensoren beschrieben. Das Verfahren umfasst das Ermitteln, auf Basis der Sensordaten der ein oder mehreren Umfeldsensoren, von Daten eines Umfeldrasters für das durch die Sensordaten erfasste Umfeld. Des Weiteren umfasst das Verfahren das Ermitteln, auf Basis der Daten des Umfeldrasters, mittels eines ersten neuronalen Encoder-Netzwerks, einer ersten Merkmalsmatrix. Das Verfahren umfasst ferner das Ermitteln, auf Basis der ersten Merkmalsmatrix und mittels eines neuronalen Auswerte-Netzwerks, von Objektdaten in Bezug auf ein oder mehrere Objekte in dem Umfeld.According to a further aspect, a method for determining object data in relation to one or more objects in the vicinity of one or more environment sensors is described. The method comprises the determination, on the basis of the sensor data of the one or more environment sensors, of data from an environment grid for the environment detected by the sensor data. Furthermore, the method includes determining, on the basis of the data of the environment grid, by means of a first neural encoder network, a first feature matrix. The method further comprises determining, on the basis of the first feature matrix and by means of a neural evaluation network, object data relating to one or more objects in the environment.

Gemäß einem weiteren Aspekt wird ein Software (SW) Programm beschrieben. Das SW Programm kann eingerichtet werden, um auf einem Prozessor (z.B. auf einem Steuergerät eines Fahrzeugs) ausgeführt zu werden, und um dadurch das in diesem Dokument beschriebene Verfahren auszuführen.According to a further aspect, a software (SW) program is described. The software program can be set up to be executed on a processor (e.g. on a control unit of a vehicle) and thereby to execute the method described in this document.

Gemäß einem weiteren Aspekt wird ein Speichermedium beschrieben. Das Speichermedium kann ein SW Programm umfassen, welches eingerichtet ist, um auf einem Prozessor ausgeführt zu werden, und um dadurch das in diesem Dokument beschriebene Verfahren auszuführen.According to a further aspect, a storage medium is described. The storage medium can comprise a software program which is set up to be executed on a processor and thereby to execute the method described in this document.

Es ist zu beachten, dass die in diesem Dokument beschriebenen Verfahren, Vorrichtungen und Systeme sowohl alleine, als auch in Kombination mit anderen in diesem Dokument beschriebenen Verfahren, Vorrichtungen und Systemen verwendet werden können. Des Weiteren können jegliche Aspekte der in diesem Dokument beschriebenen Verfahren, Vorrichtungen und Systemen in vielfältiger Weise miteinander kombiniert werden. Insbesondere können die Merkmale der Ansprüche in vielfältiger Weise miteinander kombiniert werden.It should be noted that the methods, devices and systems described in this document can be used both alone and in combination with other methods, devices and systems described in this document. Furthermore, any aspects of the methods, devices and systems described in this document can be combined with one another in diverse ways. In particular, the features of the claims can be combined with one another in diverse ways.

Im Weiteren wird die Erfindung anhand von Ausführungsbeispielen näher beschrieben. Dabei zeigen

1 ein beispielhaftes Fahrzeug mit ein oder mehreren Umfeldsensoren;
2 ein beispielhaftes Umfeldraster in Bezug auf eine Umgebung bzw. ein Umfeld eines Fahrzeugs;
3 eine beispielhafte Vorrichtung zur Erkennung eines Objektes auf Basis der Daten eines Umfeldrasters;
4a beispielhafte Eingangsdaten, die zur Erkennung eines Objektes verwendet werden können;
4b eine beispielhafte Vorrichtung zur Erkennung eines Objektes auf Basis eines Umfeldrasters und auf Basis von Bilddaten; und
5 ein Ablaufdiagramm eines beispielhaften Verfahrens zur Ermittlung von Objektdaten in Bezug auf ein Objekt auf Basis eines Umfeldrasters.

The invention is described in more detail below on the basis of exemplary embodiments. Show it

1 an exemplary vehicle with one or more environment sensors;
2 an exemplary environment grid in relation to an environment or an environment of a vehicle;
3 an exemplary device for recognizing an object on the basis of the data of an environment grid;
4a exemplary input data that can be used to identify an object;
4b an exemplary device for recognizing an object on the basis of an environment grid and on the basis of image data; and
5 a flowchart of an exemplary method for determining object data in relation to an object on the basis of an environment grid.

Wie eingangs dargelegt, befasst sich das vorliegende Dokument mit der zuverlässigen und präzisen Detektion von Objekten auf Basis der Sensordaten von ein oder mehreren Umfeldsensoren. In diesem Zusammenhang zeigt 1 ein Fahrzeug 100 mit ein oder mehreren Umfeldsensoren 111, 112 zur Erfassung von Sensordaten. Beispielhafte Umfeldsensoren 111, 112 sind ein oder mehrere Lidarsensoren, ein oder mehrere Radarsensoren, ein oder mehrere Bildkameras, etc.As stated at the beginning, the present document deals with the reliable and precise detection of objects on the basis of the sensor data from one or more environmental sensors. In this context shows 1 a vehicle 100 with one or more environment sensors 111, 112 for acquiring sensor data. Exemplary environment sensors 111, 112 are one or more lidar sensors, one or more radar sensors, one or more image cameras, etc.

Das Fahrzeug 100 umfasst eine Vorrichtung (bzw. eine Verarbeitungseinheit) 101, die eingerichtet ist, auf Basis der Sensordaten ein Objekt 150 im Umfeld des Fahrzeugs 100 zu detektieren. Ein detektiertes Objekt 150, insbesondere Objektdaten in Bezug auf ein Objekt 150, kann bzw. können in einer Fahrfunktion 102 (z.B. für das teilautomatisierte oder hochautomatisierte Fahren des Fahrzeugs 100) berücksichtigt werden.The vehicle 100 comprises a device (or a processing unit) 101 which is set up to detect an object 150 in the vicinity of the vehicle 100 on the basis of the sensor data. A detected object 150, in particular object data in Reference to an object 150 can be taken into account in a driving function 102 (for example for the partially automated or highly automated driving of the vehicle 100).

Die lokale Umgebung eines Fahrzeugs 100 kann als Occupancy Grid Map bzw. (Belegungs-) Raster 200 geschätzt bzw. dargestellt werden (siehe 2). 2 zeigt ein beispielhaftes Raster 200 einer Umgebung bzw. eines Umfelds des Fahrzeugs 100 mit einer Vielzahl von Rasterzellen oder kurz Zellen 201. Das Raster 200 kann die Umgebung bzw. das Umfeld des Fahrzeugs 100 in die Vielzahl von zwei- (2D) oder drei-dimensionalen (3D) Zellen 201 aufteilen. Eine zwei-dimensionale Zelle 201 kann dabei eine Rechteckform aufweisen (beispielsweise mit einer Kantenlänge von 10cm, 5cm, 2cm, 1cm oder weniger).The local surroundings of a vehicle 100 can be estimated or represented as an occupancy grid map or (occupancy) grid 200 (see FIG 2 ). 2 shows an exemplary grid 200 of an environment or an environment of the vehicle 100 with a plurality of grid cells or cells 201 for short. The grid 200 can represent the environment or the environment of the vehicle 100 in a multitude of two- (2D) or three-dimensional (3D) Divide cells 201. A two-dimensional cell 201 can have a rectangular shape (for example with an edge length of 10 cm, 5 cm, 2 cm, 1 cm or less).

Die Verarbeitungseinheit 101 des Fahrzeugs 100 kann eingerichtet sein, auf Basis der Sensordaten für ein oder mehrere der Zellen 201 (insbesondere für jede Zelle 201) Daten zu ermitteln, die anzeigen, ob eine Zelle 201 an einem bestimmten Zeitpunkt t belegt ist oder nicht. Insbesondere können die Daten für eine Zelle 201 anzeigen $z_{c} = (m (O), m (F)),$

wobei m({0}) eine Evidenz bzw. Evidenzmasse dafür ist, dass die Zelle c 201 durch ein Objekt 150 belegt ist (z.B. ein statisches oder ein dynamisches Objekt), und wobei m(F) eine Evidenz dafür ist, dass die Zelle c 201 frei ist, und somit nicht durch ein Objekt 150 belegt ist. Die Evidenz dafür, dass die Zelle 201 durch eine Objekt 150 belegt ist, kann als Objekt-Wahrscheinlichkeit dafür betrachtet werden, dass die Zelle 201 durch ein Objekt 150 belegt ist (insbesondere im Sinne der Dempster-Shafer Theorie).The processing unit 101 of the vehicle 100 can be set up to determine, on the basis of the sensor data for one or more of the cells 201 (in particular for each cell 201), data which indicate whether or not a cell 201 is occupied at a specific point in time t. In particular, 201 may display the data for a cell

z_{c} = (m (O), m (F.)),

where m ({0}) is evidence that cell c 201 is occupied by an object 150 (eg a static or dynamic object), and where m (F) is evidence that the cell c 201 is free and is therefore not occupied by an object 150. The evidence that the cell 201 is occupied by an object 150 can be viewed as the object probability that the cell 201 is occupied by an object 150 (in particular in the sense of the Dempster-Shafer theory).

Es kann somit auf Basis der Sensordaten von ein oder mehreren Umfeldsensoren 111 ein Raster 200 mit einer Vielzahl von Zellen 201 ermittelt werden, wobei die einzelnen Zellen 201 Information bzw. Daten darüber anzeigen können,

• ob die jeweilige Zelle 201 durch ein Objekt belegt ist oder nicht; und/oder
• ob die jeweilige Zelle 201 durch ein dynamisches oder durch ein statisches Objekt belegt ist, und/oder
• wie hoch ein Objekt in der jeweiligen Zelle 201 ist.

A grid 200 with a large number of cells 201 can thus be determined on the basis of the sensor data from one or more environment sensors 111, the individual cells 201 being able to display information or data about them,

• whether the respective cell 201 is occupied by an object or not; and or
• whether the respective cell 201 is occupied by a dynamic or a static object, and / or
• how high an object is in the respective cell 201.

Das Raster 200 kann insbesondere auf Basis der Sensordaten eines Lidarsensors und/oder eine Radarsensors 111 ermittelt werden. Die Daten eines (Umfeld-) Rasters 200 können auch als Bird Eye View (BEV) Daten in Bezug auf das Umfeld bezeichnet werden, da das Raster 200 das Umfeld in einer Draufsicht von Oben beschreibt.The grid 200 can in particular be determined on the basis of the sensor data of a lidar sensor and / or a radar sensor 111. The data of an (environment) grid 200 can also be referred to as bird eye view (BEV) data in relation to the environment, since the grid 200 describes the environment in a top view from above.

3 zeigt eine beispielhafte Vorrichtung 300 zur Erkennung eines Objektes 150 auf Basis der Daten eines Umfeldrasters 200. Die Vorrichtung 300 umfasst ein oder mehrere neuronale Netze, mit denen die Daten 301 eines Umfeldrasters 200 ausgewertet werden können. Insbesondere umfasst die Vorrichtung 300 ein Encoder-Netzwerk 302 (mit einem Convolutional Neural Network, CNN), das ausgebildet ist, auf Basis der Eingangsdaten 301 eine Vielzahl von Merkmalen (auf Englisch „Features“) zu ermitteln (die z.B. in einer Merkmalsmatrix angeordnet sind). Des Weiteren kann die Vorrichtung 300 ein Decoder-Netzwerk 302 (mit einem CNN) umfassen, das eingerichtet ist, auf Basis der Vielzahl von Merkmalen ein oder mehrere Karten 304, 305 für ein oder mehrere Merkmale und/oder für ein oder mehrere Referenzpunkte zu ermitteln. Die ein oder mehreren Karten 304, 305 können dabei jeweils einen dem Umfeldraster 200 entsprechenden Bereich des Umfelds abdecken (ggf. mit einer anderen räumlichen Auflösung als das Umfeldraster 200). 3 shows an exemplary device 300 for recognizing an object 150 on the basis of the data of an environment grid 200. The device 300 comprises one or more neural networks with which the data 301 of an environment grid 200 can be evaluated. In particular, the device 300 comprises an encoder network 302 (with a convolutional neural network, CNN) which is designed to determine a multiplicity of features (which are arranged in a feature matrix, for example) on the basis of the input data 301 ). Furthermore, the device 300 can comprise a decoder network 302 (with a CNN) which is set up to determine one or more maps 304, 305 for one or more features and / or for one or more reference points on the basis of the plurality of features . The one or more maps 304, 305 can each cover an area of the environment corresponding to the environment grid 200 (possibly with a different spatial resolution than the environment grid 200).

Insbesondere kann zumindest eine Referenzpunkt-Karte 304 bereitgestellt werden, die für ein oder mehrere Objekte 150 jeweils die Position eines Referenzpunktes (z.B. des Schwerpunktes) des jeweiligen Objektes 150 anzeigt. Die Merkmalskarte 304 kann dabei die örtliche Wahrscheinlichkeitsverteilung der Referenzpunkte der ein oder mehreren Objekte 150 anzeigen. Ggf. kann für unterschiedliche Klassen von Objekten 150 (z.B. Fußgänger, Fahrzeuge, Fahrradfahrer, etc.) jeweils eine Referenzpunkt-Karte 304 bereitgestellt werden.In particular, at least one reference point map 304 can be provided which each displays the position of a reference point (e.g. the center of gravity) of the respective object 150 for one or more objects 150. The feature map 304 can display the local probability distribution of the reference points of the one or more objects 150. If necessary, a reference point map 304 can be provided for different classes of objects 150 (e.g. pedestrians, vehicles, cyclists, etc.).

Des Weiteren können ein oder mehrere Merkmalskarten 305 für Merkmale wie die Orientierung der einzelnen Objekte 150, die Dimensionen der einzelnen Objekte 150, einen Positions-Offset der einzelnen Objekte 150, etc. bereitgestellt werden. Diese ein oder mehreren Merkmalskarten 305 können ggf. gebündelt jeweils für alle Klassen von Objekten 150 bereitgestellt werden.Furthermore, one or more feature maps 305 for features such as the orientation of the individual objects 150, the dimensions of the individual objects 150, a position offset of the individual objects 150, etc. can be provided. These one or more feature cards 305 can, if necessary, be provided in a bundled manner for all classes of objects 150.

Auf Basis der Karten 304, 305 können dann die Objekte 150 in dem Umfeld detektiert werden. Dabei können ggf. ein oder mehreren neuronale Auswerte-Netzwerke 306, 308, 309 verwendet werden. Beispielsweise kann ein Auswerte-Netzwerk 306 bereitgestellt werden, das ausgebildet ist, auf Basis der ein oder mehreren Referenzpunkt-Karten 304 eine Kandidaten-Karte 307 mit Objekt-Kandidaten bereitzustellen. Ein weiteres Auswerte-Netzwerk 308 kann dazu verwendet werden, auf Basis der ein oder mehreren Merkmalskarten 305 und auf Basis der Kandidaten-Karte 307 eine weitere Kandidaten-Karte 311 zu ermitteln. Die Kandidaten-Karten 307, 311 können dann in einem weiteren Auswerte-Netzwerk 309 gemeinsam ausgewertet werden, um Objektdaten 310 in Bezug auf ein oder mehrere Objekte 150 zu detektieren. Die Objektdaten 310 in Bezug auf ein Objekt 150 können dabei anzeigen: die Position des Objektes 150; die Dimension des Objektes 150; die Klasse bzw. den Type des Objektes 150 und/oder die Orientierung des Objektes 150. Insbesondere können die Objektdaten 310 für ein Objekt 150 eine Bounding Box um das Objekt 150 anzeigen.The objects 150 in the surroundings can then be detected on the basis of the maps 304, 305. One or more neural evaluation networks 306, 308, 309 can be used if necessary. For example, an evaluation network 306 can be provided which is designed to provide a candidate map 307 with object candidates on the basis of the one or more reference point maps 304. Another evaluation network 308 can be used to determine a further candidate card 311 on the basis of the one or more feature cards 305 and on the basis of the candidate card 307. The candidate cards 307, 311 can then be jointly evaluated in a further evaluation network 309 in order to detect object data 310 in relation to one or more objects 150. The object data 310 in relation to an object 150 can display: the position of the object 150; the dimension of the object 150; the class or the type of the object 150 and / or the orientation of the object 150. In particular, the object data 310 for an object 150 can display a bounding box around the object 150.

Die ein oder mehreren neuronalen Netze der Detektions-Vorrichtung 300 können auf Basis von (gelabelten) Trainingsdaten angelernt werden. Die Trainingsdaten können eine Vielzahl von Datensätzen umfassen. Ein Datensatz kann dabei einerseits Eingangsdaten 301 eines Umfeldrasters 200 aufweisen. Des Weiteren kann der Datensatz ein oder mehrere Soll-Referenzpunkt-Karten 304 und ein oder mehrere Soll-Merkmalskarten 305 aufweisen. Unter Verwendung eines Backpropagation-Algorithmus können dann auf Basis der Trainingsdaten das Encoder- und das Decoder-Netz 302, 303 angelernt werden.The one or more neural networks of the detection device 300 can be trained on the basis of (labeled) training data. The training data can comprise a large number of data sets. On the one hand, a data record can have input data 301 of an environment grid 200. Furthermore, the data record can have one or more nominal reference point maps 304 and one or more nominal feature maps 305. Using a backpropagation algorithm, the encoder and decoder networks 302, 303 can then be learned on the basis of the training data.

Die einzelnen Trainings-Datensätze können alternativ oder ergänzend die Soll-Objektdaten 310 für ein oder mehrere Objekte 150 anzeigen. Dies ermöglicht es, die Auswerte-Netzwerke 306, 308, 309 anzulernen.The individual training data records can alternatively or additionally display the target object data 310 for one or more objects 150. This makes it possible to train the evaluation networks 306, 308, 309.

Wie bereits oben dargelegt, kann ein Fahrzeug 100 unterschiedliche Typen von Umfeldsensoren 111, 112 aufweisen. Insbesondere kann ein Fahrzeug 100 ein oder mehrere Umfeldsensoren 111 (etwa einen Lidarsensor und/oder einen Radarsensor) umfassen, mit denen Daten 310 für ein BEV Umfeldraster 200 ermittelt werden können (wie beispielhaft in 4a dargestellt). Des Weiteren kann ein Fahrzeug 100 ein oder mehreren Umfeldsensoren 112 (insbesondere ein oder mehreren Kameras) umfassen, mit denen zwei-dimensionale (2D) Bilder 400 des Umfelds erfasst werden können. Die Bilder 400 weisen dabei eine Perspektive auf das Umfeld auf, die von der Perspektive des BEV Umfeldrasters 200 abweicht (wie in 4a, rechte Seite, dargestellt).As already explained above, a vehicle 100 can have different types of environment sensors 111, 112. In particular, a vehicle 100 can include one or more environment sensors 111 (for example a lidar sensor and / or a radar sensor) with which data 310 for a BEV environment grid 200 can be determined (as exemplified in FIG 4a shown). Furthermore, a vehicle 100 can include one or more environment sensors 112 (in particular one or more cameras) with which two-dimensional (2D) images 400 of the environment can be recorded. The images 400 have a perspective of the environment that deviates from the perspective of the BEV environment grid 200 (as in 4a , right side, shown).

4b zeigt eine beispielhafte Detektions-Vorrichtung 410, die eingerichtet ist, die Sensordaten und/oder die Information aus den unterschiedlichen Typen von Umfeldsensoren 111, 112 zu fusionieren, um mit erhöhter Genauigkeit Objektdaten 310 in Bezug auf ein oder mehreren Objekte 150 zu ermitteln. 4b FIG. 10 shows an exemplary detection device 410 which is set up to merge the sensor data and / or the information from the different types of environment sensors 111, 112 in order to determine object data 310 with respect to one or more objects 150 with increased accuracy.

Die Vorrichtung 410 umfasst ein erstes neuronales Encoder-Netzwerk 411 (z.B. das Encoder-Netzwerk 302), das eingerichtet ist, auf Basis der Daten 301 des Umfeldrasters 200 eine erste Merkmalsmatrix 413 zu ermitteln. Des Weiteren umfasst die Vorrichtung 410 ein oder mehrere zweite neuronale Encoder-Netzwerke 412, die jeweils eingerichtet sind, auf Basis der ein oder mehreren Bilder 400 von ein oder mehreren Kameras 112 jeweils eine zweite Merkmalsmatrix 414 zu ermitteln.The device 410 comprises a first neural encoder network 411 (e.g. the encoder network 302) which is set up to determine a first feature matrix 413 on the basis of the data 301 of the environment grid 200. Furthermore, the device 410 comprises one or more second neural encoder networks 412, each of which is set up to determine a second feature matrix 414 on the basis of the one or more images 400 from one or more cameras 112.

Die ein oder mehreren zweiten Merkmalsmatrizen 414 können mittels einer Transformation 415 auf das Raster 200 projiziert werden, um ein oder mehrere entsprechende transformierte Merkmalsmatrizen 419 bereitzustellen.The one or more second feature matrices 414 can be projected onto the grid 200 by means of a transformation 415 in order to provide one or more corresponding transformed feature matrices 419.

Die erste Merkmalsmatrix 413 kann dann in einer Fusionseinheit 416 mit den ein oder mehreren transformierten Merkmalsmatrizen 419 fusioniert werden, z.B. durch Konkatenation und/oder durch Addition, um eine fusionierte Merkmalsmatrix 417 bereitzustellen. Die Objektdaten 310 für ein oder mehrere Objekte 150 können dann mittels eines Auswerte-Netzwerks 418 ermittelt werden.The first feature matrix 413 can then be fused in a fusion unit 416 with the one or more transformed feature matrices 419, e.g. by concatenation and / or by addition, in order to provide a fused feature matrix 417. The object data 310 for one or more objects 150 can then be determined by means of an evaluation network 418.

Die neuronalen Netzwerte 411, 412, 418 der Vorrichtung 410 können, wie bereits oben dargelegt, auf Basis von gelabelten Trainingsdaten und ggf. unter Verwendung des Backpropagation-Algorithmus angelernt werden.The neural network values 411, 412, 418 of the device 410 can, as already explained above, be learned on the basis of labeled training data and possibly using the backpropagation algorithm.

5 zeigt ein Ablaufdiagramm eines beispielhaften (ggf. Computer implementierten) Verfahrens 500 zur Ermittlung von Objektdaten 310 in Bezug auf ein oder mehrere Objekte 150 im Umfeld von ein oder mehreren Umfeldsensoren 111, 112. 5 FIG. 10 shows a flowchart of an exemplary (possibly computer-implemented) method 500 for determining object data 310 in relation to one or more objects 150 in the vicinity of one or more environment sensors 111, 112.

Das Verfahren 500 umfasst das Ermitteln 501, auf Basis der Sensordaten der ein oder mehreren Umfeldsensoren 111, 112 (insbesondere auf Basis der Sensordaten von ein oder mehreren Lidar- und/oder Radarsensoren), von Daten 301 eines Umfeldrasters 200 für das durch die Sensordaten erfasste Umfeld. Das Umfeldraster 200 kann Zellen 202 auf einer Rasterebene aufweisen, wobei die Rasterebene bei einem Fahrzeug 100 parallel zu der Fahrbahn angeordnet sein kann, auf der das Fahrzeug 100 fährt.The method 500 comprises determining 501, on the basis of the sensor data of the one or more environment sensors 111, 112 (in particular on the basis of the sensor data of one or more lidar and / or radar sensors), of data 301 of an environment grid 200 for that detected by the sensor data Environment. Surrounding area grid 200 can have cells 202 on a grid level, wherein the grid level in a vehicle 100 can be arranged parallel to the roadway on which vehicle 100 is traveling.

Das Verfahren 500 umfasst ferner das Ermitteln 502, auf Basis der Daten 301 des Umfeldrasters 200, mittels eines ersten neuronalen Encoder-Netzwerks 302, 411 (insbesondere mittels eines CNN) einer ersten Merkmalsmatrix 413. Außerdem umfasst das Verfahren 500 das Ermitteln 503, auf Basis der ersten Merkmalsmatrix 413 und mittels eines neuronalen Auswerte-Netzwerks 309, 418, von Objektdaten 310 in Bezug auf ein oder mehrere Objekte 150 in dem Umfeld. Das Encoder-Netzwerk und das Auswerte-Netzwerk können im Vorfeld anhand von gelabelten Trainingsdaten angelernt worden sein.The method 500 further includes determining 502, on the basis of the data 301 of the environment grid 200, by means of a first neural encoder network 302, 411 (in particular by means of a CNN), a first feature matrix 413. In addition, the method 500 includes determining 503 on the basis of the first feature matrix 413 and, by means of a neural evaluation network 309, 418, of object data 310 in relation to one or more objects 150 in the environment. The encoder network and the evaluation network can have been trained in advance on the basis of labeled training data.

Durch die in diesem Dokument beschriebenen Maßnahmen können in zuverlässiger und präziser Weise auf Basis von Sensordaten von ein oder mehreren Umfeldsensoren 111, 112 Objektdaten 310 in Bezug auf ein oder mehrere Objekte 150 ermittelt werden.As a result of the measures described in this document, object data 310 relating to one or more objects 150 can be determined in a reliable and precise manner on the basis of sensor data from one or more environment sensors 111, 112.

Die vorliegende Erfindung ist nicht auf die gezeigten Ausführungsbeispiele beschränkt. Insbesondere ist zu beachten, dass die Beschreibung und die Figuren nur beispielhaft das Prinzip der vorgeschlagenen Verfahren, Vorrichtungen und Systeme veranschaulichen sollen.The present invention is not restricted to the exemplary embodiments shown. In particular, it should be noted that the description and the figures are only intended to illustrate the principle of the proposed methods, devices and systems by way of example.

ZITATE ENTHALTEN IN DER BESCHREIBUNGQUOTES INCLUDED IN THE DESCRIPTION

Diese Liste der vom Anmelder aufgeführten Dokumente wurde automatisiert erzeugt und ist ausschließlich zur besseren Information des Lesers aufgenommen. Die Liste ist nicht Bestandteil der deutschen Patent- bzw. Gebrauchsmusteranmeldung. Das DPMA übernimmt keinerlei Haftung für etwaige Fehler oder Auslassungen.This list of the documents listed by the applicant was generated automatically and is included solely for the better information of the reader. The list is not part of the German patent or utility model application. The DPMA assumes no liability for any errors or omissions.

Zitierte Nicht-PatentliteraturNon-patent literature cited

In the document Zhou, Xingyi, Dequan Wang, and Philipp Krähenbühl, "Objects as points", arXiv preprint arXiv: 1904.07850 (2019) [0016]

Claims

Device (101, 300, 410) for determining object data (310) in relation to one or more objects (150) in the vicinity of one or more environment sensors (111, 112); wherein the device (101, 300, 410) is set up, - to determine, on the basis of the sensor data of the one or more environment sensors (111, 112), data (301) of an environment grid (200) for the environment detected by the sensor data; - to determine a first feature matrix (413) on the basis of the data (301) of the environment grid (200) by means of a first neural encoder network (302, 411); and - to determine object data (310) in relation to one or more objects (150) in the environment on the basis of the first feature matrix (413) by means of a neural evaluation network (309, 418).

Device (101, 300, 410) according to Claim 1 wherein the device (101, 300, 410) is set up to determine at least one image (400) in relation to the environment on the basis of the sensor data of the one or more environment sensors (111, 112); - to determine a second feature matrix (414) on the basis of the image (400) by means of a second neural encoder network (412); and - on the basis of the first feature matrix (413) and on the basis of the second feature matrix (414), in particular by concatenation and / or by addition, to determine a merged feature matrix (417); and - to determine the object data (310) in relation to the one or more objects (150) in the environment on the basis of the merged feature matrix (417) by means of the neural evaluation network (309, 418).

Device (101, 300, 410) according to Claim 2 , wherein the device (101, 300, 410) is set up to transform and / or project the second feature matrix (414) from an image plane of the image (400) to a raster plane of the surrounding raster (200) in order to produce a transformed feature matrix ( 419) to be determined; and - to determine the merged feature matrix (417), in particular by concatenation and / or by addition, on the basis of the first feature matrix (413) and on the basis of the transformed feature matrix (419).

Device (101, 300, 410) according to one of the Claims 2 until 3 wherein - the first neural encoder network (302, 411) and / or the second neural encoder network (412) each comprise a convolutional neural network; and / or - the first neural encoder network (302, 411) and / or the second neural encoder network (412) have been trained in advance on the basis of labeled training data.

Device (101, 300, 410) according to one of the Claims 2 until 4th wherein the device (101, 300, 410) is set up to determine a plurality of images (400) in relation to the environment on the basis of the sensor data of the one or more environment sensors (111, 112); - to determine a corresponding plurality of second feature matrices (414) on the basis of the plurality of images (400) by means of a corresponding plurality of second neural encoder networks (412); and - to determine the merged feature matrix (417), in particular by concatenation and / or by addition, on the basis of the first feature matrix (413) and on the basis of the plurality of second feature matrices (414).

Device (101, 300, 410) according to one of the preceding claims, wherein the data (301) of the environment grid (200) are determined on the basis of the sensor data of at least one lidar sensor and / or at least one radar sensor.

Device (101, 300, 410) according to one of the preceding claims, wherein the data (301) of the environment grid (200) for one cell (201) indicate a plurality of cells (201) of the environment grid (200), - Information relating to a probability that the cell (201) is occupied by a dynamic and / or a static object (150), and / or - information relating to a probability that the cell (201) is free space; and or - Information relating to a height of an object (150) arranged on the cell (201).

Device (101, 300, 410) according to one of the preceding claims, wherein the device (101, 300, 410) is set up - by means of a neural decoder network (303) on the basis of the first feature matrix (413) a reference point map ( 304) indicating a spatial probability distribution of one or more reference points for the one or more objects (150); - using the neural decoder network (303) on the basis of the first feature matrix (413) to determine a feature map (305) for at least one feature, in particular a dimension and / or an orientation, of the one or more objects (150); and - the object data (310) in relation to the one or more objects (150) in the environment on the basis of the reference point map (304) and on the basis of the To determine the feature map (305) by means of the neural evaluation network (309, 418).

Device (101, 300, 410) according to one of the preceding claims, wherein the device (101, 300, 410) is set up repeatedly at successive points in time, - on the basis of current sensor data from the one or more environment sensors (111, 112) for the respective point in time, determine current data (301) of the environment grid (200) for the environment detected by the current sensor data; - to determine a current first feature matrix (413) for the respective point in time on the basis of the current data (301) of the environment grid (200) by means of the first neural encoder network (302, 411); and - to determine current object data (310) in relation to the one or more objects (150) in the environment on the basis of the current first feature matrix (413) by means of the neural evaluation network (309, 418).

Device (101, 300, 410) according to one of the preceding claims, wherein the object data (310) for an object (150) display a border of the object (150) on a grid plane of the environment grid (200),

Method (500) for determining object data (310) in relation to one or more objects (150) in the vicinity of one or more environment sensors (111, 112); wherein the method comprises (500), - Determination (501), on the basis of the sensor data of the one or more environment sensors (111, 112), of data (301) of an environment grid (200) for the environment detected by the sensor data; - Determination (502), on the basis of the data (301) of the environment grid (200), by means of a first neural encoder network (302, 411) of a first feature matrix (413); and - Determination (503), on the basis of the first feature matrix (413) and by means of a neural evaluation network (309, 418), of object data (310) in relation to one or more objects (150) in the environment.