WO2022228809A1 - Verfahren und vorrichtung zur vorhersage von objektdaten zu einem objekt - Google Patents
Verfahren und vorrichtung zur vorhersage von objektdaten zu einem objekt Download PDFInfo
- Publication number
- WO2022228809A1 WO2022228809A1 PCT/EP2022/058363 EP2022058363W WO2022228809A1 WO 2022228809 A1 WO2022228809 A1 WO 2022228809A1 EP 2022058363 W EP2022058363 W EP 2022058363W WO 2022228809 A1 WO2022228809 A1 WO 2022228809A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- camera
- image
- time
- feature tensor
- basis
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 22
- 230000001537 neural effect Effects 0.000 claims abstract description 31
- 238000011156 evaluation Methods 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims description 23
- 230000009466 transformation Effects 0.000 claims description 15
- 238000001514 detection method Methods 0.000 claims description 6
- 238000003384 imaging method Methods 0.000 claims description 5
- 230000001131 transforming effect Effects 0.000 claims description 2
- 230000002123 temporal effect Effects 0.000 description 6
- 230000007613 environmental effect Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 235000004522 Pentaglottis sempervirens Nutrition 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/262—Analysis of motion using transform domain methods, e.g. Fourier domain methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
Definitions
- the invention relates to a method and a corresponding device that enable a vehicle, for example, to use image data from one or more cameras to determine a prediction of object data in relation to one or more objects in the vicinity of the one or more cameras.
- a vehicle typically includes a number of different surroundings sensors that are set up to record different sensor data relating to the surroundings of the vehicle.
- environment sensors are lidar sensors, image sensors or image cameras, radar sensors,
- Ultrasonic sensors etc.
- one or more environmental objects e.g. one or more other vehicles
- one or more environmental objects in the area surrounding the vehicle can be detected and, if necessary, tracked.
- the present document deals with the technical task of enabling a particularly reliable and/or precise tracking of one or more objects on the basis of image data from one or more image cameras.
- the object is solved by each of the independent claims.
- Advantageous embodiments are described, inter alia, in the dependent claims. It is pointed out that additional features of a patent claim dependent on an independent patent claim without the features of the independent patent claim or only in combination with a subset of the features of the independent patent claim can form a separate invention independent of the combination of all features of the independent patent claim, which can be made the subject of an independent claim, a divisional application or a subsequent application. This applies equally to the technical teachings described in the description, which can form an invention independent of the features of the independent patent claims.
- a device for determining object data in relation to (at least) one object in the Elmfeld region from at least one video camera is described.
- Exemplary objects are obstacles and/or other road users in an Elmfeld of a vehicle.
- the image camera can be designed to capture images, in particular a temporal sequence of images, in relation to the Elmfeld in front of the image camera.
- the individual images can be arranged in a (two-dimensional, 2D) image plane.
- the individual images can have pixels, for example a matrix of pixels, in a specific image plane.
- the image camera can be installed in a vehicle (e.g. as a front camera of the vehicle).
- vehicle e.g. as a front camera of the vehicle.
- the vehicle can be designed to move on a roadway.
- the image plane of the image camera can (possibly essentially or at least partially) be arranged perpendicularly to the roadway.
- the device is set up, on the basis of at least one image from the video camera, for a first point in time using a neural encoder network (eg using a convolutional neural network that has been trained in advance).
- Determine camera-based feature tensor may include one or more features within the imaging plane of the image camera.
- a feature tensor described in this document can have two or more dimensions.
- a feature tensor can be or comprise a feature matrix.
- a feature tensor can have multiple levels of feature matrices. In such a case, the feature tensor can be three-dimensional. Each level can have a different type of feature.
- the device is also set up to transform and/or project the camera-based feature tensor from the image plane of the image onto a raster level of a surrounding raster surrounding the image camera in order to determine a transformed feature tensor (with one or more features in the raster plane).
- the grid plane can be arranged parallel to the roadway.
- the raster plane may correspond to a bird's eye view (BEV) of the environment.
- the camera-based feature tensor can be transformed and/or projected from the image plane of the image onto the raster plane of the surrounding raster of the surrounding area of the image camera by means of a transformation that is invariant over time and/or that is defined in advance.
- the device is set up to determine object data relating to the object in the area surrounding the video camera on the basis of the transformed feature tensor using a neural evaluation network.
- the object data can include one or more predicted properties of the object at a point in time subsequent to the first point in time.
- the first point in time can be a point in time n, for example, and the following point in time can be a point in time n+1.
- the device can be set up to determine the object data repeatedly, in particular periodically, for a sequence of times n, n+1, n+2, etc.
- the one or more predicted properties of the object can include the position and/or the orientation of the object, in particular the position and/or the orientation within the environment grid, at the subsequent point in time.
- the one or more predicted properties of the object can include one or more cells of the environment grid, which are occupied by the object at the subsequent point in time (in order to thereby describe the position and/or the orientation of the object).
- the one or more predicted properties of the object can include an occupancy probability and/or an evidence level of the object at the subsequent point in time for one or more cells of the surrounding grid.
- a device which, by using a neural encoder network, a (fixed) transformation and a neural evaluation network, enables (three-dimensional, 3D) object data to be recorded in a precise and robust manner on the basis of images from an image camera Predict reference to one or more objects at the birds-eye level.
- the 3D position and/or the 3D orientation of an object can be predicted at a subsequent point in time n+1 within the birds-eye view plane (and not (only) in the image plane).
- the prediction can possibly be made solely on the basis of the images from one or more cameras.
- the predicted object data in particular the predicted positions and/or orientations
- the neural encoder network and the neural evaluation network were typically trained in advance using labeled training data, with the training data comprising a large number of training data sets.
- the individual training data sets can each have a training image from the camera with one or more training objects (shown therein) for a training time and object data with one or more actual properties of the one or more training objects for the respective training -point in time following point in time.
- the individual parameters of the networks can be learned using a learning method and an error function based on the training data.
- the device can be used to determine one or more predicted properties of the one or more training objects for a training data set, which are then compared with the one or more actual properties from the training data set in order to determine the error function .
- the error function can then be used to adapt the individual parameters of the neural networks of the device in order to gradually increase the quality of the device when determining the object data.
- the device can be set up to combine, in particular to superimpose or line up, a plurality of images from the video camera that follow one another in time to form an overall image for the first point in time.
- the plurality of chronologically consecutive images can have been captured by the video camera before or at the latest at the first point in time.
- a temporal sequence of images from the image camera can thus be viewed and combined.
- the camera-based feature tensor can then be determined based on the overall image using the neural encoder network.
- the one or more predicted properties of a (in an image shown) object can be determined with increased accuracy (especially with regard to depth information perpendicular to the image plane of the camera).
- the device can be set up to determine a corresponding plurality of camera-based feature tensors for a plurality of temporally consecutive images of the video camera by means of the neural encoder network.
- the plurality of images that follow one another in time can cover a detection time period that extends in time before and/or up to the first point in time.
- the individual images of a temporal sequence of images can thus be analyzed individually in order to determine a camera-based feature tensor with features in the image plane of the image camera.
- the device can also be set up to determine a corresponding plurality of transformed feature tensors on the basis of the plurality of camera-based feature tensors.
- the transformation mentioned above can be used for this.
- the device can be set up to determine odometry data in relation to a movement of the video camera during the acquisition period.
- the image camera can be installed in a vehicle. The movement of the imaging camera can then correspond to the movement of the vehicle and odometry data relating to the movement of the vehicle can be determined, e.g. on the basis of a wheel sensor, an inertial measurement unit, a speed sensor, an acceleration sensor, etc.
- the plurality of transformed feature tensors can then be combined, in particular fused, taking into account the odometry data, in order to determine a combined, transformed feature tensor.
- Corresponding characteristics can be transformed in the individual Feature tensors are identified (and fused) based on the odometry data.
- the object data relating to the object in the area surrounding the image camera can then be determined in a particularly precise manner on the basis of the combined, transformed feature tensor using the neural evaluation network.
- the device can be set up to determine one or more updated properties of the object at the subsequent point in time that correspond to the one or more predicted properties on the basis of at least one image from the video camera for the subsequent point in time.
- the object can then be analyzed in a precise and robust manner based on the one or more predicted properties and based on the one or more updated properties, in particular based on a comparison of the one or more updated properties with the corresponding one or more predicted properties at consecutive points in time be tracked.
- the device can be set up to determine a grid-based feature tensor on the basis of grid-based sensor data from one or more environment sensors (e.g. a lidar sensor and/or a radar sensor) for the first point in time using a further neural encoder network. It is thus possible to use the sensor data from one or more surroundings sensors which are designed to sense information relating to the object directly within the raster plane.
- environment sensors e.g. a lidar sensor and/or a radar sensor
- a merged feature tensor can then be determined on the basis of the transformed feature tensor and on the basis of the grid-based feature tensor, in particular by concatenation and/or by addition.
- the object data relating to the object in the area surrounding the image camera can then be determined in a particularly precise and robust manner on the basis of the merged feature tensor using the neural evaluation network.
- a (road) motor vehicle in particular a passenger car or a truck or a bus or a motorcycle) is described which comprises the device described in this document.
- a method for determining object data in relation to an object in the vicinity of at least one video camera includes determining, by means of a neural encoder network, a camera-based feature tensor based on at least one image from the image camera for a first point in time.
- the method also includes transforming and/or projecting the camera-based feature tensor from an image plane of the image onto a raster plane of a surrounding raster surrounding the image camera in order to determine a transformed feature tensor.
- the method also includes determining, by means of a neural evaluation network, object data relating to the object in the area surrounding the image camera on the basis of the transformed feature tensor.
- the object data can include one or more predicted properties of the object at a point in time subsequent to the first point in time.
- SW software program
- the SW program can be set up to be executed on a processor (e.g. on a vehicle's control unit) and thereby to carry out the method described in this document.
- a storage medium can comprise a SW program which is set up to be executed on a processor and thereby to carry out the method described in this document.
- FIG. 1 shows an exemplary vehicle with one or more surroundings sensors
- FIG. 2 shows an exemplary environment grid in relation to an environment or surroundings of a vehicle
- FIG. 3a shows exemplary input data that can be used to identify and/or track an object
- FIG. 3b shows an exemplary device for detecting and/or tracking an object on the basis of image data
- FIG. 4 shows an exemplary consideration of a sequence of images when tracking an object
- FIG. 5 shows a flowchart of an exemplary method for predicting object data in relation to an object on the basis of image data.
- FIG. 1 shows a vehicle 100 with one or more surroundings sensors 111, 112 for acquiring sensor data.
- Exemplary environment sensors 111, 112 are one or more lidar sensors, one or more radar sensors, one or more image cameras, etc.
- the vehicle 100 includes a device (or a processing unit)
- a detected object 150 in particular object data relating to an object 150, can or can in a Driving function 102 (eg for partially automated or highly automated driving of the vehicle 100) are taken into account.
- the local environment of a vehicle 100 can be estimated or represented as an occupancy grid map or (occupancy) grid 200 (see FIG. 2).
- 2 shows an exemplary grid 200 of an environment or surroundings of vehicle 100 with a multiplicity of grid cells or cells 201 for short three-dimensional (3D) cells 201 divide.
- a two-dimensional cell 201 can have a rectangular shape (for example with an edge length of 10 cm, 5 cm, 2 cm, 1 cm or less).
- the processing unit 101 of the vehicle 100 can be set up, based on the sensor data for one or more of the cells 201 (in particular for each cell 201), to determine data which indicates whether a cell 201 is occupied at a specific point in time t or not.
- Evidence that cell 201 is occupied by an object 150 can be viewed as the object probability that cell 201 is occupied by an object 150 (particularly in terms of Dempster-Shafer theory).
- a grid 200 with a large number of cells 201 can thus be determined on the basis of the sensor data from one or more surroundings sensors 111, with the individual cells 201 being able to display information or data about
- the grid 200 can be determined in particular on the basis of the sensor data from a lidar sensor and/or a radar sensor 111 .
- the data of a (environment) grid 200 can also be referred to as Bird's Eye View (BEV) data in relation to the environment, since the grid 200 describes the environment in a plan view from above.
- BEV Bird's Eye View
- a vehicle 100 can have different types of surroundings sensors 111, 112.
- a vehicle 100 can include one or more environment sensors 111 (such as a lidar sensor and/or a radar sensor) with which data for a BEV environment grid 200 can be determined directly (as shown by way of example in FIG. 3a ).
- a vehicle 100 can include one or more environment sensors 112 (in particular one or more cameras) with which two-dimensional (2D) images 300 of the environment can be captured.
- the images 300 have a perspective of the environment that deviates from the perspective of the BEV environment grid 200 (as shown in FIG. 3a, right-hand side).
- Fig. 3b shows an exemplary detection and / or prediction device 310, which is set up to merge the sensor data and / or the information from the different types of environment sensors 111, 112 to with increased accuracy object data 330 in relation to one or to determine a plurality of objects 150, in particular to predict them for a future point in time.
- the device 310 includes a first neural encoder network 311 which is set up to determine a first (raster-based) feature tensor 313 on the basis of the sensor data 320 from an environment sensor 111 (which is provided within the environment grid 200). Furthermore includes the Device 310 one or more second neural encoder networks 312, each of which is set up to determine a second (camera-based) feature tensor 314 based on the one or more images 300 from one or more cameras 112.
- the one or more second (camera-based) feature tensors 314 can be projected onto the grid 200 using a transformation 315 to provide one or more corresponding transformed feature tensors 319 .
- An exemplary transformation 315 is described in Roddick, Thomas, Alex Kendall, and Roberto Cipolla, "Orthographie feature transform for monocular 3d object detection", arXiv preprint arXiv: 1811.08188 (2016) or British Machine Vision Conference (2019). The content of this document is incorporated into this specification by reference.
- the first (raster-based) feature tensor 313 can then be fused in a fusion unit 316 with the one or more transformed feature tensors 319 e.g. by concatenation and/or by addition to provide a fused feature tensor 317 .
- the object data 330 for one or more objects 150 can then be determined using an evaluation network 318 on the basis of the merged feature tensor 317 .
- the neural network values 311, 312, 318 of the device 310 can be trained on the basis of labeled training data and possibly using the backpropagation algorithm.
- the processing of grid-based environment data 320 is optional.
- the device 310 can be set up to determine object data 330 in relation to one or more objects 150 solely on the basis of camera-based data 300 .
- the object data 330 determined by the device 310 can include a prediction or prediction of one or more properties of an object 150 that has already been detected.
- the one or more properties can be predicted for a subsequent point in time from a sequence of points in time.
- Device 310 can be set up to repeatedly, in particular periodically, determine current object data 330 in each case on the basis of current input data 300, 320 in each case.
- object data 330 can be determined for a sequence of times n.
- the device 310 can be set up to predict one or more properties of an object 150 at a subsequent point in time n+1 on the basis of the input data 300, 320 for a point in time n.
- the one or more predicted properties can then be used to track the object 150 .
- Exemplary properties of an object 150 are
- the object data 330 can in particular include an occupancy grid 200 predicted for the time n on the basis of the input data 300, 320 for the subsequent time n+1. Furthermore, the object data 330 can indicate an association between occupied grid cells 201 and individual objects 150 .
- the occupancy grid 200 predicted for the subsequent time n+1 can then be overlaid with an occupancy grid 200 determined for the subsequent time n+1 on the basis of the input data 300, 320 in order to enable particularly precise and robust tracking of detected objects 150.
- the allocation of the individual grid cells 201 to the individual grid cells known from the predicted occupancy grid 200 can be used Objects 150 in the occupancy grid 200 determined for the subsequent point in time n+1 are used in order to be able to localize the individual objects 150 therein.
- the camera-based input data 300 can have a temporal sequence of images 401, 402, 403 of a camera 112, as shown in FIG.
- the temporal sequence of images 401, 402, 403 can be superimposed and/or lined up in order to determine a camera-based feature tensor 314 using a (neural) encoder network 312.
- the object data 330 can then be determined with increased accuracy using a processing module 410, which includes, for example, the transformation unit 316 and the evaluation network 318.
- a camera-based feature tensor 314 can be determined for the individual images 401 , 402 , 403 using the encoder network 312 .
- the individual camera-based feature tensors 314 can then each be transformed into a transformed feature tensor 319 in the transformation unit 315 .
- the individual transformed feature tensors 319 each show corresponding features which, however, can be arranged at different positions within the grid 200 due to a movement of the video camera 112, in particular of the vehicle 100. Based on odometry data in relation to the movement of the image camera 112, in particular the vehicle 100, a precise assignment of corresponding features in the individual transformed feature tensors 319 can be carried out in order to merge the transformed feature tensors 319 and, based on this, the object data 330 with increased determine accuracy.
- FIG. 5 shows a flowchart of an exemplary (possibly computer-implemented) method 500 for determining object data 330 in relation on one or more objects 150 in the vicinity of one or more cameras 112.
- the one or more cameras 112 can be arranged in a vehicle 100.
- the method 500 can be executed by a control unit 101 of the vehicle 100 .
- the method 500 includes determining 501, by means of a neural encoder network 312, a camera-based feature tensor 314 on the basis of at least one image 300 from at least one image camera 112 for a first point in time.
- the encoder network 312 may include a convolutional neural network (CNN).
- CNN convolutional neural network
- the image 300 can display the surroundings of the image camera 112 on a 2D image plane.
- the camera-based feature tensor 314 can display features in a 2D plane (corresponding to the 2D image plane).
- the method 500 includes the transformation and/or projection 502 of the camera-based feature tensor 314 (by means of a predefined and/or fixed transformation) from the (2D) image plane of the image 300 onto the raster plane of an environmental raster 200 of the environment of the image camera 112, to determine a transformed feature tensor 319 .
- the raster level can correspond to the level of a BEV in the area in front of the imaging camera 112 .
- the transformation mentioned above can be used as the transformation.
- the transformation can depend (possibly solely) on the geometric arrangement of the image plane and the raster plane relative to one another.
- the method 500 also includes determining 503, by means of a neural evaluation network 318, object data 330 in relation to the object 150 in the area surrounding the image camera 112 on the basis of the transformed feature tensor 319.
- the object data 330 can have one or more predicted properties of the Object 150 include at a time subsequent to the first time. It is thus possible to predict one or more properties of an object 150 represented in the image 300 in the future take place. A particularly precise and robust tracking of the object 150 can thus be made possible.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biodiversity & Conservation Biology (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280031643.5A CN117280390A (zh) | 2021-04-28 | 2022-03-30 | 用于预测对象的对象数据的方法和装置 |
US18/288,631 US20240212206A1 (en) | 2021-04-28 | 2022-03-30 | Method and Device for Predicting Object Data Concerning an Object |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102021110824.1A DE102021110824A1 (de) | 2021-04-28 | 2021-04-28 | Verfahren und Vorrichtung zur Vorhersage von Objektdaten zu einem Objekt |
DE102021110824.1 | 2021-04-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022228809A1 true WO2022228809A1 (de) | 2022-11-03 |
Family
ID=81393077
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2022/058363 WO2022228809A1 (de) | 2021-04-28 | 2022-03-30 | Verfahren und vorrichtung zur vorhersage von objektdaten zu einem objekt |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240212206A1 (de) |
CN (1) | CN117280390A (de) |
DE (1) | DE102021110824A1 (de) |
WO (1) | WO2022228809A1 (de) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230326215A1 (en) * | 2022-04-07 | 2023-10-12 | Waymo Llc | End-to-end object tracking using neural networks with attention |
-
2021
- 2021-04-28 DE DE102021110824.1A patent/DE102021110824A1/de active Pending
-
2022
- 2022-03-30 WO PCT/EP2022/058363 patent/WO2022228809A1/de active Application Filing
- 2022-03-30 US US18/288,631 patent/US20240212206A1/en active Pending
- 2022-03-30 CN CN202280031643.5A patent/CN117280390A/zh active Pending
Non-Patent Citations (3)
Title |
---|
HU ANTHONY ET AL: "FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 21 April 2021 (2021-04-21), pages 15253 - 15262, XP055946513, ISBN: 978-1-6654-2812-5, Retrieved from the Internet <URL:https://arxiv.org/pdf/2104.10490v1.pdf> DOI: 10.1109/ICCV48922.2021.01499 * |
SWAPNIL DAGA ET AL: "BirdSLAM: Monocular Multibody SLAM in Bird's-Eye View", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 15 November 2020 (2020-11-15), XP081814817 * |
THOMAS RODDICK ET AL: "Orthographic Feature Transform for Monocular 3D Object Detection", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 20 November 2018 (2018-11-20), XP081052453 * |
Also Published As
Publication number | Publication date |
---|---|
DE102021110824A1 (de) | 2022-11-03 |
CN117280390A (zh) | 2023-12-22 |
US20240212206A1 (en) | 2024-06-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE102019115874B4 (de) | Systeme und verfahren zur verbesserten entfernungsschätzung durch eine monokamera unter verwendung von radar- und bewegungsdaten | |
DE102014222617B4 (de) | Fahrzeugerfassungsverfahren und Fahrzeugerfassungssytem | |
EP2043045B1 (de) | Verfahren zur Objektverfolgung | |
EP3038011B1 (de) | Verfahren zum Bestimmen des Abstands eines Objekts von einem Kraftfahrzeug mittels einer monokularen Bilderfassungseinrichtung | |
DE102011078615B4 (de) | Objekterfassungsvorrichtung und objekterfassungsprogramm | |
DE102018205879A1 (de) | Verfahren, Vorrichtung und computerlesbares Speichermedium mit Instruktionen zur Verarbeitung von Sensordaten | |
WO2020094170A1 (de) | Verfahren und verarbeitungseinheit zur ermittlung von information in bezug auf ein objekt in einem umfeld eines fahrzeugs | |
DE102019109333A1 (de) | Verfahren und Verarbeitungseinheit zur Ermittlung der Größe eines Objektes | |
DE102018200683A1 (de) | Verfahren zur Detektion eines Objektes | |
DE102020102823A1 (de) | Fahrzeugkapselnetzwerke | |
DE102016003261A1 (de) | Verfahren zur Selbstlokalisierung eines Fahrzeugs in einer Fahrzeugumgebung | |
DE112021004200T5 (de) | Objekterkennungsvorrichtung | |
DE102022100545A1 (de) | Verbesserte objekterkennung | |
EP4088224A1 (de) | Verfahren zur zusammenführung mehrerer datensätze für die erzeugung eines aktuellen spurmodells einer fahrbahn und vorrichtung zur datenverarbeitung | |
WO2022228809A1 (de) | Verfahren und vorrichtung zur vorhersage von objektdaten zu einem objekt | |
EP3637311A1 (de) | Vorrichtung und verfahren zum ermitteln von höheninformationen eines objekts in einer umgebung eines fahrzeugs | |
DE102019109332A1 (de) | Verfahren und Verarbeitungseinheit zur Ermittlung eines Objekt-Zustands eines Objektes | |
DE102011118171A1 (de) | Verfahren und Vorrichtung zur Schätzung einer Fahrbahnebene und zur Klassifikation von 3D-Punkten | |
DE102020117271A1 (de) | Verfahren und Vorrichtung zur Ermittlung von Objektdaten in Bezug auf ein Objekt | |
DE102019218479A1 (de) | Verfahren und Vorrichtung zur Klassifikation von Objekten auf einer Fahrbahn in einem Umfeld eines Fahrzeugs | |
DE102018215288A1 (de) | Verfahren und Verarbeitungseinheit zur Verfolgung eines Objektes | |
DE102022126080A1 (de) | Raumüberwachungssystem | |
DE112022002046T5 (de) | Fahrvorrichtung, fahrzeug und verfahren zum automatisierten fahren und/oder assistierten fahren | |
DE102021114734A1 (de) | Verbesserte infrastruktur | |
DE102011111856B4 (de) | Verfahren und Vorrichtung zur Detektion mindestens einer Fahrspur in einem Fahrzeugumfeld |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22719544 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280031643.5 Country of ref document: CN Ref document number: 18288631 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22719544 Country of ref document: EP Kind code of ref document: A1 |