GB2606339A - Motion prediction with ego motion compensation and consideration of occluded objects - Google Patents

Motion prediction with ego motion compensation and consideration of occluded objects Download PDF

Info

Publication number
GB2606339A
GB2606339A GB2105220.4A GB202105220A GB2606339A GB 2606339 A GB2606339 A GB 2606339A GB 202105220 A GB202105220 A GB 202105220A GB 2606339 A GB2606339 A GB 2606339A
Authority
GB
United Kingdom
Prior art keywords
vehicle
motion
data
ego
environment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB2105220.4A
Other versions
GB202105220D0 (en
Inventor
Meng Yan
Lippe Phillip
Dao David
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mercedes Benz Group AG
Original Assignee
Mercedes Benz Group AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mercedes Benz Group AG filed Critical Mercedes Benz Group AG
Priority to GB2105220.4A priority Critical patent/GB2606339A/en
Publication of GB202105220D0 publication Critical patent/GB202105220D0/en
Publication of GB2606339A publication Critical patent/GB2606339A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units, or advanced driver assistance systems for ensuring comfort, stability and safety or drive control systems for propelling or retarding the vehicle
    • B60W30/08Active safety systems predicting or avoiding probable or impending collision or attempting to minimise its consequences
    • B60W30/095Predicting travel path or likelihood of collision
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • G01C21/30Map- or contour-matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/001Planning or execution of driving tasks
    • B60W60/0027Planning or execution of driving tasks using trajectory prediction for other traffic participants
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Abstract

The invention relates to a method for prediction of a motion of an object (7) in the environment of a vehicle (1), the method comprising: collecting sensor data of the object (7) in the environment of the vehicle (1) by at least one vehicle sensor and predicting the motion of the object (7) on the basis of the sensor data by a self-learning system (6), characterized by inputting occlusion data (8, 13) into the self-learning system (6), wherein the occlusion data (8, 13) refer to occluded objects of the environment, which are hidden for the at least one vehicle sensor and/or inputting ego motion data (16) into the self-learning system (6), wherein the ego motion data (16) refer to an ego motion of the vehicle (1), so that the motion of the object (7) is predicted taking into account the occluded objects and/or the ego motion of the vehicle (1). Also provided is a method of controlling an autonomous vehicle based on the prediction.

Description

MOTION PREDICTION WITH EGO MOTION COMPENSATION AND CONSIDERATION OF OCCLUDED OBJECTS
FIELD OF THE INVENTION
[0001] The invention relates to a method for prediction of a motion of an object in the environment of a vehicle, the method comprising collecting sensor data of the object in the environment of the vehicle by at least one vehicle sensor and predicting the motion of the object on the basis of the sensor data by a self-learning system. Furthermore, the present invention relates to a method of controlling an autonomous driving vehicle on the basis of such prediction. Additionally, the present invention relates to a device for prediction of a motion of an object in the environment of a vehicle, wherein the device comprises at least one vehicle sensor for collecting sensor data of the object in the environment of the vehicle and a self-learning system for predicting the motion of the object on the basis of the sensor data.
BACKGROUND INFORMATION
[0002] An autonomous driving vehicle has to predict the future trajectories of other traffic participants to plan its own safe and comfortable route. For this purpose a deep learning-based system with sensor inputs can be used to tackle this motion prediction challenge. More specifically, the present invention focuses on solving two challenging problems in motion prediction. First, on-board sensors on autonomous driving vehicles, such as cameras, lidars and radars, can only partially observe the surrounding environment due to the limitation of the sensor ranges and occlusions from other objects. Therefore, a first problem to solve is how to predict motion of those fully or partially occluded objects. Since we need to conduct motion prediction while the ego vehicle is moving, the second problem is how to compensate the ego motion while executing prediction on surrounding objects.
[0003] Document US 2019 00499 70 Al discloses an object motion prediction and autonomous vehicle control. A computer-implemented method includes obtaining state data indicative of at least a current or a past state of an object that is within a surrounding environment of an autonomous vehicle. The method includes obtaining data associated with a geographic area in which the object is located. The method includes generating a combined data set associated with the object based at least in part on a fusion of the state data and the data associated with the geographic area in which the object is located. The method includes obtaining data indicative of a machine-learned model. Furthermore, the method includes inputting the combined data set into the machine-learned model. An output from the machine-learned model is received, wherein the output can be indicative of a plurality of predicted trajectories of the object.
[0004] Moreover, document US 2019 0025841 Al discloses systems and methods for predicting the future locations of objects that are perceived by autonomous vehicles. An autonomous vehicle can include a prediction system that, for each object perceived by the autonomous vehicle, generates one or more potential goals, selects one or more of the potential goals, and develops one or more trajectories by which the object can achieve the one or more selected goals. The prediction systems and methods described can include or leverage one or more machine-learned models that assist in predicting the future locations of the objects. The prediction system may include a machine-learned static object classifier, a machine-learned goal-scoring model, a machine-learned trajectory development model, a machine-learned ballistic quality classifier, and/or other machine-learned models. The use of machine-learned models can improve the speed, quality, and/or accuracy of the generated predictions.
[0005] The above disclosures just mention that in general machine-learning systems and methods can be applied for predicting the future locations of objects that are perceived by autonomous vehicles. Neither of them includes any detailed machine-learning design models or methods that can solve the specific multi-agent multi-modal problems in a highly dynamic environment.
SUMMARY OF THE INVENTION
[0006] The object of the present invention is to provide a method and a device for motion prediction dealing with the problem of the movement of an ego vehicle and the fully or partially occlusion of other objects from the few of current sensors on the autonomous driving vehicle.
[0007] This object is solved by a method according to claim 1 and a device according to claim 7. Further favorable developments are defined in the sub claims.
[0008] Accordingly, there is provided a method for prediction of a motion of an object in the environment of a vehicle, the method comprising: collecting sensor data of the object in the environment of the vehicle by at least one vehicle sensor and predicting the motion of the object on the basis of the sensor data by a self-learning system, characterized by-inputting occlusion data into the self-learning system, wherein the occlusion data refer to occluded objects of the environment, which are hidden for the at least one vehicle sensor and/or inputting ego motion data into the self-learning system, wherein the ego motion data refer to an ego motion of the vehicle, so that the motion of the object is predicted taking into account the occluded objects and/or the ego motion of the vehicle.
[0009] In other words, in a first step sensor data of the object are collected by the sensors of the vehicle, which is preferably an autonomous driving vehicle. In a very simple embodiment only one vehicle sensor is used for collecting the sensor data. However, usually a plurality of sensors of the vehicle is used specifically based on video techniques, radar techniques, ultrasonic techniques etc. The entire sensor data may be processed by a specific processor of the vehicle. In one embodiment the collected sensor data of the different techniques are correlated in order to improve the quality of the entirety of the data.
[0010] In a following step the motion of the object is predicted by using the sensor data. Thus, the sensor data may be used in a raw form or in a preprocessed form in order to predict the motion of the object. Specifically, the prediction can be calculated by a self-learning system. This means that the results of the system improve with each learned data set. Specifically, the prediction of the motion can be optimized individually by the learned data sets.
[0011] Occlusion data are input into the self-learning system. Occlusion data describe objects in an area in the environment of the vehicle, wherefrom the sensors of the vehicle do not receive direct signals. For instance, a car in the foreground can obscure a sidewalk in the background. In this case occlusion data of the sidewalk should be gathered. For instance, a top view of the environment is generated by the prediction system. This top view may show a part of a sidewalk being hidden by the car. Therefore, if a pedestrian walks on the sidewalk and disappears behind the car, there is a high probability that he will follow the sidewalk and appear again after passing the car. In this case the occlusion data refer to the sidewalk behind the car which is not observable by the vehicle sensors. Thus, the occlusion data refer to occluded objects of the environment which are hidden for at least one of the sensors.
[0012] Alternatively or additionally ego motion data may be input into the self-learning system. This means that motion data of the own vehicle are provided and input into the self-learning system. Such ego motion data may be obtained from any controlling systems of the vehicle. The ego motion data may be updated continuously. Thus, actual motion data can be provided in a specific memory, for instance.
[0013] As a result, the motion of the object is predicted, wherein the output of the prediction may consist of tracked and recovered objects including respective proposed trajectories. The prediction takes into account occluded objects in one alternative. According to another alternative the ego motion of the vehicle is considered for the prediction. In a preferred embodiment both alternatives are combined.
[0014] In a preferred embodiment the system may comprise a deep learning neural network. However, the self-learning systems may also include other learning algorithms.
[0015] In another preferred embodiment the occlusion data is input into a first layer of the neural network and the ego motion data is input into a second layer of the neural network wherein the second layer is scaled lower than the first layer. Specifically, the first layer may be the absolutely first layer of the neural network and the second layer may be a deeper layer like the absolute third layer of the neural network. Thus, the prediction may be conditioned on an input action. Specifically, the ego motion (state and action) can be added to a predefined layer or feature map of the neural network.
[0016] In a further embodiment a long-term prediction is performed, wherein a keep-alive function is used to reduce the input noise.
[0017] According to a still further embodiment the ego motion is compensated by transforming output images with regard to an ego vehicle velocity and yaw rate.
[0018] The above object is also solved by a method of controlling an autonomous driving vehicle on the basis of predicting a motion of an object in the environment of the vehicle according to the above-described methods. Specifically, the motion prediction and the prediction of future trajectories of other traffic participants can be used to plan a safe and comfortable route of the autonomous vehicle.
[0019] Additionally, the above object is also solved by a device for prediction of a motion of an object in the environment of a vehicle, the device comprising: at least one vehicle sensor for collecting sensor data of the object in the environment of the vehicle and a self-learning system for predicting the motion of the object on the basis of the sensor data, characterized by input means for inputting occlusion data into the self-learning system, wherein the occlusion data refer to occluded objects of the environment, which are hidden for at least one vehicle sensor and/or for inputting ego motion data into the self-learning system, wherein the ego motion data refer to an ego motion of the vehicle, so that the motion of the object is predictable taking into account the occluded objects and/or the ego motion of the vehicle.
[0020] The above passages describe advantages and modifications of the inventive method. These advantages and modifications may also apply to the inventive device.
[0021] Further advantages, features, and details of the invention derive from the following description of preferred embodiments as well as from the drawings. The features and feature combinations previously mentioned in the description as well as the features and feature combinations mentioned in the following description of the figures and/or shown in the figures alone can be employed not only in the respectively indicated combination but also in any other combination or taken alone without leaving the scope of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] The novel features and characteristic of the disclosure are set forth in the appended claims. The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and together with the description, serve to explain the disclosed principles. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the figures to reference like features and components. Some embodiments of system and/or methods in accordance with embodiments of the present subject matter are now described below, by way of example only, and with reference to the accompanying figures.
[0023] The drawings show in: [0024] Fig. 1 a mapping of a 3D-view to a 2D-top-view grid map-based image.
[0025] Fig. 2 a velocity map; [0026] Fig. 3 a semantic map; [0027] Fig. 4 a horizon map; [0028] Fig. 5 an overall prediction system architecture; [0029] Fig. 6 a proposed deep neural network architecture; [0030] Fig. 7 a structure of a dense block with four delated convolutions increasing their rate exponentially over depth.
[0031] Fig. 8 a training sequence; and [0032] Fig. 9 a generator predicting the next frame yt+i based on the previous frame xt.
[0033] In the figures the same elements or elements having the same function are indicated by the same reference signs.
DETAILED DESCRIPTION
[0034] In the present document, the word "exemplary" is used herein to mean "serving as an example, instance, or illustration". Any embodiment or implementation of the present subject matter described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
[0035] While the disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawing and will be described in detail below. It should be understood, however, that it is not intended to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternatives falling within the scope of the disclosure.
[0036] The terms "comprises", "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion so that a setup, device or method that comprises a list of components or steps does not include only those components or steps but may include other components or steps not expressly listed or inherent to such setup or device or method. In other words, one or more elements in a system or apparatus preceded by "comprises" or "comprise' does not or do not, without more constraints, preclude the existence of other elements or additional elements in the system or method.
[0037] In the following detailed description of the embodiment of the disclosure, reference is made to the accompanying drawing that forms part hereof, and in which is shown by way of illustration a specific embodiment in which the disclosure may be practiced. This embodiment is described in sufficient detail to enable those skilled in the art to practice the disclosure, and it is to be understood that other embodiments may be utilized and that changes may be made without departing from the scope of the present disclosure. The following description is, therefore, not to be taken in a limiting sense.
[0038] In a specific embodiment of the present invention a deep learning-based prediction system may be used for an autonomous driving vehicle. In order to plan a safe, comfortable and efficient future behavior, the ego autonomous driving vehicle needs to predict the future behaviors of other traffic participants around it. The prediction system may also be used for other non-autonomous driving vehicles systems such as a driver assistance system. The traffic participants are also called objects in the following description.
[0039] In a preprocessing step for better handling the overall environment around the ego vehicle, all raw sensor inputs may be mapped from a 3D world to 2D top-view images as shown in Fig. 1. The left drawing in Fig. 1 shows the 3D world and the right drawing is the corresponding 2D top-view image. The autonomous car, i.e., the ego vehicle 1 is mapped to a rectangle in the 2D view. The oncoming track 2 and the car 3 in front of the ego vehicle 1 are also mapped to rectangles in the 2D view. The rectangles are positioned on the respective locations of the road 4.
[0040] The autonomous driving vehicle 1 may take input images by on-board sensors such as cameras, lidars and radars. The on-board sensors deliver raw data in multiple channels as shown in Figures 2 to 4. Specifically, Fig. 2 shows a velocity map, Fig. 3 a semantic map and Fig. 4 a horizon map. Each map shows the ego vehicle 1 in the center. The raw-images from the sensors may be preprocessed and fused together to a top-view grid-map based image with multiple channels.
[0041] As shown in Fig. 2, the dynamic objects may be encoded by their velocity and orientation using a color wheel. In this figure the static objects are represented by black color to summarize all detected objects. The semantic map of Fig. 3 classifies the pixels in a plurality of object classes using different colors for example. The horizon map of Fig. 4 defines the structure of the road to get an overview where the vehicle can drive.
[0042] Fig. 5 shows the overall network-based prediction system architecture. The network system proposed here is built to perform both tracking and prediction of surrounding agents or objects. An input frame 5 for the prediction system 6 shows the ego vehicle 1. Furthermore, it shows a plurality of objects and their occlusions in the form of shadows from the view of the ego vehicle. Specifically, there is an object 7 in the center of the frame 5 which may be a car. The area behind the car from the view of the ego vehicle 1 is an occlusion area 8. Sensors of the ego vehicle 1 cannot detect objects in this occlusion area 8. Since they are obscured by object 7.
[0043] To predict more than one future frame, the output frame 9 maybe fed back into the network to get the further predictions. Past sequences of the grid map-based image frames 5 are given to the prediction system 6 as inputs. These frames contain occlusion areas 8 as black areas, occupancies (e.g. objects like cars 7) as white areas and the horizon map (compare Fig. 4) as background, for instance. The output of the prediction system 6 may consist of the tracked and recovered objects 7 including the proposed trajectories 10 the objects may take. In the complex intersection of output frame 9 in Fig. the pedestrian (trajectories 11) will cross the street while cars 7 (trajectories 10) have to stop. However, car 12 at the top will continue driving.
[0044] Fig. 6 shows an example of a proposed network architecture. The input to the network consists of an occupancy and occlusion map 13. In addition, all objects in the traffic environment are constrained to a road structure including lanes. In the specific example the network consists of an encoder-decoder structure with four ConvLSTMs (Convolutional long short-term memory) layers. The feature maps are represented by rectangles in Fig. 6 on which different layers are applied. A dense block 15 with delated convolution as shown in Fig. 7 is used on the lowest scale to maximize the receptive field, and get better recognition of interaction over long distance. Since the predictions are conditioned on a certain input action, the ego vehicle's state and action are added as additional channel 16 to the third feature map 14. In addition, residual connections and deconvolutions are used in the decoder which ends up in a one-channel prediction 17 (e.g. related to occupancy) of the input resolution.
[0045] The dense block 15 shown in Fig. 7 may comprise the following convolutional steps: Step 151: delated convolution 3x3, rate 1 Step 152: convolution 1x1, stride 1 Step 153: delated convolution 3x3, rate 2 Step 154: convolution 1x1, stride 1 Step 155: delated convolution 3x3, rate 4 Step 156: convolution 1x1, stride 1 Step 157: delated convolution 3x3, rate 8 and Step 158: convolution 1x1, stride 1.
[0046] In the present example the dense block 15 has four delated convolutions increasing their rate exponentially over depth. All layers are connected with each other in a feet-forward manner while 1x1 convolutions are used for reducing the channel size.
[0047] The convolutional steps of Fig. 7 should be understood as an example only. The number and kind of convolutional steps may vary.
[0048] In the following the functions of ego motion compensation and long-term prediction will be explained. When the ego vehicle moves, the predicted object motions have to regard the motion of the ego vehicle because it is centered in every frame. Even though the actions of the ego vehicle are an additional input to the network (as shown in Fig. 6), it is still hard to predict this correlation. For example, we assume two cars driving next to the ego vehicle. The ego vehicle slows down and stops. The output would look like the two cars are highly speeding up although they are not changing their velocities. In addition, all static objects around the ego vehicle would move inverted to its motion so that the model has to track much more objects.
[0049] Another important feature for the prediction system is to be able to conduct longterm prediction (compare Fig. 8). The training sequence alternates between giving n=2 ground truth xt+1:1+1+2 and m=2 prediction frames yt+ 1+2:1+ 1+4 as input. While predicting further time steps a keep-alive function 18 is used to reduce the input noise. The output and all hidden states of the network are transformed by a Spatial Transformer Modul (STM) based on the ego vehicle motion. A strategy for compensating the ego motion is transforming the output images with regard to the ego vehicle velocity and yaw rate. Standard 2D affine transformation functions are not differentiable so that a Spatial Transformer Modul (STM) is the better solution. The STM applies a point-wise transformation on the image as follows: Usually the transformation matrix Ao is learned by a localization network, but in this case the matrix is already defined. Given t as the period duration of a frame, v as yaw rate and v as the velocity, the transformation matrix is approximately determined by: C(xSey At.) Not only the output depends on spatial information, the hidden states have to be transformed regarding to the ego motion so that the model can still track objects. Summarized STM is applied on the output and all hidden states of the network.
[0050] To summarize the loss calculation Fig. 9 gives and overview of the combination and application of all losses. The input frame xt is given to the generator which predicts the next frame. To compensate the ego vehicle motion, a Spatial Transformer Module (STM) is applied on it to get the final generated frame yto. The prediction is masked out by the ground truth occlusion map 19 from x1+1 and provided as input (occupancy map 20) to the Binary Cross Entropy (BCE) loss and the recurrent discriminator D. The sharpening loss SHARP does neither need the masking nor the ground truth and so only takes the predicted frame. All of them are rated and summed up to get the overall loss.
Reference Signs 1 vehicle 2 object 3 car 4 road input frame 6 self-learning system / prediction system 7 objects 8 occlusion areas 9 output frame trajectories 11 trajectories 12 car 13 occlusion data 14 feature map dense block 16 ego motion data 17 one-channel prediction 18 keep-alive function 19 occlusion map occupancy map

Claims (7)

  1. CLAIMS1. A method for prediction of a motion of an object (7) in the environment of a vehicle (1), the method comprising: - collecting sensor data of the object (7) in the environment of the vehicle (1) by at least one vehicle sensor and - predicting the motion of the object (7) on the basis of the sensor data by a self-learning system (6), characterized by - inputting occlusion data (8, 13) into the self-learning system (6), wherein the occlusion data (8, 13) refer to occluded objects of the environment, which are hidden for the at least one vehicle sensor and/or - inputting ego motion data (16) into the self-learning system (6), wherein the ego motion data (16) refer to an ego motion of the vehicle (1), - so that the motion of the object (7) is predicted taking into account the occluded objects and/or the ego motion of the vehicle (1).
  2. 2. The method according to claim 1, characterized in that the self-learning system (6) comprises a deep learning neural network (14, 15).
  3. 3. The method according to claim 2, characterized in that the occlusion data (8, 13) is input into a first layer of the neural network (14, 15) and the ego motion data is input into a second layer of the neural network (14, 15), wherein the second layer is scaled lower than the first layer.
  4. 4. The method according to any one of claims 1 to 3, characterized in that a long-term prediction is performed, wherein a keep-alive function (18) is used to reduce the input noise.
  5. 5. The method according to any one of claims 1 to 4, characterized in that the ego motion is compensated by transforming output images with regard to an ego vehicle velocity and yaw rate.
  6. 6. A method of controlling an autonomous driving vehicle (1) on the basis of predicting a motion of an object (7) in the environment of a vehicle (1) according to any one of claims 1 to 4.
  7. 7. A device for prediction of a motion of an object (7) in the environment of a vehicle (1), the device comprising: - at least one vehicle sensor for collecting sensor data of the object (7) in the environment of the vehicle (1) and - a self-learning system (6) for predicting the motion of the object (7) on the basis of the sensor data, characterized by - input means for inputting occlusion data into the self-learning system (6), wherein the occlusion data refer to occluded objects of the environment, which are hidden for the at least one vehicle sensor and/or for inputting ego motion data into the self-learning system (6), wherein the ego motion data refer to an ego motion of the vehicle (1), - so that the motion of the object (7) is predictable taking into account the occluded objects and/or the ego motion of the vehicle (1).
GB2105220.4A 2021-04-13 2021-04-13 Motion prediction with ego motion compensation and consideration of occluded objects Withdrawn GB2606339A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB2105220.4A GB2606339A (en) 2021-04-13 2021-04-13 Motion prediction with ego motion compensation and consideration of occluded objects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2105220.4A GB2606339A (en) 2021-04-13 2021-04-13 Motion prediction with ego motion compensation and consideration of occluded objects

Publications (2)

Publication Number Publication Date
GB202105220D0 GB202105220D0 (en) 2021-05-26
GB2606339A true GB2606339A (en) 2022-11-09

Family

ID=75949414

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2105220.4A Withdrawn GB2606339A (en) 2021-04-13 2021-04-13 Motion prediction with ego motion compensation and consideration of occluded objects

Country Status (1)

Country Link
GB (1) GB2606339A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190025841A1 (en) 2017-07-21 2019-01-24 Uber Technologies, Inc. Machine Learning for Predicting Locations of Objects Perceived by Autonomous Vehicles
US20190049970A1 (en) 2017-08-08 2019-02-14 Uber Technologies, Inc. Object Motion Prediction and Autonomous Vehicle Control
US20210073997A1 (en) * 2019-09-06 2021-03-11 Google Llc Future semantic segmentation prediction using 3d structure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190025841A1 (en) 2017-07-21 2019-01-24 Uber Technologies, Inc. Machine Learning for Predicting Locations of Objects Perceived by Autonomous Vehicles
US20190049970A1 (en) 2017-08-08 2019-02-14 Uber Technologies, Inc. Object Motion Prediction and Autonomous Vehicle Control
US20210073997A1 (en) * 2019-09-06 2021-03-11 Google Llc Future semantic segmentation prediction using 3d structure

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JULIE DEQUAIRE ET AL: "Deep tracking in the wild: End-to-end tracking using recurrent neural networks", INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH., vol. 37, no. 4-5, 22 June 2017 (2017-06-22), US, pages 492 - 512, XP055512751, ISSN: 0278-3649, DOI: 10.1177/0278364917710543 *
NIMA MOHAJERIN ET AL: "Multi-Step Prediction of Occupancy Grid Maps with Recurrent Neural Networks", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 21 December 2018 (2018-12-21), XP081018852 *
SCHREIBER MARCEL ET AL: "Long-Term Occupancy Grid Prediction Using Recurrent Neural Networks", 2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), IEEE, 20 May 2019 (2019-05-20), pages 9299 - 9305, XP033593545, DOI: 10.1109/ICRA.2019.8793582 *

Also Published As

Publication number Publication date
GB202105220D0 (en) 2021-05-26

Similar Documents

Publication Publication Date Title
CN111137292B (en) Method and system for learning lane change strategies via actuator-evaluation network architecture
Gupta et al. Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues
US10678253B2 (en) Control systems, control methods and controllers for an autonomous vehicle
US10955842B2 (en) Control systems, control methods and controllers for an autonomous vehicle
US11016495B2 (en) Method and system for end-to-end learning of control commands for autonomous vehicle
US11501525B2 (en) Systems and methods for panoptic image segmentation
US20190361454A1 (en) Control systems, control methods and controllers for an autonomous vehicle
US20220343138A1 (en) Analysis of objects of interest in sensor data using deep neural networks
CN110007675B (en) Vehicle automatic driving decision-making system based on driving situation map and training set preparation method based on unmanned aerial vehicle
US11587329B2 (en) Method and apparatus for predicting intent of vulnerable road users
WO2019177562A1 (en) Vehicle system and method for detecting objects and object distance
Fernández-Llorca et al. Two-stream networks for lane-change prediction of surrounding vehicles
Paravarzar et al. Motion prediction on self-driving cars: A review
US11970175B2 (en) System for obtaining a prediction of an action of a vehicle and corresponding method
GB2606339A (en) Motion prediction with ego motion compensation and consideration of occluded objects
US20230154198A1 (en) Computer-implemented method for multimodal egocentric future prediction
Siddiqui et al. Object/Obstacles detection system for self-driving cars
US20230048926A1 (en) Methods and Systems for Predicting Properties of a Plurality of Objects in a Vicinity of a Vehicle
EP4361961A1 (en) Method of determining information related to road user
Abirami et al. Secured Public Transportation System using Ambient Intelligence
Reddy Artificial Superintelligence: AI Creates Another AI Using A Minion Approach
CN117121060A (en) Computer-implemented method and system for training machine learning methods
Chen et al. Deep Anticipation: Light Weight Intelligent Mobile Sensing in IoT by Recurrent Architecture
KR20220049983A (en) Apparatus and method for classifying three-dimensional point cloud using semantic segmentation
Naik et al. Autonomous Car

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)