US12555368B2 - Method for temporal correction of multimodal data - Google Patents
Method for temporal correction of multimodal dataInfo
- Publication number
- US12555368B2 US12555368B2 US18/337,153 US202318337153A US12555368B2 US 12555368 B2 US12555368 B2 US 12555368B2 US 202318337153 A US202318337153 A US 202318337153A US 12555368 B2 US12555368 B2 US 12555368B2
- Authority
- US
- United States
- Prior art keywords
- sensor
- data set
- training
- neural network
- chronological
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/02—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
- G01S7/40—Means for monitoring or calibrating
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/02—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
- G01S7/40—Means for monitoring or calibrating
- G01S7/4004—Means for monitoring or calibrating of parts of a radar system
- G01S7/4039—Means for monitoring or calibrating of parts of a radar system of sensor or antenna obstruction, e.g. dirt- or ice-coating
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/02—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
- G01S7/40—Means for monitoring or calibrating
- G01S7/4052—Means for monitoring or calibrating by simulation of echoes
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/02—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
- G01S7/41—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
- G01S7/411—Identification of targets based on measurements of radar reflectivity
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/02—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
- G01S7/41—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
- G01S7/417—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section involving the use of neural networks
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
- G01S7/4802—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
- G01S7/497—Means for monitoring or calibrating
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/52—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S15/00
- G01S7/52004—Means for monitoring or calibrating
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/52—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S15/00
- G01S7/539—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S15/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
- G06F2218/16—Classification; Matching by matching signal segments
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/62—Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
Definitions
- the present disclosure relates to a method for the chronological correction of multimodal data, with the aim of identifying a chronological offset between measurement timepoints of data sets from different, non-synchronous sensors.
- the disclosure also relates to a method for training a neural network with the aim of being able to perform an aforementioned method for chronological correction of multimodal data.
- the disclosure also relates to a computer program which implements one of the aforementioned methods, a machine-readable data storage medium and/or a download product having such a computer program, as well as one or multiple computers comprising the aforementioned computer program.
- ADAS Advanced Driver Assistance Systems
- AD Autonomous Driving
- ML machine learning
- sensor fusion multiple sensors
- the data of multiple sensors are often converted to a common timepoint for sensor fusion.
- the timestamps of the sensor data required for the aforementioned conversion are in this case not known or only with insufficient accuracy.
- Multimodal data are typically divided into individual chronological steps (referred to as frames), in which case each frame, e.g., references a measurement of each sensor. Given that the sensors typically do not measure accurately at the same time, the chronological offset between the individual measurements of a frame must be known. If the movement of the sensors (e.g., in a vehicle) and the movement of objects (e.g., other vehicles or pedestrians) that may be present in the scene is known, the chronological offset can be used to predict the measured sensor data to a common timepoint.
- a method for the chronological correction of multimodal data is proposed.
- the aim of this method is to identify a chronological offset between the measurement timepoints of data sets from different, non-synchronous sensors and to then synchronize the data sets to common reference timepoints.
- the corrected data sets then each contain data related to the common reference timepoints.
- the method thereby comprises at least the steps described hereinafter.
- a first data set from a reference sensor is received with measurements at different measurement timepoints.
- a second data set from a second sensor with measurements at different measurement timepoints each not exactly matching those of the reference sensor is received.
- the first and second data sets are then read by a neural network.
- a plurality of feature vectors for the first and second data sets at the respective measurement timepoints is subsequently identified by the latter network.
- the neural network in each case merges and compares the first and second feature vectors, which refer to corresponding measurement timepoints that do not match exactly. Parameters of a chronological correction are in particular identified thereby.
- a chronological offset between the respective measurement timepoints is then identified.
- This corrected data set from the second sensor contains the data that the second sensor would have measured if it had measured at the same measurement timepoints at which the reference sensor also measured.
- the value identified for the chronological offset between the measurement timepoints of the reference sensor and the second sensor, and/or the corrected data set from the second sensor related to the measurement timepoints of the reference sensor, can then be output.
- data can be converted to a common timepoint after recording.
- This can in particular reduce the requirements for other synchronization measures (e.g., software timestamps instead of hardware timestamps, peer-to-peer synchronization) of the sensors.
- This reduction of the aforementioned synchronization measures is associated with a significant cost reduction.
- one advantage of the method proposed herein is that the chronological error of multimodal data is reduced.
- the lower the abstraction of the data the more important the synchronization is. For example, if a current system fuses tracked objects using a camera and radar, then the raw data (camera images and radar reflections) of the sensors, e.g. in a common tracker, can already be fused with improved synchronization. Doing so in turn increases the quality of tracking results.
- data sets for training machine learning methods can be included without much synchronization and can then be corrected in time by means of the method described hereinabove and hereinafter. Doing so can in turn significantly reduce costs that may arise in connection with a necessary synchronization of data recorded with an unintended chronological offset for training purposes of another machine learning procedure not described herein.
- a particular advantage of the method presented herein over the existing method is that the exact measurement timepoints of the sensors need not be known in order to convert the measurement data to a reference timepoint. This either saves additional costs for the synchronization or enables new algorithms regarding the sensor fusion, which would not have been feasible without accurate synchronization. This includes, e.g., a fusion of raw data of different sensors.
- the advantage of the method presented herein is that it can be learned from the training data as to how chronological errors express themselves in the data. In particular, a better correction can be achieved as a result.
- One example from daily life, in which the synchronization is applicable according to the method proposed herein, is editing video material. For example, if the image was recorded with a video camera and the sound was recorded with a separate recorder to improve sound quality, the image and sound may not initially be synchronized. Thus, for example, the mouth movements of a speaker do not match the words being spoken.
- the chronological offset is also not necessarily constant, as the chronological bases of both recording devices used for the recording can fluctuate with each other.
- the neural network can learn to check the plausibility of the spoken words to the mouth movements and derive therefrom indications for determining and correcting the chronological offset.
- different sensors of the same or different measurement principle in particular radar, lidar, ultrasonic, and/or camera sensors, are selected as the reference sensor and second sensor.
- data of such sensors is fused.
- a single sensor cannot provide data of consistent quality seamlessly in all situations, but a fusion of data from multiple sensors that act physically differently can.
- Environmental monitoring as a whole must function seamlessly.
- the measured values of the reference sensor as well as the measured values of the second sensor are each projected into a selected coordinate system.
- the neural network extracts first feature vectors related to the reference sensor and second feature vectors related to the second sensor.
- the measured values can in particular be converted into a form that requires a neural network of a predetermined architecture, for example. Adapting the data to the architecture requires less effort than adapting the architecture to the data.
- the aforementioned selected coordinate system can be given by a 2-dimensional or 3-dimensional Cartesian coordinate system.
- a third data set from a third sensor is further received with measurements at different measurement timepoints not exactly matching those of the reference sensor and those of the second sensor.
- a chronological offset between the measurement timepoints of the reference sensor and the third sensor is then identified in a manner similar to the method steps in the case of the second data set by means of a separate neural network.
- a third data set from a third sensor is received with measurements at different measurement timepoints which do not exactly match those of the reference sensor and those of the second sensor.
- a chronological offset between the measurement timepoints of the reference sensor and the third sensor is then identified in a manner similar to the method steps in the case of the second data set by means of the same neural network.
- the difference with the preceding exemplary embodiment is that, in the exemplary embodiment described herein, the same neural network used in connection with the data of the second sensor is used again—but in the previous case a separate neural network is used.
- the data sets each relate to environmental detection of a vehicle environment and/or a traffic situation by a system having corresponding sensors.
- the aforementioned sensors are, e.g., installed in a vehicle for assistance-based and/or autonomous driving, or are part of a traffic monitoring system.
- a fusion of the data of several sensors is particularly well-suited to ensure seamless environmental monitoring in general, although not every sensor used can seamlessly function in all situations on its own.
- a control signal is identified based on the analysis of the data sets related to a common reference timepoint.
- This actuation signal is in this case designed to trigger the initiation of braking, the initiation of acceleration, the control of the steering system for initiating cornering, the control of lighting, the control of the hazard warning system, and/or the control of the windscreen wipers as an action in a vehicle.
- the probability that the respective triggered response of the traffic situation detected using the data sets is then advantageously increased.
- the disclosure relates to a method for training a neural network for use in the method described hereinabove.
- the method for training in this case comprises at least the steps described hereinafter.
- training examples are provided. These each include at least a first data set from a reference sensor and a second data set from a second sensor.
- the first and second data sets are processed using the method described hereinabove into a chronological offset value and/or to a corrected data set from the second sensor.
- the value for the chronological offset and/or the corrected data set is evaluated with a predetermined cost function in terms of its quality. Subsequently, parameters and weights of the neural network are optimized.
- the latter optimization is performed with the aim of improving an assessment obtained with the further processing of training examples by the aforementioned, predetermined cost function.
- the neural network learns, by its power to generalize, to synchronize even unseen data sets of sensor data of the reference sensor on the one hand and the second sensor on the other hand.
- the aforementioned predetermined cost function measures a plausibility of or conformity with the chronological offset or the corrected data set with, e.g., a target output previously known for the respective training.
- a target output previously known for the respective training.
- the aforementioned predetermined cost function measures the fulfillment of a similarity condition and/or consistency condition between the corrected data set on the one hand and the first data set on the other hand.
- This variant is thus a self-supervised learning approach in the context of training.
- This variant is characterized in that manual labeling of training examples with target outputs is not required.
- the cost function specified hereinabove measures a quality of a processing product.
- the above processing product has thereby been identified by downstream processing of the corrected data set and/or by a downstream processing of the second data set with additional use of the chronological offset identified.
- the “loss”, i.e. the cost function, from the downstream processing is “recycled”. Furthermore, doing so avoids the introduction of artifacts during the correction, which would particularly interfere with the desired downstream processing.
- the measurement timepoints of each of the reference sensors and the second sensor are precisely known in the context of training. Furthermore, a speed of the platform supporting both sensors as well as potential other speeds of objects in the field of view of the sensors are identified.
- the speed of the platform carrying both sensors can correspond to the self-movement of the sensors, i.e., the sensors are in this case firmly connected to the platform.
- the sensors it is also possible for the sensors to have at least one moving component relative to the platform. Such possibly additionally provided speed components can also be identified.
- the aforementioned platform can be part of a vehicle or robot.
- the sensors are, e.g., part of a column with radar and camera sensor, which can be positioned on the roadside or a road intersection.
- the speeds of further objects in the field of view of the sensor or sensors, such as other road users can be identified.
- this can also be performed in the case where sensors are also transported on a self-moving vehicle or robot.
- a comparison data set from the further sensor related to the measurement timepoints of the reference sensor is identified. The previously determined speeds and the precisely known chronological offset between the respective measurement timepoints of the sensors are used for this purpose.
- a physical model regarding the dynamics of the sensors, vehicles, robots and other objects involved can be used.
- the aforementioned comparison data set is compared to the corrected data set identified by the neural network by means of a cost function. For example, this can be done by considering the mean square error from the difference of the corresponding data sets.
- the parameters and weights of the neural network can be adjusted with respect to an anticipated future minimization of the cost function.
- the cost function specified in the previous embodiment is based on metrics that quantify the similarity of point clouds.
- geometric or photometric criteria can be used as a cost function.
- the chamfer distance can be used in this context.
- the chronological offset identified by the neural network is fed into an algorithm to be trained for environmental detection.
- the cost function which characterizes the quality of a performed environmental detection, is then also used to determine the quality of the chronological offset identified. Ultimately, it is precisely this quality that is important, e.g., in the at least partially automated control of a vehicle.
- the disclosure relates to a computer program comprising machine-readable instructions which, when executed on one or multiple computers, prompt the computer(s) to perform one of the methods described earlier and described hereinafter.
- the disclosure also comprises a machine-readable data medium on which the above computer program is stored, and/or a download product with the above computer program, as well as a computer equipped with the aforementioned computer program or the aforementioned machine-readable data medium.
- FIG. 1 an exemplary embodiment of a method 1000 for the chronological correction of multimodal data
- FIG. 2 an exemplary embodiment of a method 2000 for training a neural network for use in method 1000 ;
- FIG. 3 an exemplary embodiment of a further method 3000 , which is based on the method 2000 .
- FIG. 1 shows an exemplary schematic flowchart of a method 1000 for the chronological correction of multimodal data.
- the aim of method 1000 is to identify a chronological offset between the measurement timepoints t 1 , t 2 , t 3 ; t′ 1 , t′ 2 , t′ 3 of data sets 10 , 20 from different, non-synchronous sensors 1 , 2 of a sensor arrangement S and to then synchronize the data sets for common reference timepoints.
- the sensors 1 , 2 can be different sensors of the same or different measurement principle, e.g., radar, lidar, ultrasonic, and/or camera sensors.
- a first data set 10 from a reference sensor 1 with measurements at different measurement timepoints t 1 , t 2 , t 3 is received.
- a second data set 20 from a second sensor 2 with measurements at different measurement timepoints t′ 1 , t′ 2 , t′ 3 , each not exactly matching those of the reference sensor, is received.
- the first data set 10 and the second data set 20 are read by a neural network, and a plurality of feature vectors 11 , 21 are in each case identified for the first data set 10 and the second data set 20 at the respective measurement timepoints t 1 , t 2 , t 3 , and t′ 1 , t′ 2 , t′ 3 .
- This can be performed by respective backbones B 1 or B 2 of a separate or a common neural network, which in each case process the data of the reference sensor 1 and the second sensor 2 .
- the first feature vectors 11 and the second feature vectors 21 are merged and compared, each relating to corresponding, but not exactly matching, measurement timepoints—i.e.
- Fusion layers F of a neural network are used for this purpose, which in this step in particular also identify parameters 4 of a chronological correction.
- T a chronological offset 5 between the respective measurement timepoints and/or a corrected data set 6 from the second sensor 2 related to the measurement timepoints of the reference sensor 1 is then determined in step 500 using the parameters 4 .
- a value 5 is output for the chronological offset between the measurement timepoints of reference sensor 1 and the measurement timepoints of the second sensor 2 and/or the output of the corrected data set 6 from the second sensor 2 based on the measurement timepoints t 1 , t 2 , t 3 of the reference sensor 1 .
- the measured values of the reference sensor 1 as well as the measured values of the second sensor 2 can each be projected into a selected coordinate system, e.g., a 2-dimensional or 3-dimensional Cartesian coordinate system, in method step 300 .
- a selected coordinate system e.g., a 2-dimensional or 3-dimensional Cartesian coordinate system
- first feature vectors 11 related to reference sensor 1 and second feature vectors 21 related to second sensor 2 can in each case be extracted by backbones B 1, B 2 of a separate or a common neural network.
- data sets 10 and 20 can be reflex lists of radar or lidar sensors 1 and 2 .
- the reflex lists can each be projected into a Cartesian grid, and thereby multiple data points can fall within a grid cell. Multiple layers of a neural network then process this representation.
- both the reference sensor 1 and the second sensor 2 use the same representation after the (pre-) processing described hereinabove, i.e., a 2-dimensional Cartesian grid in bird's eye view.
- the output layers of the two backbones B 1 and B 2 are merged (in a concatenated manner).
- the feature vectors 11 , 21 of the two sensors 1 and 2 are concatenated for each x-y cell of the grid.
- the fused feature vectors 11 , 21 thereby concatenated are then processed in step 400 of the method by further layers F of a neural network.
- these could be 2-dimensional folding layers (2D convolutional layer).
- Parameter 4 of a chronological correction can be calculated or extracted based on the output of the fusion layers F.
- the parameters 4 can also be provided with a chronological correction directly through the output of the fusion layers.
- the entries of multiple transformation matrices (with rotation and spatial as well as temporal translation) can be calculated.
- a corresponding transformation matrix could, e.g., be calculated for each grid cell, the entries of which can be linked to parameters 4 of the chronological correction.
- the entries of the respective transformation matrix can in each case be provided by known functions of parameters 4 —as well as other potential parameters which are also obtained as an output from the fusion layers.
- parameters 4 e.g., the (relative) speeds of the sensors and the (relative) angular speed as parameter 4 , and thus as output of the fusion layers F.
- T a chronological offset 5 between the respective measurement timepoints and/or a corrected data set from the second sensor 2 related to the measurement timepoints of the reference sensor 1 can then be determined using the parameters 4 .
- the transformation matrices can, e.g., also be directly selected as an output from the fusion layers F.
- a chronological offset 5 can be identified.
- Another option is to calculate the chronological differences of the measurement per grid cell.
- the self-movement measurement e.g., by a further sensor
- Another option is thereby provided for also using layers of a neural network to calculate the chronological offset. These layers can also use inputs from other sensors, e.g., those associated with the self-movement measurement.
- the transformation is advantageous for the transformation to be differentiable. Otherwise, the layers of the neural network—backbones and fusion layers—cannot be trained by back propagation.
- the aforementioned options for calculating the transformation are advantageously derived from the parameters of the chronological correction.
- a transformation matrix per cell of the Cartesian grid can be used, i.e., a local correction is performed.
- a transformation matrix could also be calculated for the entire scene, corresponding to a global correction.
- Global correction has the advantage of a more robust estimate, but the different movement of different objects in the scene cannot be modeled with a global transformation matrix.
- the parameters 4 of the chronological correction i.e. the output of the fusion layers, e.g. given by relative speeds, can then be calculated per grid cell or globally for the entire scene, depending on whether a transformation matrix per cell or a global transformation matrix is desired.
- a different representation can be used for a local correction. For example, if a three-dimensional grid is used, then parameters and transformation matrices can be calculated per voxel. If a point-processing network is used, then a transformation matrix can be calculated for each point (e.g., lidar point or radar reflection).
- a third data set from a third sensor can be received with measurements at different measurement timepoints not exactly matching those of the reference sensor 1 and those of the second sensor 2 , respectively, and, in a manner similar to the method steps in the case of the second data set 20 , a chronological offset between the measurement timepoints of the reference sensor 1 and the third sensor can be identified by means of a separate neural network.
- a third data set from a third sensor can be received with measurements at different measurement timepoints not exactly matching those of the reference sensor 1 and those of the second sensor 2 , respectively, and, in a manner similar to the method steps in the case of the second data set 20 , a chronological offset between the measurement timepoints of the reference sensor 1 and the third sensor is identified by means of the same neural network as in the case of the data from the second sensor 2 .
- the data sets 10 , 20 in FIG. 1 can each refer to the environmental detection of a vehicle environment and/or a traffic situation by a system S with corresponding sensors 1 , 2 .
- Tensors 1 , 2 can in this case be installed in a vehicle S for assistance-based and/or autonomous driving. However, it is also possible that the sensors 1 , 2 be part of a system S used for traffic monitoring.
- step 600 is followed by further method step 700 , in which a control signal 7 is identified based on the analysis of the data sets related to a common reference timepoint, and the control signal 7 is designed to trigger, as an action in vehicle S, the initiation of braking, the control of acceleration, the control of the steering system for initiating cornering, the control of the lighting, the control of the hazard light system, and/or the control of the windscreen wipers.
- a control signal 7 is identified based on the analysis of the data sets related to a common reference timepoint, and the control signal 7 is designed to trigger, as an action in vehicle S, the initiation of braking, the control of acceleration, the control of the steering system for initiating cornering, the control of the lighting, the control of the hazard light system, and/or the control of the windscreen wipers.
- FIG. 2 shows a schematic flow diagram of an example of steps of a method 2000 for training a neural network for, e.g., use in a method shown in FIG. 1 .
- a first step 100 ′ training examples are provided, each containing at least a first data set 10 ′ from a reference sensor 1 and a second data set from a second sensor 2 .
- the first data set 10 ′ and the second data set 20 ′ are processed using a method shown in FIG. 1 into a value for the chronological offset 5 and/or to a corrected data set 6 from the second sensor 2 .
- the value for the chronological offset 5 and/or the corrected data set 6 is then evaluated in step 300 ′ with a predetermined cost function 8 in terms of its quality.
- parameters and weights of the neural network which occur in the backbones B 1 , B 2 , the fusion layers F, or the function T—are optimized. This optimization is performed with the goal of improving the assessment obtained by the cost function 8 using the further processing of training examples 10 ′, 20 ′.
- the aforementioned cost function 8 measures a plausibility of or conformance with the chronological offset or the corrected data set using a target output previously known for the respective training example. It is also possible that cost function 8 measures the fulfillment of a similarity condition and/or consistency condition between corrected data set 6 on the one hand and first data set on the other hand. Furthermore, the cost function 8 can measure a quality of a processing product identified by downstream processing of the corrected data set 6 and/or by a downstream processing of the second data set 20 with additional use of the chronological offset 6 identified.
- FIG. 3 shows further steps of a method 3000 for training a neural network for use, e.g., in a method according to FIG. 1 .
- This method is based on the method 2000 explained in connection with FIG. 2 .
- the measurement timepoints for both reference sensor 1 and second sensor 2 are precisely known.
- a speed of the platform supporting both sensors 1 and 2 as well as potential other speeds of objects in the field of view of the sensors are further identified.
- a comparison data set 9 from the further sensor 2 which is related to the measurement timepoints of the reference sensor 1 , is identified in step 600 ′.
- the comparative data set 9 is then compared in step 700 ′ by means of a cost function with the corrected data set 6 identified by the neural network and, in view of minimizing the cost function, the parameters and weights of the neural network are adjusted in step 800 ′.
- the cost function described in connection with FIGS. 2 and 3 can be in particular based on metrics quantifying the similarity of point clouds.
- the chronological offset identified by the neural network be fed into an algorithm to be trained for environmental detection, and the cost function, which characterizes the quality of a completed environmental detection, is also used to determine the quality of the chronological offset identified.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Radar, Positioning & Navigation (AREA)
- Artificial Intelligence (AREA)
- Remote Sensing (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
-
- receiving a first data set from a reference sensor with measurements at different measurement timepoints,
- receiving a second data set of a second sensor with measurements at different measurement timepoints, each not exactly matching those of the reference sensor,
- reading the first and the second data sets by a neural network and identifying a respective plurality of feature vectors for the first and second data set at the respective measurement timepoints,
- merging and comparing the respective feature vectors, which refer to corresponding, not exactly matching measurement timepoints, by the neural network so that parameters of a chronological correction are identified, and
- identifying a chronological offset between the respective measurement timepoints of the reference sensor and the second sensor, and/or a corrected data set from the second sensor based on the measurement timepoints of the reference sensor.
Description
Claims (17)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| DE102022206346.5 | 2022-06-23 | ||
| DE102022206346.5A DE102022206346A1 (en) | 2022-06-23 | 2022-06-23 | Method for temporal correction of multimodal data |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20230419649A1 US20230419649A1 (en) | 2023-12-28 |
| US12555368B2 true US12555368B2 (en) | 2026-02-17 |
Family
ID=89075714
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/337,153 Active 2044-05-29 US12555368B2 (en) | 2022-06-23 | 2023-06-19 | Method for temporal correction of multimodal data |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US12555368B2 (en) |
| CN (1) | CN117289223A (en) |
| DE (1) | DE102022206346A1 (en) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12208814B2 (en) * | 2021-06-11 | 2025-01-28 | Zf Friedrichshafen Ag | Sensor performance validation in advanced driver-assistance system verification |
| DE102022200735A1 (en) * | 2022-01-24 | 2023-07-27 | Robert Bosch Gesellschaft mit beschränkter Haftung | Method and controller for training an object detector |
| DE102024108434A1 (en) * | 2024-03-25 | 2025-09-25 | Aumovio Autonomous Mobility Germany Gmbh | Method for operating a sensor system for object detection |
| CN118965411B (en) * | 2024-10-17 | 2025-02-07 | 山东舜林建设有限公司 | A method and related equipment for protecting municipal engineering construction data |
| CN121114949B (en) * | 2025-11-12 | 2026-02-03 | 北京海舶无人船科技有限公司 | An automatic calibration method and system for extrinsic parameters of a multimodal sensor |
Citations (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9770172B2 (en) * | 2013-03-07 | 2017-09-26 | Volcano Corporation | Multimodal segmentation in intravascular images |
| US20190018722A1 (en) * | 2017-07-12 | 2019-01-17 | Vinay Ramanath | Method and system for deviation detection in sensor datasets |
| US10650253B2 (en) * | 2015-05-22 | 2020-05-12 | Continental Teves Ag & Co. Ohg | Method for estimating traffic lanes |
| US20210382469A1 (en) * | 2020-06-08 | 2021-12-09 | International Business Machines Corporation | Generating a hybrid sensor to compensate for intrusive sampling |
| US11210560B2 (en) * | 2019-10-02 | 2021-12-28 | Mitsubishi Electric Research Laboratories, Inc. | Multi-modal dense correspondence imaging system |
| US11263524B2 (en) * | 2018-03-07 | 2022-03-01 | International Business Machines Corporation | Hierarchical machine learning system for lifelong learning |
| US11443515B2 (en) * | 2018-12-21 | 2022-09-13 | Ambient AI, Inc. | Systems and methods for machine learning enhanced intelligent building access endpoint security monitoring and management |
| US11600074B2 (en) * | 2021-06-29 | 2023-03-07 | Anno.Ai, Inc. | Object re-identification |
| US11610112B2 (en) * | 2017-07-05 | 2023-03-21 | Siemens Aktiengesellschaft | Method for the computer-aided configuration of a data-driven model on the basis of training data |
| US11994615B2 (en) * | 2017-07-19 | 2024-05-28 | Intel Corporation | Compensating for a sensor deficiency in a heterogeneous sensor array |
| US12354342B2 (en) * | 2022-04-29 | 2025-07-08 | Toyota Research Institute, Inc. | Network for multisweep 3D detection |
-
2022
- 2022-06-23 DE DE102022206346.5A patent/DE102022206346A1/en active Pending
-
2023
- 2023-06-19 US US18/337,153 patent/US12555368B2/en active Active
- 2023-06-25 CN CN202310753087.9A patent/CN117289223A/en active Pending
Patent Citations (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9770172B2 (en) * | 2013-03-07 | 2017-09-26 | Volcano Corporation | Multimodal segmentation in intravascular images |
| US10650253B2 (en) * | 2015-05-22 | 2020-05-12 | Continental Teves Ag & Co. Ohg | Method for estimating traffic lanes |
| US11610112B2 (en) * | 2017-07-05 | 2023-03-21 | Siemens Aktiengesellschaft | Method for the computer-aided configuration of a data-driven model on the basis of training data |
| US20190018722A1 (en) * | 2017-07-12 | 2019-01-17 | Vinay Ramanath | Method and system for deviation detection in sensor datasets |
| US11994615B2 (en) * | 2017-07-19 | 2024-05-28 | Intel Corporation | Compensating for a sensor deficiency in a heterogeneous sensor array |
| US11263524B2 (en) * | 2018-03-07 | 2022-03-01 | International Business Machines Corporation | Hierarchical machine learning system for lifelong learning |
| US11443515B2 (en) * | 2018-12-21 | 2022-09-13 | Ambient AI, Inc. | Systems and methods for machine learning enhanced intelligent building access endpoint security monitoring and management |
| US11210560B2 (en) * | 2019-10-02 | 2021-12-28 | Mitsubishi Electric Research Laboratories, Inc. | Multi-modal dense correspondence imaging system |
| US20210382469A1 (en) * | 2020-06-08 | 2021-12-09 | International Business Machines Corporation | Generating a hybrid sensor to compensate for intrusive sampling |
| US11600074B2 (en) * | 2021-06-29 | 2023-03-07 | Anno.Ai, Inc. | Object re-identification |
| US12354342B2 (en) * | 2022-04-29 | 2025-07-08 | Toyota Research Institute, Inc. | Network for multisweep 3D detection |
Also Published As
| Publication number | Publication date |
|---|---|
| US20230419649A1 (en) | 2023-12-28 |
| DE102022206346A1 (en) | 2023-12-28 |
| CN117289223A (en) | 2023-12-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12555368B2 (en) | Method for temporal correction of multimodal data | |
| US12481870B2 (en) | Method for determining a quality grade of data sets of sensors | |
| CN109086788B (en) | Apparatus, method and system for multi-mode fusion processing of data in multiple different formats sensed from heterogeneous devices | |
| US11935250B2 (en) | Method, device and computer-readable storage medium with instructions for processing sensor data | |
| WO2020052540A1 (en) | Object labeling method and apparatus, movement control method and apparatus, device, and storage medium | |
| WO2020104423A1 (en) | Method and apparatus for data fusion of lidar data and image data | |
| CN109099920B (en) | Sensor target accurate positioning method based on multi-sensor association | |
| US20180292201A1 (en) | Calibration apparatus, calibration method, and calibration program | |
| US11893496B2 (en) | Method for recognizing objects in an environment of a vehicle | |
| CN111753901B (en) | Data fusion method, device, system and computer equipment | |
| CN111612818A (en) | Novel binocular vision multi-target tracking method and system | |
| CN114758200B (en) | Multi-sensor data fusion methods, multi-source fusion sensing systems, and computer equipment | |
| CN114943952B (en) | Obstacle fusion method, system, equipment and medium under multi-camera overlapping vision | |
| CN116990776A (en) | Lidar point cloud compensation method, device, electronic equipment, and storage medium | |
| CN114494466A (en) | External parameter calibration method, device and equipment and storage medium | |
| JP6490747B2 (en) | Object recognition device, object recognition method, and vehicle control system | |
| EP4405717A1 (en) | Automatic cross-sensor calibration using object detections | |
| US20220292747A1 (en) | Method and system for performing gtl with advanced sensor data and camera image | |
| US20210056319A1 (en) | Object recognition device | |
| US20250042409A1 (en) | Vehicle data system and method for determining relevant or transmission-suitable vehicle data from an environment-sensing sensor | |
| US12534104B2 (en) | Method and control device for training an object detector | |
| CN116755103A (en) | A target tracking method, computer equipment, readable storage medium and motor vehicle | |
| JP7092741B2 (en) | Self-position estimation method | |
| CN116452821A (en) | Target recognition method, control system, vehicle and storage medium | |
| CN114239706A (en) | Target fusion method and system based on multiple cameras and laser radar |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| AS | Assignment |
Owner name: ROBERT BOSCH GMBH, GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GLAESER, CLAUDIUS;TIMM, FABIAN;DREWS, FLORIAN;AND OTHERS;SIGNING DATES FROM 20231006 TO 20231117;REEL/FRAME:065648/0719 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION COUNTED, NOT YET MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: ALLOWED -- NOTICE OF ALLOWANCE NOT YET MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |