EP3847576A1 - Method and system for improved object marking in sensor data - Google Patents
Method and system for improved object marking in sensor dataInfo
- Publication number
- EP3847576A1 EP3847576A1 EP19773742.2A EP19773742A EP3847576A1 EP 3847576 A1 EP3847576 A1 EP 3847576A1 EP 19773742 A EP19773742 A EP 19773742A EP 3847576 A1 EP3847576 A1 EP 3847576A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- scene
- state
- data
- data record
- sensor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/768—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using context analysis, e.g. recognition aided by known co-occurring patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
Definitions
- the present invention relates to a method and a system for
- training data sets are often used, which can contain, for example, image and / or video data, for example to to learn automatic object recognition in such or similar data.
- Object detection can e.g. be an autonomous driving or flight operation to recognize objects in the vehicle environment.
- Object detection can e.g. be an autonomous driving or flight operation to recognize objects in the vehicle environment.
- a large number of sensors In order to ensure reliable object detection, a large number of
- Training records may be required.
- Objects identified in a (training) data record are often classified, marked or labeled and form an object-label pair which is used for
- machine learning can be processed by machine.
- a street course can be provided as an object with a marking that corresponds to the
- Object marking in image and video data sets can be cost-intensive, since this cannot be automated at all or only to a very limited extent. For this reason, such image and video annotations are predominantly carried out by human editors, for example the annotation of one captured image for semantic segmentation may take an average of more than an hour.
- the object of the invention is therefore to provide a possibility for the simplified or more cost-effective provision of object markings or data containing annotations.
- the process has the following steps:
- a scene in a first state is detected by at least one sensor.
- the scene can e.g. a vehicle environment, a street scene, a course of the road, a traffic situation or the like and include static and / or objects such as traffic areas, buildings, road users or the like.
- the sensor may be a single optical sensor, such as a camera, a lidar sensor, or a fusion of such or similar sensors.
- At least one object contained in the scene is assigned a first object marking, for example a first annotation, in a first data record containing the scene in the first state.
- the first data record can contain an image or an image sequence which reproduces the scene in its first state, that is to say, for example, contains an image of a street course.
- the first object marking can, for example, frame, fill, label or otherwise mark the object, but preferably optically.
- the object and the object marking can form an object-label pair that can be processed, for example, in machine learning.
- Object marking can be of a certain object class, such as street,
- Tree building, traffic sign, pedestrian or the like.
- this can e.g. mean that a road is driven at least twice and is detected by the sensor, in which case e.g. different times can distinguish the first state from the second state.
- one or more objects of the scene are already marked in the first state, e.g. a course of the road.
- Object markings can be created from scratch again. Rather, this effort only has to be carried out once, and the second data set can then be derived from it.
- a location to be captured, for the image content of which an annotation already exists can be captured again in one or more other states, the effort of the annotation being operated only initially.
- a further training provides that in order to recognize the scene in the second
- Location information of the scene is assigned to the first data record.
- the location information can e.g. by a suitable sensor, e.g. be provided by GPS or the like. This makes it easier to recognize the scene or to assign a data record to a specific scene.
- sensor data can also be merged in order to provide the location information.
- this can be done using a combination of GPS and camera intrinsics, e.g. in the form of
- Own movement data of a vehicle can be taken into account.
- the first data record is associated with viewing angle and / or position information of the scene. This can also be in addition to assigning a
- Location information and e.g. on a self-movement data of a vehicle, by GPS data, a camera instrument or the like. This further improves recognition.
- depth prediction e.g. monocular, by means of a stereo depth estimate, an estimate of the optical flow and / or based on LIDAR data, of the first object marking
- a prediction of semantic segmentation can also be carried out in the unknown image, that is to say in the second data set.
- a further development provides that the object marking or the label is transformed so that the object marking fits the new image of the second data record more precisely. This transform is also known as warping.
- SLAM method Simultaneous Localization And Mapping
- the effort for object marking or annotation can be particularly significantly reduced if the adoption of the first object marking is at least partially automated by an artificial intelligence module, or KL module for short.
- This can have at least one processor and e.g. be set up by program instructions to emulate human-like decision-making structures in order to independently solve problems, such as here e.g. to solve the automatic object marking or annotation.
- At least one artificial neural network which can be configured in a multi-layer and / or folding manner, determines image regions of the scene in the first and second data sets of the KL module that match.
- the artificial neural network can provide a pixel-by-bit match mask as an output. This can be a good basis for manual, semi-automatic or fully automatic
- the Kl module can be trained using the first and / or second data set, which is why the Kl module can be trained as
- Training data record can be fed.
- At least one distinguishing feature of the scene between the first state and the second state can be determined and that
- Distinguishing feature can be assigned to the second object marking. This is at least possible if the distinguishing feature, for example the Difference class, already has a sufficiently good quality (eg statistical test with high confidence) and the comparison network indicates a match for the remaining image content of the scene. Then, for example, an option can be offered to automatically take over the object marking, ie the annotation. In other words, for example on the basis of the above-mentioned or another artificial neural network, a prediction can be carried out with existing training data in order to detect any changes in the scene. Since there is already a pair of image labels in the training data for the scene, a high quality of prediction can be achieved. A difference between annotation and prediction gives an indication of which objects must be annotated.
- the distinguishing feature for example the Difference class
- the comparison network indicates a match for the remaining image content of the scene.
- an option can be offered to automatically take over the object marking, ie the annotation.
- a prediction can be carried out with existing training data in order to detect any changes in the scene.
- a further development provides that the scene in the second state can be captured by an image sequence and an unfavorable position, from which the scene is captured in the second state, can be compensated for on the basis of at least one single image upstream and / or downstream of the individual image to be marked .
- the first state and the second state of the scene can differ in terms of weather conditions, light conditions or the like.
- the scene can be captured again if the visibility conditions deteriorate due to fog compared to sunny weather, at night or the like.
- the second state can, for example if the second state includes darkness, poor visibility or the like, cause one or more objects of the scene to be (no longer) visible in the second data set.
- invisible areas can be marked or annotated accordingly or based on e.g. a signal-to-noise ratio are automatically excluded.
- the invention also relates to a system for object marking in sensor data.
- the system can in particular be operated in accordance with the method described above and accordingly further developed according to one or more of the embodiment variants described above.
- the system has Via at least one, preferably optical, sensor for detecting a scene and via a data processing device, for example a computer with a processor, a memory and / or the like.
- a data processing device for example a computer with a processor, a memory and / or the like.
- Data processing device is set up to assign at least one object contained in the scene in a first data set containing the scene in a first state, and the first object marking contained in the first data set as second for the object recognized in a second state of the scene To at least partially take over object marking in a second data record.
- the system can have a second sensor for determining the location and / or position during the detection of the scene, the location and / or position determination of the detected scene, i.e. in particular the first data record.
- the second sensor can e.g. comprise one or more sensors, such as for GPS positioning, for determining self-movement or the like.
- Figure 1 is a schematic of a system dealing with one of this invention
- Figure 2 shows a practical application of the method using the example of a
- FIG. 1 shows a diagram of a system 100 which is suitable for the partially automated and / or fully automated marking or annotation of an object or an object class recognized in an image or in an image sequence.
- the system 100 comprises a data processing device 110, which can have a processor, a storage device, in particular for program code, etc.
- a data processing device 110 which can have a processor, a storage device, in particular for program code, etc.
- the processor 110 can have a processor, a storage device, in particular for program code, etc.
- Data processing device 110 has at least one artificial intelligence module 120, or KL module for short, which, for example, uses a multilayered artificial neural network 130 for pattern recognition in an image or in an image
- Image sequence is set up.
- the system has at least one first sensor 140, which is designed as an optical sensor, for example as a camera, and at least one second sensor 150 for determining the location and / or position.
- the sensors 140, 150 are exemplary on or in one
- Motor vehicle 160 arranged and can also be borrowed from another vehicle system.
- the first sensor 140 can thus be part of a driver assistance system that can also be set up for autonomous driving operation of the motor vehicle 160.
- the second sensor 150 can be part of a
- System 100 can be operated using the method described below.
- the motor vehicle 160 is moved by a scene 170, which here is an example of a traffic situation with an object 180, which e.g. a static object in the form of a street, one
- This scene 170 is recorded in a first state as an image or image sequence by means of the first sensor 140 and stored in a first data record 190.
- the first state of the scene 170 corresponds, for example, to a daytime travel of the motor vehicle 160 through the scene, with the scene being assumed to be illuminated as bright as day.
- Based on the location and / or location determination by the second sensor 150 are also one in the first data record 190
- Location information the location where the scene was recorded, and viewing angle and / or location information.
- the same or at least similar scene is again recorded in a second state, which differs from the first state, which is why the newly recorded scene in the second state is denoted by 170 in FIG. 1.
- the object 180 is still part of the scene 170 '.
- This scene 170 'in the second state is stored in a second data record 190'.
- the first data record 180 is fed to the data processing device 110 and with its help, e.g. manually or partially automated, possibly also fully automated by the KL module 120, the object 190 with a first object marking 195, i.e. an annotation, marked.
- the first object marking 195 i.e. an annotation, marked.
- Object marker 195 can e.g. be a highlight of a street.
- the second data record 190 ' is also fed to the data processing device 110 and processed therein.
- the KL module 120 is also set up to recognize the object 180 in the second data record 190 'and to assign a second object marking 195' to it, which is the same as the first object marking 195 in the first data record 190 when the object 180 is unchanged. Recognizing the scene 170 'and / or the object 180, the KL module 120 accesses the information on the location and location of the recording of the scene 170, which are stored in the first data record 190. As a result of the processing by the KL module 120, the second data record 190 now also contains the similar or the same scene 170 and the second object marking 195.
- FIG. 2 shows an exemplary scene 170 on the left-hand side, in which the object 180 is a course of a road, which already here is the first
- Object marking 195 is provided. It is assumed that comparatively bad weather prevailed during the recording of scene 170 and therefore the view is slightly restricted. On the right-hand side of FIG. 2, scene 170 is again recorded when the weather is clearer.
- the KL module 120 has recognized the scene 170 '(and has the object 180, that is to say the
- the system 100 and the method described above can be modified in many ways. For example, it is possible that, based on the first data record 190, a depth prediction, e.g. monocular, by a stereo depth estimate, an estimate of the optical flow and / or on the basis of LIDAR data, of the image already having the first object marking. There can also be a prediction of semantic segmentation in the
- the first object marking 195 is transformed so that the object marking fits the new image of the second data record 190 ′ more precisely. This transform is also known as warping. It is also possible that a SLAM (Simultaneous Localization And Mapping) method is used to obtain a better location and position determination. It is also conceivable for the artificial neural network 130 to be pixel by pixel
- the SLAM method determines at least one distinguishing feature of the scene 170, 170 'between the first state and the second state and the second object marking 195' is assigned to the distinguishing feature, at least if the distinguishing feature, e.g. the difference class already has a sufficiently good quality (e.g. statistical test with high confidence) and the artificial neural network 130 indicates a match for the remaining image content of the scene 170, 170 ', e.g. an option is offered to automatically take over object marking 195.
- the distinguishing feature e.g. the difference class already has a sufficiently good quality (e.g. statistical test with high confidence)
- the artificial neural network 130 indicates a match for the remaining image content of the scene 170, 170 ', e.g. an option is offered to automatically take over object marking 195.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102018214979.8A DE102018214979A1 (en) | 2018-09-04 | 2018-09-04 | Method and system for improved object marking in sensor data |
PCT/EP2019/073385 WO2020048940A1 (en) | 2018-09-04 | 2019-09-03 | Method and system for improved object marking in sensor data |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3847576A1 true EP3847576A1 (en) | 2021-07-14 |
Family
ID=68062888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19773742.2A Pending EP3847576A1 (en) | 2018-09-04 | 2019-09-03 | Method and system for improved object marking in sensor data |
Country Status (5)
Country | Link |
---|---|
US (1) | US11521375B2 (en) |
EP (1) | EP3847576A1 (en) |
CN (1) | CN112639812A (en) |
DE (1) | DE102018214979A1 (en) |
WO (1) | WO2020048940A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11610412B2 (en) * | 2020-09-18 | 2023-03-21 | Ford Global Technologies, Llc | Vehicle neural network training |
DE102021207093A1 (en) | 2021-07-06 | 2023-01-12 | Robert Bosch Gesellschaft mit beschränkter Haftung | Apparatus and method for providing classified digital recordings to an automatic machine learning system and updating machine-readable program code therewith |
DE102022209401A1 (en) | 2022-01-18 | 2023-07-20 | Robert Bosch Gesellschaft mit beschränkter Haftung | Method for generating training data for an adaptive method |
CN117475397B (en) * | 2023-12-26 | 2024-03-22 | 安徽蔚来智驾科技有限公司 | Target annotation data acquisition method, medium and device based on multi-mode sensor |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3131020B1 (en) * | 2015-08-11 | 2017-12-13 | Continental Automotive GmbH | System and method of a two-step object data processing by a vehicle and a server database for generating, updating and delivering a precision road property database |
EP3130891B1 (en) * | 2015-08-11 | 2018-01-03 | Continental Automotive GmbH | Method for updating a server database containing precision road information |
US9734455B2 (en) * | 2015-11-04 | 2017-08-15 | Zoox, Inc. | Automated extraction of semantic information to enhance incremental mapping modifications for robotic vehicles |
US11563895B2 (en) * | 2016-12-21 | 2023-01-24 | Motorola Solutions, Inc. | System and method for displaying objects of interest at an incident scene |
JP2019023858A (en) * | 2017-07-21 | 2019-02-14 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | Learning data generation device, learning data generation method, machine learning method, and program |
CN107578069B (en) * | 2017-09-18 | 2020-12-29 | 北京邮电大学世纪学院 | Image multi-scale automatic labeling method |
US10866588B2 (en) * | 2017-10-16 | 2020-12-15 | Toyota Research Institute, Inc. | System and method for leveraging end-to-end driving models for improving driving task modules |
US10175697B1 (en) * | 2017-12-21 | 2019-01-08 | Luminar Technologies, Inc. | Object identification and labeling tool for training autonomous vehicle controllers |
US10691943B1 (en) * | 2018-01-31 | 2020-06-23 | Amazon Technologies, Inc. | Annotating images based on multi-modal sensor data |
-
2018
- 2018-09-04 DE DE102018214979.8A patent/DE102018214979A1/en active Pending
-
2019
- 2019-09-03 WO PCT/EP2019/073385 patent/WO2020048940A1/en unknown
- 2019-09-03 CN CN201980057805.0A patent/CN112639812A/en active Pending
- 2019-09-03 US US17/054,692 patent/US11521375B2/en active Active
- 2019-09-03 EP EP19773742.2A patent/EP3847576A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
CN112639812A (en) | 2021-04-09 |
US20210081668A1 (en) | 2021-03-18 |
WO2020048940A1 (en) | 2020-03-12 |
DE102018214979A1 (en) | 2020-03-05 |
US11521375B2 (en) | 2022-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020048940A1 (en) | Method and system for improved object marking in sensor data | |
DE102008023970A1 (en) | Method and device for identifying traffic-related information | |
DE102016210534A1 (en) | Method for classifying an environment of a vehicle | |
DE102008041679A1 (en) | Method for environment recognition for navigation system in car, involves storing data of object or feature in storage, and classifying object or feature by comparison of data after visual inspection of object or feature | |
WO2020002261A1 (en) | Localization system and method for operating same | |
DE102018116036A1 (en) | Training a deep convolutional neural network for individual routes | |
DE102014221803A1 (en) | Method and device for determining a current driving situation | |
DE112020003091T5 (en) | System for realizing automatic iteration of predictive model based on data operation | |
DE102018133457B4 (en) | Method and system for providing environmental data | |
WO2020048669A1 (en) | Method for determining a lane change indication of a vehicle, computer-readable storage medium, and vehicle | |
DE102020211636A1 (en) | Method and device for providing data for creating a digital map | |
WO2020126167A1 (en) | Method for identifying at least one pattern in the surroundings of a vehicle, control device for carrying out such a method, and vehicle having such a control device | |
DE102019214200A1 (en) | Translation of training data between observation modalities | |
DE102017004721A1 (en) | Method for locating a vehicle | |
DE102018007962A1 (en) | Method for detecting traffic light positions | |
DE102021204687A1 (en) | Process for scene interpretation of an environment of a vehicle | |
WO2017174227A1 (en) | Method for determining a pose of an at least partially autonomously moving vehicle using specially selected landmarks transmitted from a back end server | |
DE102020110730A1 (en) | Method and device for increasing the availability of an automated driving function or a driver assistance system | |
EP3772017A1 (en) | Rail signal detection for autonomous railway vehicles | |
DE102021001043A1 (en) | Method for the automatic detection and localization of anomalies in data recorded by means of a lidar sensor | |
DE102021124736A1 (en) | Method and device for determining a position hypothesis | |
DE102018121274B4 (en) | Process for visualizing a driving intention, computer program product and visualization system | |
WO2020164841A1 (en) | Method for providing a training data set quantity, method for training a classifier, method for controlling a vehicle, computer-readable storage medium and vehicle | |
DE102019103192A1 (en) | Method for generating training data for a digital, adaptive camera system | |
DE102019108722A1 (en) | Video processing for machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20210406 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20230821 |