EP3721370A1 - Trainieren und betreiben eines maschinen-lern-systems - Google Patents
Trainieren und betreiben eines maschinen-lern-systemsInfo
- Publication number
- EP3721370A1 EP3721370A1 EP18789090.0A EP18789090A EP3721370A1 EP 3721370 A1 EP3721370 A1 EP 3721370A1 EP 18789090 A EP18789090 A EP 18789090A EP 3721370 A1 EP3721370 A1 EP 3721370A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- depth
- depth map
- learning system
- machine learning
- image data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W60/00—Drive control systems specially adapted for autonomous road vehicles
- B60W60/001—Planning or execution of driving tasks
- B60W60/0011—Planning or execution of driving tasks involving control alternatives for a single driving scenario, e.g. planning several paths to avoid obstacles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
- G06F18/256—Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/809—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
- G06V10/811—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data the classifiers operating on different input data, e.g. multi-modal recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
- G06T2207/30261—Obstacle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
Definitions
- the present invention relates generally to the field of artificial intelligence.
- the invention relates to a method for teaching a
- Machine learning system a method for operating a machine learning system and a machine learning system.
- Machine learning systems and / or artificial intelligence systems are generally used here, which, for example on the basis of image data recorded with a camera of the motor vehicle, classify the objects recognizable in the image data and thus ultimately the Automatically identify objects.
- Such machine learning systems for detecting objects may, for example, have a neural network and / or a neural network.
- the object recognition by means of a machine learning system can also be improved by evaluating data of other sensors in addition to image data.
- Such systems are also called multi-path systems or multi-path systems.
- Such a system can be the environment
- a learning system such as a neural network, which detects the environment based on image information from a single camera, be supplemented by a module which 3D information and / or temporal movement information recorded and interpreted.
- a module which 3D information and / or temporal movement information recorded and interpreted.
- Training data are taught. This can be a challenge especially with multi-path systems.
- a machine learning system can be extensively trained and / or trained, so that overall the machine learning system and / or an object recognition based on the machine learning system can be improved.
- a method for teaching and / or training a machine learning system is proposed.
- image data is fed into a machine learning system while processing and / or processing at least part of the image data.
- the method is characterized in particular by the following steps:
- each of the depth information values correlates with a distance to an object
- Image data and based on the processed depth information values of the at least one depth map, adjusting, varying and / or changing at least one parameter value of at least one parameter of the machine learning system, wherein the adapted at least one
- Parameter value an interpretation of input data by the machine Learning system and / or output of the machine learning system.
- the machine learning system can designate an arbitrarily designed artificial intelligence system.
- the ML system can be designed as a classifier, for example as a neural network.
- the ML system can be designed as a multipath system and / or as a multipath system and be set up to process, analyze and / or interpret depth data in addition to image data.
- the ML system can be embodied as a one-part system or, for example, have a plurality of modules which can each process different data.
- the ML system may include a first module for processing image data and a second module for processing depth maps.
- a supply of data to the ML system such as via a suitable data connection and / or a suitable interface of the ML system.
- the data for teaching the ML system forward may be propagated in the context of a forward propagation and / or in a backward direction in the context of a backward propagation through the ML system.
- the parameter value of the at least one parameter of the ML system can be iteratively and / or successively adapted, changed and / or varied such that a reaction and / or the interpretation of any further input data with respect to an intended use of the ML system improves / or optimized.
- the object recognition so can with the method of
- Parameter value of the at least one parameter can be adjusted, changed and / or varied so that ultimately the precision of the object recognition is improved.
- the "interpretation of the input data by the ML system” can mean that the system processes the input data and a in particular by the parameter value of the at least one parameter at least partially influenced output provides.
- the ML system is a classifier
- the system may output at least one class name, class, and / or at least one probability value for a class of objects.
- the parameter may therefore generally denote a quantity, in particular a mathematical quantity, based on which the ML system supplied it
- Input data such as sensor data, image data and / or depth maps, analyzed and / or interpreted.
- the above-mentioned image data and / or the at least one depth map may in particular designate training data of the ML system.
- the image data and / or the depth map may also be labeled.
- the image data and / or the depth map may also contain information regarding, for example, the objects located in the image data and / or the depth map.
- the ML system may also also be labeled.
- Training purposes determine a recognition error and the at least one parameter value of the at least one parameter can be adjusted while minimizing the recognition error to learn the ML system and / or to optimize the object recognition.
- the depth map may designate a disparity map that may represent data from any proximity information acquisition sensor.
- the depth map may contain information regarding a distance to an object, i. Distance information, represent, include and / or exhibit.
- the depth map can be spatial
- the depth information values of the depth map may indicate distance data.
- the input data may designate any sensor data, such as image data of a camera and / or a depth map of a distance sensor, which are evaluated, analyzed and / or interpreted by the ML system, for example for object recognition.
- the input data can not be labeled and / or designate data which are fed to the ML system after the ML system has been trained for actual object recognition.
- Synthetic generation can be understood as artificial production. Synthetically generated and / or artificially generated depth information values may therefore denote distance data that was not acquired by a sensor and / or distance sensor but, for example, was generated manually and / or by machine.
- Training method for an ML system such as a neural network, proposed, wherein the system both image data, such as images of a camera, as well as an at least partially synthetically generated
- Depth card be supplied.
- the parameter value of the at least one parameter of the ML system is adapted to teach the ML system.
- several or all parameters of the ML system can be adapted so that, for example, object recognition by the ML system is improved.
- the ML system can be trained with image data and exclusively with synthetically generated depth maps.
- real depth maps can be used for training, which were acquired with a sensor.
- the system for teaching synthetically generated image data can be supplied.
- Synthetically generating the depth map advantageously produces and trains an artificial scenario in which, for example, interpretation of the image data and interpretation of the depth map by the ML system would lead to different responses and / or results. Furthermore, a depth map can be synthesized synthetically compared to image data with little effort, speed and cost efficiency since it is less complex compared to image data. That way you can
- cost-effective depth maps are generated, which add to the image data
- Delusions be trained. For example, a scenario can be modeled in which a camera made a picture of a billboard becomes, on which a road is shown. An interpretation of only this image of a road could lead to the ML system interpreting the road as a real road.
- a learning system is usually done by processing large amounts of training data
- Map scenarios in the training data There may therefore be objects and / or scenarios that are unknown to the system.
- the present invention therefore makes it possible to train with additional synthetically generated depth maps a correct assessment of objects and / or scenarios that are not included in a conventional training data set of real data.
- the parameters of the system such as weights of a neural network, could be determined and / or chosen such that the system is complex
- Depth maps in the image data more information may be present. When using the trained ML system, this may result in the system following an interpretation of the image data, for example when inter-divergent image data and depth maps are being interpreted, since such
- Deviations in real datasets are typically underrepresented, rarely occurrences and therefore possibly insufficiently trained.
- the use of a multipath system which, for example, detects objects based on data from different sensors, such as image data and depth maps, can generally enable a redundant safeguarding of the object recognition. Since it is relatively unlikely that two different recognition paths of the system, such as a first path based on the image data and a second path based on the depth maps, are erroneous, each of the paths can serve as a plausibility check of the other path, whereby overall the object recognition can be improved.
- inventive method be trained.
- Depth maps to train the system The biggest gain is in the treatment of so-called corner cases or rare special cases where the image data would allow a different conclusion than the depth map, such as the optical illusions described above. Such cases are very rare and difficult to learn by simply feeding real scenes into the system. In general, therefore, a learning system and / or an ML system can be decisively improved with the present invention, which must reliably decide on the basis of visual data and / or image data as to whether it is a relevant object.
- the method further comprises the step of associating the image data with the at least one depth map.
- the parameter value of the machine learning system is dependent on the processed image data and in dependence of
- the parameter value of the at least one parameter can be matched to both data, ie both the image data and the depth map.
- a multi-path system can be comprehensively trained, which analyzes and / or interprets an environment both based on image data, ie based on visual information, and based on a depth map, ie based on spatial information.
- the depth map comprises a matrix, an array and / or a list of entries, each entry representing a pixel of a device for acquiring depth information, spatial information and / or 3D information.
- a value of each entry is a depth information value for indicating a distance between the device and an object.
- the depth map may in particular designate a disparity map and / or the depth information values may designate disparity values and / or distance data.
- the at least one depth map represents data of a stereo camera, a multi-view camera, a
- the depth map may represent data of any sensor for acquiring distance information, distance data, and / or space information.
- the image data may designate data of any optical sensor, such as a camera, an RGB camera, a color camera, a grayscale camera, and / or an infrared camera.
- the step of synthetically generating the at least one portion of the at least one depth map further comprises the substeps of defining, establishing, and / or determining a plurality of depth information values of the depth map and storing the plurality of defined depth information values in the depth map.
- Depth information values of the synthetically generated depth map By setting at least about 1% of the depth information values can be ensured that a sufficiently large and / or massive object is artificially generated in the depth map, so that when feeding a real depth map in the trained ML system is not falsely a statistical noise of the data is recognized as an object. Overall, the training process as well as the object recognition with the trained system can be further improved. According to one embodiment of the invention, the defined
- Depth information values of at least a subset of entries of the depth map which subset represents a contiguous pixel area of pixels of a depth information acquisition device such that by defining the depth information values, distance information is generated with respect to a geometrically contiguous object in the depth map.
- a massive and / or relatively large object in the depth map can be artificially generated, which can represent and / or represent a real object.
- the artificially generated in the depth map object can have any shape, contour and / or size.
- the object can be generated at any position in the depth map.
- several objects can be generated in a single depth map, for example, at different positions.
- this is geometric
- Deviation between the image data and the at least one depth map is generated.
- the discrepancy between the depth map and the image data may mean that the object exists only in the depth map.
- the discrepancy may mean that different objects are present in the image data and the depth map.
- a roadway may be recognizable, which, however, comes only from a billboard, whereas in the depth map, the billboard as a solid object at a certain distance may be recognizable.
- the parameter value of the at least one parameter of the machine learning system is adapted such that, in the event of a discrepancy between the image data and the at least one depth map, an interpretation of the depth map by the machine learning system over interpretation of the image data is preferred by the machine learning system.
- the ML system provides a corresponding output, which can cause the vehicle to perform a braking operation and / or an evasive maneuver.
- a reliability and / or precision of the object recognition can be increased by the trained ML system. This can also be in a use of such trained ML system in a vehicle the
- the distance can be a fictitious distance of a fictional
- the distance to the object can lie in a safety-relevant area of a vehicle, so that the ML system can be extensively trained for use in a vehicle.
- the ML system can be extensively trained for use in a vehicle.
- Depth map represents, the corresponding distance of the synthetically generated object can be selected. For example, if the depth map represents data from an ultrasound-based distance measuring device, then the object synthesized in the depth map can be generated at a smaller pitch than, for example, a radar-based one
- Distance measuring device would be the case so as to take into account the shorter range of the ultrasonic distance measuring device.
- the method further comprises the following steps:
- Depth maps defining a plurality of depth information values of each depth map
- the training of the ML system can be improved. Also, so many different scenarios can be trained with many different objects, so that thereby the object recognition can be significantly improved with the trained ML system.
- the objects generated in the synthetically generated depth maps differ with regard to a contour, a dimension, a position in the respective depth maps and / or with respect to a distance from one another.
- objects that are different from one another in the depth maps can be generated, which may allow the training of different scenarios and / or the recognition of different objects.
- the objects produced in the depth maps may, for example, have a round, oval, angular, polygonal, quadrangular, triangular or any other contour and / or geometry.
- the different objects can be generated in particular randomly. For example, certain parameters of the objects, such as dimensions, sizes, geometries, positions in the depth maps, or the like, may be varied randomly, such as using a random number generator.
- the objects generated in the depth maps can originate from scanned real objects. In this way, large amounts of different depth maps with different objects can be efficiently and quickly generated and used to train the ML system.
- the machine learning system is an artificial neural network, in particular a multilayer artificial neural network.
- the at least one parameter of the machine learning system is a weight of a node of an artificial neural network.
- the neural network may be, for example, a linear, a non-linear, a recurrent and / or a convolutional neural network.
- a second aspect of the invention relates to the use of at least one at least partially synthetically generated depth map in combination with image data for teaching and / or training a machine learning system, in particular for teaching an ML system, as described above and below.
- a third aspect of the invention relates to a method of operating a machine learning system for a motor vehicle, wherein the machine learning system is taught by a method as described above and below.
- the method of operating the ML system may, as it were, denote a method of recognizing objects using the ML system.
- a fourth aspect of the invention relates to a machine learning system for
- Fig. 1 shows a machine learning system according to an embodiment of the invention.
- FIG. 2 is a flowchart illustrating steps of a method of teaching a machine learning system according to the present invention
- FIG. 3 is a flowchart illustrating steps of a method of operating a machine learning system according to one embodiment of the present invention
- Fig. 1 shows a machine learning system 10 according to an embodiment of the invention.
- the ML system 10 may be an artificial intelligence system 10 of any type.
- the ML system 10 may include at least one neural network 12.
- the neural network 12 can be multi-layered and be a linear, non-linear, recurrent and / or folding neural network 12.
- the neural network 12 may include one or more convolutional layers.
- the ML system 10 of FIG. 1 is designed as a multi-path system 10, image data 14 being input via a first path 11a as input variables or
- Input data can be fed into and processed by the system 10.
- depth maps 16a, 16b and / or distance data 16a, 16b can be fed into and processed by the system 10.
- the system 10 shown in FIG. 1 has, by way of example, three modules 12a, 12b, 12c.
- a first module 12 a is set up to process, analyze and / or interpret the image data 14 and to determine and / or output a first interpretation 18 a based on the image data 14.
- the image data 14 may be, for example, images of a camera, an RGB camera, a color camera, a grayscale camera and / or an infrared camera.
- the second module 12b is configured to process, analyze and / or interpret the depth maps 16a, 16b and to determine and / or output a second interpretation 18b based on the depth maps 16a, 16b.
- the depth maps 16b denote real depth maps 16b, which are approximately Data from a stereo camera, a multi-view camera, a
- the depth maps 16b may be from an information source from which the depth information may be extracted, such as the. a mono camera unit with structure-from-motion algorithm.
- the depth maps 16a designate synthetically generated depth maps 16a, which are supplied to the system 10 for training purposes, as described in detail below. The depth maps 16a may therefore be generated artificially and data from a stereo camera, a multi-view camera, a distance measuring device, a radar-based
- the two interpretations 18a, 18b are supplied by way of example in FIG. 1 to a third module 12c, which determines and / or outputs a final interpretation 18c based on the first and second interpretation 18a, 18b.
- the three modules 12a, 12b, 12c can each be separate and / or independent modules 12a, 12b, 12c. Alternatively, the modules 12a, 12b or all three modules 12a, 12b, 12c may be combined into a single module. In particular, the modules 12a-c can each be designed as neural networks 12a-c and / or the modules 12a-12c can be designed as a common neural network 12.
- image data 14 are supplied to the system via the first path 11a and / or fed via a corresponding interface of the system 10.
- Synthesized depth maps 16a are also supplied to the system 10 via the second path 11b and / or fed in via a corresponding interface.
- real depth maps 16b can also be supplied to the system 10.
- the neural networks 12a, 12b respectively process the data supplied to you, ie the image data 14, the synthetically generated depth maps 16a and real depth maps 16b.
- the image data 14 and / or the depth maps 16a, 16b can be labeled, ie have information regarding their content, such as objects contained in the image data 14 and / or depth maps 16a, 16b.
- the neural network 12a may be capable of forward propagation of the image data 14
- Determine and / or output interpretation 18a which may be, for example, a class of objects and / or probability values.
- Interpretation 18a may also be based on the labeling of the image data
- the neural network 12b may determine and / or output the interpretation 18b, which may include a class of objects and / or
- Probability values can be. Also for the interpretation 18b, a recognition error can be determined based on the label of the depth maps 16a, 16b.
- the neural networks 12a, 12b can then be operated in reverse propagation, wherein parameter values of parameters of the neural networks 12a, 12b, which may denote in particular weights of nodes of the neural networks 12a, 12b, are each adapted, changed and / or varied while minimizing the recognition errors become.
- the interpretations 18a, 18b may further be supplied to the neural network 12c to detect and / or output a final interpretation 18c, again detecting a recognition error.
- the neural network 12c may also be operated in reverse propagation, and while minimizing the recognition error, the parameter values of the parameters and / or the weights of the nodes of the neural network 12c may be adapted, varied and / or varied.
- the image data 14 and the depth maps 16a, 16b may be propagated together through the system 10 and the entire neural network 12 to obtain the interpretation 18c.
- the neural network 12 may then also be operated in reverse propagation and the weights of the nodes of the overall system 10 and / or the entire neural network 12 can be adjusted, varied and / or changed while minimizing the recognition error.
- the system 10 is supplied with image data 14 and with synthetically generated depth maps 16a and the parameter values of the parameters of the system, in particular the weights of the nodes of the neural network 12, are adapted for teaching the system 10 and / or the neural network 12 , The adapted in this way
- Parameter values and / or weights then influence the interpretation and / or response of the system 10 to any input data, such as images from a camera in a vehicle and sensor data from an ultrasound, radar, or laser proximity sensor.
- the trained machine learning system 10 is with
- Image data 14 for example from a camera
- depth maps 16b with
- the depth maps 16b can be approximately from one
- Stereo camera come. Based on the data from both
- Information sources i. the image data 14 and the depth maps 16b
- the overall system 10 analyzes the environment.
- confusion and / or incorrect interpretation 18a can occur. If e.g. Persons can be seen on a billboard, the image-based part and / or the first path 11a of the system 10 can not distinguish between a real person and a person on the poster.
- the system 10 can not decide what it is for items and objects that are not included in training. It may therefore happen that an unknown gray box is recognized in the image data 14 as a gray floor, a bank or a door. In order to support such decision cases, it may be advantageous to use the depth maps 16b in the second path 11b of the system 10.
- the invention provides, synthetically generated depth maps 16a for Training the system 10, which training the system 10 and in particular the module 12b is significantly expanded.
- Synthetically generated depth maps 16b may be in the same file format as the real depth maps 16a, such as disparity maps.
- the depth maps 16a, 16b may include, for example, a matrix and / or list of entries, each entry being a pixel of a device for capturing
- Depth information value for indicating a distance between the device and an object can be enriched and / or modified by different artificially generated objects at different positions.
- the depth maps 16a may be largely and / or fully synthetically generated.
- a plurality of depth information values, in particular at least 1% of the depth information values, of the depth maps 16b may be defined and / or specified and stored to produce a synthetically generated depth map 16a.
- a contiguous pixel region in the synthetically generated depth maps 16a can be manipulated and / or defined, so that geometrically coherent objects are generated in the synthetically generated depth maps which can represent real objects in real depth maps 16b.
- the manipulated and / or defined depth information values can furthermore be selected such that they correspond to a distance to the respective object between 5 cm and 500 m, in particular between 5 cm and 200 m. This allows the objects to be generated in safety-relevant intervals, for example for a vehicle.
- a depth map 16a a block may be created in the middle of a road that is delineated from the road by different depth information values and within the depth map 16a
- the second path 11b of the module 12b may be extended or replaced by another module based on motion information.
- the synthetic generation of motion information may e.g. in the form of an optical flow lead to an improved scope of training. This is illustrated by an example in which an object moves through the field of view of the system 10, which is unknown or ambiguous.
- image data 14 can also be generated at least partially synthetically and used to train the system 10.
- FIG. 2 shows a flowchart for illustrating steps of a method for teaching a machine learning system 10 according to one
- a depth map 16a is generated at least partially synthetically. For this purpose, in step S1 several entries of
- Depth map 16a manipulated, fixed, changed and / or defined.
- a subset of entries of the depth map 16a can be manipulated and / or defined, which subset one
- the depth map 16a may also be stored and / or deposited in a data storage device.
- the synthetically generated depth map 16a is fed into the ML system 10, for example via a suitable interface. Furthermore, image data 14 is fed to the system 10 in step S2.
- the image data 16 can originate from a camera and / or approximately on one Data storage device be deposited.
- the image data 14 can be assigned to the synthetically generated depth map 16a.
- step S3 the image data 14 and the synthetically generated depth map 16a are processed, processed, interpreted and / or evaluated by the system 10.
- step S3 a first
- Interpretation 18a based on the image data 14 and a second interpretation 18b based on the depth map 16a are determined, generated and / or output by the system 10.
- the interpretations 18a, 18b can each have a class of objects and / or probability values for objects and / or for object classes.
- step S4 at least one parameter value of at least one parameter of the system 10 is adapted and / or changed, so that the system 10 is trained based on the processed image data 14 and the processed depth map 16a.
- the system 10 is trained based on the processed image data 14 and the processed depth map 16a.
- Interpretations 18a, 18b are propagated in the reverse direction by the system 10, wherein the parameter value of the at least one parameter can be adjusted while minimizing a recognition error.
- all the parameter values of all parameters of the system can be adapted to teach the system 10.
- the parameter values can be adapted to teach the system 10.
- the two interpretations 18a, 18b can also be processed to a final interpretation 18c of the system 10, which in turn can optionally be output. Alternatively or additionally, in step S4, the final interpretation 18c and a corresponding recognition error of this interpretation 18c can also be used for teaching the system 10 and / or for adjusting the parameter values.
- the object generated in the synthetic card 16a in step S1 can only be detected in the
- Depth map 16a and not included in the image data 14. Also, different objects may be present in the image data 14 and the depth map 16a, so that a discrepancy exists between the image data 14 and the depth map 16a. This in turn may cause the interpretations 18a, 18b to diverge. If the interpretations 18a, 18b deviate from one another, the parameter values of the parameters of the system 10 can also be used in step S4 are adapted such that the interpretation 18b based on the depth map 16a versus the interpretation 18a based on the
- Image data 14 is preferred.
- the final interpretation 18c may preferably match the interpretation 18b and the parameter values of the system 10 may be selected accordingly.
- the steps S1 to S4 can be run through several times to comprehensively teach the system 10, wherein in steps S1 different depth maps 16a with relatively different objects are always generated and can be fed into the system 10.
- the objects in the depth maps 16a may differ from each other in terms of a dimension, size, shape, geometry, a position, a distance and / or any other sizes. In this way, the system 10 can be trained on all possible objects and scenarios.
- FIG. 3 is a flowchart illustrating steps of a method of operating a machine learning system 10 according to one
- system 10 described with respect to FIG. 3 has the same elements and features as the system 10 of FIG. Furthermore, the system 10 of Figure 3 may be taught according to the method described with reference to Figure 2.
- the system 10 can in particular for object recognition in a
- image data 14 for example a camera of the motor vehicle, is supplied to the system. Further, in step S1, the system 10 is supplied with a depth map 16b with distance information, such as from a stereo camera, an ultrasonic sensor, or any other distance sensor.
- a step S2 the image data 14 and the depth map 16b are processed, interpreted and / or analyzed by the system 10. Based on the image data 14, the system 10 may determine a first interpretation 18a of a scenario depicted in the image data 14. Further, the system 10 may determine a second interpretation 18b based on the depth map 16b.
- the two interpretations 18a, 18b are then further processed in a step S3 and optionally compared with one another. Based on the Interpretations 18a, 18b, a final interpretation 18c of the scenario depicted in the image data 14 and the depth map 16b is determined and / or created in step S3. If the two interpretations 18a, 18b do not agree, then for security reasons the interpretation 18b based on the depth map 16b may be preferred over the interpretation 18a based on the image data 14.
- the final interpretation 18c may further include other components of the
- Motor vehicle such as a control unit, are supplied. Based on the interpretation 18c can then be a reaction of the vehicle, such as
- a braking operation and / or an evasive maneuver determined, initiated and / or performed.
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102017221765.0A DE102017221765A1 (de) | 2017-12-04 | 2017-12-04 | Trainieren und Betreiben eines Maschinen-Lern-Systems |
PCT/EP2018/078177 WO2019110177A1 (de) | 2017-12-04 | 2018-10-16 | Trainieren und betreiben eines maschinen-lern-systems |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3721370A1 true EP3721370A1 (de) | 2020-10-14 |
Family
ID=63896157
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP18789090.0A Pending EP3721370A1 (de) | 2017-12-04 | 2018-10-16 | Trainieren und betreiben eines maschinen-lern-systems |
Country Status (4)
Country | Link |
---|---|
US (1) | US11468687B2 (de) |
EP (1) | EP3721370A1 (de) |
DE (1) | DE102017221765A1 (de) |
WO (1) | WO2019110177A1 (de) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200143960A (ko) * | 2019-06-17 | 2020-12-28 | 현대자동차주식회사 | 영상을 이용한 객체 인식 장치 및 그 방법 |
US11669593B2 (en) | 2021-03-17 | 2023-06-06 | Geotab Inc. | Systems and methods for training image processing models for vehicle data collection |
US11682218B2 (en) | 2021-03-17 | 2023-06-20 | Geotab Inc. | Methods for vehicle data collection by image analysis |
DE102021119906A1 (de) | 2021-07-30 | 2023-02-02 | Bayerische Motoren Werke Aktiengesellschaft | Verfahren zum Erzeugen von Trainingsdaten zum selbstüberwachten Trainieren eines neuronalen Netzes, Verfahren zum selbstüberwachten Trainieren eines neuronalen Netzes, Computerprogramm sowie Recheneinrichtung für ein Fahrzeug |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102008001256A1 (de) | 2008-04-18 | 2009-10-22 | Robert Bosch Gmbh | Verkehrsobjekt-Erkennungssystem, Verfahren zum Erkennen eines Verkehrsobjekts und Verfahren zum Einrichten eines Verkehrsobjekt-Erkennungssystems |
EP3688666A1 (de) * | 2017-11-03 | 2020-08-05 | Siemens Aktiengesellschaft | Segmentierung und entrauschung von tiefenbildern für erkennungsanwendungen unter verwendung generativer kontradiktorischer neuronaler netzwerke |
US10740876B1 (en) * | 2018-01-23 | 2020-08-11 | Facebook Technologies, Llc | Systems and methods for generating defocus blur effects |
RU2698402C1 (ru) * | 2018-08-30 | 2019-08-26 | Самсунг Электроникс Ко., Лтд. | Способ обучения сверточной нейронной сети для восстановления изображения и система для формирования карты глубины изображения (варианты) |
-
2017
- 2017-12-04 DE DE102017221765.0A patent/DE102017221765A1/de active Pending
-
2018
- 2018-10-16 WO PCT/EP2018/078177 patent/WO2019110177A1/de unknown
- 2018-10-16 US US16/762,757 patent/US11468687B2/en active Active
- 2018-10-16 EP EP18789090.0A patent/EP3721370A1/de active Pending
Also Published As
Publication number | Publication date |
---|---|
WO2019110177A1 (de) | 2019-06-13 |
US11468687B2 (en) | 2022-10-11 |
DE102017221765A1 (de) | 2019-06-06 |
US20210182577A1 (en) | 2021-06-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3721370A1 (de) | Trainieren und betreiben eines maschinen-lern-systems | |
WO2019179946A1 (de) | Erzeugung synthetischer radarsignale | |
DE102018129425B4 (de) | System zur Erkennung eines Bearbeitungsfehlers für ein Laserbearbeitungssystem zur Bearbeitung eines Werkstücks, Laserbearbeitungssystem zur Bearbeitung eines Werkstücks mittels eines Laserstrahls umfassend dasselbe und Verfahren zur Erkennung eines Bearbeitungsfehlers eines Laserbearbeitungssystems zur Bearbeitung eines Werkstücks | |
EP3292510B1 (de) | Verfahren und vorrichtung zur erkennung und bewertung von fahrbahnreflexionen | |
DE102018206208A1 (de) | Verfahren, Vorrichtung, Erzeugnis und Computerprogramm zum Betreiben eines technischen Systems | |
DE102016212700A1 (de) | Verfahren und System zur Steuerung eines Fahrzeugs | |
DE102008013366B4 (de) | Verfahren zur Bereitstellung von Information für Fahrerassistenzsysteme | |
EP3631677A1 (de) | Verfahren zur erkennung von objekten in einem bild einer kamera | |
DE102021002798A1 (de) | Verfahren zur kamerabasierten Umgebungserfassung | |
DE112020006045T5 (de) | Formal sicheres symbolisches bestärkendes lernen anhand von visuellen eingaben | |
EP3785169A1 (de) | Verfahren und vorrichtung zur umsetzung eines eingangsbildes einer ersten domäne in ein ausgangsbild einer zweiten domäne | |
DE102022114201A1 (de) | Neuronales Netz zur Objekterfassung und -Nachverfolgung | |
DE102019216206A1 (de) | Vorrichtung und Verfahren zum Bestimmen einer Kehrtwendestrategie eines autonomen Fahrzeugs | |
WO2020051618A1 (de) | Analyse dynamisscher räumlicher szenarien | |
DE102020128978A1 (de) | Trainieren von tiefen neuronalen netzwerken mit synthetischen bildern | |
WO2021165077A1 (de) | Verfahren und vorrichtung zur bewertung eines bildklassifikators | |
DE102020214596A1 (de) | Verfahren zum Erzeugen von Trainingsdaten für ein Erkennungsmodell zum Erkennen von Objekten in Sensordaten einer Umfeldsensorik eines Fahrzeugs, Verfahren zum Erzeugen eines solchen Erkennungsmodells und Verfahren zum Ansteuern einer Aktorik eines Fahrzeugs | |
EP3748453A1 (de) | Verfahren und vorrichtung zum automatischen ausführen einer steuerfunktion eines fahrzeugs | |
DE102020200503A1 (de) | Verfahren zum Generieren von gelabelten Daten, insbesondere für das Training eines neuronalen Netzes, mittels Verbesserung initialer Label | |
DE102019214200A1 (de) | Übersetzung von Trainingsdaten zwischen Beobachtungsmodalitäten | |
DE102018109680A1 (de) | Verfahren zum Unterscheiden von Fahrbahnmarkierungen und Bordsteinen durch parallele zweidimensionale und dreidimensionale Auswertung; Steuereinrichtung; Fahrassistenzsystem; sowie Computerprogrammprodukt | |
DE102020126690A1 (de) | Verfahren zum Bestimmen eines Bewegungsmodells eines Objekts in einer Umgebung eines Kraftfahrzeugs, Computerprogrammprodukt, computerlesbares Speichermedium sowie Assistenzsystem | |
WO2021175783A1 (de) | Computerimplementiertes verfahren und system zum erzeugen synthetischer sensordaten und trainingsverfahren | |
DE102020202305A1 (de) | Verfahren zum Erkennen einer Umgebung eines Fahrzeugs und Verfahren zum Trainieren eines Fusionsalgorithmus für ein Fahrzeugsystem | |
DE102019111608A1 (de) | Verfahren zum Bestimmen einer Eigenbewegung eines Kraftfahrzeugs, elektronische Recheneinrichtung sowie elektronisches Fahrzeugführungssystem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20200706 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20220930 |