US11978262B2 - Image classification and associated training for safety-relevant classification tasks - Google Patents

Image classification and associated training for safety-relevant classification tasks Download PDF

Info

Publication number
US11978262B2
US11978262B2 US17/357,071 US202117357071A US11978262B2 US 11978262 B2 US11978262 B2 US 11978262B2 US 202117357071 A US202117357071 A US 202117357071A US 11978262 B2 US11978262 B2 US 11978262B2
Authority
US
United States
Prior art keywords
image
classifier
learning
information density
resolved
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US17/357,071
Other versions
US20210406587A1 (en
Inventor
Udo Mayer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Robert Bosch GmbH
Original Assignee
Robert Bosch GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Robert Bosch GmbH filed Critical Robert Bosch GmbH
Publication of US20210406587A1 publication Critical patent/US20210406587A1/en
Assigned to ROBERT BOSCH GMBH reassignment ROBERT BOSCH GMBH ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MAYER, UDO
Application granted granted Critical
Publication of US11978262B2 publication Critical patent/US11978262B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60QARRANGEMENT OF SIGNALLING OR LIGHTING DEVICES, THE MOUNTING OR SUPPORTING THEREOF OR CIRCUITS THEREFOR, FOR VEHICLES IN GENERAL
    • B60Q9/00Arrangement or adaptation of signal devices not provided for in one of main groups B60Q1/00 - B60Q7/00, e.g. haptic signalling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness

Definitions

  • the present invention relates to the automatic classification of image data with regard to its content, for example for the at least semi-automated driving of vehicles.
  • Such neural networks may, for example, be made up of multiple layers connected one after another, in which the dimensionality of the task is significantly reduced by using convolutional cores and by downsampling.
  • Such neural networks are also characterized in that the data are solidly processed in parallel.
  • Great Britain Patent No. GB 2 454 857 B describes an example of a method in which, using a self-learning neural network, a microscope image is classified according to which objects it contains.
  • a method for training a classifier for image data with the aid of learning image data and associated labels.
  • image data also includes in particular such films and image sequences.
  • the complete film, or the complete image sequence contains an additional piece of information regarding the dynamic change in the image content. This dynamic quality is missing from the individual images. The fact that this dynamic quality is important is consistent with the fact that people prefer to perceive moving or flashing image content over static image content.
  • Each label includes an allocation to one or more classes of a predefined classification.
  • the classes may represent objects which are visible in the learning image data. In image data of traffic situations, these objects may in particular be pedestrians, other vehicles, road boundaries, traffic signs and other traffic-relevant objects.
  • space-resolved relevance maps are provided for some or ideally for all data sets of learning image data. These relevance maps indicate how relevant which spatial areas of the particular learning image data are for the assessment of the situation shown in these learning image data.
  • a data set of learning image data may be a static image or a frame from a film or from an image sequence.
  • a data set of learning image data may also be a film or an image sequence.
  • a dynamic change in a particular spatial area in the course of a film or an image sequence may already provide a reason for this spatial area to be classified as particularly relevant.
  • learning samples are ascertained from data sets of learning image data and associated relevance maps.
  • spatial areas that have a local relevance below a predefined threshold value may, for example, be blanked out, blurred or otherwise garbled in the learning samples.
  • the learning samples are fed to the classifier. Thereafter, the classifier maps the learning samples to allocations to one or more classes. Parameters which characterize the behavior of the classifier are then optimized with the aim that these allocations are consistent with the labels of the learning image data from which the learning samples have originated.
  • the classifier contains an artificial neural network, ANN, or is an ANN
  • the trainable parameters may include, for example, weights with which inputs of neurons or other processing units are settled in a weighted manner for activation of the particular neuron, or of the particular processing unit.
  • the space-resolved relevance map may come from an arbitrary source. By way of example, it may be supplied together with the labels as further additional information to the learning image data. However, the space-resolved relevance map may also be retrieved, for example, from another trained ANN. This means that this further ANN may be specifically trained on the question of which spatial areas of the image data require particular attention in which situation, regardless of what exactly is contained in these spatial areas of the image data.
  • the effect is particularly pronounced in the case of classifiers which process image data from traffic situations.
  • classifiers which process image data from traffic situations.
  • usually large solid angles of the surroundings of a vehicle are detected.
  • usually only the information from a small part of this solid angle is actually relevant for managing the driving task. If this were not the case, a person who may only look in one direction at a time would have no chance of mastering the driving task.
  • the tendency that a recognition by the classifier “zeroes in” on image areas away from the actual traffic event is suppressed.
  • the training is also more effective and faster since the classifier no longer has to learn to distinguish what is important from what is unimportant.
  • the classifier may be a division of labor between the ANN, which identifies relevant image areas, and the classifier, which subsequently examines these image areas for the objects they contain. Overall, this is easier to train than a monolithic classifier which performs both tasks.
  • the present invention also relates to a method for classifying image data using a trained classifier.
  • image data may also include entire image sequences or films with dynamic additional information.
  • a space-resolved relevance map which indicates how relevant which spatial areas of the image data are for the assessment of the situation shown in these image data. From the image data and the space-resolved relevance map, a sample is ascertained in which, the higher the local relevance according to the relevance map, the locally more pronounced the information from the image data. This sample is fed to the classifier and is mapped by the classifier to an allocation to one or more classes of a predefined classification.
  • areas which have a local relevance below a predefined threshold value may in this case, for example, be blanked out, blurred or otherwise garbled.
  • the space-resolved relevance map for at least one data set of image data may be retrieved from a trained ANN.
  • the present invention also relates to a method for measuring a space-resolved relevance map for a concrete data set of image data (or learning image data).
  • This space-resolved relevance map may be used, for example, for training a classifier, for forming a sample to be processed by the classifier, or for training an ANN which ascertains space-resolved relevance maps for image data.
  • the image data are presented to at least one test subject.
  • the test subject is given the task of perceiving the content of the image data which from its point of view is relevant and, once it has done so, making an input. While the image data are being presented, it is observed to which spatial areas of the image data the test subject turns its attention.
  • test subject may optionally be checked whether the test subject not only believes it has identified the content of the image data, but also has actually identified it correctly. For example, not only may an input be requested to the effect that the test subject has perceived the relevant content of the image data, but it may also be asked what exactly the test subject in its opinion has identified. The answer to this question may be compared with a pre-known label as to which objects are actually contained in the image data.
  • the presentation may be ended (for instance the image or film may be blanked out), and multiple object names may be presented, from which the test subject must select those that are correct.
  • the test subject's input indicating that the relevant content has been perceived may then be rejected, for example, if, when asked, the correct object or at least the correct object class (for instance “dog” or “animal” as the class above “husky”) is not named.
  • the local relevance of those areas to which the test subject has turned its attention in the relevance map is increased. This may in particular take place, for example, also in interaction with many test subjects. From the consideration of the image data by each test subject, the information as to which spatial areas of the image data are on average perceived as relevant may be aggregated for example by a voting mechanism.
  • the spatial areas of the image data to which the test subject turns its attention does not necessarily depend solely on the image data themselves, but may also be influenced by a task assigned to this test subject.
  • the driver may be busy with the driving task while the passenger is looking for a parking space, a mailbox or a certain business. Therefore, if, for example, a relevance map for the at least semi-automated driving of the vehicle is being measured, a driver rather than a passenger should be used as the test subject.
  • the head posture, the eye position and/or the eye movements of the test subject are recorded.
  • the areas to which the test subject turns its attention may then be evaluated based on the head posture, the eye position and/or the eye movements. This is an indicator that may hardly be consciously influenced by the test subject.
  • a driver of a vehicle controls his/her selection of what he/she considers important from the traffic situation, usually via head posture (for instance a shoulder check), eye position and/or eye movements. Other movements are restricted as a result of being strapped into the driver's seat.
  • different sub-areas of the image data may successively become visible to the test subject. Those sub-areas of the image data which are visible at the time of the input made by the test subject may then be deemed to be those areas to which the test subject turns its attention. This does not require any special hardware in order to identify exactly where the test subject is looking.
  • one and the same data set of image data may for example be presented to multiple test subjects.
  • different sequences of sub-areas of the image data which successively become visible may then be presented to these test subjects. This may include changing the order of the sub-areas which successively become visible, and/or presenting to some test subjects sub-areas which are not presented to other test subjects.
  • a representative conclusion is then drawn as to which sub-areas of the image data are relevant, for example, for the assessment of traffic situations.
  • the space-resolved relevance map and an ANN which generates such relevance maps may also be used to check whether a vehicle driver or machine operator is currently turning his/her attention to those things that are presently important in terms of safety.
  • the present invention also relates to a method for observing and/or controlling the attention of a vehicle driver or machine operator.
  • image data of the situation in which the driven vehicle or the operated machine is located are detected by at least one sensor.
  • a space-resolved relevance map is retrieved from a trained artificial neural network, ANN. This space-resolved relevance map indicates how relevant which areas of the image data are for the assessment of the situation shown in these image data.
  • a piece of information and/or a warning is output to the vehicle driver or machine operator.
  • the vehicle driver or machine operator may be informed at all times about which aspects of his/her present situation are presently particular important from a safety-related point of view. If it should be found, when comparing the actual behavior of the vehicle driver or machine operator, that he/she is turning his/her attention to something other than the presently important aspects, he/she may be informed of this by way of a warning.
  • advertising in shops or at the curbside may attract a lot of attention.
  • the advertising is often designed in such a way that certain “hooks,” such as a favorable price, are placed in the foreground and may be read even from a passing vehicle.
  • the price is then marked with an asterisk indicating conditions, and any attempt to read these conditions written in small print may take a lot of attention away from the traffic situation.
  • an overlay of the situation may be presented to the vehicle driver or machine operator with an indication of at least one spatial area of the image data, the local relevance of which exceeds a predefined threshold value according to the relevance map.
  • the area of the situation that is presently particularly relevant may be highlighted for example in a head-up display on a windshield or in data glasses worn by the vehicle driver or machine operator, through the insertion of a border or a similar indication.
  • the head posture, the eye position and/or the eye movements of the vehicle driver or machine operator are recorded.
  • the head posture, the eye position and/or the eye movements are used to evaluate which part of the situation the vehicle driver or machine operator is predominantly observing.
  • this part of the situation is consistent with at least one spatial area of the image data, the local relevance of which exceeds a predefined threshold according to the relevance map.
  • a visual, acoustic and/or haptic warning device perceptible to the vehicle driver or machine operator is activated.
  • the methods may in particular be entirely or partially computer implemented. Therefore, the present invention also relates to a computer program containing machine-readable instructions which, when executed on one or multiple computer(s), upgrade the computer(s) to the device described above and/or prompt the computer(s) to carry out one of the methods described above.
  • control units for vehicles and embedded systems for technical devices which are also capable of executing machine-readable instructions are also to be regarded as computers.
  • a download product is a digital product which is transferrable via a data network, i.e. downloaded by a user of the data network, which may be offered for immediate download in an online shop, for example.
  • a computer may be equipped with the computer program, with the machine-readable data medium or with the download product.
  • FIG. 1 shows an exemplary embodiment of method 100 for training a classifier 1 , in accordance with the present invention.
  • FIG. 2 shows an exemplary embodiment of method 200 for classifying image data 2 , in accordance with the present invention.
  • FIG. 3 shows an example of the generation of a sample 23 for classification based on image data 2 , in accordance with the present invention.
  • FIG. 4 shows an exemplary embodiment of method 300 for measuring a relevance map 12 , 22 , in accordance with the present invention.
  • FIG. 5 shows an exemplary embodiment of method 400 for observing and/or controlling the attention of a vehicle driver or machine operator 40 , in accordance with the present invention.
  • FIG. 1 is a schematic flowchart of an exemplary embodiment of method 100 for training a classifier 1 for image data 2 .
  • step 110 for learning image data 11 , in each case space-resolved relevance maps 12 are provided, those relevance maps 12 according to block 111 may be retrieved, for example, from an appropriately trained ANN.
  • step 120 from learning image data 11 and associated relevance maps 12 , learning samples 13 are ascertained, in which, the higher the local relevance according to relevance map 12 , the locally more pronounced the information from learning image data 11 .
  • spatial areas whose local relevance is below a predefined threshold value may be blanked out, blurred or otherwise garbled.
  • step 130 learning samples 13 are fed to classifier 1 and are mapped to allocations to one or multiple classes 3 a through 3 c .
  • step 140 parameters 15 which characterize the behavior of classifier 1 are optimized with the aim that classes 3 a through 3 c delivered by classifier 1 are consistent with labels 14 of learning image data 11 from which learning samples 13 originated. This optimization may be continued until an arbitrary abort criterion is met.
  • the fully trained state of parameters 15 is denoted by reference numeral 15 *.
  • FIG. 2 is a schematic flowchart of an exemplary embodiment of method 200 for classifying image data 2 .
  • image data 2 may be recorded optionally by at least one sensor carried by a vehicle.
  • a space-resolved reference map 22 is provided, which in particular according to block 211 , for example, may be retrieved from an ANN.
  • a sample 23 is ascertained, in which, the higher the local relevance according to relevance map 22 , the locally more pronounced the information from image data 2 .
  • sample 23 is fed to classifier 1 , and according to block 231 in particular, for example, only the significantly data-reduced sample 23 compared to image data 2 may be transferred by the sensor via a bus system of the vehicle, but not image data 2 themselves.
  • sample 23 is mapped by classifier 1 to the sought allocation to classes 3 a through 3 c of the predefined classification.
  • FIG. 3 shows an example of how image data 2 may be converted into a sample 23 .
  • Image data 2 which here are in the form of a static image, show a traffic situation including a road 25 , an oncoming vehicle 26 and a traffic sign 27 .
  • a billboard 28 is visible on the left-hand curbside.
  • Relevance map 22 assesses road 25 , vehicle 26 and the right-hand curbside, where traffic signs such as sign 27 are located, as relevant. This area is therefore unchanged in sample 23 for the classification, while the details of billboard 28 are blanked out.
  • FIG. 4 is a schematic flowchart of an exemplary embodiment of method 300 for measuring a relevance map 12 , 22 for image data 2 , 11 .
  • image data 2 , 11 are presented to at least one test subject 4 . Meanwhile, it is observed to which spatial areas 2 a , 11 a of image data 2 , 11 test subject 4 turns its attention.
  • step 320 In response to input 41 made by test subject 4 indicating that it has perceived the relevant content of image data 2 , 11 , spatial areas 2 a , 11 a to which test subject 4 has previously turned its attention are detected in step 320 . In step 330 , the local relevance of these areas 2 a , 11 a in relevance map 12 , 22 is increased.
  • Box 310 shows, by way of example, two possible ways in which the turning of attention to areas 2 a , 11 a may be established. These possibilities may be used individually or also in combination.
  • head posture 42 a , eye position 42 b and/or eye movements 42 c of test subject 4 may be recorded.
  • areas 2 a , 11 a of image data 2 , 11 to which test subject 4 turns its attention may then be evaluated based on head posture 42 a , eye position 42 b and/or eye movements 42 c.
  • different sub-areas of image data 2 , 11 may be successively made visible to test subject 4 until test subject 4 recognizes image data 2 , 11 and makes input 41 .
  • one and the same data set of image data 2 , 11 may be presented to multiple test subjects 4 .
  • Different sequences of sub-areas of image data 2 , 11 which successively become visible may be presented to these test subjects 4 . (Block 313 b ).
  • those sub-areas of image data 2 , 11 which are visible at the time of input 41 made by test subject 4 may be deemed to be those areas 2 a , 11 a to which test subject 4 turns its attention.
  • step 330 the local relevance of ascertained areas 2 a , 11 a in relevance map 12 , 22 is increased, it being possible in particular to use, for example, a voting mechanism across many test subjects 4 .
  • FIG. 5 is a schematic flowchart of an exemplary embodiment of method 400 for observing and/or controlling the attention of a vehicle driver or machine operator 40 .
  • image data 2 of the situation, in which the driven vehicle or the operated machine is located are detected by at least one sensor.
  • a space-resolved relevance map 22 is retrieved from a trained artificial neural network, ANN.
  • a piece of information and/or a warning 6 is output to the vehicle driver or machine operator 40 .
  • Box 430 shows two possible ways in which the piece of information and/or warning 6 may be generated. These possibilities may be used individually or also in combination.
  • an overlay of the situation may be presented to the vehicle driver or machine operator 40 with an indication of at least one image area, the local relevance of which exceeds a predefined threshold value according to the relevance map.
  • head posture 42 a , eye position 42 b and/or eye movements 42 c of vehicle driver or machine operator 40 may be recorded.
  • these may be used to evaluate which part 7 of the situation vehicle driver or machine operator 40 is predominantly observing.
  • it may be checked to what extent this part 7 of the situation is consistent with at least one spatial area of the image data, the local relevance of which exceeds a predefined threshold value according to relevance map 22 . If part 7 of the situation is not consistent with the aforementioned spatial area of the image data (logical value 0), a visual, acoustic and/or haptic warning device perceptible to vehicle driver or machine operator 40 may be activated.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Mechanical Engineering (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

A method for training a classifier for image data using learning image data and associated labels, each of the labels including an allocation to one or multiple classes of a predefined classification. In the method, for each data set of learning image data, space-resolved relevance maps are provided, which indicate how relevant which spatial areas of the particular learning image data are for the assessment of the situation shown in the learning image data. From data sets of learning image data and associated relevance maps, learning samples are ascertained; the learning samples are fed to the classifier; and classifier parameters are optimized with the aim that the classifier maps the learning samples to allocations to one or multiple classes which are consistent with the labels of the learning image data from which the learning samples originate.

Description

CROSS REFERENCE
The present application claims the benefit under 35 U.S.C. § 119 of German Patent Application No. DE 102020208008.9 filed on Jun. 29, 2020, which is expressly incorporated herein by reference in its entirety.
FIELD
The present invention relates to the automatic classification of image data with regard to its content, for example for the at least semi-automated driving of vehicles.
BACKGROUND INFORMATION
Around 90% of the information a human driver needs to drive a vehicle in traffic is visual information. For the at least semi-automated driving of vehicles, it is therefore indispensable to correctly evaluate the content of image data of any modality which are recorded when observing the vehicle surroundings. A classification of the image data according to which traffic-relevant objects are contained therein, such as for example other road users, road markings, obstacles and traffic signs, is of particular importance for the task of driving.
To deal with this complexity, artificial neural networks are used. Such neural networks may, for example, be made up of multiple layers connected one after another, in which the dimensionality of the task is significantly reduced by using convolutional cores and by downsampling. Such neural networks are also characterized in that the data are solidly processed in parallel. Great Britain Patent No. GB 2 454 857 B describes an example of a method in which, using a self-learning neural network, a microscope image is classified according to which objects it contains.
For the safety-related assessment of neural networks and other trainable classifiers, it is important to what extent their behavior is explainable and comprehensible.
SUMMARY
Within the scope of the present invention, a method is provided for training a classifier for image data with the aid of learning image data and associated labels. Besides static images and frames (individual images) from films or image sequences, the term “image data” also includes in particular such films and image sequences. The complete film, or the complete image sequence, contains an additional piece of information regarding the dynamic change in the image content. This dynamic quality is missing from the individual images. The fact that this dynamic quality is important is consistent with the fact that people prefer to perceive moving or flashing image content over static image content.
Each label includes an allocation to one or more classes of a predefined classification. By way of example, the classes may represent objects which are visible in the learning image data. In image data of traffic situations, these objects may in particular be pedestrians, other vehicles, road boundaries, traffic signs and other traffic-relevant objects.
In accordance with an example embodiment of the present invention, space-resolved relevance maps are provided for some or ideally for all data sets of learning image data. These relevance maps indicate how relevant which spatial areas of the particular learning image data are for the assessment of the situation shown in these learning image data. By way of example, a data set of learning image data may be a static image or a frame from a film or from an image sequence. However, a data set of learning image data may also be a film or an image sequence. By way of example, a dynamic change in a particular spatial area in the course of a film or an image sequence may already provide a reason for this spatial area to be classified as particularly relevant.
In the simplest case, the relevance may be indicated in a binary fashion (for instance 0=not relevant, 1=relevant), but also in arbitrary gradings which may express a relative prioritization of spatial areas in the image data relative to one another.
In accordance with an example embodiment of the present invention, learning samples are ascertained from data sets of learning image data and associated relevance maps. In these learning samples, the higher the local relevance according to the relevance map, the more pronounced the information from the learning image data. For this purpose, spatial areas that have a local relevance below a predefined threshold value may, for example, be blanked out, blurred or otherwise garbled in the learning samples.
The learning samples are fed to the classifier. Thereafter, the classifier maps the learning samples to allocations to one or more classes. Parameters which characterize the behavior of the classifier are then optimized with the aim that these allocations are consistent with the labels of the learning image data from which the learning samples have originated.
If, for example, the classifier contains an artificial neural network, ANN, or is an ANN, the trainable parameters may include, for example, weights with which inputs of neurons or other processing units are settled in a weighted manner for activation of the particular neuron, or of the particular processing unit.
The space-resolved relevance map may come from an arbitrary source. By way of example, it may be supplied together with the labels as further additional information to the learning image data. However, the space-resolved relevance map may also be retrieved, for example, from another trained ANN. This means that this further ANN may be specifically trained on the question of which spatial areas of the image data require particular attention in which situation, regardless of what exactly is contained in these spatial areas of the image data.
It has been found that taking the relevance map into account during training makes the behavior of the classifier much more comprehensible and explainable. Analyses of the “heat maps” of those pixels which were significant for an allocation to particular classes (for instance types of objects) have in the past shown that the decision for particular classes was often made on the basis of image pixels which did not even belong to the objects in question. This behavior is at least partially suppressed by taking the relevance map into account.
The effect is particularly pronounced in the case of classifiers which process image data from traffic situations. In this case, usually large solid angles of the surroundings of a vehicle are detected. In any situation, however, usually only the information from a small part of this solid angle is actually relevant for managing the driving task. If this were not the case, a person who may only look in one direction at a time would have no chance of mastering the driving task. By taking the relevance map into account, the tendency that a recognition by the classifier “zeroes in” on image areas away from the actual traffic event is suppressed.
At the same time, the training is also more effective and faster since the classifier no longer has to learn to distinguish what is important from what is unimportant. By way of example, there may be a division of labor between the ANN, which identifies relevant image areas, and the classifier, which subsequently examines these image areas for the objects they contain. Overall, this is easier to train than a monolithic classifier which performs both tasks.
The present invention also relates to a method for classifying image data using a trained classifier. As explained above, besides static images, these image data may also include entire image sequences or films with dynamic additional information.
Within the scope of this method, in accordance with an example embodiment of the present invention, a space-resolved relevance map is provided, which indicates how relevant which spatial areas of the image data are for the assessment of the situation shown in these image data. From the image data and the space-resolved relevance map, a sample is ascertained in which, the higher the local relevance according to the relevance map, the locally more pronounced the information from the image data. This sample is fed to the classifier and is mapped by the classifier to an allocation to one or more classes of a predefined classification.
In a manner analogous to training, areas which have a local relevance below a predefined threshold value may in this case, for example, be blanked out, blurred or otherwise garbled.
Taking account only of image areas previously identified as relevant suppresses the aforementioned tendency to base the classification of image data on objects which are not even part of the currently relevant traffic situation. At the same time, however, it also enables a compression of the data to be transferred within the vehicle. If, for example, multiple cameras are installed in and at the vehicle, then the significantly information-reduced sample may be created close to the particular camera. This sample may then be forwarded over a bus system, which supplies the entire vehicle, to a central classifier, which evaluates the traffic situation as a whole. Less bandwidth is then required for this transfer.
Most of today's vehicles are equipped with a CAN bus or other bus system, to which many other vehicle systems are connected. Such a bus system enables all connected participants to communicate with one another. Compared to the previous harness of dedicated cables between each two participants communicating with one another, this saves considerable cabling effort. However, the price for this is that the connected participants have to share the bandwidth of the bus system. Generally, only one participant at a time is able to send. If the entire vehicle surroundings are observed by a plurality of sensors (such as high-resolution cameras), large amounts of data are generated which may no longer be able to be transferred in full over the bus system. Even a “high-speed” CAN bus only has a maximum bandwidth of 1 Mbit/s, which is already too little for a full-HD video data stream. However, by significantly reducing the amount of information by forming the sample prior to the transfer over the bus system and thus compressing the data in a lossy manner, the bandwidth is sufficient even for transporting the data obtained from multiple cameras and reduced in the same way.
Also when classifying image data using the classifier, in a manner analogous to training the classifier, the space-resolved relevance map for at least one data set of image data may be retrieved from a trained ANN.
The present invention also relates to a method for measuring a space-resolved relevance map for a concrete data set of image data (or learning image data). This space-resolved relevance map may be used, for example, for training a classifier, for forming a sample to be processed by the classifier, or for training an ANN which ascertains space-resolved relevance maps for image data.
In accordance with an example embodiment of the present invention, in this method, the image data are presented to at least one test subject. The test subject is given the task of perceiving the content of the image data which from its point of view is relevant and, once it has done so, making an input. While the image data are being presented, it is observed to which spatial areas of the image data the test subject turns its attention.
In response to the input made by the test subject indicating that it has perceived the relevant content of the image data, spatial areas of these image data to which the test subject has previously turned its attention are recorded. It is thus ascertained which spatial areas of the image data were the basis for the test subject deciding that it has identified the content of the image data.
It may optionally be checked whether the test subject not only believes it has identified the content of the image data, but also has actually identified it correctly. For example, not only may an input be requested to the effect that the test subject has perceived the relevant content of the image data, but it may also be asked what exactly the test subject in its opinion has identified. The answer to this question may be compared with a pre-known label as to which objects are actually contained in the image data. By way of example, when the test subject inputs that it has perceived the relevant content of the image data, the presentation may be ended (for instance the image or film may be blanked out), and multiple object names may be presented, from which the test subject must select those that are correct. The test subject's input indicating that the relevant content has been perceived may then be rejected, for example, if, when asked, the correct object or at least the correct object class (for instance “dog” or “animal” as the class above “husky”) is not named.
The local relevance of those areas to which the test subject has turned its attention in the relevance map is increased. This may in particular take place, for example, also in interaction with many test subjects. From the consideration of the image data by each test subject, the information as to which spatial areas of the image data are on average perceived as relevant may be aggregated for example by a voting mechanism.
The spatial areas of the image data to which the test subject turns its attention does not necessarily depend solely on the image data themselves, but may also be influenced by a task assigned to this test subject. In a vehicle, for example, the driver may be busy with the driving task while the passenger is looking for a parking space, a mailbox or a certain business. Therefore, if, for example, a relevance map for the at least semi-automated driving of the vehicle is being measured, a driver rather than a passenger should be used as the test subject.
In one particularly advantageous embodiment of the present invention, the head posture, the eye position and/or the eye movements of the test subject are recorded. The areas to which the test subject turns its attention may then be evaluated based on the head posture, the eye position and/or the eye movements. This is an indicator that may hardly be consciously influenced by the test subject. At the same time, a driver of a vehicle controls his/her selection of what he/she considers important from the traffic situation, usually via head posture (for instance a shoulder check), eye position and/or eye movements. Other movements are restricted as a result of being strapped into the driver's seat.
As an alternative or also in combination with this, in another advantageous embodiment of the present invention, different sub-areas of the image data may successively become visible to the test subject. Those sub-areas of the image data which are visible at the time of the input made by the test subject may then be deemed to be those areas to which the test subject turns its attention. This does not require any special hardware in order to identify exactly where the test subject is looking.
In particular, one and the same data set of image data may for example be presented to multiple test subjects. By way of example, different sequences of sub-areas of the image data which successively become visible may then be presented to these test subjects. This may include changing the order of the sub-areas which successively become visible, and/or presenting to some test subjects sub-areas which are not presented to other test subjects. On average across these test subjects, a representative conclusion is then drawn as to which sub-areas of the image data are relevant, for example, for the assessment of traffic situations.
The space-resolved relevance map and an ANN which generates such relevance maps may also be used to check whether a vehicle driver or machine operator is currently turning his/her attention to those things that are presently important in terms of safety.
Therefore, the present invention also relates to a method for observing and/or controlling the attention of a vehicle driver or machine operator. In accordance with an example embodiment of the present invention, in this method, image data of the situation in which the driven vehicle or the operated machine is located are detected by at least one sensor. For these image data, a space-resolved relevance map is retrieved from a trained artificial neural network, ANN. This space-resolved relevance map indicates how relevant which areas of the image data are for the assessment of the situation shown in these image data.
Based on this relevance map, a piece of information and/or a warning is output to the vehicle driver or machine operator. By way of example, regardless of his/her present actual behavior, the vehicle driver or machine operator may be informed at all times about which aspects of his/her present situation are presently particular important from a safety-related point of view. If it should be found, when comparing the actual behavior of the vehicle driver or machine operator, that he/she is turning his/her attention to something other than the presently important aspects, he/she may be informed of this by way of a warning.
Behind all this is the consideration that, for a human driver, one of the greatest challenges when learning to drive is that of separating what is important from what is unimportant in the flood of information of the traffic situation. Again and again, there are situations in which the student driver devotes his/her attention entirely to one aspect and the driving instructor has to point out that something else is actually more important.
In addition, in inner cities for example, advertising in shops or at the curbside may attract a lot of attention. The advertising is often designed in such a way that certain “hooks,” such as a favorable price, are placed in the foreground and may be read even from a passing vehicle. However, the price is then marked with an asterisk indicating conditions, and any attempt to read these conditions written in small print may take a lot of attention away from the traffic situation.
By way of example, an overlay of the situation may be presented to the vehicle driver or machine operator with an indication of at least one spatial area of the image data, the local relevance of which exceeds a predefined threshold value according to the relevance map. For this purpose, the area of the situation that is presently particularly relevant may be highlighted for example in a head-up display on a windshield or in data glasses worn by the vehicle driver or machine operator, through the insertion of a border or a similar indication.
In another particularly advantageous embodiment of the present invention, the head posture, the eye position and/or the eye movements of the vehicle driver or machine operator are recorded. The head posture, the eye position and/or the eye movements are used to evaluate which part of the situation the vehicle driver or machine operator is predominantly observing.
It is checked, to what extent this part of the situation is consistent with at least one spatial area of the image data, the local relevance of which exceeds a predefined threshold according to the relevance map. In response to establishing that the part of the situation predominantly being observed is not consistent with the spatial area of the image data identified as particularly relevant, a visual, acoustic and/or haptic warning device perceptible to the vehicle driver or machine operator is activated.
Besides the described example of advertising, there are many other situations in which the unfamiliar and unexpected suddenly attracts attention. By way of example, a banknote which is lying on the floor because a colleague has lost it may suddenly come into view when operating a punching machine. The operator then focuses first on this banknote instead of observing the working area of the machine and in particular making sure that both hands are outside the danger zone. This may be identified through comparison with the relevance map, in which specifically this danger zone is rated as particularly relevant.
The methods may in particular be entirely or partially computer implemented. Therefore, the present invention also relates to a computer program containing machine-readable instructions which, when executed on one or multiple computer(s), upgrade the computer(s) to the device described above and/or prompt the computer(s) to carry out one of the methods described above. In this sense, control units for vehicles and embedded systems for technical devices which are also capable of executing machine-readable instructions are also to be regarded as computers.
In addition, the present invention also relates to a machine-readable data medium and/or a download product containing the computer program. A download product is a digital product which is transferrable via a data network, i.e. downloaded by a user of the data network, which may be offered for immediate download in an online shop, for example.
Furthermore, a computer may be equipped with the computer program, with the machine-readable data medium or with the download product.
Further measures which improve the present invention will be presented in greater detail below together with the description of the preferred exemplary embodiments of the present invention and with reference to figures.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 shows an exemplary embodiment of method 100 for training a classifier 1, in accordance with the present invention.
FIG. 2 shows an exemplary embodiment of method 200 for classifying image data 2, in accordance with the present invention.
FIG. 3 shows an example of the generation of a sample 23 for classification based on image data 2, in accordance with the present invention.
FIG. 4 shows an exemplary embodiment of method 300 for measuring a relevance map 12, 22, in accordance with the present invention.
FIG. 5 shows an exemplary embodiment of method 400 for observing and/or controlling the attention of a vehicle driver or machine operator 40, in accordance with the present invention.
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
FIG. 1 is a schematic flowchart of an exemplary embodiment of method 100 for training a classifier 1 for image data 2. In step 110, for learning image data 11, in each case space-resolved relevance maps 12 are provided, those relevance maps 12 according to block 111 may be retrieved, for example, from an appropriately trained ANN. In step 120, from learning image data 11 and associated relevance maps 12, learning samples 13 are ascertained, in which, the higher the local relevance according to relevance map 12, the locally more pronounced the information from learning image data 11. In this case, in particular according to block 121, spatial areas whose local relevance is below a predefined threshold value may be blanked out, blurred or otherwise garbled.
In step 130, learning samples 13 are fed to classifier 1 and are mapped to allocations to one or multiple classes 3 a through 3 c. In step 140, parameters 15 which characterize the behavior of classifier 1 are optimized with the aim that classes 3 a through 3 c delivered by classifier 1 are consistent with labels 14 of learning image data 11 from which learning samples 13 originated. This optimization may be continued until an arbitrary abort criterion is met. The fully trained state of parameters 15 is denoted by reference numeral 15*.
FIG. 2 is a schematic flowchart of an exemplary embodiment of method 200 for classifying image data 2. In step 205, image data 2 may be recorded optionally by at least one sensor carried by a vehicle. In step 210, for image data 2, a space-resolved reference map 22 is provided, which in particular according to block 211, for example, may be retrieved from an ANN. In step 220, in a manner analogous to step 120, a sample 23 is ascertained, in which, the higher the local relevance according to relevance map 22, the locally more pronounced the information from image data 2.
In step 230, sample 23 is fed to classifier 1, and according to block 231 in particular, for example, only the significantly data-reduced sample 23 compared to image data 2 may be transferred by the sensor via a bus system of the vehicle, but not image data 2 themselves. In step 240, sample 23 is mapped by classifier 1 to the sought allocation to classes 3 a through 3 c of the predefined classification.
FIG. 3 shows an example of how image data 2 may be converted into a sample 23. Image data 2, which here are in the form of a static image, show a traffic situation including a road 25, an oncoming vehicle 26 and a traffic sign 27. In addition, a billboard 28 is visible on the left-hand curbside. Relevance map 22 assesses road 25, vehicle 26 and the right-hand curbside, where traffic signs such as sign 27 are located, as relevant. This area is therefore unchanged in sample 23 for the classification, while the details of billboard 28 are blanked out.
FIG. 4 is a schematic flowchart of an exemplary embodiment of method 300 for measuring a relevance map 12, 22 for image data 2, 11. In step 310, image data 2, 11 are presented to at least one test subject 4. Meanwhile, it is observed to which spatial areas 2 a, 11 a of image data 2, 11 test subject 4 turns its attention.
In response to input 41 made by test subject 4 indicating that it has perceived the relevant content of image data 2, 11, spatial areas 2 a, 11 a to which test subject 4 has previously turned its attention are detected in step 320. In step 330, the local relevance of these areas 2 a, 11 a in relevance map 12, 22 is increased.
Box 310 shows, by way of example, two possible ways in which the turning of attention to areas 2 a, 11 a may be established. These possibilities may be used individually or also in combination.
According to block 311, head posture 42 a, eye position 42 b and/or eye movements 42 c of test subject 4 may be recorded. According to block 312, areas 2 a, 11 a of image data 2, 11 to which test subject 4 turns its attention may then be evaluated based on head posture 42 a, eye position 42 b and/or eye movements 42 c.
According to block 313, different sub-areas of image data 2, 11 may be successively made visible to test subject 4 until test subject 4 recognizes image data 2, 11 and makes input 41. In particular, according to block 313 a, one and the same data set of image data 2, 11 may be presented to multiple test subjects 4. Different sequences of sub-areas of image data 2, 11 which successively become visible may be presented to these test subjects 4. (Block 313 b).
According to block 314, those sub-areas of image data 2, 11 which are visible at the time of input 41 made by test subject 4 may be deemed to be those areas 2 a, 11 a to which test subject 4 turns its attention.
In step 330, the local relevance of ascertained areas 2 a, 11 a in relevance map 12, 22 is increased, it being possible in particular to use, for example, a voting mechanism across many test subjects 4.
FIG. 5 is a schematic flowchart of an exemplary embodiment of method 400 for observing and/or controlling the attention of a vehicle driver or machine operator 40. In step 410, image data 2 of the situation, in which the driven vehicle or the operated machine is located, are detected by at least one sensor. In step 420, for these image data 2, a space-resolved relevance map 22 is retrieved from a trained artificial neural network, ANN. In step 430, based on this relevance map 22, a piece of information and/or a warning 6 is output to the vehicle driver or machine operator 40.
Box 430 shows two possible ways in which the piece of information and/or warning 6 may be generated. These possibilities may be used individually or also in combination.
According to block 431, an overlay of the situation may be presented to the vehicle driver or machine operator 40 with an indication of at least one image area, the local relevance of which exceeds a predefined threshold value according to the relevance map.
According to block 432, head posture 42 a, eye position 42 b and/or eye movements 42 c of vehicle driver or machine operator 40 may be recorded. In block 433, these may be used to evaluate which part 7 of the situation vehicle driver or machine operator 40 is predominantly observing. In block 434, it may be checked to what extent this part 7 of the situation is consistent with at least one spatial area of the image data, the local relevance of which exceeds a predefined threshold value according to relevance map 22. If part 7 of the situation is not consistent with the aforementioned spatial area of the image data (logical value 0), a visual, acoustic and/or haptic warning device perceptible to vehicle driver or machine operator 40 may be activated.

Claims (8)

What is claimed is:
1. A method for training an image classifier, the method comprising the following steps:
for each of a plurality of learning images:
obtaining a respective space-resolved relevance map, wherein an information density of image information included over an entirety of the respective obtained learning image is a single first information density, and the obtained respective space-resolved relevance map indicated different respective relevancies of different spatial areas of the respective image data;
based on the obtained respective space-resolved relevance maps, generating a respective modified image by modifying the obtained respective image, the modification being performed by, for each of one or more areas of the respective image, reducing the information density of the respective area to be at a respective other information density that is lower than the first information density, so that different areas of the respective modified image have different information densities than one another, the respective modified image being a respective classifier input sample;
feeding an entirety of the respective classifier input sample to the classifier;
executing the classifier by which the classifier processes the respective classifier input sample as a whole, and not the respective learning image, to identify one or more of predefined classes of objects contained in the respective learning image; and
performing a comparison to determine how consistent the one or more identified classes which the classifier has identified for the respective classifier input sample are with one or more of the predefined classes with which the respective learning image has been labeled; and
based on the consistency determinations, optimizing one or parameters of the classifier to increase a consistency at which the classifier identifies the classes with the classes with which the learning images are labeled.
2. The method as recited in claim 1, wherein the reducing of the information density of the respective area is performed by setting all pixels of the respective area:
to be at a same pixel value so that the area becomes a blanked out region of the image; or
to be a blurred or garbled region of the image.
3. A method for classifying an image using a trained classifier, the method comprising the following steps:
obtaining the image including image information, wherein an information density of the image information over an entirety of the image is a single first information density;
obtaining a space-resolved relevance map that indicates different respective relevancies of different spatial areas of the image;
based on the obtained space-resolved relevance map, generating a modified image by modifying the obtained image, the modification being performed by, for each of one or more areas of the image, reducing the information density of the respective area to be at a respective other information density that is lower than the first information density, so that different areas of the modified image have different information densities than one another, the modified image being a classifier input sample;
feeding the classifier an entirety of the input sample to the classifier; and
executing the classifier by which the classifier processes the sample as a whole, and not the obtained image, to identify one or more predefined classes of objects contained in the obtained image.
4. The method as recited in claim 3, wherein the image is obtained by a recordation by at least one sensor carried by a vehicle, and the sample, but not the recorded image, is transferred to the classifier via a bus system of the vehicle, which is also used by other on-board systems of the vehicle.
5. The method as recited in claim 1, wherein the respective space-resolved relevance map is retrieved from a trained artificial neural network (ANN).
6. The method as recited in claim 3, wherein the obtainment of the space-resolved relevance map is from a trained artificial neural network (ANN).
7. A non-transitory machine-readable data medium on which is stored a computer program that is executable by one or more computers and that, when executed by the one or more computers, causes the one or more computers to perform a method for training an image classifier, the method including the following steps:
for each of a plurality of learning images:
obtaining a respective space-resolved relevance map, wherein an information density of image information included over an entirety of the respective obtained learning image is a single first information density, and the obtained respective space-resolved relevance map indicated different respective relevancies of different spatial areas of the respective image;
based on the obtained respective space-resolved relevance maps, generating a respective modified image by modifying the obtained respective image, the modification being performed by, for each of one or more areas of the respective image, reducing information density of the respective area to be at a respective other information density that is lower than the first information density, so that different areas of the respective modified image have different information densities than one another, the respective modified image being a respective classifier input sample;
feeding an entirety of the respective classifier input sample to the classifier;
executing the classifier by which the classifier processes the respective classifier input sample as a whole, and not the respective learning image, to identify one or more of predefined classes of objects contained in the respective learning image; and
performing a comparison to determine how consistent the one or more identified classes which the classifier has identified for the respective classifier input sample are with one or more of the predefined classes with which the respective learning image has been labeled; and
based on the consistency determinations, optimizing one or parameters of the classifier to increase a consistency at which the classifier identifies the classes with the classes with which the learning images are labeled.
8. A computer configured to train an image classifier, the computer being configured to perform a method, the method comprising:
for each of a plurality of learning images:
obtaining a respective space-resolved relevance map, wherein an information density of image information included over an entirety of the respective obtained learning image is a single first information density, and the obtained respective space-resolved relevance map indicated different respective relevancies of different spatial areas of the respective image;
based on the obtained respective space-resolved relevance maps, generating a respective modified image by modifying the obtained respective image, the modification being performed by, for each of one or more areas of the respective image, reducing the information density of the respective area to be at a respective other information density that is lower than the first information density, so that different areas of the respective modified image have different information densities than one another, the respective modified image being a respective classifier input sample;
feeding an entirety of the respective classifier input sample to the classifier;
executing the classifier by which the classifier processes the respective classifier input sample as a whole, and not the respective learning image, to identify one or more of predefined classes of objects contained in the respective learning image; and
performing a comparison to determine how consistent the one or more identified classes which the classifier has identified for the respective classifier input sample are with one or more of the predefined classes with which the respective learning image has been labeled; and
based on the consistency determinations, optimizing one or parameters of the classifier to increase a consistency at which the classifier identifies the classes with the classes with which the learning images are labeled.
US17/357,071 2020-06-29 2021-06-24 Image classification and associated training for safety-relevant classification tasks Active 2042-05-24 US11978262B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102020208008.9A DE102020208008A1 (en) 2020-06-29 2020-06-29 Image classification and related training for security-related classification tasks
DE102020208008.9 2020-06-29

Publications (2)

Publication Number Publication Date
US20210406587A1 US20210406587A1 (en) 2021-12-30
US11978262B2 true US11978262B2 (en) 2024-05-07

Family

ID=78827053

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/357,071 Active 2042-05-24 US11978262B2 (en) 2020-06-29 2021-06-24 Image classification and associated training for safety-relevant classification tasks

Country Status (3)

Country Link
US (1) US11978262B2 (en)
CN (1) CN113935362A (en)
DE (1) DE102020208008A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102020208008A1 (en) * 2020-06-29 2021-12-30 Robert Bosch Gesellschaft mit beschränkter Haftung Image classification and related training for security-related classification tasks
US11800065B2 (en) 2021-08-19 2023-10-24 Geotab Inc. Mobile image surveillance systems and methods
CN116824198A (en) * 2022-03-21 2023-09-29 华为云计算技术有限公司 Bias evaluation method, bias evaluation device, bias evaluation medium, bias evaluation program product and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2454857B (en) 2006-09-18 2010-06-09 Bosch Gmbh Robert Method for processing a microscope intensity image
US20170103269A1 (en) * 2015-10-07 2017-04-13 Honda Motor Co., Ltd. System and method for providing laser camera fusion for identifying and tracking a traffic participant
US20180018553A1 (en) * 2015-03-20 2018-01-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Relevance score assignment for artificial neural networks
US20180240221A1 (en) * 2017-02-17 2018-08-23 Cogisen S.R.L. Method for image processing and video compression
US10699192B1 (en) * 2019-01-31 2020-06-30 StradVision, Inc. Method for optimizing hyperparameters of auto-labeling device which auto-labels training images for use in deep learning network to analyze images with high precision, and optimizing device using the same
US20210125104A1 (en) * 2019-10-25 2021-04-29 Onfido Ltd Machine learning inference system
US20210406587A1 (en) * 2020-06-29 2021-12-30 Robert Bosch Gmbh Image classification and associated training for safety-relevant classification tasks

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2454857B (en) 2006-09-18 2010-06-09 Bosch Gmbh Robert Method for processing a microscope intensity image
US20180018553A1 (en) * 2015-03-20 2018-01-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Relevance score assignment for artificial neural networks
US20170103269A1 (en) * 2015-10-07 2017-04-13 Honda Motor Co., Ltd. System and method for providing laser camera fusion for identifying and tracking a traffic participant
US20180240221A1 (en) * 2017-02-17 2018-08-23 Cogisen S.R.L. Method for image processing and video compression
US10699192B1 (en) * 2019-01-31 2020-06-30 StradVision, Inc. Method for optimizing hyperparameters of auto-labeling device which auto-labels training images for use in deep learning network to analyze images with high precision, and optimizing device using the same
US20210125104A1 (en) * 2019-10-25 2021-04-29 Onfido Ltd Machine learning inference system
US20210406587A1 (en) * 2020-06-29 2021-12-30 Robert Bosch Gmbh Image classification and associated training for safety-relevant classification tasks

Also Published As

Publication number Publication date
DE102020208008A1 (en) 2021-12-30
US20210406587A1 (en) 2021-12-30
CN113935362A (en) 2022-01-14

Similar Documents

Publication Publication Date Title
US11978262B2 (en) Image classification and associated training for safety-relevant classification tasks
US11067405B2 (en) Cognitive state vehicle navigation based on image processing
US11017250B2 (en) Vehicle manipulation using convolutional image processing
DE112018007287T5 (en) VEHICLE SYSTEM AND METHOD FOR DETECTING OBJECTS AND OBJECT DISTANCE
DE112014007249B4 (en) Image processing device, vehicle display system, display device, image processing method and image processing program
WO2019048011A1 (en) Gesture control for communication with an autonomous vehicle on the basis of a simple 2d camera
DE102017100198A1 (en) FIXING GENERATION FOR MACHINE LEARNING
WO2014173863A1 (en) Method and apparatus for detecting non-motorised road users
DE102008043743A1 (en) Sensor signals e.g. video sensor signals, evaluating method for detecting e.g. traffic sign in surrounding of vehicle, involves evaluating information based on evaluation specification, and outputting information based on evaluation result
DE102007001099A1 (en) Driver assistance system for traffic sign recognition
WO2013152929A1 (en) Learning method for automated recognition of traffic signs, method for determining an updated parameter set for the classification of a traffic sign and traffic sign recognition system
WO2018215242A2 (en) Method for determining a driving instruction
DE102018128634A1 (en) Method for providing visual information about at least part of an environment, computer program product, mobile communication device and communication system
US20120189161A1 (en) Visual attention apparatus and control method based on mind awareness and display apparatus using the visual attention apparatus
DE102021130548A1 (en) Presentation of objects in an image based on anomalies
DE102016120066A1 (en) A computer implemented method for controlling an object recognition system
US11823305B2 (en) Method and device for masking objects contained in an image
DE102020205825A1 (en) System for deception detection, prevention and protection of ADAS functions
DE102020205831A1 (en) System for deception detection, prevention and protection of ADAS functions
EP3553695A1 (en) Method and system for displaying image data from at least one night vision camera of a vehicle
EP4287147A1 (en) Training method, use, software program and system for the detection of unknown objects
DE102021207258B3 (en) Method for automatically controlling at least one vehicle function of a vehicle and notification system for a vehicle
US20230267653A1 (en) Generating realistic images from specified semantic maps
WO2024110149A1 (en) Method for detecting at least one object in a vehicle interior
DE102017221045A1 (en) Method and device for providing a warning message by means of a visual field display unit for a vehicle

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: ROBERT BOSCH GMBH, GERMANY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MAYER, UDO;REEL/FRAME:058694/0725

Effective date: 20210916

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

ZAAB Notice of allowance mailed

Free format text: ORIGINAL CODE: MN/=.

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE