WO2021165077A1 - Verfahren und vorrichtung zur bewertung eines bildklassifikators - Google Patents
Verfahren und vorrichtung zur bewertung eines bildklassifikators Download PDFInfo
- Publication number
- WO2021165077A1 WO2021165077A1 PCT/EP2021/052931 EP2021052931W WO2021165077A1 WO 2021165077 A1 WO2021165077 A1 WO 2021165077A1 EP 2021052931 W EP2021052931 W EP 2021052931W WO 2021165077 A1 WO2021165077 A1 WO 2021165077A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- classifier
- robot
- areas
- image classifier
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/28—Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0014—Image feed-back for automatic industrial control, e.g. robot with camera
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
- G06V10/7747—Organisation of the process, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional [3D] objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/06—Recognition of objects for industrial automation
Definitions
- the invention relates to a method for evaluating an image classifier, a method for training an image classifier, a method for operating an image classifier, a training device, a computer program, a control system and a machine-readable storage medium.
- Image classifiers represent a key technology for operating at least partially autonomous and / or mobile robots. It has been shown that image classifiers learned from data, especially neural networks, currently provide the best classification services.
- the advantage of the method with features according to independent claim 1 is that it provides an insight into the functioning of an image classifier.
- the method enables the determination of elements of an image which are relevant from a security perspective and which the image classifier is intended to recognize. This allows an insight into the accuracy of the classifications of the classifier.
- the method can be used in order to be able to determine whether a mobile robot that carries out its navigation based on the output of an image classifier is safe enough to be able to operate it.
- the invention deals with a computer-implemented method for evaluating an image classifier, a classifier output of the image classifier being provided for controlling an at least partially autonomous robot (100, 220), the method for evaluating comprising the following steps:
- An image classifier can be understood to mean a device that is designed to accept images (also: image data) and generate a classification output that characterizes the image data or parts thereof.
- images also: image data
- an image classifier can be used to determine in which parts of an input image objects are located.
- an image classifier can be used to detect other road users.
- the corresponding classifier output can then be used to control the robot.
- the classifier output can be used to determine a trajectory on which the robot moves through its environment without collision. That is to say, the image data preferably show an environment of the robot.
- an image classifier can also be used for other classification tasks, for example for semantic segmentation.
- the image classifier classifies every desired point in an input image, for example every pixel of a camera image, into a desired class. This can be used, for example, for a mobile robot to recognize the boundaries of the drivable area of the surroundings based on an input image and to plan a trajectory based on this.
- An image classifier can contain a model from the area of machine learning, such as a neural network.
- the model can be used to classify the input of the image classifier.
- the image classifier can have preprocessing and / or postprocessing methods.
- a post-processing method can be, for example, a non-maximum suppression, which can be used to merge different bounding boxes of the same objects.
- Different types of images can be used as input data for an image classifier, in particular sensor data, for example from a camera sensor, a radar sensor, a LIDAR sensor, an ultrasonic sensor or an infrared camera sensor. Audio recordings from microphones can also be displayed as image data and used as input for an image classifier, for example in the form of spectral images. It is still it is conceivable that several types of sensor data can be combined in order to obtain an input data for the image classifier.
- sensor data for example from a camera sensor, a radar sensor, a LIDAR sensor, an ultrasonic sensor or an infrared camera sensor.
- Audio recordings from microphones can also be displayed as image data and used as input for an image classifier, for example in the form of spectral images. It is still it is conceivable that several types of sensor data can be combined in order to obtain an input data for the image classifier.
- image data can be generated synthetically with the aid of computer-aided measures.
- images can be calculated or rendered based on physical models.
- the images that are used for input to the image classifier can be recorded directly by a sensor and passed on to the image classifier.
- image data can be recorded or generated before the classification and then preferably temporarily stored on a storage medium before they are passed on to the image classifier.
- hard disks In particular, hard disks,
- Flash drives or solid state disks can be used.
- Image data can also be held in a dynamic memory.
- a control signal can be determined with which an at least partially autonomous robot can be controlled.
- An at least partially autonomous robot can be understood to mean a robot which at least temporarily carries out a task independently without the control of a person. He can use sensors and actuators for this purpose, for example.
- a partially autonomous robot can be, for example, an autonomously driving vehicle, a lawnmower robot, a vacuum robot or a drone. In the following, the term robot is understood to mean an at least partially autonomous robot.
- An image data set can be understood to mean a set of image data, it being possible for specific information in the form of annotations to be assigned to the image data.
- Annotation of an image data can be understood here as a set of information that describes the image data or parts thereof or contains further additional information about the image.
- Image data can depict scenes, whereby scenes can contain objects.
- a scene can be understood to be a situation in the real world, in particular the environment of the robot.
- a scene can represent a set of objects in a street situation.
- objects can be understood to mean other road users, for example.
- a scene can be understood to mean the virtual world on the basis of which an image datum was synthesized.
- the objects can be understood as virtual elements of the scene.
- Image data can be assigned to annotations, whereby annotations can comprise information about the respective depicted scene and / or image areas.
- annotations can comprise information about the respective depicted scene and / or image areas.
- an annotation can contain a set of bounding boxes that describe the position of the objects depicted in the image datum.
- the annotation contains pixel-precise information with regard to the class of a pixel (i.e. a semantic segmentation) of the image data.
- an annotation contains information on weather and / or environmental influences that were used when the specific image data was recorded, e.g. rain, solar radiation, time of day or the nature of the soil.
- an annotation contains information about the scene in which the image was recorded.
- the annotation can contain information about the relative position of the sensor in relation to other objects in the scene, for example. This information can later be used, for example, to determine the 3-dimensional position of an object that is mapped 2-dimensionally in an image datum (for example a camera image).
- the 3-dimensional position information of objects in the scene is contained directly in the annotation, for example in the form of a relative vector from the sensor to the object.
- the first image data set can preferably be selected for the method in such a way that a system is used for recording which corresponds to or is similar to the later robot.
- the first image data set can be recorded in such a way that a test driver controls the vehicle in such a way that desired image data can be recorded by the sensors of the vehicle.
- the first data record is recorded by a vehicle which, in terms of the sensor system, is structurally identical to the vehicle for which the image classifier is to be evaluated.
- the first image data set is generated synthetically with the aid of a computer-aided model.
- the model can preferably be selected in such a way that it at least resembles the robot in terms of shape, physical properties and sensors.
- the scene can be understood as the combination of the arrangement and properties of virtual objects that can be used to generate synthetic image data.
- the first image data set can also be obtained from existing sources.
- existing sources For example, there are a number of freely accessible data sets on the Internet that can be used for the purpose of assessing an image classifier.
- the annotations required for the method can either be generated manually or at least partially automatically for the various image data.
- the annotations preferably contain relative information with regard to the element of an image datum to be classified and the system that is / was used for the recording.
- a vehicle can be designed in such a way that it can record a camera-based image data set which can then be used to evaluate an image classifier that is to be used later in the vehicle or a structurally identical vehicle.
- the annotations of the image data can contain, for example, bounding boxes of objects to be detected in the vicinity of the vehicle.
- they can contain information about which position the objects to be recognized in relation to the vehicle with a specific image datum. This information can later be used to determine the relevance value of an object.
- the model data of the synthetic model can be included directly as information in the annotations.
- the data described in the previous paragraph can be simulated using a computer. This requires a virtual model of the sensor and its position in the simulated scene.
- This position and / or positions of simulated objects that are later to be recognized by the image classifier can in this case be included directly in the annotation.
- the annotations also preferably contain information relating to, for example, the speed of the robot, the acceleration, the steering angle, the drive positions or the planned trajectory, with each of this information representing values or templates that were available at the time an image data was recorded.
- This information is preferably also contained in the annotation for the objects of the scene, insofar as it makes sense.
- areas of a scene can then be determined that the robot could have reached within a certain time at the time of the recording. These areas can be determined, for example, using the time-to-collision with other objects and / or the time-to-react and / or time-to-brake and / or time-to-steer and / or time-to-kickdown .
- the areas can be understood as safety-critical areas in which the robot must be able to detect other objects with high accuracy and reliability, for example in order to plan a safe trajectory.
- An image area to be classified can be understood as at least part of an image data item for which the image classifier is intended to predict a certain object or a certain class that is mapped by the image area.
- the areas to be classified can be understood as images of the objects that are to be detected by the image classifier.
- the image areas can be understood as the pixels of an image, whereby each pixel can be assigned to an object.
- the image areas to be classified can then be assigned relevance values.
- Each area to be classified is preferably given a relevance value.
- a relevance value can be understood here as a value which indicates the extent to which a misclassification of the image classifier for this image area can become critical for the behavior of the robot using the image classifier. For example, image areas that depict objects very far away from the robot can be assigned small relevance values. Image areas that depict objects that are close to the robot can also be of high relevance, since a misclassification of them would have a greater impact on the robot.
- Relevance values can be represented by scalar values.
- a relevance value can be binary or real.
- the image classifier For all or parts of the image areas to be classified it can then be determined whether they have been correctly classified by the image classifier.
- the evaluation of the image classifier can then preferably take place on the basis of the relevance values of the incorrectly classified image areas. For example, the evaluation can take place in the form of a sum or an average of the relevance values of the incorrectly classified areas.
- the advantage of this approach is that a numerical and objective value can be determined which can be used to make a decision as to whether the image classifier can be used as part of the control of the robot. This enables a differentiated insight into the functioning of an image classifier as described above. This is a considerable improvement since, in particular, machine-learning-based image classifiers exhibit black box behavior that cannot otherwise be viewed in a satisfactory manner.
- the image areas to be classified are each assigned to an object.
- the advantage of this approach is that the relevance of an image area can reflect the relevance of the object. This allows the relevance of objects in a scene to be assessed based on the image areas. In return, this allows a detailed insight into the behavior of the image classifier for recognizing objects in a scene.
- the step of determining the areas that can be reached by the robot is based on movement information from the robot.
- the robot's movement information allows a determination to be made as to which areas of the scene the robot can plausibly move into at all. Objects in these areas should therefore be able to be predicted with a high degree of accuracy in order to control the robot in such a way that it does not collide with the objects, for example.
- the movement information can be extracted from the annotation or estimated with the aid of the image data. For example, several consecutive images of the first image data set can be used to estimate the speed of the robot. In the case of, for example, stationary manufacturing robots, information about the areas that can be reached by the robot can also be obtained from the robot's data sheets.
- this approach is therefore that areas can be determined in which the image classifier should be able to reliably recognize objects. Since other areas of the scene may be less or not relevant, this method therefore allows a detailed and targeted insight into the operation of the image classifier and the evaluation allows a better assessment of the recognition performance (also: performance) of the image classifier. This is because, in known methods, the recognition performance of an image classifier is estimated on all image areas of an image. The detection performance of the robot with regard to safe and error-free operation can therefore be assessed much better via the areas that can be reached.
- the step of determining the relevance values comprises the following steps: • Determination of depth information of the objects;
- image areas can only be assigned a relevance value other than zero, for example, if the corresponding objects can actually interact with the robot in a safety-critical manner.
- the recognition of a pedestrian is irrelevant for the trajectory planning of a robot under safety-critical standards if the robot cannot travel faster than 30 km / h and the pedestrian to be recognized is, for example, more than 500 m away.
- the behavior of the image classifier in relation to the safety of the robot can be assessed much more precisely using the method presented.
- the step of evaluating the image classifier includes determining an evaluation measure, the method for evaluating the image classifier further including the following additional step:
- the evaluation measure can be selected such that the worse the performance of the image classifier, the greater it is. In this case, retraining would take place if the assessment measure is greater than the assessment threshold.
- the assessment measure can be, for example, the sum or the average of the relevance values of all misclassified image areas.
- Retraining the image classifier can be understood to mean a method that adapts the parameters of the image classifier with the aid of the second image data set in such a way that the recognition accuracy of the image classifier is achieved with the aid of the second image data set is further improved.
- the second image data set can in turn contain annotations that can be used to adapt the parameters with the aid of a monitored learning process.
- the second image data set can be determined using the same method as the first image data set. It is also conceivable that the second image data set contains at least parts of the image data and / or annotations of the first image data set.
- the advantage of this approach is that the recognition performance of the image classifier can be improved as long as the recognition performance is sufficient to be able to operate it in a real product.
- this approach also offers the advantage that a just sufficient evaluation result can be further improved and thus a certain safety buffer can be achieved with regard to the recognition accuracy of the image classifier. It is also conceivable that the image classifier is retrained with second image data sets that differ between the iterations in order to further increase the recognition performance.
- image data of the first image data set can be used for at least a part of the second image data set.
- images can be removed from the first image data set or exchanged in each iteration.
- images can be removed from the second image data set or exchanged in each iteration.
- the retraining of the image classifier is carried out based on relevance values of image areas of the second image data set.
- the advantage of this approach is that, from the evaluation point of view, fewer or non-relevant image areas can be weighted in the training in such a way that they have little or no influence on the training of the image classifier. This leads to a simpler training of the image classifier, which in turn significantly increases the recognition performance of the image classifier. This leads to an increase in the performance of the overall system if an image classifier trained in this way is used as part of the control of a robot.
- FIG. 1 shows, schematically, the flowchart of the method of evaluating an image classifier
- Figure 2 schematically shows a control system
- FIG. 3 schematically shows an autonomous vehicle that is controlled by the control system
- FIG. 4 schematically shows a production robot which is controlled by the control system.
- FIG. 1 shows a flow chart of a method for evaluating an image classifier (60).
- the image classifier (60) is such pronounced that it can recognize vehicles in motorway situations, whereby the classifier output (y) includes bounding boxes.
- an image data set is determined. This can be done, for example, with the aid of a test vehicle in which a camera is installed that is suitable for recording image data.
- the image data set shows image data from motorway situations on which vehicles can be recognized.
- vehicle data such as speed and steering angle that are present at the respective recording time of an image are assigned to the image data. Alternatively, these vehicle data can also be estimated from the recorded image data after the recording.
- the image data set determined in this way is then manually annotated by a person.
- semi-automatic annotation can also be carried out with the aid of a second image classifier.
- the second image classifier can suggest annotations that can be checked by a person and changed if necessary.
- the second image classifier carries out the annotation in a fully automated manner, in that the suggestions of the second image classifier are used directly as annotations.
- the annotations generated contain information relating to the other vehicles in the image data recorded, bounding boxers of the vehicles in the image, as well as the installation position and orientation of the camera sensor.
- the annotations additionally contain 3-dimensional information such as position, orientation, speed and / or direction of travel of the vehicles to be detected accordingly.
- a second step (301) it is then determined for each of the images of the image data set which areas the test vehicle could have reached in a specified time at the point in time when the image was taken.
- the Time-To-React for example, can be used here as the time.
- the Time-To-React instead of the Time-To-React, the Time-To-Collision, Time-To-Brake, Time- To-steer or time-to-kickdown or combinations of these times can be used.
- the calculation of the reachable areas takes place with the help of the speed information in the annotations, as well as information about the position of the vehicle.
- the result is information about which areas the vehicle could have reached in a certain time at the point in time when an image data item was recorded in the scene in which the image data item was recorded.
- a relevance value is determined for the other vehicles of the image data.
- the 3-dimensional position of the other vehicles is determined on the basis of the annotated bounding boxes and the installation position of the camera sensor. Alternatively, this information can also be extracted directly from the annotation, if it is available.
- the relevance value can be defined as 1 for all vehicles that are in one of the areas determined in the previous step, and otherwise as 0. Alternatively, it is possible that the vehicles are assigned a value between 0 and 1 if they are outside an area determined in the previous step. Alternatively, it is also conceivable that vehicles are also assigned a value between 0 and 1 in one of the areas determined in the previous step. It is also conceivable that the relevance value of an object also depends on the speed and trajectory of the object. For example, objects outside the reachable areas can also receive a relevance value greater than 0 if, for example, they are moving towards the corresponding reachable areas.
- a vehicle In a fourth step (303), the image data of the first image data set are classified by the image classifier (60).
- the image classifier In the classification, a vehicle can either be detected in an image datum, that is to say correctly classified, or not.
- a fifth step (304) the recognition accuracy of the image classifier (60) is evaluated.
- the mean value or the median of the relevance values can also be used as an evaluation measure.
- the image classifier (60) can be retrained in a sixth step (306) with the aid of a second image data set.
- the evaluation threshold is defined as 0. This is synonymous with the statement that all vehicles with a relevance greater than 0 must be recognized. In the event that this does not occur, the image classifier is retrained.
- the second image data set can be determined using one of the methods that can also be used to determine the first image data set. If the evaluation measure satisfies the evaluation threshold value, the image classifier (60) can be released.
- FIG. 2 shows an actuator (10) in its surroundings (20) in interaction with a control system (40).
- the surroundings (20) are detected at preferably regular time intervals with a sensor (30), in particular an imaging sensor such as a video sensor, which can also be provided by a plurality of sensors, for example a stereo camera.
- the sensor signal (S) - or, in the case of several sensors, one sensor signal (S) each - from the sensor (30) is transmitted to the control system (40).
- the control system (40) thus receives a sequence of sensor signals (S).
- the control system (40) uses this to determine control signals (A) which are transmitted to the actuator (10).
- the control system (40) receives the sequence of sensor signals (S) from the sensor (30) in an optional receiving unit (50) which converts the sequence of sensor signals (S) into a sequence of input images (x) the sensor signal (S) is taken over as the input image (x)).
- the input image (x) can, for example, be a section or further processing of the sensor signal (S).
- the input image (x) comprises individual frames of a video recording. In other words, the input image (x) is determined as a function of the sensor signal (S).
- the sequence of input images (x) is fed to an image classifier (60) which, for example, was evaluated as in the first embodiment and whose evaluation level was below the evaluation threshold value.
- the image classifier (60) is preferably parameterized by parameters (f) which are stored in a parameter memory (P) and are provided by this.
- the image classifier (60) determines from the input images (x) classifier outputs (y).
- the classifier outputs (y) are fed to an optional conversion unit (80) which uses them to determine control signals (A) which are fed to the actuator (10) in order to control the actuator (10) accordingly.
- the classifier output (y) includes information about objects that the sensor (30) has detected.
- the actuator (10) receives the control signals (A), is controlled accordingly and carries out a corresponding action.
- the actuator (10) can include a control logic (not necessarily structurally integrated) which determines a second control signal from the control signal (A) with which the actuator (10) is then controlled.
- control system (40) comprises the sensor (30). In still further embodiments, the control system (40) alternatively or additionally also comprises the actuator (10). In further preferred embodiments, the control system (40) comprises one or a plurality of processors (45) and at least one machine-readable storage medium (46) on which instructions are stored, which when they are executed on the processors (45), the Control system (40) cause to carry out the method according to the invention.
- a display unit (10a) is provided as an alternative or in addition to the actuator (10).
- FIG. 3 shows how the control system (40) can be used to control an at least partially autonomous robot, here an at least partially autonomous motor vehicle (100).
- the sensor (30) can be, for example, a video sensor preferably arranged in the motor vehicle (100).
- the image classifier (60) is set up to identify objects from the input images (x).
- the actuator (10), which is preferably arranged in the motor vehicle (100), can be, for example, a brake, a drive or a steering system of the motor vehicle (100).
- the control signal (A) can then be determined in such a way that the actuator or actuators (10) is controlled in such a way that the motor vehicle (100) prevents, for example, a collision with the objects identified by the image classifier (60), in particular when it occurs is about objects of certain classes, e.g. pedestrians.
- the at least partially autonomous robot can also be another mobile robot (not shown), for example one that moves by flying, swimming, diving or walking.
- the mobile robot can also be, for example, an at least partially autonomous lawnmower or an at least partially autonomous cleaning robot.
- the control signal (A) can be determined in such a way that the mobile robot is driven and / or steered in such a way be controlled so that the at least partially autonomous robot prevents, for example, a collision with objects identified by the image classifier (60).
- a display unit (10a) can be controlled with the control signal (A) and, for example, the determined safe areas can be displayed.
- the display unit (10a) it is also possible for the display unit (10a) to be controlled with the control signal (A) in such a way that it emits an optical or acoustic warning signal when it is determined that the motor vehicle ( 100) threatens to collide with one of the objects identified by the image classifier (60).
- FIG. 4 shows how the control system (40) can be used to control a production robot (220), such as a PUMA robot, wherein the work space (212) of the production robot (220) can also be entered by people (210).
- the control system (40) receives image data from a camera sensor (30), on the basis of which it controls an actuator (10), the actuator (10) recording the movement of the production robot (220) and a gripper drives at the end of the arm of the production robot (220), with which workpieces (211a, 211b) can be detected.
- the control system (40) can also use the included image classifier (60) to recognize people (210) who are in the work area (212) of the production robot (220).
- the movement of the production robot (220) can be adapted by the control system (40) in such a way that the person or people (210) are not affected by Production robot (220) is touched or injured.
- the movement of the production robot (220) is selected such that the arm of the production robot (220) maintains a certain minimum distance from the person or persons (210) in the work space (212).
- the image classifier (60) was trained with images of people (210) in or around the work space (212) of the production robot (220).
- a first image data record can be recorded, the images of the first image data record also being able to show people (210) in or around the workspace of the production robot (220).
- the images of the first image data set can be provided with annotations in the form of bounding boxes for the persons (210) on the corresponding images for evaluation, with each bounding box also being assigned a relevance value. This relevance value can be defined as 1 if the corresponding bounding box shows a person (210) who is in the work space (212) of the production robot (220), and otherwise it can be defined as 0.
- the sum of the relevance values of the bounding boxes of the first data record not recognized by the image classifier (60) must be. This is synonymous with the statement that the image classifier (60) must not incorrectly detect any person (210) inside the work area (212) of the production robot (220), while this is not required for people outside the work area. Alternatively, it is conceivable that people outside the work space (212) receive higher relevance values the closer they are to the work space (212). It is also conceivable that in this case the sum of the relevance values may be greater than 0 in order to evaluate the image classifier (60) as sufficiently safe for use.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Robotics (AREA)
- Image Analysis (AREA)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202180014940.4A CN115104132B (zh) | 2020-02-17 | 2021-02-08 | 用于评价图像分类器的方法和装置 |
| US17/790,578 US12462574B2 (en) | 2020-02-17 | 2021-02-08 | Method and device for evaluating an image classifier |
| JP2022549288A JP7473663B2 (ja) | 2020-02-17 | 2021-02-08 | 画像分類器を評価するための方法及び装置 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| DE102020201939.8 | 2020-02-17 | ||
| DE102020201939.8A DE102020201939A1 (de) | 2020-02-17 | 2020-02-17 | Verfahren und Vorrichtung zur Bewertung eines Bildklassifikators |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021165077A1 true WO2021165077A1 (de) | 2021-08-26 |
Family
ID=74572774
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2021/052931 Ceased WO2021165077A1 (de) | 2020-02-17 | 2021-02-08 | Verfahren und vorrichtung zur bewertung eines bildklassifikators |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US12462574B2 (https=) |
| JP (1) | JP7473663B2 (https=) |
| CN (1) | CN115104132B (https=) |
| DE (1) | DE102020201939A1 (https=) |
| WO (1) | WO2021165077A1 (https=) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114973056A (zh) * | 2022-03-28 | 2022-08-30 | 华中农业大学 | 基于信息密度的快速视频图像分割标注方法 |
| US20230419529A1 (en) * | 2022-05-25 | 2023-12-28 | Sick Ag | Method and apparatus for acquiring master data of an object |
| US20240119274A1 (en) * | 2022-09-23 | 2024-04-11 | International Business Machines Corporation | Training neural networks with convergence to a global minimum |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12613510B2 (en) * | 2023-01-24 | 2026-04-28 | Gdm Holding Llc | On-robot data collection |
| CN116188449B (zh) * | 2023-03-13 | 2023-08-08 | 哈尔滨市科佳通用机电股份有限公司 | 铁路货车缓解阀拉杆开口销丢失故障识别方法及设备 |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019175012A1 (de) * | 2018-03-14 | 2019-09-19 | Robert Bosch Gmbh | Verfahren zum erzeugen eines trainingsdatensatzes zum trainieren eines künstlichen-intelligenz-moduls für eine steuervorrichtung eines fahrzeugs |
Family Cites Families (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2011517632A (ja) * | 2008-02-20 | 2011-06-16 | コンチネンタル・テーヴエス・アクチエンゲゼルシヤフト・ウント・コンパニー・オツフエネハンデルスゲゼルシヤフト | 車両の周辺にある物体を検出する方法及び援助システム |
| JP2009282760A (ja) | 2008-05-22 | 2009-12-03 | Toyota Motor Corp | 車両制御装置 |
| US9122958B1 (en) | 2014-02-14 | 2015-09-01 | Social Sweepster, LLC | Object recognition or detection based on verification tests |
| US9704043B2 (en) | 2014-12-16 | 2017-07-11 | Irobot Corporation | Systems and methods for capturing images and annotating the captured images with information |
| US11854308B1 (en) * | 2016-02-17 | 2023-12-26 | Ultrahaptics IP Two Limited | Hand initialization for machine learning based gesture recognition |
| US11841920B1 (en) * | 2016-02-17 | 2023-12-12 | Ultrahaptics IP Two Limited | Machine learning based gesture recognition |
| CN106845374B (zh) * | 2017-01-06 | 2020-03-27 | 清华大学 | 基于深度学习的行人检测方法及检测装置 |
| US11348269B1 (en) * | 2017-07-27 | 2022-05-31 | AI Incorporated | Method and apparatus for combining data to construct a floor plan |
| WO2019113510A1 (en) * | 2017-12-07 | 2019-06-13 | Bluhaptics, Inc. | Techniques for training machine learning |
| US20190205744A1 (en) * | 2017-12-29 | 2019-07-04 | Micron Technology, Inc. | Distributed Architecture for Enhancing Artificial Neural Network |
| US11173605B2 (en) * | 2018-02-26 | 2021-11-16 | dogugonggan Co., Ltd. | Method of controlling mobile robot, apparatus for supporting the method, and delivery system using mobile robot |
| CN110494863B (zh) | 2018-03-15 | 2024-02-09 | 辉达公司 | 确定自主车辆的可驾驶自由空间 |
| US10725475B2 (en) * | 2018-04-09 | 2020-07-28 | Toyota Jidosha Kabushiki Kaisha | Machine learning enhanced vehicle merging |
| EP3797339B1 (en) * | 2018-05-22 | 2025-03-12 | Starship Technologies OÜ | Method and system for automatic autonomous road crossing |
| US11875012B2 (en) * | 2018-05-25 | 2024-01-16 | Ultrahaptics IP Two Limited | Throwable interface for augmented reality and virtual reality environments |
| US10937173B2 (en) * | 2018-11-15 | 2021-03-02 | Qualcomm Incorporated | Predicting subject body poses and subject movement intent using probabilistic generative models |
| US11397272B2 (en) * | 2018-12-11 | 2022-07-26 | Exxonmobil Upstream Research Company | Data augmentation for seismic interpretation systems and methods |
| US11922323B2 (en) * | 2019-01-17 | 2024-03-05 | Salesforce, Inc. | Meta-reinforcement learning gradient estimation with variance reduction |
| GB201906234D0 (en) * | 2019-05-03 | 2019-06-19 | Microsoft Technology Licensing Llc | An interpretable neural network |
| DE102019210507A1 (de) * | 2019-07-16 | 2021-01-21 | Robert Bosch Gmbh | Vorrichtung und computerimplementiertes Verfahren für die Verarbeitung digitaler Sensordaten und Trainingsverfahren dafür |
| US11126855B2 (en) * | 2019-08-08 | 2021-09-21 | Robert Bosch Gmbh | Artificial-intelligence powered ground truth generation for object detection and tracking on image sequences |
| EP3923192A1 (en) * | 2020-06-12 | 2021-12-15 | Robert Bosch GmbH | Device and method for training and testing a classifier |
| GB2598761A (en) * | 2020-09-11 | 2022-03-16 | Nokia Technologies Oy | Domain adaptation |
| US12148194B2 (en) * | 2020-09-14 | 2024-11-19 | Intelligent Fusion Technology, Inc. | Method, device, and storage medium for targeted adversarial discriminative domain adaptation |
| US20220308592A1 (en) * | 2021-03-26 | 2022-09-29 | Ohmnilabs, Inc. | Vision-based obstacle detection for autonomous mobile robots |
| EP4105839B1 (en) * | 2021-06-16 | 2025-05-28 | Robert Bosch GmbH | Device and method to adapt a pretrained machine learning system to target data that has different distribution than the training data without the necessity of human annotations on target data |
-
2020
- 2020-02-17 DE DE102020201939.8A patent/DE102020201939A1/de active Pending
-
2021
- 2021-02-08 US US17/790,578 patent/US12462574B2/en active Active
- 2021-02-08 JP JP2022549288A patent/JP7473663B2/ja active Active
- 2021-02-08 CN CN202180014940.4A patent/CN115104132B/zh active Active
- 2021-02-08 WO PCT/EP2021/052931 patent/WO2021165077A1/de not_active Ceased
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019175012A1 (de) * | 2018-03-14 | 2019-09-19 | Robert Bosch Gmbh | Verfahren zum erzeugen eines trainingsdatensatzes zum trainieren eines künstlichen-intelligenz-moduls für eine steuervorrichtung eines fahrzeugs |
Non-Patent Citations (4)
| Title |
|---|
| JONATHON BYRD ET AL: "What is the Effect of Importance Weighting in Deep Learning?", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 8 December 2018 (2018-12-08), XP081368966 * |
| M ALTHOFF: "Reachability Analysis and its Application to the Safety Assessment of Autonomous Cars", 3 January 2018 (2018-01-03), XP055587085, Retrieved from the Internet <URL:https://mediatum.ub.tum.de/doc/963752/642175.pdf> * |
| MATTHIAS ALTHOFF: "Dissertation", 2010, TECHNISCHE UNIVERSITÄT MÜNCHEN, article "Reachability Analysis and its Application to the Safety Assessment of Autonomous Cars" |
| OHN-BAR ESHED ET AL: "Are all objects equal? Deep spatio-temporal importance prediction in driving videos", PATTERN RECOGNITION, vol. 64, 1 April 2017 (2017-04-01), pages 425 - 436, XP029864359, ISSN: 0031-3203, DOI: 10.1016/J.PATCOG.2016.08.029 * |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114973056A (zh) * | 2022-03-28 | 2022-08-30 | 华中农业大学 | 基于信息密度的快速视频图像分割标注方法 |
| US20230419529A1 (en) * | 2022-05-25 | 2023-12-28 | Sick Ag | Method and apparatus for acquiring master data of an object |
| US12573070B2 (en) * | 2022-05-25 | 2026-03-10 | Sick Ag | Method and apparatus for acquiring master data of an object |
| US20240119274A1 (en) * | 2022-09-23 | 2024-04-11 | International Business Machines Corporation | Training neural networks with convergence to a global minimum |
Also Published As
| Publication number | Publication date |
|---|---|
| DE102020201939A1 (de) | 2021-08-19 |
| JP7473663B2 (ja) | 2024-04-23 |
| CN115104132A (zh) | 2022-09-23 |
| US12462574B2 (en) | 2025-11-04 |
| US20230038337A1 (en) | 2023-02-09 |
| JP2023513385A (ja) | 2023-03-30 |
| CN115104132B (zh) | 2025-09-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2021165077A1 (de) | Verfahren und vorrichtung zur bewertung eines bildklassifikators | |
| EP3695244B1 (de) | Verfahren und vorrichtung zum erzeugen eines inversen sensormodells und verfahren zum erkennen von hindernissen | |
| DE102018206208A1 (de) | Verfahren, Vorrichtung, Erzeugnis und Computerprogramm zum Betreiben eines technischen Systems | |
| DE102021002798A1 (de) | Verfahren zur kamerabasierten Umgebungserfassung | |
| DE102019209457A1 (de) | Verfahren zum Trainieren eines künstlichen neuronalen Netzes, künstliches neuronales Netz, Verwendung eines künstlichen neuronalen Netzes sowie entsprechendes Computerprogramm, maschinenlesbares Speichermedium und entsprechende Vorrichtung | |
| EP4248418A2 (de) | Verfahren und system zur annotation von sensordaten | |
| DE102019209463A1 (de) | Verfahren zur Bestimmung eines Vertrauenswertes eines Objektes einer Klasse | |
| DE102022100545A1 (de) | Verbesserte objekterkennung | |
| DE102021003567A1 (de) | Verfahren zur Erkennung von Objektbeziehungen und Attributierungen aus Sensordaten | |
| DE102021206475A1 (de) | Hindernisdetektion im Gleisbereich auf Basis von Tiefendaten | |
| DE102020134530A1 (de) | Rückaufprallwarnsystem mit temporärem cnn | |
| DE102023100599A1 (de) | Rechnerisch effizientes unüberwachtes dnn-vortraining | |
| DE102018109680A1 (de) | Verfahren zum Unterscheiden von Fahrbahnmarkierungen und Bordsteinen durch parallele zweidimensionale und dreidimensionale Auswertung; Steuereinrichtung; Fahrassistenzsystem; sowie Computerprogrammprodukt | |
| DE102020214596A1 (de) | Verfahren zum Erzeugen von Trainingsdaten für ein Erkennungsmodell zum Erkennen von Objekten in Sensordaten einer Umfeldsensorik eines Fahrzeugs, Verfahren zum Erzeugen eines solchen Erkennungsmodells und Verfahren zum Ansteuern einer Aktorik eines Fahrzeugs | |
| DE102021201178A1 (de) | Computerimplementiertes verfahren zum erzeugen von zuverlässigkeitsangaben für computervision | |
| DE102016218196A1 (de) | Virtuelles strassenoberflächenerfassungs-testumfeld | |
| DE102022209403B4 (de) | Verfahren zum Überprüfen der Durchführung einer Prädiktionsaufgabe durch ein neuronales Netzwerk | |
| DE102022206131A1 (de) | Klassifikator und Verfahren für die Erkennung von Objekten aus Sensordaten auf der Basis einer vorgegebenen Klassenhierarchie | |
| EP4113392B1 (de) | Verfahren zum prüfen der zuverlässigkeit einer ki-basierten objekt-detektion | |
| DE102020208981A1 (de) | Verfahren zur Schätzung der Geometrie eines Bewegungsweges | |
| DE102019220615A1 (de) | Verfahren und Vorrichtung zum Erkennen und Klassifizieren von Objekten | |
| DE102023209075A1 (de) | Computerimplementiertes Verfahren, Verarbeitungsvorrichtung und Fahrzeugsteuervorrichtung | |
| DE102024210706A1 (de) | Objektdetektion mit Distanzbestimmung | |
| DE102024203903A1 (de) | Verfahren zum Trainieren eines Maschinenlernmodells für eine Detektionsanwendung in einem Fahrzeug | |
| EP4645253A1 (de) | Verfahren zum ermitteln einer umgebungsrepräsentation einer umgebung eines fahrzeugs |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21704243 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022549288 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21704243 Country of ref document: EP Kind code of ref document: A1 |
|
| WWG | Wipo information: grant in national office |
Ref document number: 202180014940.4 Country of ref document: CN |
|
| WWG | Wipo information: grant in national office |
Ref document number: 17790578 Country of ref document: US |