DE102019204514A1

DE102019204514A1 - Device and method for object detection

Info

Publication number: DE102019204514A1
Application number: DE102019204514.6A
Authority: DE
Inventors: Masato Takami; Alexander Lengsfeld
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2019-03-29
Filing date: 2019-03-29
Publication date: 2020-10-01
Also published as: CN111753838A

Abstract

Vorrichtung und computerimplementiertes Verfahren zur Objektdetektion wobei wenigstens ein Parameter (308A, 308B) für eine Skalierung wenigstens eines Teils eines Bildes, für eine Gewichtung für eine Wahrscheinlichkeit mit der es sich um ein bestimmtes Objekt handelt, oder eine Anzahl zu bestimmender Bildbereiche für die Objektdetektion abhängig von einem Ergebnis (306) einer semantischen Segmentierung (304) des Bildes (302) bestimmt wird, wobei durch den wenigstens einen Parameter (308A, 308B) und wenigstens einen Teil des Bildes ein Bildbereich definiert ist, wobei die Objektdetektion (310) abhängig vom Parameter in wenigstens einem Teil des Bildbereichs ausgeführt wird.Device and computer-implemented method for object detection, wherein at least one parameter (308A, 308B) depends on the scaling of at least a part of an image, a weighting for a probability with which it is a specific object, or a number of image areas to be determined for the object detection is determined by a result (306) of a semantic segmentation (304) of the image (302), an image area being defined by the at least one parameter (308A, 308B) and at least part of the image, the object detection (310) being dependent on Parameter is executed in at least a part of the image area.

Description

Stand der TechnikState of the art

Um Objekte in einem Bild zu detektieren, werden Algorithmen eingesetzt, welche als Detektoren bezeichnet werden. Bei herkömmlicher Objektdetektion wird ein Objekt detektiert, indem Koordinaten einer das Objekt umgebenden Box, einer Boundingbox, und für die Box ein Wahrscheinlichkeitswert einer Objektklasse bestimmt werden. Im Automobilsektor werden z.B. andere Fahrzeuge mit solchen Objektdetektoren detektiert. Dabei muss das System Objekte in unterschiedlichen Größen detektieren. Ein Fahrzeug, das sich sehr nah an einer Kamera eines anderen Fahrzeugs befindet, stellt ein Objekt dar, das durch eine deutlich größere Boundingbox beschrieben werden muss als Objekte, die Fahrzeuge darstellen, die sich in demgegenüber größerer Entfernung zur Kamera befinden. Diese Größenvariation erhöht den Suchraum und damit die Rechenanforderung an das System.In order to detect objects in an image, algorithms are used which are referred to as detectors. In conventional object detection, an object is detected by determining the coordinates of a box surrounding the object, a bounding box, and a probability value of an object class for the box. In the automotive sector, e.g. other vehicles detected with such object detectors. The system has to detect objects of different sizes. A vehicle that is very close to a camera of another vehicle represents an object that has to be described by a much larger bounding box than objects that represent vehicles that are located at a greater distance from the camera. This size variation increases the search space and thus the computational demands on the system.

Eine Methode beispielsweise das Umfeld eines Fahrzeugs zu beschreiben, stellt die semantische Segmentierung dar. Dabei wird jeder Pixel in einem von der Kamera erfassten Bild einer Klasse zugeordnet. Das führt dazu, dass auch Pixel von sehr kleinen Objekten in großer Entfernung klassifiziert werden. Einzelne Objekte sind durch herkömmliche semantische Segmentierung nicht voneinander unterscheidbar. Das heißt, dass im Bild ein Übergang zwischen Objekten der gleichen Klasse, wie z.B. zwei überlappende Fahrzeuge, nicht erkennbar sind.One method of describing the surroundings of a vehicle, for example, is semantic segmentation. Each pixel in an image captured by the camera is assigned to a class. This means that even pixels of very small objects are classified at a great distance. Individual objects cannot be distinguished from one another by conventional semantic segmentation. This means that in the picture there is a transition between objects of the same class, e.g. two overlapping vehicles are not recognizable.

Es existieren Ansätze, welche von einer semantischen Segmentierung ausgehend durch Verbindung und Spaltung von zusammenhängenden Segmenten Informationen von Objektgrenzen extrahieren. Eine bessere Performance als solche Ansätze zeigen Deep-Learning Methoden, welche die Eigenschaft der Objektdetektion und der semantischen Segmentierung kombinieren, wodurch dann pixelgenaue Objektbeschreibungen, eine Instanzsegmentierung, erzeugt werden. Beispielsweise werden dazu z.B. MaskRCNN eingesetzt. Diese sind beispielsweise in Mask R-CNN Kaiming, He et.al. arXiv: 1703.06870 beschrieben.There are approaches which, based on semantic segmentation, extract information from object boundaries by connecting and splitting connected segments. Deep learning methods that combine the property of object detection and semantic segmentation, which then generate pixel-precise object descriptions, an instance segmentation, are better performing than such approaches. For example, e.g. MaskRCNN used. These are for example in Mask R-CNN Kaiming, He et.al. arXiv: 1703.06870.

In einer initialen Objektdetektion werden die Boundingboxen bestimmt. In einer anschließenden Objektdetektion wird innerhalb einer durch die initiale Objektdetektion gefundenen Boundingbox eine semantische Segmentierung durchgeführt. Dies setzt eine gute initiale Objektdetektion voraus, da in der initialen Objektdetektion unerkannte Objekte in der Folge nicht erkannt werden.The bounding boxes are determined in an initial object detection. In a subsequent object detection, a semantic segmentation is carried out within a bounding box found by the initial object detection. This presupposes a good initial object detection, since objects that were not recognized in the initial object detection are subsequently not recognized.

Wünschenswert ist es, die Objektdetektion demgegenüber zu verbessern.In contrast, it is desirable to improve object detection.

Offenbarung der ErfindungDisclosure of the invention

Dies wird durch den Gegenstand der unabhängigen Ansprüche erreicht.This is achieved through the subject matter of the independent claims.

Ein computerimplementiertes Verfahren zur Objektdetektion, sieht vor, dass wenigstens ein Parameter für eine Skalierung wenigstens eines Teils eines Bildes, für eine Gewichtung für eine Wahrscheinlichkeit mit der es sich um ein bestimmtes Objekt handelt, oder eine Anzahl zu bestimmender Bildbereiche für die Objektdetektion abhängig von einem Ergebnis einer semantischen Segmentierung des Bildes bestimmt wird, wobei durch den wenigstens einen Parameter und wenigstens einen Teil des Bildes ein Bildbereich definiert ist, wobei die Objektdetektion abhängig vom Parameter in wenigstens einem Teil des Bildbereichs ausgeführt wird. Semantische Segmentierung umfasst die punktgenaue Lokalisation von Objekten in Bildern. Hierbei wird für jedes Pixel eines Bildes eine Zugehörigkeit zu einer bekannten Objektklasse bestimmt. Die Objektdetektion wird nur in dem Teil des Bildes durchgeführt, der durch die semantische Segmentierung bestimmt wurde. Dadurch wird die Leistungsfähigkeit der Objektdetektion verbessert. Beispielsweise werden in einer digitalen Bildverarbeitung mit einem künstlichen neuronalen Netz durch die semantische Segmentierung des Bildes Bereiche des Bildes als Kandidaten-Boundingboxen bestimmt und Objektklassen zugeordnet. Diese auf Objektklassen bezogenen Kandidaten-Boundingboxen lokalisieren bereits ein Objekt, das anschließend in der Objektdetektion detektiert wird.A computer-implemented method for object detection provides that at least one parameter for scaling at least a part of an image, for weighting a probability with which it is a specific object, or a number of image areas to be determined for object detection depending on one The result of a semantic segmentation of the image is determined, an image area being defined by the at least one parameter and at least part of the image, the object detection being carried out in at least a part of the image area as a function of the parameter. Semantic segmentation encompasses the precise localization of objects in images. For each pixel of an image, an association with a known object class is determined. The object detection is only carried out in the part of the image that was determined by the semantic segmentation. This improves the performance of object detection. For example, in digital image processing with an artificial neural network, the semantic segmentation of the image determines areas of the image as candidate bounding boxes and assigns them to object classes. These candidate bounding boxes, which are related to object classes, already localize an object that is then detected in the object detection.

In einem Aspekt ist vorgesehen, dass der Bildbereich im Bild mittels Skalierung abhängig vom Teil des Bildes bestimmt wird. Indem das Objekt in einem skalierten Bildbereich detektiert wird, sind ansonsten in normaler Auflösung des Bildes nicht detektierbare Objekte auffindbar.In one aspect it is provided that the image area in the image is determined by means of scaling as a function of the part of the image. Since the object is detected in a scaled image area, objects that are otherwise undetectable can be found in the normal resolution of the image.

Vorzugsweise wird in diesem Aspekt eine Größe des Teils des Bildes mit einer Referenzgröße verglichen, wobei der Teil des Bildes durch die Skalierung vergrößert wird, wenn die Größe die Referenzgröße unterschreitet oder wobei der Teil des Bildes durch die Skalierung verkleinert wird, wenn der Bildbereich die Referenzgröße überschreitet. Sehr kleine Teile des Bildes mit Objektpixeln können hoch skaliert werden, sodass Objekte detektierbar sind, welche in der normalen Auflösung nicht gefunden worden wären. Andererseits sind sehr große Objekte in sehr großen Teilen des Bildes, die zu groß für den Sichtbereich des Detektors sind durch niedriger Skalieren detektierbar.In this aspect, a size of the part of the image is preferably compared with a reference size, the part of the image being enlarged by the scaling if the size falls below the reference size or the part of the image being reduced by the scaling if the image area is the reference size exceeds. Very small parts of the image with object pixels can be scaled up so that objects can be detected which would not have been found in the normal resolution. On the other hand, very large objects in very large parts of the image that are too large for the field of view of the detector can be detected by low scaling.

In einem Aspekt bestimmt die semantische Segmentierung eine Objektklasse aus einer Vielzahl Objektklassen, wobei wenigstens eine Eigenschaft eines Detektors für die Objektdetektion abhängig von der Objektklasse bestimmt wird. Damit sind für diese Objektklasse besonders geeignete Eigenschaften eines Detektors einsetzbar. Dies verbessert die Objektdetektion zusätzlich.In one aspect, the semantic segmentation determines an object class from a plurality of object classes, with at least one property a detector for object detection is determined depending on the object class. Properties of a detector which are particularly suitable for this object class can therefore be used. This also improves object detection.

Vorzugsweise umfasst das Bild eine Vielzahl Pixel, wobei die semantische Segmentierung pixelweise oder superpixelweise Objektklassen zu Pixeln zuordnet. Superpixelweise bedeutet mehrere zusammengefasste Pixel werden zusammen betrachtet. Diese punktgenaue Zuordnung ermöglicht die Objektdetektion auf Objektpixeln durchzuführen. Dies verbessert die Objektdetektion zusätzlich.The image preferably comprises a plurality of pixels, the semantic segmentation assigning object classes to pixels by pixel or by super pixel. Superpixel means several combined pixels are considered together. This precise assignment enables object detection to be carried out on object pixels. This also improves object detection.

Vorzugsweise ist der Teil des Bildes durch einen zusammenhängenden Bereich von Pixeln definiert, die derselben Objektklasse zugeordnet sind. Damit wird die Objektdetektion mit Pixeln eines bestimmten detektierten Objekts dieser Objektklasse ausgeführt.The part of the image is preferably defined by a contiguous area of pixels which are assigned to the same object class. The object detection is thus carried out with pixels of a specific detected object of this object class.

Hinsichtlich der Vorrichtung zur Objektdetektion, ist vorgesehen, dass die Vorrichtung eine Klassifiziereinrichtung umfasst, die ausgebildet ist, wenigstens einen Parameter für eine Skalierung wenigstens eines Teils eines Bildes, für eine Gewichtung für eine Wahrscheinlichkeit mit der es sich um ein bestimmtes Objekt handelt, oder eine Anzahl zu bestimmender Bildbereiche für die Objektdetektion abhängig von einem Ergebnis einer semantischen Segmentierung des Bildes zu bestimmen, wobei durch den wenigstens einen Parameter und wenigstens einem Teil des Bildes ein Bildbereich definiert ist, wobei die Vorrichtung einen Detektor umfasst, der ausgebildet ist, die Objektdetektion abhängig vom Parameter in wenigstens einem Teil des Bildbereichs auszuführen. Diese Vorrichtung ist wesentlich effektiver als sequentiell arbeitende Einzelsysteme.With regard to the device for object detection, it is provided that the device comprises a classification device which is designed to include at least one parameter for scaling at least a part of an image, for a weighting for a probability with which it is a specific object, or a To determine the number of image areas to be determined for the object detection as a function of a result of a semantic segmentation of the image, an image area being defined by the at least one parameter and at least a part of the image, the device comprising a detector which is designed to depend on the object detection of the parameter in at least a part of the image area. This device is much more effective than sequentially working individual systems.

Die Vorrichtung umfasst vorteilhafterweise eine Skaliereinrichtung, die ausgebildet ist, den Bildbereich im Bild mittels Skalierung abhängig vom Teil des Bildes zu bestimmen. Die Skaliereinrichtung ermöglicht es, Teile des Bildes für eine Erkennung ansonsten unauffindbarer Objekte an den Detektor anzupassen.The device advantageously comprises a scaling device which is designed to determine the image area in the image by means of scaling as a function of the part of the image. The scaling device makes it possible to adapt parts of the image to the detector for the purpose of recognizing otherwise undetectable objects.

Vorzugsweise ist die Skaliereinrichtung ausgebildet, eine Größe des Teils des Bildes mit einer Referenzgröße zu vergleichen, den Teil des Bildes durch die Skalierung zu vergrößern, wenn die Größe die Referenzgröße unterschreitet oder den Teil des Bildes durch die Skalierung zu verkleinern, wenn der Bildbereich die Referenzgröße überschreitet. Die Referenzgröße ermöglicht eine Wahl der Richtung der Anpassung der Größe.The scaling device is preferably designed to compare a size of the part of the image with a reference size, to enlarge the part of the image by scaling if the size falls below the reference size or to reduce the part of the picture by scaling if the image area is the reference size exceeds. The reference size enables a choice of the direction of the adjustment of the size.

Die Klassifiziereinrichtung ist vorzugsweise ausgebildet, mittels semantischer Segmentierung eine Objektklasse aus einer Vielzahl Objektklassen zu bestimmen, und wenigstens eine Eigenschaft des Detektors für die Objektdetektion abhängig von der Objektklasse zu bestimmen. Damit ist der Detektor für die jeweilige Objektklasse einstellbar.The classification device is preferably designed to determine an object class from a plurality of object classes by means of semantic segmentation and to determine at least one property of the detector for object detection as a function of the object class. This means that the detector can be set for the respective object class.

Vorzugsweise umfasst das Bild eine Vielzahl Pixel, wobei die Klassifiziereinrichtung ausgebildet ist mittels semantischer Segmentierung pixelweise oder superpixelweise Objektklassen zu Pixeln zuzuordnen. Dies ermöglicht eine pixelweise Zuordnung von Parametern zu den Pixeln der verschiedenen Objektklassen.The image preferably comprises a plurality of pixels, the classification device being designed to assign object classes to pixels by means of semantic segmentation, pixel by pixel or super pixel by pixel. This enables a pixel-by-pixel assignment of parameters to the pixels of the various object classes.

Vorzugsweise ist der Teil des Bildes durch einen zusammenhängenden Bereich von Pixeln definiert, die derselben Objektklasse zugeordnet sind. Der auf dem Teil des Bildes beruhende Bildbereich ist durch den Parameter definiert. Dadurch sind die Pixel eines Objekts durch Pixel derselben Objektklasse demselben Parameter zugeordnet.The part of the image is preferably defined by a contiguous area of pixels which are assigned to the same object class. The image area based on the part of the image is defined by the parameter. As a result, the pixels of an object are assigned to the same parameter by pixels of the same object class.

Weitere vorteilhafte Ausführungsformen ergeben sich aus der folgenden Beschreibung und der Zeichnung. In der Zeichnung zeigt

1 eine schematische Darstellung von Teilen einer Vorrichtung zur Objektdetektion,
2 Schritte in einem Verfahren zur Objektdetektion,
3 eine schematische Darstellung eines Teils einer Objektdetektion,
4 eine schematische Darstellung eines Teils einer Skalierung.

Further advantageous embodiments emerge from the following description and the drawing. In the drawing shows

1 a schematic representation of parts of a device for object detection,
2 Steps in a method for object detection,
3 a schematic representation of part of an object detection,
4th a schematic representation of part of a scaling.

1 zeigt schematisch Teile einer Vorrichtung 100 zur Objektdetektion. 1 shows schematically parts of a device 100 for object detection.

Die Vorrichtung 100 umfasst eine Klassifiziereinrichtung 102. Die Vorrichtung 100 umfasst einen Detektor 104. Die Vorrichtung 100 umfasst eine optionale Skaliereinrichtung 106. Die Vorrichtung 100 umfasst eine Erzeugungseinrichtung 108. Es kann eine Eingabeeinrichtung 110 und eine Ausgabeeinrichtung 112 vorgesehen sein. Diese Einrichtungen umfassen im Beispiel einen oder mehrere Prozessoren und wenigstens einen Speicher für Instruktionen. Datenleitungen 114 verbinden diese Einrichtungen im Beispiel.The device 100 comprises a classifier 102 . The device 100 includes a detector 104 . The device 100 includes an optional scaling device 106 . The device 100 comprises a generating device 108 . It can be an input device 110 and an output device 112 be provided. In the example, these devices comprise one or more processors and at least one memory for instructions. Data lines 114 connect these facilities in the example.

Die Eingabeeinrichtung 110 ist für eine Eingabe eines Bildes ausgebildet. Das Bild kann aus einem Speicher gelesen oder von außerhalb der Vorrichtung 100 empfangen werden.The input device 110 is designed for an input of an image. The image can be read from memory or from outside the device 100 be received.

Die Klassifiziereinrichtung 102 ist ausgebildet, wenigstens einen Parameter abhängig von einem Ergebnis einer semantischen Segmentierung des Bildes zu bestimmen. Durch den Parameter ist ein Bildbereich definiert. Beispielsweise werden Parameter für eine Skalierung wenigstens eines Teils eines Bildes, für eine Gewichtung für eine Wahrscheinlichkeit mit der es sich um ein bestimmtes Objekt handelt, oder eine Anzahl zu bestimmender Bildbereiche für die Objektdetektion bestimmt.The classifier 102 is designed to determine at least one parameter as a function of a result of a semantic segmentation of the image. Through the parameter is defines an image area. For example, parameters for scaling at least part of an image, for weighting a probability with which it is a specific object, or a number of image areas to be determined for the object detection are determined.

Die Klassifiziereinrichtung 102 ist im Beispiel ein künstliches neuronales Netz.The classifier 102 is an artificial neural network in the example.

Der Bildbereich ist in einem Aspekt ein Teil des Bildes.The image area is part of the image in one aspect.

In einem anderen Aspekt wird der Bildbereich im Bild mittels Skalierung abhängig von einem Teil des Bildes bestimmt, der abhängig vom Parameter definiert ist. In diesem Aspekt ist die die Skaliereinrichtung 106 vorgesehen und zu dieser Skalierung ausgebildet.In another aspect, the image area in the image is determined by means of scaling as a function of a part of the image that is defined as a function of the parameter. In this aspect that is the scaling device 106 provided and designed for this scaling.

Die Skaliereinrichtung 106 kann ausgebildet sein, eine Größe des Teils des Bildes mit einer Referenzgröße zu vergleichen, den Teil des Bildes durch die Skalierung zu vergrößern, wenn die Größe die Referenzgröße unterschreitet oder den Teil des Bildes durch die Skalierung zu verkleinern, wenn der Bildbereich die Referenzgröße überschreitet. Die Skaliereinrichtung 106 ist im Beispiel als künstliches neuronales Netz oder Filter ausgebildet.The scaling device 106 can be designed to compare a size of the part of the image with a reference size, to enlarge the part of the image by scaling if the size falls below the reference size or to reduce the part of the image by scaling if the image area exceeds the reference size. The scaling device 106 is designed as an artificial neural network or filter in the example.

Die Klassifiziereinrichtung 102 ist in einem Aspekt ausgebildet, mittels semantischer Segmentierung eine Objektklasse aus einer Vielzahl Objektklassen zu bestimmen.The classifier 102 is designed in one aspect to determine an object class from a plurality of object classes by means of semantic segmentation.

Die Klassifiziereinrichtung 102 kann in diesem Aspekt ausgebildet sein, wenigstens eine Eigenschaft des Detektors für die Objektdetektion abhängig von der Objektklasse zu bestimmen.The classifier 102 can be designed in this aspect to determine at least one property of the detector for object detection as a function of the object class.

In einem Aspekt umfasst das Bild eine Vielzahl Pixel. In diesem Aspekt kann die die Klassifiziereinrichtung 102 ausgebildet sein, mittels semantischer Segmentierung pixelweise Objektklassen zu Pixeln zuzuordnen.In one aspect, the image includes a plurality of pixels. In this aspect, the classifier 102 be designed to assign object classes to pixels by means of semantic segmentation.

Vorzugsweise ist in diesem Aspekt der Teil des Bildes durch einen zusammenhängenden Bereich von Pixeln definiert, die derselben Objektklasse zugeordnet sind.In this aspect, the part of the image is preferably defined by a contiguous area of pixels which are assigned to the same object class.

Der Detektor 104 ist ausgebildet, die Objektdetektion abhängig vom Parameter in wenigstens einem Teil des Bildbereichs auszuführen.The detector 104 is designed to carry out the object detection as a function of the parameter in at least part of the image area.

Der Detektor 104 ist im Beispiel ein künstliches neuronales Netz.The detector 104 is an artificial neural network in the example.

Die Klassifiziereinrichtung 102 ist in einem Aspekt ausgebildet, den Parameter während der Objektdetektion zu lernen. Die Klassifiziereinrichtung 102 kann als selbstlernendes künstliches neuronales Netz ausgebildet sein.The classifier 102 is designed in one aspect to learn the parameter during the object detection. The classifier 102 can be designed as a self-learning artificial neural network.

Ein computerimplementiertes Verfahren zur Objektdetektion wird anhand der 2 beschrieben.A computer-implemented method for object detection is based on 2 described.

In einem Schritt 202 wird nach dem Start ein Ergebnis einer semantischen Segmentierung eines Bildes bestimmt. Anschließend wird ein Schritt 204 ausgeführt.In one step 202 a result of a semantic segmentation of an image is determined after the start. Then there is a step 204 executed.

Im Schritt 204 wird ein Parameter abhängig von dem Ergebnis der semantischen Segmentierung des Bildes bestimmt. Durch den Parameter ist ein Bildbereich definiert. Anschließend wird ein optionaler Schritt 206 ausgeführt.In step 204 a parameter is determined depending on the result of the semantic segmentation of the image. An image area is defined by the parameter. Then it becomes an optional step 206 executed.

Im optionalen Schritt 206 wird der Bildbereich im Bild mittels Skalierung abhängig von einem Teil des Bildes bestimmt wird, der abhängig vom Parameter definiert ist. Beispielsweise wird eine Größe des Teils des Bildes mit einer Referenzgröße verglichen wird, wobei der Teil des Bildes durch die Skalierung vergrößert wird, wenn die Größe die Referenzgröße unterschreitet oder wobei der Teil des Bildes durch die Skalierung verkleinert wird, wenn der Bildbereich die Referenzgröße überschreitet.In the optional step 206 the image area in the image is determined by means of scaling depending on a part of the image that is defined depending on the parameter. For example, a size of the part of the picture is compared with a reference size, the part of the picture being enlarged by the scaling if the size falls below the reference size or the part of the picture being reduced by the scaling if the picture area exceeds the reference size.

Anschließend wird ein Schritt 208 ausgeführt.Then there is a step 208 executed.

Wenn der optional Schritt 206 entfällt, ist wird der Teil des Bildes selbst unskaliert als der Bildbereich für die Objektdetektion verwendet.If the optional step 206 is omitted, the unscaled part of the image is used as the image area for object detection.

Im Schritt 208 wird die Objektdetektion abhängig vom Parameter in wenigstens einem Teil des Bildbereichs ausgeführt.In step 208 the object detection is carried out depending on the parameter in at least a part of the image area.

Im Schritt 202 kann vorgesehen sein, mittels semantischer Segmentierung eine Objektklasse aus einer Vielzahl Objektklassen zu bestimmen. Zudem kann, wenn das Bild eine Vielzahl Pixel umfasst, im Schritt 202 vorgesehen sein, mittels semantischer Segmentierung pixelweise Objektklassen zu Pixeln zuzuordnen. Der Teil des Bildes kann in diesem Fall durch einen zusammenhängenden Bereich von Pixeln definiert sein, die derselben Objektklasse zugeordnet sind.In step 202 It can be provided to determine an object class from a large number of object classes by means of semantic segmentation. In addition, if the image includes a plurality of pixels, step 202 be provided to assign object classes to pixels pixel by pixel by means of semantic segmentation. In this case, the part of the image can be defined by a contiguous area of pixels which are assigned to the same object class.

In diesem Fall kann im Schritt 208 vorgesehen sein, wenigstens eine Eigenschaft eines Detektors für die Objektdetektion abhängig von der Objektklasse zu bestimmen.In this case, step 208 be provided to determine at least one property of a detector for object detection depending on the object class.

Anschließend endet das Verfahren.Then the process ends.

Es kann vorgesehen sein, für die Objektdetektion in den Teilen des Bildes, in denen durch semantische Segmentierung keine Zuordnung von Pixel zu Objektklasse bestimmbar ist, einen Detektor zu verwenden, dessen Eigenschaft hinsichtlich der Vielzahl Objektklassen unspezifisch ist.Provision can be made for object detection in those parts of the image in which, due to semantic segmentation, no assignment of Pixel to object class can be determined to use a detector whose property is unspecific with regard to the large number of object classes.

In einem Aspekt ist vorgesehen, dass für eine Vielzahl Bilder eine Vielzahl Objektdetektionen durch das beschriebene Vorgehen durchgeführt wird. Wenn beispielsweise ein künstliches neuronales Netz für die semantische Segmentierung verwendet wird, werden beispielsweise als Parameter Ankerpunkte von Boundingboxen bestimmt. Für ein beispielhaftes neuronales Netz werden im Verfahren eine Vielzahl Parameter bestimmt. Dies ist gleichzusetzen mit der Erzeugung von Kandidaten-Boundingboxen. Das bedeutet, dass das künstliche neuronale Netz für semantische Segmentierung durch ein künstliches neuronales Netz ergänzt wird, welches auf Basis der semantischen Segmentierung die Erzeugung von Kandidaten-Boundingboxen lernt. Der Vorteil dabei ist es, dass große Kandidaten-Boundingboxen in großen Bereichen des Bildes erzeugt werden, die ein Objekt einer Objektklasse enthalten, wobei kleine Kandidaten-Boundingboxen in kleinen Bereichen des Bildes erzeugt werden, die ein Objekt derselben Objektklasse enthalten. Bereiche des Bildes, die keiner Objektklasse zugeordnet werden, erhalten keine Boundingbox. Durch diese reduzierte Anzahl der Kandidaten wird der Rechenaufwand und die benötigten Rechen-Ressourcen für die anschließende Objekterkennung um ein Vielfaches reduziert.In one aspect it is provided that a plurality of object detections are carried out for a plurality of images using the procedure described. If, for example, an artificial neural network is used for the semantic segmentation, anchor points of bounding boxes are determined as parameters, for example. A large number of parameters are determined in the method for an exemplary neural network. This equates to the creation of candidate bounding boxes. This means that the artificial neural network for semantic segmentation is supplemented by an artificial neural network which learns to generate candidate bounding boxes on the basis of the semantic segmentation. The advantage here is that large candidate bounding boxes are generated in large areas of the image that contain an object of one object class, while small candidate bounding boxes are generated in small areas of the image that contain an object of the same object class. Areas of the image that are not assigned to any object class are not given a bounding box. As a result of this reduced number of candidates, the computing effort and the computing resources required for the subsequent object recognition are reduced many times over.

3 zeigt eine schematische Darstellung eines Teils einer Objektdetektion. In 3 ist ein System dargestellt, welches die Objektdetektion und die semantische Segmentierung so kombiniert, dass die Vorteile beider Ansätze ideal ausgenutzt werden. 3 shows a schematic representation of part of an object detection. In 3 a system is shown that combines object detection and semantic segmentation in such a way that the advantages of both approaches are ideally exploited.

Für ein Bild 302 wird eine semantische Segmentierung 304 durchgeführt, um einen Suchbereich für die Objektdetektion im gesamten Bild zu definieren und dadurch den Suchraum der Objektdetektion zu reduzieren. Ein aus der semantischen Segmentierung 304 resultierendes Bild 306 ist in 3 schematisch dargestellt. Die semantische Segmentierung 304 wird beispielsweise durch die Klassifiziereinrichtung 102 bestimmt. Im Beispiel werden ein erster Bereich 306A und ein zweiter Bereich 306B zusammenhängender Pixel bestimmt. Eine Fläche des ersten Bereichs 306A ist gegenüber einer Fläche des zweiten Bereichs 306B größer. Beiden Bereichen wird im Beispiel dieselbe Objektklasse „Fahrzeug“ zugeordnet. Es ist auch möglich, dass eine Vielzahl voneinander verschiedener oder derselben Objekte im Bild 302 enthalten sind. In diesem Fall sind eine Vielzahl verschiedener Objektklassen zu einer Vielzahl verschiedener Bereiche zugeordnet.For a picture 302 becomes a semantic segmentation 304 carried out in order to define a search area for object detection in the entire image and thereby reduce the search area for object detection. One from semantic segmentation 304 resulting image 306 is in 3 shown schematically. The semantic segmentation 304 is for example by the classifier 102 certainly. The example shows a first area 306A and a second area 306B contiguous pixels determined. A face of the first region 306A is opposite a face of the second area 306B greater. In the example, the same object class “vehicle” is assigned to both areas. It is also possible to have a large number of objects that are different from one another or the same in the image 302 are included. In this case, a large number of different object classes are assigned to a large number of different areas.

Basierend auf dem Bild 302 wird der Parameter für die Objektdetektion 310 bestimmt. Im Beispiel wird dem ersten Bereich 306A eine Vielzahl erster Boundingboxes 308A in einem Eingangsbild 308 für eine Objektdetektion 310 zugeordnet. Im Beispiel wird dem zweiten Bereich 306B eine Vielzahl zweiter Boundingboxes 308B im Eingangsbild 308 für die Objektdetektion 310 zugeordnet. Für die Objektdetektion 310 der Bereiche 306A und 306B ist damit die Vielzahl der Kandidaten-Boundingboxes und ihre jeweilige Objektklasse bekannt.Based on the picture 302 becomes the parameter for object detection 310 certainly. The example is the first area 306A a large number of first bounding boxes 308A in an input image 308 for object detection 310 assigned. The example is the second area 306B a multitude of second bounding boxes 308B in the input image 308 for object detection 310 assigned. For object detection 310 of areas 306A and 306B This means that the large number of candidate bounding boxes and their respective object class are known.

In diesem Beispiel wird für den kleinen zweiten Bereich 306AB ein Detektor für die Objektdetektion 310 eingesetzt, welche für die Detektion von kleinen Objekten eingestellt ist. In diesem Beispiel wird für den demgegenüber größeren ersten Bereich 306A ein Detektor für die Objektdetektion 310 eingesetzt, welche für die Detektion von größeren Objekten eingestellt ist. In diesem Fall wird ein Teil des Bildes 302 als Bildbereich der Objektdetektion 310 zugeführt.In this example, a detector for object detection is used for the small second area 306AB 310 used, which is set for the detection of small objects. In this example, the first area is larger 306A a detector for object detection 310 used, which is set for the detection of larger objects. In this case it becomes part of the picture 302 as the image area of the object detection 310 fed.

Optional wird anstelle oder zusätzlich zur Anpassung des Detektors, der Bildbereich 306A oder 306B passend skaliert. Dies ist in 4 schematisch dargestellt. 4 zeigt die Skalierung eines Teils des Bildes 402 in eine skalierten Bildbereich 402'. In diesem Fall unterscheidet sich die Auflösung oder Fläche Bildbereichs 402 von der Auflösung oder Fläche des Bildbereichs 402` am Eingang der Objektdetektion. In diesem Fall wird der Teil des Bildes 402 als skalierter Bildbereich 402' der Objektdetektion zugeführt. Ein anderer in Figure 4 ebenfalls dargestellter Teil des Bildes 404 wird der Objektdetektion unskaliert zugeführt. Der skalierte Bildbereich 402' umfasst im Beispiel zwei einander überlappende Boundingboxen 402A und 402B, für im skalierten Bildbereich 402' besser als im unskalierten Bildbereich 402 erkennbare einander überlappende Fahrzeuge. Die Skalierung vergrößert in diesem Beispiel den unskalierten Bildbereich 402 hinsichtlich Fläche und Auflösung. Dadurch wird eine anschließende Objektdetektion vereinfacht.The image area is optionally available instead of or in addition to adapting the detector 306A or 306B appropriately scaled. This is in 4th shown schematically. 4th shows the scaling of part of the image 402 in a scaled image area 402 ' . In this case, the resolution or area of the image area is different 402 on the resolution or area of the image area 402 'at the entrance of the object detection. In this case, that part of the picture 402 as scaled image area 402 ' fed to the object detection. Another in Figure 4th also shown part of the picture 404 is fed to the object detection unscaled. The scaled image area 402 ' includes two overlapping bounding boxes in the example 402A and 402B , for in the scaled image area 402 ' better than in the unscaled image area 402 recognizable overlapping vehicles. In this example, the scaling increases the unscaled image area 402 in terms of area and resolution. This simplifies subsequent object detection.

Für sehr große Objekte, in denen die semantische Segmentierung beispielsweise viele Kandidaten-Boundingboxen bestimmt, die Objekte innerhalb der äußeren Umrisse eines Fahrzeugs angeben, kann es sinnvoll sein, ausgehend von einem unskalierten Bildbereich, eine Skalierung zur Verkleinerung eines skalierten Bildbereichs zu bestimmen.For very large objects in which the semantic segmentation determines, for example, many candidate bounding boxes that specify objects within the outer contours of a vehicle, it can be useful to use an unscaled image area as a starting point to determine a scaling to reduce a scaled image area.

In diesen Beispielen wird ein Eingangsbild durch die semantische Segmentierung vorzugsweise pixelweise klassifiziert, indem Bildbereiche mit Pixeln bestimmt werden, welche derselben Objektklasse, z.B. Fahrzeug, zugeordnet sind. Diese Bildbereiche werden im Beispiel der Größe ihrer Fläche nach in unterschiedliche Kategorien eingeteilt. Flächenmäßig große zusammenhängende Bereiche werden optional herunterskaliert und einem Detektor der Objektklasse, im Beispiel Fahrzeug, zugeführt. Der Detektor wird mit den optimalen Parametern für diese Größe und/oder Objektklasse betrieben. Mit kleinen Bereichen wird entsprechend verfahren, wobei diese Bilder wahlweise hochskaliert werden, um die sehr kleinen Objekte der Objektklasse, im Beispiel Fahrzeug, zu detektieren.In these examples, an input image is classified by the semantic segmentation, preferably pixel by pixel, in that image areas are determined with pixels belonging to the same object class, e.g. Vehicle, are assigned. In the example, these image areas are divided into different categories according to their area. Coherent areas of large area are optionally scaled down and fed to a detector of the object class, in the example vehicle. The detector is operated with the optimal parameters for this size and / or object class. Small areas are dealt with accordingly, with these images optionally being scaled up in order to detect the very small objects of the object class, in the example vehicle.

Daraus resultiert eine Objektdetektion, welche in einem Bild Objekte unterschiedlichster Größe erfassen kann. Für die Funktion der Objektdetektion ist die vorherige Aufteilung auf die Boundingbox-Ebene günstig, da eine Anzahl der Kandidaten-Boundingboxen nach der semantischen Segmentierung geringer ist, als es durch eine Bestimmung von Boundingboxen ohne die semantische Segmentierung möglich ist. Die führt zu einer deutlichen Reduzierung der Laufzeit.This results in object detection that can capture objects of the most varied of sizes in one image. The previous division on the bounding box level is favorable for the function of the object detection, since a number of candidate bounding boxes after the semantic segmentation is less than is possible by determining bounding boxes without the semantic segmentation. This leads to a significant reduction in the running time.

Bei der hier beschriebenen Methode ist es möglich die Ausgaben beider Algorithmen, d.h. des Algorithmus für die semantische Segmentierung und für die Objektdetektion, gleichwertig zu behandeln und durch die gewonnene Redundanz eine höhere Stabilität des Systems zu erreichen. Die Objektdetektion kann in den Bereichen, in denen die semantische Segmentierung fälschlicherweise keine Pixel der Objektklasse, im Beispiel Fahrzeug, klassifiziert hat, mit den Standardparametern durchgeführt werden und bei einer sehr hohen Detektionswahrscheinlichkeit für ein Objekt der Objektklasse die semantische Segmentierung überstimmen.With the method described here it is possible to output both algorithms, i.e. of the algorithm for semantic segmentation and for object detection, to be treated equally and to achieve a higher stability of the system through the redundancy obtained. The object detection can be carried out with the standard parameters in those areas in which the semantic segmentation has incorrectly classified any pixels of the object class, in the example vehicle, and overrule the semantic segmentation with a very high detection probability for an object of the object class.

Wie in 5 schematisch dargestellt, wird die semantische Segmentierung 304 durch eine Kandidaten-Erzeuger 502 ergänzt, welcher auf Basis der semantischen Segmentierung 304 die Erzeugung von Kandidaten-Boundingboxen 308A, 308B lernt. Im Beispiel ist ein künstliches neuronales Netz mit entsprechenden Features vorgesehen. Die übrige Vorgehensweise ist wie für 3 beschrieben, wobei in 3 und in 5 dieselben Bezugszeichen für die Elemente mit derselben Funktion verwendet werden. Der Kandidaten-Erzeuger 502 ist ausgebildet, wie oben beschrieben große Kandidaten-Boundingboxen in großen Bereichen der Objektklasse, im Beispiel Fahrzeug, erzeugt werden und kleine Kandidaten-Boundingboxen in kleinen Bereichen. Im Beispiel ist die Klassifiziereinrichtung 102 ausgebildet, Ankerpunkte für die Kandidaten-Boundingboxen als Parameter während der Objektdetektion zu lernen.As in 5 The semantic segmentation is shown schematically 304 by a candidate producer 502 added, which is based on semantic segmentation 304 the creation of candidate bounding boxes 308A , 308B learns. In the example, an artificial neural network with corresponding features is provided. The rest of the procedure is as for 3 described, whereby in 3 and in 5 the same reference numerals are used for the elements with the same function. The candidate producer 502 is designed, as described above, large candidate bounding boxes are generated in large areas of the object class, in the example vehicle, and small candidate bounding boxes in small areas. In the example is the classifier 102 trained to learn anchor points for the candidate bounding boxes as parameters during object detection.

Dieses System ist beispielsweise für Personendetektion im Überwachungsbereich, in der Robotik oder im Automobilsektor einsetzbar, insbesondere wenn die Objektgrößen besonders im Nah- und Fernbereich sehr stark variieren. Weiterhin ist gerade auf der Autobahn bei hohen Geschwindigkeiten die Detektion von Fahrzeugen auch in der Ferne notwendig. Beispielsweise ist ein Einsatz bei automated emergency breaking sehr vorteilhaft.This system can be used, for example, for person detection in the surveillance area, in robotics or in the automotive sector, especially when the object sizes vary greatly, especially in the near and far range. Furthermore, the detection of vehicles from a distance is necessary, especially on the motorway at high speeds. For example, use in automated emergency breaking is very advantageous.

Claims

Computer-implemented method for object detection, characterized in that at least one parameter for scaling at least part of an image, for weighting a probability with which it is a specific object, or a number of image areas to be determined for object detection depending on a result a semantic segmentation of the image is determined, an image area being defined by the at least one parameter and at least a part of the image, the object detection being carried out as a function of the parameter in at least a part of the image area.

Procedure according to Claim 1 , characterized in that the image area in the image is determined by means of scaling depending on the part of the image.

Procedure according to Claim 2 , characterized in that a size of the part of the image is compared with a reference size, wherein the part of the image is enlarged by the scaling if the size falls below the reference size or wherein the part of the image is reduced by the scaling if the image area is Reference size exceeds.

Method according to one of the preceding claims, characterized in that the semantic segmentation determines an object class from a plurality of object classes, at least one property of a detector for object detection being determined as a function of the object class.

Method according to one of the preceding claims, characterized in that the image comprises a plurality of pixels, the semantic segmentation assigning object classes to pixels by pixel or by superpixel.

Procedure according to Claim 5 , characterized in that the part of the image is defined by a contiguous area of pixels which are assigned to the same object class.

Device (100) for object detection, characterized in that the device (100) comprises a classification device (102) which is designed to include at least one parameter for scaling at least part of an image, for weighting a probability with which it is a specific object is involved, or to determine a number of image areas to be determined for object detection depending on a result of a semantic segmentation of the image, wherein an image area is defined by the at least one parameter and at least part of the image, wherein the device (100) comprises a detector (104) which is designed to carry out the object detection as a function of the parameter in at least a part of the image area.

Device (100) after Claim 7 , characterized in that the device (100) comprises a scaling device (104) which is designed to determine the image area in the image by means of scaling as a function of the part of the image.

Device (100) after Claim 8 , characterized in that the scaling device (104) is designed to compare a size of the part of the image with a reference size, to enlarge the part of the image by the scaling if the size falls below the reference size or to reduce the part of the image by the scaling when the image area exceeds the reference size.

Device (100) according to one of the Claims 7 to 9 , characterized in that the classification device (102) is designed to determine an object class from a plurality of object classes by means of semantic segmentation, and to determine at least one property of the detector (104) for object detection depending on the object class.

Device (100) according to one of the Claims 7 to 10 , characterized in that the image comprises a plurality of pixels, the classifying device (102) being designed to assign object classes to pixels by means of semantic segmentation, pixel by pixel or super pixel by pixel.

Device (100) after Claim 11 , characterized in that the part of the image is defined by a contiguous area of pixels which are assigned to the same object class.

Computer program, characterized in that the computer program comprises computer-readable instructions, when executed by a computer, the method according to one of the Claims 1 to 6th expires.

Computer program product, characterized in that the computer program product comprises a memory on which a computer program is Claim 13 is stored.

Computer-implemented method for at least partially autonomous movement of a vehicle, characterized in that an image is captured by a camera of the vehicle, with an object detection for the image according to the method according to one of the Claims 1 to 6th is carried out, with at least one control for the vehicle, in particular for automated braking, steering or acceleration of the vehicle being determined depending on the result of the object detection.

Control device for at least partially automated movement of a vehicle, characterized in that the control device comprises an interface for receiving an image that is captured by a camera of the vehicle, wherein the control device comprises the device for object detection for the image according to one of the Claims 7 to 12 comprises, and wherein the control device comprises an interface for an actuator, which is designed to control the vehicle depending on a result of the object detection, in particular for automated braking, steering or acceleration of the vehicle.