DE102021205271A1

DE102021205271A1 - Quality check of training data for classification models for semantic segmentation of images

Info

Publication number: DE102021205271A1
Application number: DE102021205271.1A
Authority: DE
Inventors: Andres Mauricio Munoz Delgado
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2021-05-21
Filing date: 2021-05-21
Publication date: 2022-11-24

Abstract

Verfahren (100) zur Qualitätsprüfung von Trainingsdaten (2) für ein Klassifikationsmodell (1), mit den Schritten:• es wird ein Klassifikationsmodell (1) bereitgestellt (110), das auf das Ziel trainiert wurde, die Trainingsbilder (2a) auf die jeweils zugehörigen Soll-Segmentierungskarten (2b) abzubilden;• es wird eine Relevanzbewertungsfunktion (5) bereitgestellt (120);• es wird eine Menge von Kalibrier-Bildern (6) bereitgestellt (130);• mit der Relevanzbewertungsfunktion (5) werden Kalibrier-Bewertungen (6#) ermittelt (140);• mit der Relevanzbewertungsfunktion (5) wird eine Test-Bewertung (2a#) dahingehend ermittelt (150), in welchem Maße Anteile eines Trainingsbildes (2a) relevant für die Entscheidungen des Klassifikationsmodells (1) sind, die durch die Auswahlmaske (2c) angegebenen Pixel des Trainingsbildes (2a) jeweils in Klassen (4) einzuteilen;• in Antwort darauf, dass die Test-Bewertung (2a#) im Einklang mit den Kalibrier-Bewertungen (6#) steht (160), wird festgestellt (170), dass der durch die Auswahlmaske (2c) angegebene Anteil der Soll-Segmentierungskarte (2b) zum Trainingsbild (2a) korrekt ist;• andernfalls wird festgestellt (180), dass der durch die Auswahlmaske (2c) angegebene Anteil der Soll-Segmentierungskarte (2b) zum Trainingsbild (2a) nicht korrekt ist.Method (100) for quality testing of training data (2) for a classification model (1), with the steps: • a classification model (1) is provided (110) that has been trained on the target, the training images (2a) on the respective associated target segmentation maps (2b);• a relevance evaluation function (5) is provided (120);• a set of calibration images (6) is provided (130);• calibration evaluations are made with the relevance evaluation function (5). (6#) determined (140);• with the relevance evaluation function (5), a test evaluation (2a#) is determined (150) to the extent to which parts of a training image (2a) are relevant for the decisions of the classification model (1). dividing the pixels of the training image (2a) specified by the selection mask (2c) into classes (4);• in response to the test score (2a#) being consistent with the calibration scores (6#) ( 160), it is stated (170) that s the portion of the target segmentation map (2b) indicated by the selection mask (2c) is correct for the training image (2a);• otherwise it is determined (180) that the portion of the target segmentation map (2b) indicated by the selection mask (2c) for the training image (2a) is incorrect.

Description

Die vorliegende Erfindung betrifft die Qualitätsprüfung von Trainingsdaten für Klassifikationsmodelle dahingehend, ob diese Trainingsdaten korrekte Soll-Segmentierungskarten für die jeweiligen Trainingsbilder als „Labels“ enthalten.The present invention relates to the quality check of training data for classification models to determine whether this training data contains correct target segmentation maps for the respective training images as “labels”.

Stand der TechnikState of the art

Das Führen eines Fahrzeugs im öffentlichen Straßenverkehr ist eine komplexe Aufgabe, die eine kontinuierliche Erfassung des Fahrzeugumfelds und eine zeitnahe Reaktion auf das Auftauchen von Objekten, wie etwa Verkehrszeichen, sowie auf das Verhalten anderer Verkehrsteilnehmer erfordert. Voraussetzung für eine korrekte Reaktion ist, dass Objekte und andere Verkehrsteilnehmer korrekt klassifiziert werden, also beispielsweise ein Stoppschild immer als ein Stoppschild erkannt wird.Driving a vehicle on public roads is a complex task that requires continuous detection of the vehicle's surroundings and a prompt reaction to the appearance of objects, such as traffic signs, and the behavior of other road users. A prerequisite for a correct reaction is that objects and other road users are classified correctly, for example a stop sign is always recognized as a stop sign.

Um ein Fahrzeug zumindest teilweise automatisiert zu führen, ist es erforderlich, die Klassifikation von Objekten, die der Mensch bereits lange vor der ersten Fahrstunde erlernt, maschinell nachzubilden. Die WO 2018/184 963 A2 offenbart ein Verfahren, mit dem Objekte im Umfeld eines Fahrzeugs mit künstlichen neuronalen Netzen, KNN, erkannt werden können. Mit KNNs können insbesondere semantische Segmentierungskarten erstellt werden, die die Pixel eines Eingabe-Bildes in Klassen einteilen, welche wiederum beispielsweise Typen von Objekten repräsentieren. Derartige KNN werden dann mit Trainingsdaten trainiert, die Trainingsbilder und zugehörige Soll-Segmentierungskarten mit Labels für jedes Pixel der Trainingsbilder enthalten. In order to drive a vehicle at least partially automatically, it is necessary to mechanically reproduce the classification of objects that humans have learned long before the first driving lesson. the WO 2018/184 963 A2 discloses a method with which objects in the environment of a vehicle can be recognized with artificial neural networks, ANN. In particular, semantic segmentation maps can be created with ANNs, which divide the pixels of an input image into classes, which in turn represent, for example, types of objects. Such ANNs are then trained with training data that contain training images and associated target segmentation maps with labels for each pixel of the training images.

Der Trainingserfolg ist grundsätzlich stark davon abhängig, dass die Labels in den Trainingsdaten im Hinblick auf die jeweilige Anwendung zutreffend sind. Die DE 10 2019 204 139 A1 offenbart ein Verfahren zum Trainieren eines KNN, das besonders robust gegen Fehler in den Labels ist.The success of the training is fundamentally dependent on the labels in the training data being correct with regard to the respective application. the DE 10 2019 204 139 A1 discloses a method for training an ANN that is particularly robust against errors in the labels.

Offenbarung der ErfindungDisclosure of Invention

Im Rahmen der Erfindung wurde ein Verfahren zur Qualitätsprüfung von Trainingsdaten für ein Klassifikationsmodell entwickelt. Dieses Klassifikationsmodell ist dazu ausgebildet, individuellen Pixeln eines Eingabe-Bildes jeweils eine Klasse zuzuordnen und so eine semantische Segmentierungskarte des Eingabe-Bildes zu erstellen. Die Trainingsdaten enthalten Trainingsbilder und zugehörige Soll-Segmentierungskarten.A method for quality testing of training data for a classification model was developed as part of the invention. This classification model is designed to assign a class to each individual pixel of an input image and thus create a semantic segmentation map of the input image. The training data includes training images and associated target segmentation maps.

Für dieses Verfahren wird zunächst ein Klassifikationsmodell bereitgestellt, das auf das Ziel trainiert wurde, die Trainingsbilder auf die jeweils zugehörigen Soll-Segmentierungskarten abzubilden. Das Verhalten des Klassifikationsmodells wird somit durch die in den Trainingsdaten enthaltenen Zuordnungen von Soll-Segmentierungskarten zu Trainingsbildern bestimmt.For this method, a classification model is first provided that has been trained with the aim of mapping the training images onto the respectively associated target segmentation maps. The behavior of the classification model is thus determined by the assignments of target segmentation maps to training images contained in the training data.

Es wird eine Relevanzbewertungsfunktion bereitgestellt. Zu einem Eingabe-Bild und einer binären Auswahlmaske von Pixeln dieses Eingabe-Bildes gibt diese Relevanzbewertungsfunktion an, in welchem Maße Anteile des Eingabe-Bildes relevant für die Entscheidungen des Klassifikationsmodells sind, die durch die Auswahlmaske angegebenen Pixel des Eingabe-Bildes jeweils in Klassen einzuteilen.A relevance rating function is provided. For an input image and a binary selection mask of pixels of this input image, this relevance evaluation function indicates the extent to which parts of the input image are relevant for the classification model's decisions to classify the pixels of the input image specified by the selection mask into classes .

Die Motivation für die Relevanzbewertungsfunktion ist, dass die Klassifikations-Scores allein noch keine zuverlässige Beurteilung erlauben, ob die verwendeten Trainingsdaten mit im Kontext der jeweiligen Anwendung richtigen Soll-Klassifikations-Scores gelabelt sind. Eine grundsätzliche Tendenz von Bildklassifikatoren, die vorliegenden Trainingsdaten „auswendig zu lernen“ („Overfitting“), ist unabhängig davon, ob die Labels korrekt sind oder nicht. Daher werden hier nicht die Klassifikations-Scores, sondern die Ausgabe der Relevanzbewertungsfunktion analysiert.The motivation for the relevance evaluation function is that the classification scores alone do not allow a reliable assessment of whether the training data used are labeled with the correct target classification scores in the context of the respective application. A fundamental tendency of image classifiers to “memorize” (“overfitting”) the available training data is independent of whether the labels are correct or not. Therefore, it is not the classification scores that are analyzed here, but the output of the relevance rating function.

Für die Einteilung beispielsweise einer zusammenhängenden Gruppe von Pixeln, die ein bestimmtes Objekt repräsentieren, in eine Klasse, die den Typ des Objekts repräsentiert, sind häufig nicht nur diese Pixel selbst relevant. Vielmehr ergibt sich die Einteilung eines Objekts häufig auch aus dessen Kontext im Eingabe-Bild. So lässt sich beispielsweise nur anhand derjenigen Pixel, die zu einer Person gehören, nicht zweifelsfrei feststellen, ob es sich bei dieser Person um einen Fußgänger, einen Radfahrer oder einen E-Scooter-Fahrer handelt. Die Unterscheidung zwischen einem Fußgänger und einem E-Scooter-Fahrer ist besonders schwierig, da sich die Person in beiden Fällen in einer stehenden Position befindet. Die Unterscheidung ist aber für die Beurteilung einer Verkehrssituation wichtig, da ein E-Scooter-Fahrer wesentlich schneller unterwegs ist als ein Fußgänger und den Radweg oder, wo ein Radweg fehlt, die Fahrbahn benutzen muss. Lediglich anhand weiterer Pixel, die zum E-Scooter gehören, lässt sich die Person als E-Scooter-Fahrer identifizieren. Analoges gilt für einen Radfahrer, der auch erst durch die zum Fahrrad gehörenden Pixel von einer aus anderen Gründen sitzenden Person unterschieden werden kann.For the classification, for example, of a coherent group of pixels that represent a specific object into a class that represents the type of object, it is often not just these pixels themselves that are relevant. Rather, the classification of an object often results from its context in the input image. For example, it is not possible to determine with certainty whether this person is a pedestrian, a cyclist or an e-scooter driver based only on the pixels that belong to a person. Distinguishing between a pedestrian and a scooter rider is particularly difficult because in both cases the person is in a standing position. However, the distinction is important for assessing a traffic situation, since an e-scooter driver travels much faster than a pedestrian and has to use the cycle path or, where there is no cycle path, the lane. The person can only be identified as the e-scooter driver based on other pixels that belong to the e-scooter. The same applies to a cyclist, who can only be distinguished from a person who is sitting for other reasons by the pixels belonging to the bicycle.

Somit ist zu erwarten, dass bei einer korrekt als E-Scooter-Fahrer bzw. Radfahrer erkannten Person die Relevanzbewertungsfunktion den E-Scooter, bzw. das Fahrrad, als entscheidungsrelevant markiert. Hingegen wird die Einteilung einer fälschlicherweise als E-Scooter-Fahrer bzw. Radfahrer erkannten Person nicht von Pixeln abhängen, die zu einem E-Scooter bzw. Fahrrad gehören.It is therefore to be expected that if a person is correctly recognized as an e-scooter driver or cyclist, the relevance evaluation function will mark the e-scooter or bicycle as relevant to the decision. On the other hand, the classification of an e-scooter driver or cyclist is incorrectly recognized th person does not depend on pixels belonging to a e-scooter or bicycle.

Auch bei der Qualitätskontrolle von Produkten hängt die Einteilung bestimmter Pixel des Eingabe-Bildes häufig vom Kontext ab. Beispielsweise können bestimmte Bereiche der Oberfläche eines Bauteils mit einer Funktionsschicht vergütet sein. Ein Bild des Bauteils zeigt dann also sowohl Bereiche der Oberfläche, die mit der Funktionsschicht vergütet sind, als auch Bereiche, in denen das Grundmaterial des Bauteils sichtbar ist. Die Feststellung, dass bestimmte Pixel einen unbeschichteten Bereich erkennen lassen, lässt für sich genommen noch keine schlüssige Aussage dahingehend zu, ob es sich hierbei um einen regulär unbeschichteten Bereich oder um eine Defektstelle in der Beschichtung handelt. Wenn aber beispielsweise der unbeschichtete Bereich eine „Insel“ ist, die vollständig von einem beschichteten Bereich umgeben ist, handelt es sich mit hoher Wahrscheinlichkeit um eine Defektstelle, an der die Beschichtung abgeplatzt ist oder aus sonstigen Gründen lokal fehlt.When it comes to quality control of products, too, the classification of certain pixels in the input image often depends on the context. For example, certain areas of the surface of a component can be coated with a functional layer. An image of the component then shows both areas of the surface that are coated with the functional layer and areas in which the base material of the component is visible. The finding that certain pixels reveal an uncoated area does not in itself allow a conclusive statement as to whether this is a regular uncoated area or a defect in the coating. But if, for example, the uncoated area is an “island” that is completely surrounded by a coated area, there is a high probability that there is a defect where the coating has flaked off or is locally missing for other reasons.

Somit kann die Relevanzbewertungsfunktion bei einem korrekt als Defektstelle erkannten unbeschichteten Bereich beispielsweise das Beschichtungsmaterial, das diese Defektstelle umgibt, als entscheidungsrelevant markieren.Thus, in the case of an uncoated area correctly identified as a defect, the relevance evaluation function can mark the coating material that surrounds this defect as relevant to the decision, for example.

Die Relevanzbewertungsfunktion kann insbesondere beispielsweise so beschaffen sein, dass sie angibt, welche Pixel des Eingabe-Bildes berücksichtigt werden müssen, um alle durch die Auswahlmaske angegebenen Pixel des Eingabe-Bildes korrekt in Klassen einzuteilen.In particular, the relevance evaluation function can be designed, for example, to indicate which pixels of the input image must be taken into account in order to correctly classify all pixels of the input image indicated by the selection mask.

Für das Verfahren wird eine Menge von Kalibrier-Bildern bereitgestellt, für die bekannt ist, dass das Klassifikationsmodell den Pixeln dieser Kalibrier-Bilder jeweils die richtigen Klassen zuordnet oder zugeordnet hat. Dies können beispielsweise Bilder sein, die

• von vornherein mit Soll-Segmentierungskarten versehen waren, die individuellen Pixeln ein bekanntermaßen korrektes Klassen-Label zuordnen und
• von dem Klassifikationsmodell auf Segmentierungskarten abgebildet wurden oder werden, die diesen Soll-Segmentierungskarten zumindest näherungsweise entsprechen.

A set of calibration images is provided for the method, for which it is known that the classification model assigns or has assigned the correct classes to the pixels of these calibration images. For example, this could be images

• were provided from the outset with target segmentation maps that assign a class label that is known to be correct to individual pixels, and
• were or are mapped by the classification model to segmentation maps that at least approximately correspond to these target segmentation maps.

Als Kalibrier-Bilder können aber auch beispielsweise Bilder verwendet werden,

• für die vorab keine Soll-Segmentierungskarte bekannt war,
• für die jedoch eine nachträgliche Prüfung der vom Klassifikationsmodell ermittelten Segmentierungskarte ergibt, dass diese Segmentierungskarte korrekt ist.

However, images can also be used as calibration images, for example,

• for which no target segmentation map was previously known,
• for which, however, a subsequent check of the segmentation map determined by the classification model shows that this segmentation map is correct.

Die Menge der Kalibrier-Bilder kann dabei wesentlich kleiner sein als die Gesamtmenge der Trainingsbilder. Dafür kann in die Garantie, dass die Pixel eines jeden Kalibrier-Bildes jeweils mit den richtigen Klassen gelabelt sind, ein Aufwand investiert werden, der für die Gesamtmenge der Trainingsbilder nicht zu leisten ist. Wenn also die Aufgabe gestellt ist, eine große Menge an Trainingsbildern auf Qualität zu überprüfen, kann der erforderliche Aufwand auf den Erwerb und das Labeln einer kleinen Menge Kalibrier-Bilder fokussiert werden.The number of calibration images can be significantly smaller than the total number of training images. For this, an effort can be invested in the guarantee that the pixels of each calibration image are each labeled with the correct classes, which is not possible for the total amount of training images. Thus, if the task is to check the quality of a large amount of training images, the effort required can be focused on acquiring and labeling a small amount of calibration images.

Mit der Relevanzbewertungsfunktion werden einerseits für die Kalibrier-Bilder und vorgegebene binäre Auswahlmasken von Pixeln dieser Kalibrier-Bilder Kalibrier-Bewertungen dahingehend ermittelt, in welchem Maße Anteile der Kalibrier-Bilder relevant für die Entscheidungen des Klassifikationsmodells sind, die durch die Auswahlmaske angegebenen Pixel des Kalibrier-Bildes jeweils in Klassen einzuteilen. Aus einem Kalibrier-Bild können somit mehrere Kalibrier-Bewertungen erzeugt werden, indem jeweils mit verschiedenen binären Auswahlmasken nach den Erklärungen für Klasseneinteilungen verschiedener Mengen von Pixeln gefragt wird. Beispielsweise kann nacheinander nach Erklärungen für Klasseneinteilungen von Pixeln gefragt werden, die verschiedenen Klassen angehören. Wenn das Kalibrier-Bild mehrere Instanzen eines Objekts einer bestimmten Klasse enthält, kann nacheinander nach Erklärungen für die Klasseneinteilung dieser einzelnen Instanzen gefragt werden.With the relevance evaluation function, on the one hand, calibration evaluations are determined for the calibration images and specified binary selection masks of pixels of these calibration images, to what extent parts of the calibration images are relevant for the decisions of the classification model, the pixels of the calibration indicated by the selection mask -divide each image into classes. A number of calibration evaluations can thus be generated from a calibration image, in that different binary selection masks are used to ask for the explanations for class divisions of different sets of pixels. For example, explanations for class divisions of pixels that belong to different classes can be asked for one after the other. If the calibration image contains several instances of an object of a certain class, explanations for the classification of these individual instances can be asked for one after the other.

Mit der Relevanzbewertungsfunktion wird andererseits ausgehend von einer binären Auswahlmaske von Pixeln eines Trainingsbildes eine Test-Bewertung dahingehend ermittelt, in welchem Maße Anteile des Trainingsbildes relevant für die Entscheidungen des Klassifikationsmodells sind, die durch die Auswahlmaske angegebenen Pixel des Trainingsbildes jeweils in Klassen einzuteilen. Es kann also beispielsweise analog zu den Kalibrier-Bildern jeweils nach Erklärungen gefragt werden, warum eine bestimmte Gruppe von Pixeln als ein bestimmter Typ von Objekt klassifiziert wird.On the other hand, based on a binary selection mask of pixels of a training image, the relevance evaluation function is used to determine a test evaluation of the extent to which parts of the training image are relevant for the decisions of the classification model to divide the pixels of the training image indicated by the selection mask into classes. For example, analogous to the calibration images, explanations can be asked for why a specific group of pixels is classified as a specific type of object.

Es wird geprüft, inwieweit die Test-Bewertung im Einklang mit den Kalibrier-Bewertungen steht. In Antwort darauf, dass die Test-Bewertung im Einklang mit den Kalibrier-Bewertungen steht, wird festgestellt, dass der durch die Auswahlmaske angegebene Anteil der Soll-Segmentierungskarte zum Trainingsbild korrekt ist. Steht die Test-Bewertung hingegen nicht im Einklang mit den Kalibrier-Bewertungen, wird festgestellt, dass der durch die Auswahlmaske angegebene Anteil der Soll-Segmentierungskarte zu dem Trainingsbild nicht korrekt ist.It is checked to what extent the test rating is consistent with the calibration ratings. In response to the test score being consistent with the calibration scores, it is determined that the portion of the target segmentation map indicated by the selection mask is correct for the training image. If, on the other hand, the test evaluation is not consistent with the calibration evaluations, it is determined that the portion of the target segmentation map specified by the selection mask for the training image is incorrect.

Wenn eine hinreichende Menge an Kalibrier-Bildern vorhanden ist und eine hinreichende Zahl binärer Auswahlmasken untersucht wird, kann das Klassifikationsmodell realistischer Weise die jeweils gemäß der Maske ausgewählten Pixel nur dann in allen Fällen richtig in die Klassen einteilen, wenn es sich für seine Entscheidung jeweils auf aussagekräftige Bildbereiche stützt. Wie zuvor erläutert, können diese Bildbereiche auch den umliegenden Kontext der einzuteilenden Pixel umfassen. Das Klassifikationsmodell muss also gelernt haben, welche Bildbereiche für die Einteilung welcher Pixelgruppen in Klassen relevant ist. Diese Lernerfahrung kann beispielsweise beinhalten, dass ein E-Scooter oder ein Fahrrad eine benachbarte Person zu einem E-Scooter-Fahrer bzw. einem Radfahrer macht oder dass umliegendes Beschichtungsmaterial eine unbeschichtete Stelle einer Oberfläche zu einer Defektstelle macht. Die Lernerfahrung spiegelt sich in einer bestimmten charakteristischen Verteilung der jeweils als relevant erachteten Bildanteile wider.If a sufficient number of calibration images are available and a sufficient number of binary selection masks are examined, the classification model can realistically classify each pixel selected according to the mask correctly in all cases only if its decision relies on meaningful image areas. As discussed previously, these image areas may also include the surrounding context of the pixels to be binned. The classification model must therefore have learned which image areas are relevant for the classification of which pixel groups into classes. This learning experience can include, for example, an e-scooter or bicycle turning a neighboring person into an e-scooter rider or cyclist, or surrounding coating material turning an uncoated area of a surface into a defect. The learning experience is reflected in a certain characteristic distribution of the parts of the picture that are considered relevant.

Dies bedeutet umgekehrt, dass das Klassifikationsmodell auch ausgewählte Pixelgruppen aus anderen Bildern mit hoher Wahrscheinlichkeit richtig in Klassen einteilen wird, wenn er sich jeweils auf Bildanteile aus dieser Verteilung stützt. Für die Trainingsbilder sind diese Klasseneinteilungen durch die Soll-Segmentierungskarten gegeben, denn das Training war ja gerade darauf gerichtet, dass die Trainingsbilder zumindest näherungsweise auf diese Soll-Segmentierungskarten abgebildet werden.Conversely, this means that the classification model will also classify selected pixel groups from other images correctly with a high degree of probability if it is based on image parts from this distribution in each case. For the training images, these class divisions are given by the target segmentation maps, because the training was aimed at mapping the training images at least approximately onto these target segmentation maps.

Es kann nun der Fall eintreten, dass mit der Auswahlmaske abgefragte Bereiche eines Trainingsbildes auf der Basis ganz anderer Bildbereiche in die Klassen gemäß Soll-Segmentierungskarte eingeteilt werden, als dies auf Grund der besagten Verteilung zu erwarten ist. Dies deutet darauf hin, dass das Klassifikationsmodell eine Einteilung vornimmt, die ihm durch das Training mit falschen Labels in der Soll-Segmentierungskarte im Wege des „Overfittings“ mehr oder weniger „aufgezwungen“ wird, ohne in der Sache richtig zu sein. Dementsprechend kann in Antwort darauf, dass die Test-Bewertung nicht im Einklang mit den Kalibrier-Bewertungen steht, festgestellt werden, dass der durch die Auswahlmaske angegebene Anteil der Soll-Segmentierungskarte zum Trainingsbild nicht korrekt ist.The case can now arise that areas of a training image queried with the selection mask are divided into the classes according to the target segmentation map on the basis of completely different image areas than is to be expected on the basis of said distribution. This indicates that the classification model makes a classification that is more or less "forced" on it by way of "overfitting" through training with incorrect labels in the target segmentation map, without being correct in the matter. Accordingly, in response to the fact that the test rating does not agree with the calibration ratings, it can be determined that the portion of the target segmentation map indicated by the selection mask for the training image is incorrect.

Bildlich gesprochen, kann das Klassifikationsmodell beim Training mit falschen Labels also beispielsweise räsonieren: „Ich bin der Meinung, dass diese Gruppe von Pixeln zu einem Fußgänger gehört. Aber wenn der Chef sagt, das ist ein Radfahrer, dann wird das wohl so sein... Er hat Recht! Hier im Bild stehen überall Fahrräder herum. Das überzeugt mich. Es ist ein Radfahrer.“ Die Test-Bewertung für das entsprechende Trainingsbild markiert also eine Vielzahl von über das ganze Bild verteilten Bereichen, in denen jeweils Fahrräder zu sehen sind, als entscheidungsrelevant für die Einteilung einer einzelnen Person in die Klasse „Radfahrer“. Dies passt nicht zur Verteilung der Kalibrier-Bewertungen, die in diesem Beispiel alle gemein haben, dass immer nur ein Bereich unmittelbar um die fragliche Person herum relevant für die Entscheidung zwischen Fußgänger, Radfahrer und E-Scooter-Fahrer sein kann.Figuratively speaking, when training with incorrect labels, the classification model can reason, for example: “I think that this group of pixels belongs to a pedestrian. But if the boss says it's a cyclist, then it must be so... He's right! Here in the picture there are bicycles lying around everywhere. That convinces me. It is a cyclist.” The test evaluation for the corresponding training image thus marks a large number of areas distributed over the entire image, in which bicycles can be seen, as relevant for the decision-making for the classification of an individual person in the “cyclist” class. This does not match the distribution of the calibration ratings, which in this example all have in common that only one area immediately around the person in question can be relevant for the decision between pedestrian, cyclist and e-scooter driver.

Es wurde erkannt, dass gerade die Prüfung, ob für die Einteilung bestimmter Pixelgruppen in Klassen der richtige Bildkontext berücksichtigt wird, in einem wesentlich geringeren Maße durch ein „Overfitting“ des Klassifikationsmodells an bestimmte Trainingsdaten beeinflussbar ist als die Klasseneinteilung selbst. Gerade dann, wenn eine Klasseneinteilung objektiv falsch ist, kann sie nicht auf einem passenden Bildkontext basieren, denn dieser kann ja nicht vorhanden sein (sonst wäre die Klasseneinteilung objektiv richtig). In dem vorgenannten Beispiel, in dem der Fußgänger fälschlicherweise zum Radfahrer wird, kann das Klassifikationsmodell die Einteilung als Radfahrer, die ihm die Loss-Funktion „einbleut“, „auswendig lernen“. Jedoch „zaubert“ dies im Trainingsbild kein Fahrrad in die unmittelbare Nähe der Person, so dass sich das Klassifikationsmodell irgendeine andere Erklärung suchen muss (hier: die anderswo im Bild verteilten Fahrräder).It was recognized that the examination of whether the correct image context is taken into account for the classification of certain pixel groups into classes can be influenced to a much lesser extent by "overfitting" the classification model to certain training data than the classification itself. Especially when a classification is objectively incorrect, it cannot be based on a suitable image context, since this cannot exist (otherwise the classification would be objectively correct). In the above example, in which the pedestrian mistakenly becomes the cyclist, the classification model can "learn by heart" the classification as a cyclist, which "buzzes" him the loss function. However, this does not "conjure up" a bicycle in the immediate vicinity of the person in the training image, so that the classification model has to look for some other explanation (here: the bicycles distributed elsewhere in the image).

Die Prüfung, inwieweit die Test-Bewertung im Einklang mit den Kalibrier-Bewertungen steht, kann insbesondere beispielsweise beinhalten, aus mindestens einer Kalibrier-Bewertung mindestens ein Merkmal und/oder mindestens eine Größe auszuwerten sowie aus der Test-Bewertung ein hierzu korrespondierendes Merkmal, bzw. eine hierzu korrespondierende Größe, zu ermitteln. Es kann dann geprüft werden, inwieweit dieses korrespondierende Merkmal, bzw. diese korrespondierende Größe, im Einklang mit dem Merkmal bzw. der Größe für die Kalibrier-Bewertung steht. Auf diese Weise wird der Vergleich der Test-Bewertungen mit den Kalibrier-Bewertungen ein Stück weit abstrahiert. Dies erleichtert es insbesondere, eine bestimmte Test-Bewertung mit einer Vielzahl von Kalibrier-Bewertungen zu vergleichen, die zu inhaltlich ganz unterschiedlichen Kalibrier-Bildern gehören.The examination of the extent to which the test evaluation is consistent with the calibration evaluations can include, for example, evaluating at least one characteristic and/or at least one variable from at least one calibration evaluation and determining a corresponding characteristic from the test evaluation, or to determine a variable corresponding thereto. It can then be checked to what extent this corresponding feature or this corresponding variable is consistent with the feature or the variable for the calibration evaluation. In this way, the comparison of the test ratings with the calibration ratings is abstracted to a certain extent. In particular, this makes it easier to compare a specific test assessment with a large number of calibration assessments that belong to calibration images that have very different content.

Als Merkmale bzw. Größen eigenen sich hierbei insbesondere beispielsweise

• ein quantitativer Anteil des Kalibrier-Bildes, dem die Relevanzbewertungsfunktion eine Relevanz oberhalb eines vorgegebenen Schwellwerts zuordnet; und/oder
• Klassen, in die das Klassifikationsmodell diejenigen Pixel des Kalibrier-Bildes einteilt, denen die Relevanzbewertungsfunktion eine Relevanz oberhalb eines vorgegebenen Schwellwerts zuordnet; und/oder
• eine räumliche Lage von für die Einteilung in Klassen relevanten Pixeln relativ zu den durch die Auswahlmaske angegebenen Pixeln im Kalibrier-Bild.

In particular, for example, are suitable as features or variables

• a quantitative portion of the calibration image to which the relevance evaluation function assigns a relevance above a predetermined threshold; and or
• Classes into which the classification model divides those pixels of the calibration image to which the relevance evaluation function has relevance above a predetermined threshold; and or
• a spatial position of pixels relevant for the division into classes relative to the pixels in the calibration image indicated by the selection mask.

Beispielsweise kann die Anzahl von Pixeln und/oder sonstigen Merkmalen des Kalibrier-Bildes, denen die Relevanzbewertungsfunktion eine Relevanz oberhalb eines vorgegebenen Schwellwerts zuordnet, als quantitativer Anteil ermittelt werden.For example, the number of pixels and/or other features of the calibration image to which the relevance evaluation function assigns a relevance above a predefined threshold value can be determined as a quantitative proportion.

Beispielsweise können die Klassen, in die das Klassifikationsmodell die relevanten Pixel des Kalibrier-Bildes einteilt, als Häufigkeitsverteilung, wie etwa als Histogramm, über die Klassen erfasst werden.For example, the classes into which the classification model divides the relevant pixels of the calibration image can be recorded as a frequency distribution, such as a histogram, over the classes.

Eine räumliche Lage der entscheidungsrelevanten Pixel kann beispielsweise in Form einer Richtung und einer Entfernung zwischen den einzuteilenden Pixeln und einem Bezugspunkt der Gruppe der entscheidungsrelevanten Pixel erfasst werden. Die Richtung kann beispielsweise nach den klassischen kartesischen Richtungen Nord, Süd, Ost und West quantisiert sein. Der Bezugspunkt kann beispielsweise ein Schwerpunkt der Gruppe der entscheidungsrelevanten Pixel sein.A spatial position of the decision-relevant pixels can be recorded, for example, in the form of a direction and a distance between the pixels to be classified and a reference point of the group of decision-relevant pixels. The direction can, for example, be quantized according to the classic Cartesian directions north, south, east and west. The reference point can, for example, be a focal point of the group of pixels relevant to the decision.

Vorteilhaft wird für das Merkmal, bzw. für die Größe, eine Verteilung, und/oder eine zusammenfassende Statistik, über alle Kalibrier-Bilder ermittelt. Es wird dann geprüft, ob das korrespondierende Merkmal, bzw. die korrespondierende Größe, für das Trainingsbild im Einklang mit dieser Verteilung, bzw. mit dieser zusammenfassenden Statistik, steht. Auf diese Weise lassen sich die qualitativ stark unterschiedlichen Rollen und Herkünfte von Kalibrier-Bewertungen einerseits und Test-Bewertungen andererseits besonders gut abbilden: Jede einzelne Test-Bewertung ist mit einer Vielzahl von Kalibrier-Bewertungen zu vergleichen, denn die Kalibrier-Bewertungen definieren die Verteilung, an der sich die Test-Bewertung messen lassen muss.Advantageously, a distribution and/or summarizing statistics are determined for the feature, or for the size, for all calibration images. It is then checked whether the corresponding feature or the corresponding variable for the training image is consistent with this distribution or with this summarizing statistic. In this way, the qualitatively very different roles and origins of calibration assessments on the one hand and test assessments on the other hand can be mapped particularly well: Each individual test assessment can be compared with a large number of calibration assessments, because the calibration assessments define the distribution , against which the test evaluation must be measured.

Zu diesem Zweck kann insbesondere beispielsweise die zusammenfassende Statistik einen Mittelwert und eine Standardabweichung beinhalten. Das korrespondierende Merkmal, bzw. die korrespondierende Größe, kann dann als im Einklang mit dieser Statistik gewertet werden, wenn es in einem in Standardabweichungen bemessenen Bereich um den Mittelwert liegt.For this purpose, for example, the summarizing statistics can contain a mean value and a standard deviation. The corresponding characteristic or quantity can then be considered as being in line with this statistic if it lies within a range, measured in standard deviations, around the mean.

Das Verhalten der Relevanzbewertungsfunktion hängt immer auch ein Stück weit von dem verwendeten Klassifikationsmodell ab. Dieser Einfluss lässt sich vermindern, indem mehrere Klassifikationsmodelle bereitgestellt werden und ausgehend von jedem dieser Klassifikationsmodelle jeweils festgestellt wird, inwieweit ein Anteil der Soll-Segmentierungskarte des Trainingsbildes korrekt ist. Die ausgehend von den verschiedenen Klassifikationsmodellen erhaltenen Feststellungen können dann zu einem Endergebnis zusammengeführt werden, beispielsweise durch einen Mehrheitsentscheid.The behavior of the relevance evaluation function always depends to some extent on the classification model used. This influence can be reduced by providing a plurality of classification models and, starting from each of these classification models, determining the extent to which a portion of the target segmentation map of the training image is correct. The findings obtained from the different classification models can then be combined to form a final result, for example by a majority decision.

Besonders vorteilhaft beinhalten die mehreren Klassifikationsmodelle mehrere Abwandlungen ein und desselben Klassifikationsmodells. Dann beziehen sich die jeweils erhaltenen Feststellungen qualitativ alle auf das gleiche Klassifikationsmodell. Gleichzeitig ist das durch Zusammenführen dieser Feststellungen erhaltene Endergebnis ein Stück weit von kleinen Änderungen dieses Modells abstrahiert.The multiple classification models particularly advantageously contain multiple modifications of one and the same classification model. Then the findings obtained in each case relate qualitatively to the same classification model. At the same time, the end result obtained by bringing these findings together is somewhat abstracted from small modifications of this model.

Zu diesem Zweck kann beispielsweise ausgehend von einem Klassifikationsmodell mindestens eine Abwandlung dieses Klassifikationsmodells erzeugt werden, indem eine zufällige Auswahl von Neuronen oder anderen Verarbeitungseinheiten dieses Klassifikationsmodells deaktiviert wird („Drop-Out“).For this purpose, starting from a classification model, for example, at least one modification of this classification model can be generated by a random selection of neurons or other processing units of this classification model being deactivated (“drop out”).

Die Abwandlungen können genau wie das ursprüngliche Klassifikationsmodell als mit den zu untersuchenden Trainingsdaten trainiert angesehen werden.Just like the original classification model, the modifications can be regarded as having been trained with the training data to be examined.

Sollte im Rahmen des Verfahrens festgestellt werden, dass die Anteile von Soll-Segmentierungskarten zu einem oder mehreren Trainingsbildern nicht korrekt sind, so kann dies als Feedback genutzt werden, um das Training des Klassifikationsmodells zu verbessern und so letztlich die Genauigkeit der von diesem Modell gelieferten semantischen Segmentierung zu steigern.If it is determined during the process that the proportions of target segmentation maps for one or more training images are incorrect, this can be used as feedback to improve the training of the classification model and ultimately the accuracy of the semantic data provided by this model increase segmentation.

Zu diesem Zweck können als nicht korrekt festgestellte Anteile von Soll-Segmentierungskarten zu mindestens einem Trainingsbild verändert werden. Insbesondere kann dann, wenn es in ein und demselben Trainingsbild mehrere Bereiche mit unzutreffenden Labels in der Soll-Segmentierungskarte gibt, mit einer geeigneten Strategie eine neue Konfiguration dieser Bereiche in der Soll-Segmentierungskarte. Diese Strategie kann insbesondere beispielsweise beinhalten, dass die für die fraglichen Bereiche ermittelten neuen Soll-Klasseneinteilungen nicht untereinander widersprüchlich sind. So macht es beispielsweise keinen Sinn, eine Person als Radfahrer zu klassifizieren, während zugleich das Gefährt, auf dem sich diese Person befindet, als E-Scooter klassifiziert wird.For this purpose, portions of target segmentation maps that have been determined to be incorrect can be changed into at least one training image. In particular, if there are several areas with incorrect labels in the target segmentation map in one and the same training image, a new configuration of these areas in the target segmentation map can be achieved with a suitable strategy. This strategy can in particular include, for example, that the new target class divisions determined for the areas in question are not mutually contradictory. For example, it makes no sense to classify a person as a cyclist while at the same time classifying the vehicle that person is on as an e-scooter.

Dieses Vorgehen ist ein Stück weit analog dazu, dass ein Kandidat in einer Multiple-Choice-Prüfung die Fragen, bei deren Beantwortung er sich unsicher ist, zunächst sammelt und am Ende der Prüfung im Rahmen der noch zur Verfügung stehenden Zeit noch einmal in einer Weise überarbeitet, dass die neuen Antworten zumindest in sich konsistent sind. Im Mittel lässt sich hiermit die erreichte Punktzahl erhöhen.This procedure is somewhat analogous to a candidate in a multiple-choice First collect the questions he is unsure about answering and at the end of the exam, within the time available, revise them again in such a way that the new answers are at least internally consistent. On average, this increases the number of points achieved.

Es kann aber auch beispielsweise mindestens ein Trainingsbild, für das Anteile der Soll-Segmentierungskarte als nicht korrekt festgestellt wurden, aus den Trainingsdaten entfernt werden. Für die vom Klassifikationsmodell letztendlich erzielte Klassifikationsgenauigkeit kann das Weglassen des Trainingsbildes durchaus ein kleineres Übel sein als das Beibehalten dieses Trainingsbildes mit einer Soll-Segmentierungskarte, die zu den Soll-Segmentierungskarten anderer Trainingsbilder widersprüchlich sind.However, it is also possible, for example, to remove at least one training image from the training data for which parts of the target segmentation map were determined to be incorrect. For the classification accuracy ultimately achieved by the classification model, omitting the training image may well be a lesser evil than retaining this training image with a target segmentation map that contradicts the target segmentation maps of other training images.

Dieses Vorgehen ist ein Stück weit analog dazu, dass bei der Fusion einer Vielzahl von Bildern eines Objekts aus verschiedenen Perspektiven zu einem dreidimensionalen Modell dieses Objekts im Wege der Photogrammetrie qualitativ schlechte Bilder verworfen werden sollten. Es geht dann zwar eine Ansicht aus der jeweiligen Perspektive verloren, aber dies wirkt sich weniger nachteilig auf die Fusion aus als das Beibehalten des schlechten Bildes, das Widersprüche zu anderen Bildern erzeugen könnte.This procedure is a bit analogous to the fact that when a large number of images of an object from different perspectives are merged into a three-dimensional model of this object by means of photogrammetry, poor-quality images should be discarded. A view from each perspective is then lost, but this is less detrimental to fusion than retaining the poor image, which could create inconsistencies with other images.

Wenn die Trainingsdaten durch Anpassen von Soll-Segmentierungskarten und/oder durch das Weglassen von Trainingsbildern verändert wurden, kann das Klassifikationsmodell hiermit erneut trainiert werden. Es kann dann beispielsweise beurteilt werden, ob bei nochmaliger Durchführung des hier beschriebenen Verfahrens ein höherer Anteil der Soll-Segmentierungskarten als korrekt festgestellt wird. Es kann auch beispielsweise beurteilt werden, ob nach einem Training mit den veränderten Trainingsdaten die Klassifikationsgenauigkeit bei einer Messung mittels Test- oder Validierungsdaten besser ist als im ursprünglichen trainierten Zustand.If the training data has been changed by adjusting target segmentation maps and/or by omitting training images, the classification model can be retrained here. It can then be assessed, for example, whether a higher proportion of the target segmentation maps is found to be correct when the method described here is carried out again. For example, it can also be assessed whether, after training with the changed training data, the classification accuracy in a measurement using test or validation data is better than in the originally trained state.

In einer weiteren vorteilhaften Ausgestaltung werden dem erneut trainierten Klassifikationsmodell Bilder, die mit mindestens einem Sensor aufgenommen wurden, als Eingabe-Bilder zugeführt. Die Pixel der Eingabe-Bilder werden von dem Klassifikationsmodell in Klassen eingeteilt. Aus den so erhaltenen Segmentierungskarten wird ein Ansteuersignal gebildet. Mit diesem Ansteuersignal wird ein Fahrzeug, ein Überwachungssystem, und/oder ein System für die Qualitätskontrolle von in Serie gefertigten Produkten, angesteuert.In a further advantageous embodiment, the retrained classification model is supplied with images that were recorded with at least one sensor as input images. The pixels of the input images are divided into classes by the classification model. A control signal is formed from the segmentation maps obtained in this way. A vehicle, a monitoring system and/or a system for the quality control of series-produced products is controlled with this control signal.

Das Verfahren ist insbesondere im Zusammenhang mit der Qualitätskontrolle oder anderen Anwendungen vorteilhaft, in denen Beispiele für Assoziationen zwischen Klasseneinteilungen einerseits und Bildkontexten andererseits knapp sind. Wenn beispielsweise von dem zuvor erwähnten Produkt, bei dem ein Teil der Oberfläche mit einer Funktionsschicht vergütet ist, viele Exemplare in Serie hergestellt werden, ist es der Normalfall, dass ein geprüftes Exemplar fehlerfrei ist, und der Ausnahmefall, dass an diesem Exemplar die Beschichtung schadhaft ist. Gleichzeitig betreffen diese Schäden meistens flächenmäßig nur sehr kleine Anteile der Beschichtung. In den beschichteten Bereichen sind also sehr viel mehr Pixel in die Klasse „Beschichtung“ einzuteilen als in die Klasse „Grundwerkstoff“. Dies begünstigt ein „Overfitting“ während des Trainings, bei dem ein im Vergleich zur verfügbaren Menge an Trainingsdaten überparametrisiertes Klassifikationsmodell die Trainingsdaten mehr oder weniger „auswendig lernt“, statt aus diesen Trainingsdaten eine verallgemeinerte Lehre zu ziehen. Die Berücksichtigung der entscheidungsrelevanten Bildanteile schafft hier für die Zwecke der Beurteilung, inwieweit Soll-Segmentierungskarten im Hinblick auf die vorliegende Anwendung korrekt sind, eine zusätzliche Informationsquelle, die von einem eventuellen „Overfitting“ des Klassifikationsmodells nicht betroffen ist.The method is particularly advantageous in connection with quality control or other applications where examples of associations between classifications on the one hand and image contexts on the other hand are scarce. If, for example, the product mentioned above, where part of the surface is coated with a functional coating, is produced in large numbers, it is usually the case that an inspected sample is free of defects, and the exceptional case that the coating on this sample is defective is. At the same time, this damage usually affects only very small parts of the coating in terms of area. In the coated areas, there are many more pixels in the "Coating" class than in the "Base material" class. This favors "overfitting" during training, in which a classification model that is over-parameterized compared to the available amount of training data more or less "learns the training data by heart" instead of drawing a generalized lesson from this training data. The consideration of the decision-relevant image parts creates an additional source of information for the purpose of assessing to what extent target segmentation maps are correct with regard to the present application, which is not affected by a possible "overfitting" of the classification model.

Weiterhin gibt es gerade bei der Qualitätskontrolle auch Vorwissen darüber, welche Bildbereiche überhaupt entscheidungsrelevant sein können. So befindet sich das gefertigte Produkt typischerweise immer an der gleichen Position im Bild. Daher ist klar, welche Bildbereiche überhaupt zum Produkt gehören und welche Bildbereiche (etwa ein Hintergrund) Informationen enthalten, die nichts mit dem Produkt selbst zu tun haben.Furthermore, especially in quality control, there is also prior knowledge about which image areas can be relevant to the decision at all. The finished product is typically always in the same position in the image. It is therefore clear which image areas actually belong to the product and which image areas (e.g. a background) contain information that has nothing to do with the product itself.

Das Verfahren kann insbesondere ganz oder teilweise computerimplementiert sein. Daher bezieht sich die Erfindung auch auf ein Computerprogramm mit maschinenlesbaren Anweisungen, die, wenn sie auf einem oder mehreren Computern ausgeführt werden, den oder die Computer dazu veranlassen, das beschriebene Verfahren auszuführen. In diesem Sinne sind auch Steuergeräte für Fahrzeuge und Embedded-Systeme für technische Geräte, die ebenfalls in der Lage sind, maschinenlesbare Anweisungen auszuführen, als Computer anzusehen.In particular, the method can be fully or partially computer-implemented. The invention therefore also relates to a computer program with machine-readable instructions which, when executed on one or more computers, cause the computer or computers to carry out the method described. In this sense, control devices for vehicles and embedded systems for technical devices that are also able to execute machine-readable instructions are also to be regarded as computers.

Ebenso bezieht sich die Erfindung auch auf einen maschinenlesbaren Datenträger und/oder auf ein Downloadprodukt mit dem Computerprogramm. Ein Downloadprodukt ist ein über ein Datennetzwerk übertragbares, d.h. von einem Benutzer des Datennetzwerks downloadbares, digitales Produkt, das beispielsweise in einem Online-Shop zum sofortigen Download feilgeboten werden kann.The invention also relates to a machine-readable data carrier and/or a download product with the computer program. A downloadable product is a digital product that can be transmitted over a data network, i.e. can be downloaded by a user of the data network and that can be offered for sale in an online shop for immediate download, for example.

Weiterhin kann ein Computer mit dem Computerprogramm, mit dem maschinenlesbaren Datenträger bzw. mit dem Downloadprodukt ausgerüstet sein.Furthermore, a computer with the computer program with the machine-readable data carrier or be equipped with the download product.

Weitere, die Erfindung verbessernde Maßnahmen werden nachstehend gemeinsam mit der Beschreibung der bevorzugten Ausführungsbeispiele der Erfindung anhand von Figuren näher dargestellt.Further measures improving the invention are presented in more detail below together with the description of the preferred exemplary embodiments of the invention with the aid of figures.

Ausführungsbeispieleexemplary embodiments

Es zeigt:

1 Ausführungsbeispiel des Verfahrens 100;
2 Beispielhafte Unterscheidung zwischen Trainingsbildern 2a und 2a' mit korrekten (k) bzw. inkorrekten (i) Soll-Segmentierungskarten 2b.

It shows:

1 embodiment of the method 100;
2 Exemplary distinction between training images 2a and 2a' with correct (k) or incorrect (i) target segmentation maps 2b.

1 ist ein schematisches Ablaufdiagramm eines Ausführungsbeispiels des Verfahrens 100 zur Qualitätsprüfung von Trainingsdaten 2 für ein Klassifikationsmodell 1 für die semantische Segmentierung von Eingabe-Bildern 3. 1 is a schematic flowchart of an embodiment of the method 100 for quality testing of training data 2 for a classification model 1 for the semantic segmentation of input images 3.

Zu den Trainingsdaten 2, die Trainingsbilder 2a und zugehörige Soll-Segmentierungskarten 2b enthalten, wird in Schritt 110 ein auf diesen Trainingsdaten trainiertes Klassifikationsmodel bereitgestellt. Gemäß Block 111 können auch mehrere Klassifikationsmodelle 1 bereitgestellt werden, wobei es sich gemäß Block 111a insbesondere beispielsweise um Abwandlungen 1' eines Klassifikationsmodells 1 handeln kann. Derartige Abwandlungen 1' können durch Deaktivieren (Drop-Out) einer zufälligen Auswahl von Neuronen oder anderen Verarbeitungseinheiten aus dem ursprünglichen Klassifikationsmodell 1 erzeugt werden. Dies hat den Vorteil, dass diese Abwandlungen 1' ausgehend von einem mit den Trainingsdaten 2 trainierten Zustand des ursprünglichen Klassifikationsmodells 1 ebenfalls als mit den Trainingsdaten 2 trainiert behandelt werden können.In step 110, a classification model trained on this training data is provided for the training data 2, which contains training images 2a and associated target segmentation maps 2b. According to block 111, a plurality of classification models 1 can also be provided, which, according to block 111a, can in particular be modifications 1′ of a classification model 1, for example. Such modifications 1' can be generated by deactivating (dropping out) a random selection of neurons or other processing units from the original classification model 1. This has the advantage that these modifications 1 ′, starting from a state of the original classification model 1 trained with the training data 2 , can also be treated as having been trained with the training data 2 .

In Schritt 120 wird die Relevanzbewertungsfunktion 5 bereitgestellt, die zu einem Eingabe-Bild 3 des Klassifikationsmodells 1 und einer binären Auswahlmaske 3a von Pixeln dieses Eingabe-Bildes 3 die für die Einteilung dieser ausgewählten Pixel in Klassen 4 entscheidungsrelevanten Anteile des Eingabe-Bildes angibt.In step 120, the relevance evaluation function 5 is provided, which for an input image 3 of the classification model 1 and a binary selection mask 3a of pixels of this input image 3 specifies the parts of the input image that are relevant for the classification of these selected pixels into classes 4 .

In Schritt 130 wird eine Menge von Kalibrier-Bildern 6 bereitgestellt, für die bekannt ist, dass das Klassifikationsmodell 1 den Pixeln dieser Kalibrier-Bilder 6 jeweils die richtigen Klassen 4 zuordnet oder zugeordnet hat.In step 130 a set of calibration images 6 is provided for which it is known that the classification model 1 assigns or has assigned the correct classes 4 to the pixels of these calibration images 6 in each case.

In Schritt 140 werden mit der Relevanzbewertungsfunktion 5 für die Kalibrier-Bilder 6 und vorgegebene binäre Auswahlmasken 6a von Pixeln dieser Kalibrier-Bilder Kalibrier-Bewertungen 6# ermittelt. Diese Kalibier-Bewertungen 6# geben die für die Einteilung der gemäß den Masken 6a ausgewählten Pixel in Klassen 4 entscheidungsrelevanten Anteile der Kalibrier-Bilder 6 an.In step 140, calibration evaluations 6# are determined with the relevance evaluation function 5 for the calibration images 6 and predefined binary selection masks 6a of pixels of these calibration images. These calibration evaluations 6# indicate the portions of the calibration images 6 that are relevant for the classification of the pixels selected according to the masks 6a into classes 4.

In Schritt 150 wird mit der Relevanzbewertungsfunktion 5 ausgehend von einer binären Auswahlmaske 2c von Pixeln eines Trainingsbildes 2a eine Test-Bewertung 2a# ermittelt. Diese Test-Bewertung 2a# gibt an, in welchem Maße Anteile des Trainingsbildes 2a relevant für die Entscheidungen des Klassifikationsmodells 1 sind, die durch die Auswahlmaske 2c angegebenen Pixel des Trainingsbildes 2a jeweils in Klassen 4 einzuteilen.In step 150, a test evaluation 2a# is determined with the relevance evaluation function 5 based on a binary selection mask 2c of pixels of a training image 2a. This test evaluation 2a# indicates the extent to which parts of the training image 2a are relevant for the decisions of the classification model 1 to classify the pixels of the training image 2a indicated by the selection mask 2c into classes 4 respectively.

In Schritt 160 wird geprüft, ob die Test-Bewertung 2a# in Einklang mit den Kalibrier-Bewertungen 6# steht. Ist dies der Fall (Wahrheitswert 1), wird in Schritt 170 festgestellt, dass der durch die Auswahlmaske 2c angegebene Anteil der Soll-Segmentierungskarte 2b zum Trainingsbild 2a korrekt ist. Andernfalls (Wahrheitswert 0) wird in Schritt 180 festgestellt, dass der durch die Auswahlmaske 2c angegebene Anteil der Soll-Segmentierungskarte 2b zum Trainingsbild 2a nicht korrekt ist.In step 160 it is checked whether the test score 2a# is consistent with the calibration scores 6#. If this is the case (truth value 1), it is determined in step 170 that the portion of the target segmentation map 2b specified by the selection mask 2c for the training image 2a is correct. Otherwise (truth value 0), it is determined in step 180 that the portion of the target segmentation map 2b specified by the selection mask 2c for the training image 2a is incorrect.

Gemäß Block 161 kann aus mindestens einer Kalibrier-Bewertung 6# mindestens ein Merkmal und/oder mindestens eine Größe ausgewertet werden. Gemäß Block 162 kann dann aus der Test-Bewertung 2a# ein hierzu korrespondierendes Merkmal, bzw. eine hierzu korrespondierende Größe, ausgewertet werden. Gemäß Block 163 kann dann geprüft werden, inwieweit dieses korrespondierende Merkmal, bzw. diese korrespondierende Größe, im Einklang mit dem Merkmal bzw. der Größe für die Kalibrier-Bewertung 6# steht.According to block 161, at least one feature and/or at least one variable can be evaluated from at least one calibration evaluation 6#. According to block 162, a feature corresponding thereto or a variable corresponding thereto can then be evaluated from the test evaluation 2a#. According to block 163, it can then be checked to what extent this corresponding feature or this corresponding variable is consistent with the feature or the variable for the calibration evaluation 6#.

Hierbei kann insbesondere gemäß Block 163a für das Merkmal, bzw. für die Größe, eine Verteilung, und/oder eine zusammenfassende Statistik, über alle Kalibrier-Bilder 6# ermittelt werden. Es kann dann gemäß Block 163b geprüft werden, ob das korrespondierende Merkmal, bzw. die korrespondierende Größe, im Einklang mit dieser Verteilung, bzw. mit dieser zusammenfassenden Statistik, steht.In this case, in particular according to block 163a, a distribution and/or summarizing statistics for the feature or for the size can be determined for all calibration images 6#. According to block 163b, it can then be checked whether the corresponding feature or the corresponding variable is consistent with this distribution or with this summarizing statistic.

Gemäß Block 165 können Feststellungen, inwieweit Anteile der Soll-Segmentierungskarte 2b des Trainingsbildes 2a korrekt sind, für verschiedene Klassifikationsmodelle 1, wie etwa durch Drop-Out erzeugte Abwandlungen 1' eines Klassifikationsmodells 1, zu einem Endergebnis 7 zusammengeführt werden.According to block 165, determinations as to the extent to which parts of the target segmentation map 2b of the training image 2a are correct can be combined to form a final result 7 for various classification models 1, such as modifications 1′ of a classification model 1 generated by drop-out.

Insoweit Anteile von Soll-Segmentierungskarten (2b) als nicht korrekt identifiziert werden, können diese gemäß Block 190a verändert werden. Alternativ oder in Kombination hierzu können zugehörige Trainingsbilder 2a aus den Trainingsdaten 2 entfernt werden. Mit den solchermaßen veränderten Trainingsdaten 2' kann das Klassifikationsmodell 1 in Schritt 200 erneut trainiert werden.Insofar as parts of target segmentation maps (2b) are identified as incorrect, they can be changed according to block 190a. Alternatively or in combination with this, added relevant training images 2a are removed from the training data 2. The classification model 1 can be trained again in step 200 with the training data 2 ′ changed in this way.

In Schritt 210 werden dem erneut trainierten Klassifikationsmodell 1* Bilder, die mit mindestens einem Sensor 9 aufgenommen wurden, als Eingabe-Bilder 3 zugeführt. In Schritt 220 werden die Pixel der Eingabe-Bilder 3 von dem Klassifikationsmodell 1* in Klassen 4 eingeteilt. In Schritt 230 wird aus den so erhaltenen Segmentierungskarten ein Ansteuersignal 230a gebildet. In Schritt 240 wird ein Fahrzeug 50, ein Überwachungssystem 60, und/oder ein System 70 für die Qualitätskontrolle von in Serie gefertigten Produkten, mit dem Ansteuersignal 230a angesteuert.In step 210, images that were recorded with at least one sensor 9 are supplied as input images 3 to the retrained classification model 1*. In step 220 the pixels of the input images 3 are divided into classes 4 by the classification model 1*. In step 230, a control signal 230a is formed from the segmentation maps obtained in this way. In step 240, a vehicle 50, a monitoring system 60, and/or a system 70 for the quality control of mass-produced products is controlled with the control signal 230a.

2 verdeutlicht an einem einfachen Beispiel, wie ein Trainingsbild 2a mit korrekter Soll-Segmentierungskarte 2b durch das zuvor beschriebene Verfahren von einem Trainingsbild 2a' mit nicht korrekter (inkorrekt, i) Soll-Segmentierungskarte 2b unterschieden werden kann. 2 Using a simple example, illustrates how a training image 2a with a correct target segmentation map 2b can be distinguished from a training image 2a′ with an incorrect (incorrect, i) target segmentation map 2b using the method described above.

Es sind beispielhaft drei Kalibrier-Bilder 6, 6', 6" aufgezeichnet, die alle eine Person 10 zeigen. Die binäre Auswahlmaske 6a, 6a', 6a" für den Bildanteil, der in Klassen 4 eingeteilt werden soll, umfasst jeweils diese Person 10.Three calibration images 6, 6', 6" are recorded as an example, all of which show a person 10. The binary selection mask 6a, 6a', 6a" for the image portion that is to be divided into classes 4 includes this person 10 in each case .

Kalibrier-Bild 6 zeigt die Person 10 auf einem E-Scooter 11. Daher werden die mit der Auswahlmaske 6a ausgewählten Pixel richtigerweise in die Klasse „E-Scooter-Fahrer“ eingeteilt. Entscheidungsrelevant hierfür ist der Bildanteil 6#, der den E-Scooter 11 zeigt.Calibration image 6 shows the person 10 on an e-scooter 11. The pixels selected with the selection mask 6a are therefore correctly classified in the “e-scooter driver” class. The image portion 6#, which shows the e-scooter 11, is relevant to the decision.

Kalibrier-Bild 6' zeigt die Person 10 auf einem Fahrrad 12. Daher werden die mit der Auswahlmaske 6a' ausgewählten Pixel richtigerweise in die Klasse „Radfahrer“ eingeteilt. Entscheidungsrelevant hierfür ist der Bildanteil 6#', der das Fahrrad 12 zeigt.Calibration image 6' shows the person 10 on a bicycle 12. Therefore, the pixels selected with the selection mask 6a' are correctly divided into the "cyclist" class. The image portion 6#', which shows the bicycle 12, is relevant to the decision.

Kalibrier-Bild 6" zeigt die Person 10 als Fußgänger. Daher werden die mit der Auswahlmaske 6a" ausgewählten Pixel richtigerweise in die Klasse „Fußgänger“ eingeteilt. Entscheidungsrelevant hierfür ist ein Bildanteil 6#", der neben der Person 10 selbst auch dessen unmittelbarste Umgebung umfasst zwecks Prüfung auf Fahrräder, E-Scooter oder ähnliche Gegenstände.Calibration image 6″ shows person 10 as a pedestrian. Therefore, the pixels selected with selection mask 6a″ are correctly classified in the “pedestrian” class. An image portion 6#" is relevant to the decision, which includes not only the person 10 himself but also his immediate surroundings for the purpose of checking for bicycles, e-scooters or similar objects.

Trainingsbild 2a zeigt eine Person 10 auf einem Fahrrad 12. Für die Entscheidung, in welche Klasse 4 die mit der Auswahlmaske 2c ausgewählte Person 10 eingeteilt wird, identifiziert die Relevanzbewertungsfunktion 5 das Fahrrad 12 als entscheidungsrelevanten Bildanteil 2a#. Dies steht im Einklang mit der Verteilung der Kalibrier-Bewertungen 6#, 6" und 6"'. Somit wird festgestellt, dass die Soll-Segmentierungskarte 2b, die den zur Person 10 gehörenden Pixeln die Klasse „Radfahrer“ zuweist, korrekt (k) ist.Training image 2a shows a person 10 on a bicycle 12. For the decision as to which class 4 the person 10 selected with the selection mask 2c is divided into, the relevance evaluation function 5 identifies the bicycle 12 as a decision-relevant image part 2a#. This is consistent with the distribution of the 6#, 6" and 6"' calibration scores. It is thus determined that the target segmentation map 2b, which assigns the “cyclist” class to the pixels belonging to the person 10, is correct (k).

Trainingsbild 2a' zeigt einen Person 10 als Fußgänger sowie zwei Fahrräder 12, die ohne Zusammenhang mit der Person 10 sind. Die Relevanzbewertungsfunktion 5 identifiziert diese Fahrräder 12 als entscheidungsrelevanten Anteil 2a#' für die Einteilung der mit der Auswahlmaske 2c ausgewählten Person 10. Dies steht nicht im Einklang mit der Verteilung der Kalibrier-Bewertungen 6#, 6" und 6"', die allesamt nur Bildanteile im unmittelbaren räumlichen Zusammenhang mit der Person 10 als entscheidungsrelevant ausweisen. Somit wird festgestellt, dass die Soll-Segmentierungskarte 2b, die den zur Person 10 gehörenden Pixeln die Klasse „Radfahrer“ zuweist, in diesem Punkt inkorrekt (i) ist.Training image 2a' shows a person 10 as a pedestrian and two bicycles 12 that are not related to person 10. The relevance evaluation function 5 identifies these bicycles 12 as a decision-relevant part 2a#' for the classification of the person 10 selected with the selection mask 2c. This is not consistent with the distribution of the calibration evaluations 6#, 6" and 6"', all of which are only Identify parts of the image in the direct spatial connection with the person 10 as relevant to the decision. It is thus established that the target segmentation map 2b, which assigns the “cyclist” class to the pixels belonging to the person 10, is incorrect (i) in this point.

ZITATE ENTHALTEN IN DER BESCHREIBUNGQUOTES INCLUDED IN DESCRIPTION

Diese Liste der vom Anmelder aufgeführten Dokumente wurde automatisiert erzeugt und ist ausschließlich zur besseren Information des Lesers aufgenommen. Die Liste ist nicht Bestandteil der deutschen Patent- bzw. Gebrauchsmusteranmeldung. Das DPMA übernimmt keinerlei Haftung für etwaige Fehler oder Auslassungen.This list of documents cited by the applicant was generated automatically and is included solely for the better information of the reader. The list is not part of the German patent or utility model application. The DPMA assumes no liability for any errors or omissions.

Zitierte PatentliteraturPatent Literature Cited

WO 2018/184963 A2 [0003]
DE 102019204139 A1 [0004]

Claims

Method (100) for checking the quality of training data (2) for a classification model (1), which is designed to assign a class (4) to individual pixels of an input image (3) and thus create a semantic segmentation map of the input image (3 ) to create, wherein the training data (2) contain training images (2a) and associated target segmentation maps (2b), with the steps: • a classification model (1) is provided (110) which has been trained for the purpose of mapping the training images (2a) onto the respectively associated target segmentation maps (2b); • a relevance evaluation function (5) is provided (120) which indicates to an input image (3) and a binary selection mask (3a) of pixels of this input image (3) to what extent (3#) proportions of the input - image (3) are relevant for the decisions of the classification model (1), to divide the pixels of the input image (3) indicated by the selection mask (3a) into classes (4); • a set of calibration images (6) is provided (130) for which it is known that the classification model (1) assigns or has assigned the correct classes (4) to the pixels of these calibration images (6); • with the relevance evaluation function (5) for the calibration images (6) and predetermined binary selection masks (6a) of pixels of these calibration images (6) calibration evaluations (6#) are determined (140) to the extent to which proportions of Calibration images (6) are relevant for the decisions of the classification model (1) to divide the pixels of the calibration image (6) indicated by the selection mask (6a) into classes (4); • with the relevance evaluation function (5), starting from a binary selection mask (2c) of pixels of a training image (2a), a test evaluation (2a#) is determined (150) to determine the extent to which parts of the training image (2a) are relevant for the decisions of the classification model (1) are to divide the pixels of the training image (2a) indicated by the selection mask (2c) into classes (4); • in response to the test score (2a#) being consistent with the calibration scores (6#) (160), it is determined (170) that the portion of the target segmentation map indicated by the selection mask (2c). (2b) is correct for the training image (2a); • otherwise it is determined (180) that the portion of the target segmentation map (2b) specified by the selection mask (2c) for the training image (2a) is incorrect.

Method (100) according to claim 1 , the relevance evaluation function (5) indicating which pixels of the input image (3) must be taken into account in order to correctly classify all pixels of the input image (3) indicated by the selection mask (3a) into classes (4).

Method (100) according to any one of Claims 1 until 2 , wherein • at least one feature and/or at least one variable is evaluated (161) from at least one calibration evaluation (6#); • from the test evaluation (2a#), a feature corresponding thereto, or a variable corresponding thereto, is evaluated (162); and • it is checked (163) to what extent this corresponding feature or variable is consistent with the feature or variable for the calibration evaluation (6#).

Method (100) according to claim 3 , wherein the feature or the size • a quantitative portion of the calibration image (6) to which the relevance evaluation function (5) assigns a relevance above a predetermined threshold value; and/or • classes (4) into which the classification model (1) divides those pixels of the calibration image (6) to which the relevance evaluation function (5) assigns a relevance above a predetermined threshold value; and/or • a spatial position of pixels relevant for the division into classes (4) relative to the pixels in the calibration image (6) specified by the selection mask (6a).

Method (100) according to any one of claims 3 until 4 , where • for the feature, or for the size, a distribution and/or summary statistics is determined (163a) for all calibration images (6#), and • it is checked (163b) whether the corresponding feature , or the corresponding quantity, is consistent with this distribution, or with this summary statistic.

Method (100) according to claim 5 , where • the summary statistic includes a mean and a standard deviation; and • the corresponding characteristic or quantity is considered consistent with that statistic if it falls within a range, measured in standard deviations, about the mean.

Method (100) according to any one of Claims 1 until 6 , wherein • several classification models (1) are provided (111); • Based on each of these classification models (1), it is determined (160) to what extent a portion of the target segmentation map (2b) of the training image (2a) is correct; and • these determinations are combined (165) into a final result (7).

Method (100) according to claim 7 , starting from a classification model (1), at least one modification (1') of this classification model (1) is generated (111a) by a random selection of neurons or other processing units of this classification model (1) being deactivated.

Method (100) according to any one of Claims 7 until 8th , whereby several determinations are brought together to a final result (7) by majority decision.

Method (100) according to any one of Claims 1 until 9 , wherein • portions of target segmentation maps (2b) that are determined to be incorrect are changed (190a) to form at least one training image (2a), and/or • at least one training image (2a) for which a portion of the target segmentation map (2b) was found to be incorrect is removed from the training data (2) (190b), and wherein • the classification model (1) is trained again (200) using the training data (2') changed in this way.

Method (100) according to claim 10 , wherein • images that were recorded with at least one sensor (9) are supplied to the retrained classification model (1*) as input images (3) (210); • the pixels of the input images (3) are divided (220) into classes (4) by the classification model (1*); • a control signal (230a) is formed (230) from the segmentation maps obtained in this way; and • a vehicle (50), a monitoring system (60), and/or a system (70) for the quality control of mass-produced products, with which the control signal (230a) is controlled (240).

Computer program containing machine-readable instructions which, when executed on one or more computers, cause the computer or computers to perform a method (100) according to one of Claims 1 until 11 to execute.

Machine-readable data carrier with the computer program claim 12 .

One or more computers with the computer program after claim 12 , and/or with the machine-readable data carrier Claim 13 .