DE102020203807A1

DE102020203807A1 - Method and apparatus for training an image classifier

Info

Publication number: DE102020203807A1
Application number: DE102020203807.4A
Authority: DE
Inventors: Konrad Groh
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2020-03-24
Filing date: 2020-03-24
Publication date: 2021-09-30

Abstract

Computerimplementiertes Verfahren zum Training eines Bildklassifikators (60), wobei der Bildklassifikator (60) zu einer Mehrzahl von Eingabebildern eine entsprechende Mehrzahl von Ausgabesignalen ermittelt, wobei ein Ausgabesignal jeweils auf Merkmalsrepräsentationen von wenigstens zwei Eingabebildern der Mehrzahl von Eingabebildern basiert, wobei die Merkmalsrepräsentationen vom Bildklassifikator (60) ermittelt werden.Computer-implemented method for training an image classifier (60), the image classifier (60) determining a corresponding plurality of output signals for a plurality of input images, an output signal being based in each case on feature representations of at least two input images of the plurality of input images, the feature representations from the image classifier ( 60) can be determined.

Description

Die Erfindung betrifft einen Verfahren zum Trainieren eines Bildklassifikators, ein Verfahren zum Betreiben eines Bildklassifikators, eine Trainingsvorrichtung, ein Computerprogramm und ein maschinenlesbares Speichermedium.The invention relates to a method for training an image classifier, a method for operating an image classifier, a training device, a computer program and a machine-readable storage medium.

Stand der TechnikState of the art

Aus der nicht vorveröffentlichten DE 10 2018 211 875 ist ein Verfahren zum Regularisieren des Trainings eines Bildklassifikators bekannt.From the not pre-published DE 10 2018 211 875 a method of regularizing the training of an image classifier is known.

Vorteile der ErfindungAdvantages of the invention

Neuronale Netze erzielen derzeit auf verschiedenen Feldern Bestleistungen in Bezug auf Klassifikationsgenaugkeit. Während des Trainings sind sie jedoch anfällig gegenüber dem Phänomen des Overfittings, also des Überanpassens an die zum Training verwendeten Daten.Neural networks are currently achieving top performances in terms of classification accuracy in various fields. During training, however, they are prone to the phenomenon of overfitting, i.e. overfitting to the data used for training.

Der Vorteil des Verfahrens mit Merkmalen des unabhängig Anspruchs 1 ist, dass ein Bildklassifikator derart während des Trainings regularisiert werden kann, dass Overfitting vermieden oder zumindest gemildert werden kann. Hierdurch steigert sich die Klassifikationsgenauigkeit des Bildklassifikators.The advantage of the method with the features of independent claim 1 is that an image classifier can be regularized during training in such a way that overfitting can be avoided or at least mitigated. This increases the classification accuracy of the image classifier.

Offenbarung der ErfindungDisclosure of the invention

In einem ersten Aspekt beschäftigt sich die Erfindung mit einem computerimplementierten Verfahren zum Training eines Bildklassifikators, wobei der Bildklassifikator zu einer Mehrzahl von Eingabebildern eine entsprechende Mehrzahl von Ausgabesignalen ermittelt, wobei ein Ausgabesignal jeweils auf Merkmalsrepräsentationen von wenigstens zwei Eingabebildern der Mehrzahl von Eingabebildern basiert, wobei die Merkmalsrepräsentationen vom Bildklassifikator ermittelt werden.In a first aspect, the invention deals with a computer-implemented method for training an image classifier, the image classifier determining a corresponding plurality of output signals for a plurality of input images, an output signal being based in each case on feature representations of at least two input images of the plurality of input images, the Feature representations are determined by the image classifier.

Unter einem Bildklassifikator kann eine Vorrichtung verstanden werden, die dahingehend ausgeprägt ist, dass sie ein oder mehrere Eingabebilder entgegennehmen kann und ein oder mehrere Ausgabesignale ermitteln kann, die eine Klassifikation der entsprechenden Eingabebilder oder Teile der entsprechenden Eingabebilder charakterisieren. Zum Beispiel kann ein Bildklassifikator verwendet werden, um zu detektieren, in welchen Teilen eines Bildes, welches als Eingabebild dem Bildklassifikator zur Verfügung gestellt wird, sich Objekte befinden.An image classifier can be understood to mean a device that is designed to accept one or more input images and determine one or more output signals that characterize a classification of the corresponding input images or parts of the corresponding input images. For example, an image classifier can be used to detect in which parts of an image, which is made available to the image classifier as an input image, objects are located.

Neben der Objektdetektion kann ein Bildklassifikator auch für andere Klassifikationsaufgaben verwendet werden, zum Beispiel für semantische Segmentierung. Hierbei klassifiziert der Bildklassifikator jeden gewünschten Punkt in einem Eingabebild, zum Beispiel jeden Pixel eines Kamerabildes, in eine gewünschte Klasse.In addition to object detection, an image classifier can also be used for other classification tasks, for example for semantic segmentation. The image classifier classifies every desired point in an input image, for example every pixel of a camera image, into a desired class.

Die Ermittlung eines Ausgabesignals durch den Bildklassifikator kann weitere Vor- und/oder Nachverarbeitungsverfahren beinhalten. Bei einer Objektdetektion können zum Beispiel Detektionen, die sich stark überlappen, zu einer Detektion zusammengefasst werden.The determination of an output signal by the image classifier can include further preprocessing and / or postprocessing methods. In the case of object detection, for example, detections that strongly overlap can be combined into one detection.

Die Eingabebilder, die dem Bildklassifikator zugeführt werden, können unterschiedliche Arten von Bilddaten darstellen, insbesondere Sensordaten, beispielsweise von einem Kamerasensor, einem Radarsensor, einem LIDAR-Sensor, einem Ultraschallsensor oder einem Infrarotkamerasensor. Auch Audioaufnahmen von Mikrofonen können als Eingabebild verwendet werden, zum Beispiel in Form von Spektralbildern.The input images that are fed to the image classifier can represent different types of image data, in particular sensor data, for example from a camera sensor, a radar sensor, a LIDAR sensor, an ultrasonic sensor or an infrared camera sensor. Audio recordings from microphones can also be used as input images, for example in the form of spectral images.

Es ist weiterhin vorstellbar, dass mehrere Arten von Eingabedaten kombiniert werden können, um ein Eingabebild für den Bildklassifikator zu erhalten.It is also conceivable that several types of input data can be combined in order to obtain an input image for the image classifier.

Alternativ können Eingabebilder mit Hilfe von computergestützten Maßnahmen synthetisch erzeugt werden. Zum Beispiel können Bilder basierend auf physikalischen Modellen berechnet bzw. gerendert werden.Alternatively, input images can be generated synthetically with the help of computer-aided measures. For example, images can be calculated or rendered based on physical models.

Es wird davon ausgegangen, dass der Bildklassifikator derart ausgeprägt ist, dass es in der Lage ist die Eingabedaten des Eingabebilds zu verarbeiten.It is assumed that the image classifier is so pronounced that it is able to process the input data of the input image.

Das Ausgabesignal kann als Prädiktion einer Eigenschaft des Eingabebilds durch den Bildklassifikator aufgefasst werden. Dabei kann das Ausgabesignal von einem gewünschten Ausgabesignal abweichen, das heißt der Bildklassifikator kann eine Fehlklassifikation ausgeben. In diesem Sinne kann die Performanz eines Bildklassifikators als die Klassifikationsgenauigkeit verstanden werden, also die Fähigkeit für ein Eingabebild eine gewünschte Klassifikation auszugeben. Im Allgemeinen ist eine hohe Performanz von höchster Wichtigkeit, da Fehlklassifikationen zu einem ungewollten und/oder gefährlichen Verhalten einer Vorrichtung führen können, die der Bildklassifikator betreibt.The output signal can be interpreted as a prediction of a property of the input image by the image classifier. In this case, the output signal can deviate from a desired output signal, that is to say the image classifier can output a misclassification. In this sense, the performance of an image classifier can be understood as the classification accuracy, i.e. the ability to output a desired classification for an input image. In general, high performance is of the utmost importance, since misclassifications can lead to undesired and / or dangerous behavior of a device operated by the image classifier.

Für das Training des Bildklassifikators kann ein Trainingsdatensatz verwendet werden, der Eingabebilder und entsprechende gewünschte Ausgabesignale umfasst. Auf Basis des Trainingsdatensatzes kann der Bildklassifikator vorzugsweise durch ein schrittweises Gradientenabstiegsverfahren auf Basis von Batches (engl. für Stapel) trainiert werden. Für einen Schritt des Gradientenabstiegsverfahrens kann aus der Menge der Trainingsdaten ein Batch mit einer vordefinierten Anzahl an Eingabebilden vorzugsweise zufällig ermittelt werden. Für die Eingabebilder des Batches kann der Bildklassifikator entsprechende Ausgabesignale ermitteln. Anschließend kann ein Kostenwert ermittelt werden, der ein Maß charakterisiert, inwiefern die Ausgabesignale mit den entsprechenden gewünschten Ausgabesignalen des Batches übereinstimmen. Das Training kann dann mit bekannten Verfahren basierend auf dem Kostenwert durchgeführt werden.For training the image classifier, a training data set can be used which comprises input images and corresponding desired output signals. On the basis of the training data set, the image classifier can preferably be trained by a step-by-step gradient descent method on the basis of batches. For one step of the gradient descent process, a batch with a predefined number of inputs can be formed from the set of training data preferably be determined randomly. The image classifier can determine corresponding output signals for the input images of the batch. A cost value can then be determined which characterizes a measure of the extent to which the output signals match the corresponding desired output signals of the batch. The training can then be performed using known methods based on the cost value.

Im Kontext der Erfindung wird verstanden, dass ein Bildklassifikator Merkmalsrepräsentationen von Eingabebildern ermitteln kann. Zum Beispiel kann der Bildklassifikator in Form eines neuronalen Netzes realisiert sein, wobei die Schichten des neuronalen Netzes jeweils Merkmalsrepräsentationen zu entsprechenden Eingabebildern ermitteln können.In the context of the invention it is understood that an image classifier can determine feature representations of input images. For example, the image classifier can be implemented in the form of a neural network, wherein the layers of the neural network can each determine feature representations for corresponding input images.

Anstatt bestimmte Merkmalsrepräsentationen eines Eingabebilds wie sonst üblich als Ausgabesignal zu verwenden, werden im ersten Aspekt der Erfindung die Merkmalsrepräsentationen anderer Eingabebilder des Batches zusätzlich verwendet, um das Ausgabesignal für das Eingabebild zu ermitteln. Hierfür kann beispielsweise eine Linearkombination der Repräsentation des Eingabebilds sowie weiteren Repräsentationen von anderen Eingabebilden verwendet werden.Instead of using certain feature representations of an input image as the usual output signal, in the first aspect of the invention the feature representations of other input images of the batch are additionally used to determine the output signal for the input image. For this purpose, for example, a linear combination of the representation of the input image and further representations of other input images can be used.

Diese Herangehensweise kann so verstanden werden, als dass die eigentliche Merkmalsrepräsentation des Eingabebildes mit einem entsprechenden Rauschen versehen wird. Dieses Hinzufügen von Rauschen hat einen regularisierenden Effekt auf das Training des Bildklassifikators. Es ist bekannt, dass Regularisierung während des Trainings dem Phänomen des Overfitting (eng. Überanpassung) entgegenwirkt und somit die Performanz des Bildklassifikators verbessert werden kann.This approach can be understood as providing the actual feature representation of the input image with corresponding noise. This addition of noise has a regularizing effect on the training of the image classifier. It is known that regularization counteracts the phenomenon of overfitting during training and thus the performance of the image classifier can be improved.

In einem weiteren Aspekt der Erfindung ist denkbar, dass die Mehrzahl der Eingabebilder eine Folge von Eingabebildern ist und ein Ausgabesignal für ein erstes Eingabebild von einer Merkmalsrepräsentation des Vorgängers und/oder einer Merkmalsrepräsentation des Nachfolgers des ersten Eingabebilds abhängt.In a further aspect of the invention, it is conceivable that the plurality of input images is a sequence of input images and an output signal for a first input image depends on a feature representation of the predecessor and / or a feature representation of the successor of the first input image.

Unter einem Vorgänger bzw. Nachfolger können Eingabebilder verstanden werden, die zum Beispiel in einem Batch dem ersten Eingabebild entsprechend vorgehen oder nachfolgen. Zu diesem Zweck kann der Batch als ringförmig verstanden werden, das heißt für das erste Element des Batches von Eingabebilden ist der Vorgänger das letzte Element des Batches und für das letzte Element des Batches ist das erste Element des Batches der Nachfolger.A predecessor or successor can be understood to mean input images which, for example, proceed or follow the first input image in a batch. For this purpose, the batch can be understood as ring-shaped, i.e. for the first element of the batch of input images the predecessor is the last element of the batch and for the last element of the batch the first element of the batch is the successor.

Der Vorteil dieser Weiterbildung ist, dass die Regularisierung von einem Computer effizient berechnet werden kann, da Merkmalsrepräsentationen von der Folge von Eingabebilden typischerweise sequentiell im Speicher des Computers abgelegt werden. Hierdurch kann der Computer schnell auf die Repräsentation des Vorgängers bzw. Nachfolgers zugreifen, wodurch das Training des Bildklassifikators beschleunigt wird. Der Vorteil eines beschleunigten Trainings liegt darin, dass für eine vordefinierte Trainingszeit der Bildklassifikator mehr Trainingsschritte durchführen kann und dadurch mehr Eingabebilder verarbeiten kann, was zu einer Extraktion von mehr Informationen aus dem Trainingsdaten und dadurch zu einer höheren Performanz des Bildklassifikators führt.The advantage of this development is that the regularization can be calculated efficiently by a computer, since feature representations from the sequence of input images are typically stored sequentially in the memory of the computer. As a result, the computer can quickly access the representation of the predecessor or successor, whereby the training of the image classifier is accelerated. The advantage of accelerated training is that the image classifier can carry out more training steps for a predefined training time and can thus process more input images, which leads to an extraction of more information from the training data and thus to a higher performance of the image classifier.

In einem weiteren Aspekt ist vorstellbar, dass ein Ausgabesignal durch eine Linearkombination der Merkmalsrepräsentationen des Vorgängers, der Merkmalsrepräsentation des Nachfolgers und der Merkmalsrepräsentation des ersten Eingabebildes ermittelt wird..In a further aspect, it is conceivable that an output signal is determined by a linear combination of the feature representations of the predecessor, the feature representation of the successor and the feature representation of the first input image.

Der Vorteil der Linearkombination liegt darin, dass die Gewichtung der einzelnen Merkmalsrepräsentationen derart angepasst werden kann, dass ein gewünschtes Rauschverhalten beim Training erreicht wird. Hierdurch kann das Training des Bildklassifikators den Trainingsdaten entsprechend angepasst werden, wodurch sich die Performanz des Bildklassifikators weiter erhöht.The advantage of the linear combination is that the weighting of the individual feature representations can be adapted in such a way that a desired noise behavior is achieved during training. As a result, the training of the image classifier can be adapted accordingly to the training data, which further increases the performance of the image classifier.

In einem weiteren Aspekt beschäftigt sich die Erfindung mit einem computerimplementierten Verfahren zum Klassifizieren eines Eingabebildes mittels eines Bildklassifikators umfassend die Schritte:

• Trainieren des Bildklassifikators gemäß einem Verfahren der vorherigen Aspekte;
• Bereitstellen des trainierten Bildklassifikators an eine Inferenzvorrichtung;
• Ermittlung eines Ausgabesignals basierend auf dem Eingabebild durch die Inferenzvorrichtung, wobei die Inferenzvorrichtung zur Ermittlung des Ausgabesignals das Eingabebild dem Bildklassifikator zuführt und der Bildklassifikator das Ausgabesignal ermittelt.

In a further aspect, the invention is concerned with a computer-implemented method for classifying an input image by means of an image classifier, comprising the steps:

• training the image classifier according to a method of the previous aspects;
• providing the trained image classifier to an inference device;
Determination of an output signal based on the input image by the inference device, the inference device supplying the input image to the image classifier to determine the output signal and the image classifier determining the output signal.

Der Vorteil der Herangehensweise dieses Aspekts ist, dass der mit einem Verfahren der vorherigen Aspekte Bildklassifikator für einzelne Eingabebilder entsprechende Ausgabesignale ermitteln kann. Hierfür kann der Bildklassifikator das Ausgabesignal unabhängig von Repräsentationen eines Vorgänger- oder Nachfolgereingabebilds ermitteln. Dadurch wird der Inferenzvorrichtung ermöglicht, dass das Ausgabesignal in Echtzeit ermittelt werden kann. Dies ist insbesondere bei sicherheitskritischen Anwendungen der Inferenzvorrichtung von höchster Wichtigkeit, da eine verzögerte Ermittlung eines Ausgabesignals zu einem ungewollten und/oder sicherheitskritischen Verhalten führen kann.The advantage of the approach of this aspect is that the image classifier for individual input images can determine corresponding output signals with a method of the previous aspects. For this purpose, the image classifier can determine the output signal independently of representations of a preceding or following input image. This enables the inference device to be able to determine the output signal in real time. This is of the greatest importance in particular in the case of safety-critical applications of the inference device, since a delayed determination of an output signal can lead to undesired and / or safety-critical behavior.

Nachfolgend werden Ausführungsformen der Erfindung unter Bezugnahme auf die beiliegenden Zeichnungen näher erläutert. In den Zeichnungen zeigen:

1 schematisch einen Aufbau eines Steuerungssystems zur Ansteuerung eines Aktors;
2 schematisch ein Ausführungsbeispiel zur Steuerung eines wenigstens teilautonomen Roboters;
3 schematisch ein Ausführungsbeispiel zur Steuerung eines Fertigungssystems;
4 schematisch ein Ausführungsbeispiel zur Steuerung eines Zugangssystems;
5 schematisch ein Ausführungsbeispiel zur Steuerung eines Überwachungssystems;
6 schematisch ein Ausführungsbeispiel zur Steuerung eines persönlichen Assistenten;
7 schematisch ein Ausführungsbeispiel zur Steuerung eines medizinisch bildgebenden Systems;
8 eine beispielhafte Ermittlung eines Kostenwerts während eines Trainingsschritts.

Embodiments of the invention are explained in more detail below with reference to the accompanying drawings. In the drawings show:

1 schematically a structure of a control system for controlling an actuator;
2 schematically an embodiment for controlling an at least partially autonomous robot;
3 schematically an embodiment for controlling a manufacturing system;
4th schematically an embodiment for controlling an access system;
5 schematically an embodiment for controlling a monitoring system;
6th schematically an embodiment for controlling a personal assistant;
7th schematically an embodiment for controlling a medical imaging system;
8th an exemplary determination of a cost value during a training step.

Beschreibung der AusführungsbeispieleDescription of the exemplary embodiments

1 zeigt einen Aktor (10) in seiner Umgebung (20) in Interaktion mit einem Steuerungssystem (40). In vorzugsweise regelmäßigen zeitlichen Abständen wird die Umgebung (20) in einem Sensor (30), insbesondere einem bildgebenden Sensor wie einem Videosensor, erfasst, der auch durch eine Mehrzahl von Sensoren gegeben sein kann, beispielsweise eine Stereokamera. Das Sensorsignal (S) - bzw. im Fall mehrerer Sensoren je ein Sensorsignal (S) - des Sensors (30) wird an das Steuerungssystem (40) übermittelt. Das Steuerungssystem (40) empfängt somit eine Folge von Sensorsignalen (S). Das Steuerungssystem (40) ermittelt hieraus Ansteuersignale (A), welche an den Aktor (10) übertragen werden. 1 shows an actuator ( 10 ) in its environment ( 20th ) in interaction with a control system ( 40 ). The environment ( 20th ) in one sensor ( 30th ), in particular an imaging sensor such as a video sensor, which can also be provided by a plurality of sensors, for example a stereo camera. The sensor signal ( S. ) - or in the case of several sensors, one sensor signal each ( S. ) - of the sensor ( 30th ) is sent to the control system ( 40 ) transmitted. The control system ( 40 ) thus receives a sequence of sensor signals ( S. ). The control system ( 40 ) uses this to determine control signals ( A. ), which are sent to the actuator ( 10 ) be transmitted.

Das Steuerungssystem (40) empfängt die Folge von Sensorsignalen (S) des Sensors (30) in einer optionalen Empfangseinheit (50), die die Folge von Sensorsignalen (S) in eine Folge von Eingangsbildern (x) umwandelt (alternativ kann auch unmittelbar je das Sensorsignal (S) als Eingangsbild (x) übernommen werden). Das Eingangsbild (x) kann beispielsweise ein Ausschnitt oder eine Weiterverarbeitung des Sensorsignals (S) sein. Das Eingangsbild (x) umfasst einzelne Frames einer Videoaufzeichnung. Mit anderen Worten wird Eingangsbild (x) abhängig von Sensorsignal (S) ermittelt. Die Folge von Eingangsbildern (x) wird einem Bildklassifikator (60), vorzugsweise einem, zugeführt.The control system ( 40 ) receives the sequence of sensor signals ( S. ) of the sensor ( 30th ) in an optional receiver unit ( 50 ), which are the sequence of sensor signals ( S. ) into a sequence of input images ( x ) (alternatively, the sensor signal ( S. ) as the input image ( x ) are accepted). The input image ( x ), for example, a section or further processing of the sensor signal ( S. ) be. The input image ( x ) comprises individual frames of a video recording. In other words, the input image ( x ) depending on the sensor signal ( S. ) determined. The sequence of input images ( x ) is an image classifier ( 60 ), preferably one.

Der Bildklassifikator (60) wird vorzugsweise parametriert durch Parameter (ϕ), die in einem Parameterspeicher (P) hinterlegt sind und von diesem bereitgestellt werden.The image classifier ( 60 ) is preferably parameterized by parameters (ϕ), which are stored in a parameter memory ( P. ) are stored and are provided by it.

Der Bildklassifikator (60) ermittelt aus den Eingangsbildern (x) Ausgabesignale (y). Die Ausgabesignale (y) werden einer optionalen Umformeinheit (80) zugeführt, die hieraus Ansteuersignale (A) ermittelt, welche dem Aktor (10) zugeführt werden, um den Aktor (10) entsprechend anzusteuern. Ausgangsgröße (y) umfasst Informationen über Objekte, die der Sensor (30) erfasst hat.The image classifier ( 60 ) determined from the input images ( x ) Output signals ( y ). The output signals ( y ) are an optional forming unit ( 80 ), which generate control signals ( A. ) determines which of the actuator ( 10 ) are fed to the actuator ( 10 ) to be controlled accordingly. Output variable ( y ) includes information about objects that the sensor ( 30th ) has recorded.

Der Aktor (10) empfängt die Ansteuersignale (A), wird entsprechend angesteuert und führt eine entsprechende Aktion aus. Der Aktor (10) kann hierbei eine (nicht notwendigerweise baulich integrierte) Ansteuerlogik umfassen, welches aus dem Ansteuersignal (A) ein zweites Ansteuersignal ermittelt, mit dem dann der Aktor (10) angesteuert wird.The actuator ( 10 ) receives the control signals ( A. ), is controlled accordingly and carries out a corresponding action. The actuator ( 10 ) can include a (not necessarily structurally integrated) control logic, which is derived from the control signal ( A. ) a second control signal is determined with which the actuator ( 10 ) is controlled.

In weiteren Ausführungsformen umfasst das Steuerungssystem (40) den Sensor (30). In noch weiteren Ausführungsformen umfasst das Steuerungssystem (40) alternativ oder zusätzlich auch den Aktor (10).In further embodiments the control system comprises ( 40 ) the sensor ( 30th ). In still further embodiments the control system comprises ( 40 ) alternatively or additionally also the actuator ( 10 ).

In weiteren bevorzugten Ausführungsformen umfasst das Steuerungssystem (40) eine Ein- oder Mehrzahl von Prozessoren (45) und wenigstens ein maschinenlesbares Speichermedium (46), auf dem Anweisungen gespeichert sind, die dann, wenn sie auf den Prozessoren (45) ausgeführt werden, das Steuerungssystem (40) veranlassen, das erfindungsgemäße Verfahren auszuführen.In further preferred embodiments, the control system comprises ( 40 ) a single or multiple processors ( 45 ) and at least one machine-readable storage medium ( 46 ), on which instructions are stored that, when they are on the processors ( 45 ) are executed, the control system ( 40 ) cause the method according to the invention to be carried out.

In alternativen Ausführungsformen ist alternativ oder zusätzlich zum Aktor (10) eine Anzeigeeinheit (10a) vorgesehen.In alternative embodiments, as an alternative or in addition to the actuator ( 10 ) a display unit ( 10a) intended.

2 zeigt, wie das Steuerungssystem (40) zur Steuerung eines wenigstens teilautonomen Roboters, hier eines wenigstens teilautonomen Kraftfahrzeugs (100), eingesetzt werden kann. 2 shows how the control system ( 40 ) to control an at least partially autonomous robot, here an at least partially autonomous motor vehicle ( 100 ), can be used.

Bei dem Sensor (30) kann es sich beispielsweise um einen vorzugsweise im Kraftfahrzeug (100) angeordneten Videosensor handeln.The sensor ( 30th ) it can be, for example, a preferably in a motor vehicle ( 100 ) arranged video sensor act.

Der Bildklassifikator (60) ist eingerichtet, aus den Eingangsbildern (x) Objekte zu identifizieren.The image classifier ( 60 ) is set up, from the input images ( x ) Identify objects.

Bei dem vorzugsweise im Kraftfahrzeug (100) angeordneten Aktor (10) kann es sich beispielsweise um eine Bremse, einen Antrieb oder eine Lenkung des Kraftfahrzeugs (100) handeln. Das Ansteuersignal (A) kann dann derart ermittelt werden, dass der Aktor oder die Aktoren (10) derart angesteuert wird, dass das Kraftfahrzeug (100) beispielsweise eine Kollision mit den vom Bildklassifikator (60) identifizierten Objekte verhindert, insbesondere, wenn es sich um Objekte bestimmter Klassen, z.B. um Fußgänger, handelt.In the case of the preferably in the motor vehicle ( 100 ) arranged actuator ( 10 ) it can be, for example, a brake, a drive or a steering of the motor vehicle ( 100 ) Act. That Control signal ( A. ) can then be determined in such a way that the actuator or actuators ( 10 ) is controlled in such a way that the motor vehicle ( 100 ) for example a collision with the image classifier ( 60 ) identified objects, especially if they are objects of certain classes, e.g. pedestrians.

Alternativ kann es sich bei dem wenigstens teilautonomen Roboter auch um einen anderen mobilen Roboter (nicht abgebildet) handeln, beispielsweise um einen solchen, der sich durch Fliegen, Schwimmen, Tauchen oder Schreiten fortbewegt. Bei dem mobilen Roboter kann es sich beispielsweise auch um einen wenigstens teilautonomen Rasenmäher oder einen wenigstens teilautonomen Putzroboter handeln. Auch in diesen Fällen kann das Ansteuersignal (A) derart ermittelt werden, dass Antrieb und/oder Lenkung des mobilen Roboters derart angesteuert werden, dass der wenigstens teilautonome Roboter beispielsweise eine Kollision mit vom Bildklassifikator (60) identifizierten Objekten verhindert.Alternatively, the at least partially autonomous robot can also be another mobile robot (not shown), for example one that moves by flying, swimming, diving or walking. The mobile robot can also be, for example, an at least partially autonomous lawnmower or an at least partially autonomous cleaning robot. In these cases, too, the control signal ( A. ) are determined in such a way that the drive and / or steering of the mobile robot are controlled in such a way that the at least partially autonomous robot, for example, has a collision with the image classifier ( 60 ) identified objects.

Alternativ oder zusätzlich kann mit dem Ansteuersignal (A) die Anzeigeeinheit (10a) angesteuert werden, und beispielsweise die ermittelten sicheren Bereiche dargestellt werden. Auch ist es beispielsweise beim einem Kraftfahrzeug (100) mit nicht automatisierter Lenkung möglich, dass die Anzeigeeinheit (10a) mit dem Ansteuersignal (A) derart angesteuert wird, dass sie ein optisches oder akustisches Warnsignal ausgibt, wenn ermittelt wird, dass das Kraftfahrzeug (100) droht, mit einem der sicher identifizierten Objekte zu kollidieren.Alternatively or additionally, the control signal ( A. ) the display unit ( 10a ) can be controlled and, for example, the determined safe areas are displayed. It is also, for example, in a motor vehicle ( 100 ) with non-automated steering it is possible that the display unit ( 10a) with the control signal ( A. ) is controlled in such a way that it emits an optical or acoustic warning signal when it is determined that the motor vehicle ( 100 ) threatens to collide with one of the objects that have been reliably identified.

3 zeigt ein Ausführungsbeispiel, in dem das Steuerungssystem (40) zur Ansteuerung einer Fertigungsmaschine (11) eines Fertigungssystems (200) verwendet wird, indem ein diese Fertigungsmaschine (11) steuernder Aktor (10) angesteuert wird. Bei der Fertigungsmaschine (11) kann es sich beispielsweise um eine Maschine zum Stanzen, Sägen, Bohren und/oder Schneiden handeln. 3 shows an embodiment in which the control system ( 40 ) to control a production machine ( 11 ) of a manufacturing system ( 200 ) is used by using this manufacturing machine ( 11 ) controlling actuator ( 10 ) is controlled. At the manufacturing machine ( 11 ) it can be, for example, a machine for punching, sawing, drilling and / or cutting.

Bei dem Sensor (30) kann es sich dann beispielsweise um einen optischen Sensor handeln, der z.B. Eigenschaften von Fertigungserzeugnissen (12a, 12b) erfasst. Es ist möglich, dass diese Fertigungserzeugnisse (12a, 12b) beweglich sind. Es ist möglich, dass der die Fertigungsmaschine (11) steuernde Aktor (10) abhängig von einer Zuordnung der erfassten Fertigungserzeugnisse (12a, 12b) angesteuert wird, damit die Fertigungsmaschine (11) entsprechend einen nachfolgenden Bearbeitungsschritt des richtigen Fertigungserzeugnisses (12a, 12b) ausführt. Es ist auch möglich, dass durch Identifikation der richtigen Eigenschaften desselben der Fertigungserzeugnisse (12a, 12b) (d.h. ohne eine Fehlzuordnung) die Fertigungsmaschine (11) entsprechend den gleichen Fertigungsschritt für eine Bearbeitung eines nachfolgenden Fertigungserzeugnisses anpasst.The sensor ( 30th ) it can then be, for example, an optical sensor that, for example, measures the properties of manufactured products ( 12a , 12b ) recorded. It is possible that these manufactured products ( 12a , 12b ) are movable. It is possible that the manufacturing machine ( 11 ) controlling actuator ( 10 ) depending on an assignment of the recorded production products ( 12a , 12b ) is controlled so that the production machine ( 11 ) corresponding to a subsequent processing step of the correct manufactured product ( 12a , 12b ) executes. It is also possible that by identifying the correct properties of the manufactured products ( 12a , 12b ) (i.e. without a misallocation) the production machine ( 11 ) adjusts the same production step accordingly for processing a subsequent production product.

4 zeigt ein Ausführungsbeispiel, bei dem das Steuerungssystem (40) zur Steuerung eines Zugangssystems (300) eingesetzt wird. Das Zugangssystem (300) kann eine physische Zugangskontrolle, beispielsweise eine Tür (401) umfassen. Videosensor (30) ist eingerichtet ist, eine Person zu erfassen. Mittels des Bildklassifikators (60) kann dieses erfasste Bild interpretiert werden. Sind mehrere Personen gleichzeitig erfasst, kann durch eine Zuordnung der Personen (also der Objekte) zueinander beispielweise die Identität der Personen besonders zuverlässig ermittelt werden, beispielsweise durch eine Analyse ihrer Bewegungen. Der Aktor (10) kann ein Schloss sein, dass abhängig vom Ansteuersignal (A) die Zugangskontrolle freigibt, oder nicht, beispielsweise die Tür (401) öffnet, oder nicht. Hierzu kann das Ansteuersignal (A) abhängig von der der Interpretation des Bildklassifikators (60) gewählt werden, beispielsweise abhängig von der ermittelten Identität der Person. An Stelle der physischen Zugangskontrolle kann auch eine logische Zugangskontrolle vorgesehen sein. 4th shows an embodiment in which the control system ( 40 ) to control an access system ( 300 ) is used. The access system ( 300 ) a physical access control, for example a door ( 401 ) include. Video sensor ( 30th ) is set up to record a person. Using the image classifier ( 60 ) this captured image can be interpreted. If several people are recorded at the same time, the identity of the people can, for example, be determined particularly reliably by assigning the people (that is to say the objects) to one another, for example by analyzing their movements. The actuator ( 10 ) can be a lock that depends on the control signal ( A. ) enables the access control or not, for example the door ( 401 ) opens or not. The control signal ( A. ) depending on the interpretation of the image classifier ( 60 ) can be selected, for example depending on the identified identity of the person. Instead of the physical access control, a logical access control can also be provided.

5 zeigt ein Ausführungsbeispiel, bei dem das Steuerungssystem (40) zur Steuerung eines Überwachungssystems (400) verwendet wird. Von dem in 5 dargestellten Ausführungsbeispiel unterscheidet sich dieses Ausführungsbeispiel dadurch, dass an Stelle des Aktors (10) die Anzeigeeinheit (10a) vorgesehen ist, die vom Steuerungssystem (40) angesteuert wird. Beispielsweise kann vom Bildklassifikator (60) eine Identität der vom Videosensor (30) aufgenommenen Gegenstände ermittelt werden, um abhängig davon z.B. darauf zu schließen, welche verdächtig werden, und das Ansteuersignal (A) dann derart gewählt werden, dass dieser Gegenstand von der Anzeigeeinheit (10a) farblich hervorgehoben dargestellt wird. 5 shows an embodiment in which the control system ( 40 ) to control a monitoring system ( 400 ) is used. From the in 5 The exemplary embodiment shown differs from this exemplary embodiment in that instead of the actuator ( 10 ) the display unit ( 10a ) is provided, which is provided by the control system ( 40 ) is controlled. For example, the image classifier ( 60 ) an identity of the video sensor ( 30th ) recorded objects can be determined, depending on, for example, which are suspicious, and the control signal ( A. ) can then be selected in such a way that this object is recognized by the display unit ( 10a ) is highlighted in color.

6 zeigt ein Ausführungsbeispiel, bei dem das Steuerungssystem (40) zur Steuerung eines persönlichen Assistenten (250) eingesetzt wird. Der Sensor (30) ist bevorzugt ein optischer Sensor, der Bilder einer Geste eines Nutzers (249) empfängt. 6th shows an embodiment in which the control system ( 40 ) to control a personal assistant ( 250 ) is used. The sensor ( 30th ) is preferably an optical sensor that receives images of a gesture by a user ( 249 ) receives.

Abhängig von den Signalen des Sensors (30) ermittelt das Steuerungssystem (40) ein Ansteuersignal (A) des persönlichen Assistenten (250), beispielsweise, indem der Bildklassifikator eine Gestenerkennung durchführt. Dem persönlichen Assistenten (250) wird dann dieses ermittelte Ansteuersignal (A) übermittelt und er somit entsprechend angesteuert. Dieses ermittelte Ansteuersignal (A) ist kann insbesondere derart gewählt werden, dass es einer vermuteten gewünschten Ansteuerung durch den Nutzer (249) entspricht. Diese vermutete gewünschte Ansteuerung kann abhängig von der vom Bildklassifikator (60) erkannten Geste ermittelt werden. Das Steuerungssystem (40) kann dann abhängig von der vermuteten gewünschten Ansteuerung das Ansteuersignal (A) zur Übermittlung an den persönlichen Assistenten (250) wählen und/oder das Ansteuersignal (A) zur Übermittlung an den persönlichen Assistenten entsprechend der vermuteten gewünschten Ansteuerung (250) wählen.Depending on the signals from the sensor ( 30th ) the control system determines ( 40 ) a control signal ( A. ) of the personal assistant ( 250 ), for example by the image classifier performing gesture recognition. The personal assistant ( 250 ) then this determined control signal ( A. ) and is thus controlled accordingly. This determined control signal ( A. ) can in particular be selected in such a way that there is a presumed desired activation by the user ( 249 ) is equivalent to. This presumed desired control can depend on the Image classifier ( 60 ) recognized gesture can be determined. The control system ( 40 ), depending on the presumed desired control, the control signal ( A. ) for transmission to the personal assistant ( 250 ) and / or the control signal ( A. ) for transmission to the personal assistant according to the presumed desired control ( 250 ) Select.

Diese entsprechende Ansteuerung kann beispielsweise beinhalten, dass der persönliche Assistent (250) Informationen aus einer Datenbank abruft und sie für den Nutzer (249) rezipierbar wiedergibt.This corresponding control can include, for example, that the personal assistant ( 250 ) Retrieves information from a database and makes it available to the user ( 249 ) reproduces in a receivable manner.

Anstelle des persönlichen Assistenten (250) kann auch ein Haushaltsgerät (nicht abgebildet), insbesondere eine Waschmaschine, ein Herd, ein Backofen, eine Mikrowelle oder eine Spülmaschine vorgesehen sein, um entsprechend angesteuert zu werden.Instead of the personal assistant ( 250 ) a household appliance (not shown), in particular a washing machine, a stove, an oven, a microwave or a dishwasher, can also be provided in order to be controlled accordingly.

7 zeigt ein Ausführungsbeispiel, bei dem das Steuerungssystem (40) zur Steuerung eines medizinischen bildgebenden Systems (500), beispielsweise eines MRT-, Röntgen- oder Ultraschallgeräts, verwendet wird. Der Sensor (30) kann beispielsweise durch einen bildgebenden Sensor gegeben sein, durch das Steuerungssystem (40) wird die Anzeigeeinheit (10a) angesteuert. Beispielsweise kann vom Bildklassifikator (60) ermittelt werden, ob ein vom bildgebenden Sensor aufgenommener Bereich auffällig ist, und das Ansteuersignal (A) dann derart gewählt werden, dass dieser Bereich von der Anzeigeeinheit (10a) farblich hervorgehoben dargestellt wird. 7th shows an embodiment in which the control system ( 40 ) to control a medical imaging system ( 500 ), for example an MRI, X-ray or ultrasound machine is used. The sensor ( 30th ) can be given, for example, by an imaging sensor, by the control system ( 40 ) the display unit ( 10a ) controlled. For example, the image classifier ( 60 ) it can be determined whether an area recorded by the imaging sensor is conspicuous, and the control signal ( A. ) can then be selected in such a way that this area is covered by the display unit ( 10a ) is highlighted in color.

8 zeigt beispielhaft die Ermittlung eines Kostenwerts (l) basierend auf einem Batch (604) von Eingabebilden (x_v , x₁ , x_n ) und gewünschten Ausgabesignalen (y_1t ) zum Training eines Bildklassifikators (60). 8th shows an example of the determination of a cost value ( l ) based on a batch ( 604 ) of input images ( x _v , x ₁ , x _n ) and desired output signals ( y _1t ) for training an image classifier ( 60 ).

Dem Bildklassifikator (60) wird der Batch (604) von Eingabebilden (x_v , x₁ , x_n ) zur Verfügung gestellt. Der Batch (604) weist eine Reihenfolgende auf, in der ein erstes Signal (x₁ ) ein Vorgängereingabebild (x_v ) und ein Nachfolgereingabebild (x_n ) aufweist. Für das erste Eingabebild (x₁ ) umfasst der Batch (604) weiterhin ein entsprechendes gewünschtes Ausgabesignal (y_1t ), welches eine gewünschte Klassifikation charakterisiert.The image classifier ( 60 ) the batch ( 604 ) of input images ( x _v , x ₁ , x _n ) made available. The batch ( 604 ) has a sequence in which a first signal ( x ₁ ) a previous input image ( x _v ) and a successor input image ( x _n ) having. For the first input image ( x ₁ ) the batch includes ( 604 ) furthermore a corresponding desired output signal ( y _1t ), which characterizes a desired classification.

Für die verschiedenen Eingabebilder (x_v ,x₁ ,x_n ) werden entsprechende Ausgabesignale (u, v, w) durch den Bildklassifikator (60) ermittelt. Hierfür ermittelt der Bildklassifikator (60) für das erste Eingabebild (x₁ ) ein erstes Ausgabesignal (v), für das Vorgängereingabebild (x_v ) ein Vorgängerausgabesignal (u) und für das Nachfolgereingabebild (x_n ) ein Nachfolgerausgabesignal (w).For the various input images ( x _v , x ₁ , x _n ) corresponding output signals ( u , v , w ) through the image classifier ( 60 ) determined. For this purpose, the image classifier determines ( 60 ) for the first input image ( x ₁ ) a first output signal ( v ), for the previous input image ( x _v ) a previous output signal ( u ) and for the subsequent input image ( x _n ) a successor output signal ( w ).

Die verschiedenen Ausgabesignale (u, v, w) können anschließend durch eine Kombinationsfunktion zusammengeführt werden, um ein kombiniertes Ausgabesignal (y₁ ) für das erstes Eingabebild (x₁ ) zu ermitteln. Dies kann zum Beispiel über folgende Linearkombination erfolgen: $y_{1} = a \cdot u + b \cdot v + c \cdot w,$

wobei y₁ das kombinierte Ausgabesignal (y₁ ), u das Vorgängerausgabesignal (u), v das erste Ausgabesignal (v) und w das Nachfolgerausgabesignal (w) ist. Die Werte α, b und c sind vordefinierte Skalare. Vorzugsweise können die vordefinierten Skalare so gewählt werden, dass b größer ist als die anderen beiden Werte.The different output signals ( u , v , w ) can then be merged using a combination function to produce a combined output signal ( y ₁ ) for the first input image ( x ₁ ) to investigate. This can be done, for example, using the following linear combination:

y_{1} = a \cdot u + b \cdot v + c \cdot w,

whereby y ₁ the combined output signal ( y ₁ ), u the previous output signal ( u ), v the first output signal ( v ) and w the successor output signal ( w ) is. The values α , b and c are predefined scalars. The predefined scalars can preferably be selected such that b is greater than the other two values.

Nachdem das kombinierte Ausgabesignal (y₁ ) ermittelt wurde, kann ermittelt werden, inwiefern es mit dem gewünschten Ausgabesignal (y_1t ) übereinstimmt. Hierfür kann eine Kostenfunktion (606) verwendet werden, zum Beispiel die Kreuzentropiefunktion. Die Kostenfunktion kann einen Kostenwert (I) ermitteln, auf dessen Basis das Training durchgeführt werden kann.After the combined output signal ( y ₁ ) has been determined, it can be determined to what extent the desired output signal ( y _1t ) matches. A cost function ( 606 ) can be used, for example the cross entropy function. The cost function can have a cost value ( I. ) on the basis of which the training can be carried out.

Es ist beispielsweise denkbar, dass auf Basis des Kostenwerts ein bekanntes Gradientenabstiegsverfahren durchgeführt wird, wie etwa SGD oder Adam.For example, it is conceivable that a known gradient descent method, such as SGD or Adam, is carried out on the basis of the cost value.

ZITATE ENTHALTEN IN DER BESCHREIBUNGQUOTES INCLUDED IN THE DESCRIPTION

Diese Liste der vom Anmelder aufgeführten Dokumente wurde automatisiert erzeugt und ist ausschließlich zur besseren Information des Lesers aufgenommen. Die Liste ist nicht Bestandteil der deutschen Patent- bzw. Gebrauchsmusteranmeldung. Das DPMA übernimmt keinerlei Haftung für etwaige Fehler oder Auslassungen.This list of the documents listed by the applicant was generated automatically and is included solely for the better information of the reader. The list is not part of the German patent or utility model application. The DPMA assumes no liability for any errors or omissions.

Zitierte PatentliteraturPatent literature cited

DE 102018211875 [0002]

Claims

Computer-implemented method for training an image classifier (60), the image classifier (60) determining a corresponding plurality of output signals for a plurality of input images, an output signal being based in each case on feature representations of at least two input images of the plurality of input images, the feature representations from the image classifier ( 60) can be determined.

Procedure according to Claim 1 , wherein the plurality of input images is a sequence of input images and an output signal (y) for a first input image (x) depends on a feature representation of the predecessor and / or a feature representation of the successor of the first input image.

Procedure according to Claim 2 , the output signal (y) being determined by a linear combination of the feature representations of the predecessor, the feature representation of the successor and the feature representation of the first input image.

Computer-implemented method for classifying an input image by means of an image classifier (60) comprising the steps of: training the image classifier (60) according to one of the Claims 1 until 3 ; • providing the trained image classifier (60) to an inference device; • Determination of an output signal (y) based on the input image (x) by the inference device, the inference device for determining the output signal (y) feeding the input image (x) to the image classifier (60) and the image classifier (60) feeding the output signal (y) determined.

Control system (40) for controlling an actuator (10) and / or a display device (10a), the control system comprising an image classifier (60) which, according to one of the Claims 1 until 3 has been trained, and the control of the actuator (10) and / or the display device is based on an output signal (y) of the image classifier (60).

Training device which is set up, the method according to one of the Claims 1 until 3 to execute.

Computer program which is set up, the method according to one of the Claims 1 until 5 to execute.

Machine-readable storage medium (46, 146) on which the computer program is after Claim 7 is stored.