DE102022128074A1

DE102022128074A1 - METHOD FOR CONTROLLING A MACHINE ELEMENT AND CONTROL ARRANGEMENT

Info

Publication number: DE102022128074A1
Application number: DE102022128074.8A
Authority: DE
Inventors: wird später genannt Erfinder
Original assignee: Sensor Technik Wiedemann GmbH
Current assignee: Sensor Technik Wiedemann GmbH
Priority date: 2022-10-24
Filing date: 2022-10-24
Publication date: 2024-04-25
Also published as: WO2024088917A1

Abstract

Die Erfindung betrifft ein Verfahren zur Steuerung eines Maschinenelements, bei dem einem Benutzer während den Schritten des Identifizierens und Kategorisierens einer Geste der Fortschritt durch eine Änderung des Rahmens um die Geste angezeigt wird.The invention relates to a method for controlling a machine element in which a user is shown the progress during the steps of identifying and categorizing a gesture by changing the frame around the gesture.

Description

Die vorliegende Erfindung betrifft ein Verfahren zur Steuerung eines Maschinenelementes. Die Erfindung betrifft weiterhin eine Steueranordnung sowie eine Computeranordnung.The present invention relates to a method for controlling a machine element. The invention further relates to a control arrangement and a computer arrangement.

HINTERGRUNDBACKGROUND

Gestensteuerung im Bereich der Robotik wird zunehmend wichtiger. Dazu gehören zum einen Kommandos, die ein Benutzer an eine Maschine nicht mehr mittels manueller Eingabe, sondern eventuell mit Gesten vollzieht. Zum anderen fallen da aber auch Gesten darunter, die eine Maschine möglichst autonom erkennen könnten sollte, z.B. wenn Gefahr im Verzug ist oder ein Benutzer währenddessen weitere Tätigkeiten ausführt.Gesture control in the field of robotics is becoming increasingly important. On the one hand, this includes commands that a user no longer gives to a machine by manual input, but perhaps with gestures. On the other hand, it also includes gestures that a machine should be able to recognize as autonomously as possible, e.g. when there is imminent danger or when a user is carrying out other activities at the same time.

Auf Seiten eines Benutzers, der mit einer Maschine interagiert steht im Vordergrund, dass die Geste zum einen zuverlässig erkannt wird, zum anderen aber auch diese Erkennung möglichst schnell erfolgt.For a user who interacts with a machine, the main concern is that the gesture is recognized reliably, but that this recognition also takes place as quickly as possible.

Es besteht demnach das Bedürfnis, diesen Aspekten zumindest zum Teil Rechnung zu tragen, und so die Interaktion zwischen Mensch und Maschine zu verbessern.There is therefore a need to take these aspects into account, at least in part, and thus improve the interaction between man and machine.

ZUSAMMENFASSUNG DER ERFINDUNGSUMMARY OF THE INVENTION

Diesem Bedürfnis wird mit den Gegenständen der unabhängigen Patentansprüche Rechnung getragen. Weiterbildungen und Ausgestaltungsformen des vorgeschlagenen Prinzips sind in den Unteransprüchen angegeben.This need is taken into account by the subject matter of the independent patent claims. Further developments and embodiments of the proposed principle are specified in the subclaims.

Der Erfinder hat für die Mensch und Maschine Interaktion, bei der eine Gestensteuerung erfolgt, erkannt, dass ein Problem bei dieser oftmals die auftretende Zeitverzögerung ist, die zwischen der Geste des Menschen und dem Beginn der Aktion liegt. Je nach verwendetem Algorithmus kann die Zeitverzögerung zwischen der Geste und der dazugehörigen Aktion der Maschine mehrere 100ms bis zu wenigen Sekunden betragen.The inventor has recognized that a problem with human-machine interaction that involves gesture control is often the time delay that occurs between the human's gesture and the start of the action. Depending on the algorithm used, the time delay between the gesture and the corresponding machine action can be several hundred milliseconds to a few seconds.

Dieser Zeitraum führt oftmals dazu, dass der Benutzer die Geste abbricht oder verändert, weil er von der Maschine kein Feedback erhält und so nicht beurteilen kann, ob diese nicht oder evtl. falsch erkannt wird. Gleichzeitig sollte aber auch die Geste sicher erkannt werden, der Benutzer muss also die gewünschte Geste für einige Zeit aufrechterhalten, um so die Erkennung der korrekten Geste über mehrere Bilder hinweg zu gewährleisten. Somit müssen für eine zuverlässige Erkennung und folglich einer korrekten Aktivierung von Aktionen verschiedene Kriterien erfüllt sein, wie z.B. das Objekt muss über mehrere Frames erkannt werden, die Detektionswahrscheinlichkeit muss einen bestimmten Schwellwert überschreiten, die Position und Größe des erkannten Objekts (Hand für Geste) darf sich nur einen bestimmten Wert ändern und weitere.This period of time often leads to the user aborting or changing the gesture because they receive no feedback from the machine and are therefore unable to judge whether it is not being recognised or has been recognised incorrectly. At the same time, however, the gesture should also be recognised reliably, so the user must maintain the desired gesture for some time to ensure that the correct gesture is recognised across multiple images. Various criteria must therefore be met for reliable recognition and, consequently, for actions to be activated correctly, such as the object must be recognised across multiple frames, the detection probability must exceed a certain threshold, the position and size of the recognised object (hand for gesture) may only change by a certain amount, and more.

Für die Verbesserung der diesbezüglichen Interaktion schlägt der Erfinder nun ein Verfahren zur Steuerung eines Maschinenelementes vor, bei dem einem Benutzer, während dieser die Geste vollzieht, Information über die Identifizierung der Geste sowie die entsprechende Kategorisierung mitgeteilt wird. Auf diese Weise kann der Benutzer entweder seine Geste anpassen bzw. leicht verändern, und auch über den notwendigen Zeitraum (aber nicht darüber hinaus) aufrechterhalten, bis die Ausführung der durch die Geste gewünschten Aktion von der Maschine gestartet wird.To improve the interaction in this regard, the inventor now proposes a method for controlling a machine element in which a user is provided with information about the identification of the gesture and the corresponding categorization while he is performing the gesture. In this way, the user can either adapt or slightly change his gesture and also maintain it for the necessary period of time (but not beyond that) until the machine starts executing the action desired by the gesture.

Entsprechend wird bei einem Verfahren nach dem vorgeschlagenen Prinzip eine Vielzahl von Bildern aufgenommen, aus denen zum einen die Gesten eines Benutzers identifiziert und auch kategorisiert werden. Zum anderen werden die aufgenommenen Bilder auch optional wieder ausgegeben, z.B. auf einem Bildschirm, so dass der Benutzer seine Geste sehen und ggf. auch anpassen kann. Dies verbessert die Erkennung der Geste durch das Feedback zurück an den Benutzer.Accordingly, a method based on the proposed principle takes a large number of images from which the gestures of a user are identified and categorized. The captured images are also optionally output again, e.g. on a screen, so that the user can see his gesture and adjust it if necessary. This improves the recognition of the gesture by providing feedback back to the user.

Es wird nun eine Teilmenge der Vielzahl von der Kamera aufgenommener Bilder erfasst. Die Teilmenge kann gleich der Vielzahl sein, so dass die spätere Verarbeitung in Echtzeit erfolgt. Eine kleinere Teilmenge hat aber auch den Vorteil, weniger Prozessorleistung für die Verarbeitung aufwenden zu müssen. In einigen Aspekten kann beispielsweise lediglich jedes zweite Bild zur weiteren Verbreitung verwendet werden. in weiteren Aspekten können auch 10 Bilder pro Sekunde zur Verarbeitung, d.h. zur Identifikation der Geste und deren Kategorisierung herangezogen werden.A subset of the large number of images captured by the camera is now captured. The subset can be equal to the large number, so that subsequent processing takes place in real time. However, a smaller subset also has the advantage of requiring less processor power for processing. In some aspects, for example, only every second image can be used for further distribution. In other aspects, 10 images per second can be used for processing, i.e. for identifying the gesture and categorizing it.

In zwei folgenden Schritten wird insbesondere basierend auf einem durch maschinenbasiertes Lernen trainiertem Netzwerk, wenigstens ein Objekt für jedes Bild der Teilmenge aufgenommener Bilder identifiziert. Das identifizierte Objekt wird ebenso durch das Netzwerk aus einer Menge vorbestimmter Objekte unter Bildung eines Konfidenzwertes kategorisiert, wobei jedem der vorbestimmten Objekte ein definierter Steuerbefehl zugeordnet ist.In two subsequent steps, at least one object is identified for each image of the subset of recorded images, in particular based on a network trained by machine-based learning. The identified object is also categorized by the network from a set of predetermined objects to form a confidence value, with each of the predetermined objects being assigned a defined control command.

Diese Schritte können je nach verwendetem Deep Learning Netzwerk gemeinsam oder auch sequentiell durchgeführt werden. In einem Aspekt wird hierzu ein Netzwerk verwendet, bei dem Identifizierung und Kategorisierung im Wesentlichen gemeinsam erfolgt. Insbesondere wurde für die Ground Truth während des Trainings ein Ansatz gewählt, der ein Feedback sowohl zur Identifizierung als auch der Kategorisierung benutzt. In einigen Aspekten wird ein maschinenbasiertes Netzwerk basierend auf YOLO verwendet.These steps can be performed jointly or sequentially, depending on the deep learning network used. In one aspect, a network is used in which identification and categorization are essentially carried out jointly. In particular, an approach was used for the ground truth during training which uses feedback for both identification and categorization. In some aspects, a machine-based network based on YOLO is used.

In einem weiteren Schritt wird dem Benutzer ein Feedback über das identifizierte Objekt gegeben. Hierzu wird auf dem Bildschirm ein mit dem identifizierten Objekt assoziiertes Zeichen angezeigt, welches den Konfidenzwert oder ein davon abgeleitetes Kriterium indiziert. Der Benutzer erhält die Information, welches der in den Bildern vorhandene Gegenstände von dem Netzwerk als Gestenobjekt identifiziert wurde, und welche Geste dieser Gegenstand nach der Kategorisierung darstellen soll. Dies erlaubt dem Benutzer bereits frühzeitig einzugreifen, wenn beispielsweise das Objekt falsch identifiziert wird, die falsche Geste kategorisiert ist oder auch die Identifizierung aufgrund äußerer Parameter (z.B. Haltung der Hand, Licht etc.) immer wieder fehlschlägt.In a further step, the user is given feedback about the identified object. For this purpose, a symbol associated with the identified object is displayed on the screen, which indicates the confidence value or a criterion derived from it. The user receives information about which of the objects in the images was identified by the network as a gesture object and which gesture this object should represent after categorization. This allows the user to intervene at an early stage if, for example, the object is incorrectly identified, the wrong gesture is categorized or the identification repeatedly fails due to external parameters (e.g. hand position, light, etc.).

Der Konfidenzwert gib die vom Netzwerk ermittelte Wahrscheinlichkeit wieder, dass -basierend auf der Ground truth- das kategorisierte Objekt dem wahren Objekt und damit der korrekten Geste entspricht.The confidence value represents the probability determined by the network that - based on the ground truth - the categorized object corresponds to the true object and thus to the correct gesture.

Der ermittelte Konfidenzwert für das kategorisierte Objekt wird nun über eine Anzahl aus der Teilmenge der aufgenommenen Bilder mit einem Schwellwert verglichen. Bei einem Überschreiten des Schwellwertes durch die dem kategorisierten Objekt zugeordneten Konfidenzwerte für die Anzahl aus der Teilmenge aufgenommener Bilder wird ein dem kategorisierten Objekt zugeordneter Steuerbefehl erzeugt und abgegeben. Gleichzeitig wird auch das mit dem identifizierten Objekt assoziierte Zeichen auf dem Bildschirm geändert, und so dem Benutzer die Ausgabe des Steuerbefehls mitgeteilt.The determined confidence value for the categorized object is then compared with a threshold value using a number from the subset of recorded images. If the threshold value is exceeded by the confidence values assigned to the categorized object for the number from the subset of recorded images, a control command assigned to the categorized object is generated and issued. At the same time, the character associated with the identified object is changed on the screen, thus informing the user that the control command has been issued.

Auf diese Weise wird der Benutzer während der Identifizierung und Kategorisierung seiner Geste über den Fortgang dieses Prozesses und die „Erfolgswahrscheinlichkeit“ auf dem Laufenden gehalten. Durch dieses Feedback wird eine Bedienung des Systems sehr vereinfacht, da ein Benutzer eine entsprechende Rückmeldung über den Zustand der Gestenerkennung, des Systems, der Auswirkung auf das System, den veranlassten Aktionen und den damit verursachten Folgen erhält. Mit dem vorgeschlagenen Verfahren wird in einigen Aspekten sichergestellt, dass, um einen Steuerbefehl auszuführen, sowohl ein vorgegebener Konfidenzwert überschritten sein muss als auch dieser Zustand für eine bestimmte Dauer erfüllt ist.In this way, the user is kept informed about the progress of this process and the "probability of success" while his gesture is being identified and categorized. This feedback makes it much easier to operate the system, as the user receives appropriate feedback about the state of gesture recognition, the system, the impact on the system, the actions initiated and the consequences caused by them. The proposed method ensures in some aspects that in order to execute a control command, both a predetermined confidence value must be exceeded and this state must be met for a certain period of time.

Beispielsweise kann der Benutzer darauf hingewiesen werden, dass die Geste nicht identifiziert worden ist oder nicht mit der erforderlichen Genauigkeit kategorisiert werden kann, so dass dieser die Darstellung der Geste ändern kann, z.B. den Winkel zu Kamera anpassen, näher oder weiter weg gehen, die Hand vom Körper wegbewegen und ähnliches.For example, the user may be alerted that the gesture has not been identified or cannot be categorized with the required accuracy, allowing the user to change the representation of the gesture, e.g. adjust the angle to the camera, move closer or further away, move the hand away from the body, and the like.

In einem Aspekt weist das mit dem identifizierten Objekt assoziierte Zeichen einen Rahmen auf, der um das identifizierte Objekt angeordnet ist. Dies zeigt dem Benutzer direkt an, welcher Gegenstand in dem aufgenommenen und verarbeiteten Bild als Objekt identifiziert worden ist. Zudem kann der Konfidenzwert auf verschiedene Weise mitgeteilt werden. Möglichkeiten hierzu sind eine Zahl, insbesondere einen Prozentwert, deren Größe von dem Konfidenzwert abhängt, eine Dicke des Rahmens, die von dem Konfidenzwert abhängt, eine Farbe des Rahmens, die von dem Konfidenzwert abhängt oder auch eine Kombination mehrerer der vorgenannten.In one aspect, the symbol associated with the identified object has a frame arranged around the identified object. This shows the user directly which item in the captured and processed image has been identified as an object. In addition, the confidence value can be communicated in various ways. Possible ways of doing this are a number, in particular a percentage value, the size of which depends on the confidence value, a thickness of the frame that depends on the confidence value, a color of the frame that depends on the confidence value, or a combination of several of the above.

In diesem Zusammenhang ist es möglich, dass der Konfidenzwert von der Kategorisierung in vorangegangenen Teilbildern abhängt. Insbesondere kann der Konfidenzwert dauerhaft über der Schwelle liegen, wenn das gleiche identifizierte Objekt über mehrere Teilbilder hinweg auch gleich kategorisiert wird. Wenn ein Gesamtwert aus den Konfidenzwerten gebildet wird, beispielsweise eine Summe, sollte dieser ansteigen, sofern über mehrere Teilbilder hinweg das identifizierte Objekt gleich kategorisiert wurde. Entsprechend kann sich in einem solchen Fall, die Farbe des Rahmens oder auch der Konfidenzwert ändern, so dass dem Benutzer das zunehmende Vertrauen in die korrekte Gestenerkennung angezeigt wird. In einigen Aspekten wird ebenso ein erstmaliges Überschreiten des Schwellwertes angezeigt, so dass der Benutzer weiß, dass nach Ablauf einer vorgegebenen Zeit, bei gleichzeitig weiterhin korrekter Erkennung der Geste der Steuerbefehl erzeugt wird.In this context, it is possible that the confidence value depends on the categorization in previous sub-images. In particular, the confidence value can be permanently above the threshold if the same identified object is categorized the same across multiple sub-images. If an overall value is formed from the confidence values, e.g. a sum, this should increase if the identified object was categorized the same across multiple sub-images. In such a case, the color of the frame or the confidence value can change accordingly, so that the user is shown increasing confidence in correct gesture recognition. In some aspects, the first time the threshold is exceeded is also displayed, so that the user knows that after a predetermined time has elapsed and the gesture continues to be correctly recognized, the control command will be generated.

In einem weiteren Aspekt ändert sich das mit dem identifizierten Objekt assoziierte Zeichen, insbesondere dann, wenn für vorangegangene Bilder der Teilmenge das identifizierte Objekt in gleicher Weise kategorisiert wurde. Mit anderen Worten muss, damit ein Steuerbefehl ausgeführt wird, sowohl ein vorgegebener Konfidenzwert überschritten sein als auch dieser Zustand für eine bestimmte Dauer erfüllt sein. Entsprechend ist in einigen Aspekten vorgesehen, dass ein mit dem identifizierten Objekt assoziiertes Zeichen beispielsweise einen Rahmen um das identifizierte Objekt bildet, dessen Dicke und oder Farbe von einer Kategorisierung des identifizierten Objektes in aufeinander folgenden Bilder der Teilmenge abhängt. Insbesondere kann sich dessen Dicke und/oder Farbe sich ändern, wenn für vorangegangene Bilder der Teilmenge das identifizierte Objekt gleich kategorisiert wurde.In a further aspect, the symbol associated with the identified object changes, in particular if the identified object was categorized in the same way for previous images of the subset. In other words, in order for a control command to be executed, both a predetermined confidence value must be exceeded and this condition must be fulfilled for a certain duration. Accordingly, in some aspects, it is provided that a symbol associated with the identified object forms, for example, a frame around the identified object, the thickness and/or color of which depends on a categorization of the identified object in successive images of the subset. In particular, its thickness and/or color can change if the identified object was categorized in the same way for previous images of the subset.

In einigen Ausführungen wird ein mit dem abgegebenen Steuerbefehl assoziierten Zeichen auf dem Bildschirm ausgegeben. Dies kann beispielsweise die Geste sein, die erkannt wurde oder auch eine Bezeichnung oder eine kurze Beschreibung des ausgeführten Steuerbefehls. Ebenso kann das mit dem identifizierten Objekt assoziierte Zeichen auf dem Bildschirm nach einer definierten Zeitspanne nach einem Abgeben des zugeordneten Steuerbefehls wieder entfernt werden. Auf diese Weise wird dem Benutzer angezeigt, dass das System für eine neue Geste zur Verfügung steht.In some embodiments, a character associated with the control command issued is displayed on the screen. This can be, for example, the gesture that was recognized or a name or a short description of the control command executed. Likewise, the character associated with the identified object can be removed from the screen after a defined period of time after the associated control command has been issued. This indicates to the user that the system is available for a new gesture.

In einem anderen Aspekt kann vorgesehen sein, dass bei Unterschreiten des Schwellwertes durch die dem kategorisierten Objekt zugeordneten Konfidenzwerte für die Anzahl der Teilmenge aufgenommener Bilder das mit dem identifizierten Objekt assoziierten Zeichen auf dem Bildschirm wieder entfernt wird. Allerding mag es in einigen Aspekten nicht sinnvoll sein, dies bei einem einzelnen Unterschreiten schon durchzuführen. Vielmehr kann das vorgeschlagene Verfahren eine gewisse Toleranz bei einer Fehlkategorisierung oder einer Kategorisierung mit kleinen Konfidenzwerten aufweisen.In another aspect, it can be provided that if the confidence values assigned to the categorized object for the number of the subset of recorded images fall below the threshold value, the symbol associated with the identified object is removed from the screen. However, in some aspects it may not be sensible to do this if the threshold value is exceeded in one instance. Rather, the proposed method can have a certain tolerance for incorrect categorization or categorization with small confidence values.

So ist in einigen Aspekten vorgesehen, dass ein Überschreiten des Schwellwertes durch die dem kategorisierten Objekt zugeordneten Konfidenzwerte für die Anzahl aus der Teilmenge aufgenommener Bilder auch dann gewertet wird, wenn während der Anzahl aus der Teilmenge aufgenommener Bilder bei weniger als 7 Bildern insbesondere weniger als 5, insbesondere weniger als 3 Bildern der Schwellwert unterschritten wird. Der Schwellwert für eine erfolgreiche Kategorisierung kann beispielsweise bei über 50% oder auch über 60% oder auch über 70% oder auch über 80% liegen.In some aspects, it is provided that an exceedance of the threshold value by the confidence values assigned to the categorized object for the number of images taken from the subset is also evaluated if the threshold value is undercut for fewer than 7 images, in particular fewer than 5, in particular fewer than 3, of the number of images taken from the subset. The threshold value for successful categorization can, for example, be over 50% or even over 60% or even over 70% or even over 80%.

Die Anzahl der Bilder aus der Teilmenge der Bilder, bei der ein identifiziertes und kategorisiertes Objekt mit dem Konfidenzwert über der Schwelle liegen muss, kann in einigen Aspekten von der Kategorie bzw. dem kategorisierten Objekt selbst abhängen. Entsprechend ist es auf diese Weise möglich, dass bestimmte kategorisierte Objekte mit einer höheren Priorität bzw. schneller einen damit assoziierten Steuerbefehl hervorrufen. So kann beispielsweise bei einem kategorisierten Objekt, dem ein Stop-Steuerbefehl zugeordnet ist, die Anzahl der Bilder, bei denen das identifizierte Objekt auch entsprechend kategorisiert wird, deutlich kürzer sein. In einigen Aspekten ist es um 1/3 kürzer, d.h. beispielsweise anstatt 30 Bildern nur noch 20 Bilder.The number of images from the subset of images in which an identified and categorized object must have a confidence value above the threshold can depend in some aspects on the category or the categorized object itself. Accordingly, it is possible in this way that certain categorized objects trigger an associated control command with a higher priority or more quickly. For example, in the case of a categorized object to which a stop control command is assigned, the number of images in which the identified object is also categorized accordingly can be significantly shorter. In some aspects, it is 1/3 shorter, i.e., for example, only 20 images instead of 30.

Ein weiterer Aspekt betrifft das Problem, dass sich identifizierte Objekte bewegen, wenn beispielsweise ein Benutzer seine Hand mit der Geste bewegt. Das Netzwerk zur Identifizierung und Kategorisierung erzeugt zu diesem Zweck einen Bewegungsvektor, welcher dem identifizierten Objekt zugeordnet wird, um auf diese Weise dem Objekt folgen zu können. In anderen Aspekten wird ein identifiziertes Objekt zwischen zwei aufeinander folgenden Bildern auch dann kategorisiert, wenn das identifizierte Objekt um nicht mehr als eine festgelegte Anzahl von Bildpunkten in der Position im jeweiligen Bild abweicht. Mit anderen Worten lässt sich ein Wert festgelegen, um wie viele Pixel sich ein Objekt in einem Bild (bzw. ein mit dem Objekt assoziierten Zeichen wie ein Rahmen um das Objekt) zu dem Objekt in einem folgenden Bild verändern dar, so dass dieses noch kategorisiert bzw. auch noch identifiziert wird.Another aspect concerns the problem that identified objects move when, for example, a user moves his hand with the gesture. For this purpose, the identification and categorization network generates a motion vector which is assigned to the identified object in order to be able to follow the object in this way. In other aspects, an identified object is categorized between two consecutive images even if the identified object deviates by no more than a specified number of pixels in position in the respective image. In other words, a value can be set by how many pixels an object in one image (or a symbol associated with the object, such as a frame around the object) changes compared to the object in a subsequent image, so that it is still categorized or identified.

Dieser Wert kann mit einem Faktor proportional sein zur Größe des Objektes bzw. Größe eines um das Objekt gelegten Rahmen. Analog kann ebenfalls zur Generierung einer Toleranz für die Entfernung des Objekts, beispielsweise der Hand des Benutzers zur Erkennungseinheit und folglich der sich ändernder Größe der erkannten Hand ein solcher Faktor angesetzt werden. Diese Toleranzen und Überwachungen sind wichtig, damit die Gestenerkennung einen Anhaltspunkt hat, ob es sich um eine Eingabe oder um mehrere bzw. sich ändernde Nutzer handelt. In einigen weiteren Aspekten werden Objekte, beispielsweise Handgesten, welche in der Mitte des erfassten Bildes sind, bevorzugt behandelt, d.h. identifiziert oder auch kategorisiert. Das Netzwerk wird also vor allem darauf trainiert, Objekte im Zentrum der erfassten Bilder zu identifizieren und kategorisieren.This value can be proportional to the size of the object or the size of a frame placed around the object. Similarly, such a factor can also be used to generate a tolerance for the distance of the object, for example the user's hand, from the recognition unit and consequently the changing size of the recognized hand. These tolerances and monitoring are important so that the gesture recognition has an indication of whether it is a single input or multiple or changing users. In some other aspects, objects, such as hand gestures, which are in the middle of the captured image are given preferential treatment, i.e. identified or categorized. The network is therefore primarily trained to identify and categorize objects in the center of the captured images.

Ein weiterer Aspekt betrifft ein Sperren weiterer Steuerbefehle, bis der vorangegangene Steuerbefehl, dessen Geste erkannt wurde abgearbeitet ist. Hierzu ist ebenfalls vorgesehen, die einem Benutzer anzugzeigen. Entsprechend erfolgt in einigen Aspekten eine Änderung des mit dem identifizierten bzw. kategorisierten Objekt assoziierte Zeichen nach dem Abgeben unmittelbar, und wird anschließend gesperrt, bis der Steuerbefehl abgearbeitet ist.Another aspect concerns blocking further control commands until the previous control command whose gesture was recognized has been processed. For this purpose, it is also intended to display the gesture to a user. Accordingly, in some aspects, a change in the character associated with the identified or categorized object occurs immediately after it is issued and is then blocked until the control command has been processed.

Unter dem Begriff sperren wird verstanden, dass eine weitere Identifizierung oder Kategorisierung einer zusätzlichen Geste nicht angezeigt wird. Auf diese Weise wird der Benutzer darauf hingewiesen, dass der erkannte und abgearbeitete Steuerbefehl noch nicht abgeschlossen ist. Entsprechend wird in diesen Aspekten auch ein erneutes Identifizieren bzw. Kategorisieren nicht angezeigt, selbst wenn dieses im Hintergrund noch weiter ausgeführt wird. In anderen Aspekten kann die Gestenerkennung auch unterbrochen werden bis der Steuerbefehl abgearbeitet ist.The term blocking means that further identification or categorization of an additional gesture is not displayed. In this way, the user is informed that the recognized and processed control command has not yet been completed. Accordingly, in these aspects, further identification or categorization is not displayed, even if this is still being carried out in the background. In other aspects, gesture recognition can also be interrupted until the control command has been processed.

Die Sperre kann dem Benutzer auf verschiedene Weise mitgeteilt werden, z.B. durch Anzeigen eines weiteren Zeichens, oder eine weitere Änderung des mit dem identifizierten Objekt kategorisierten Zeichen. Beispielsweise kann nach dem korrekten Erkennen ein Icon der erkannten Geste, ausgegeben werden, dessen Farbe sich ändert, nachdem der Steuerbefehl abgegeben wurde. Die Farbe bleibt dann erhalten, solange der Steuerbefehl ausgeführt wird.The lock can be communicated to the user in various ways, e.g. by displaying another character, or another change of the character categorized with the identified object. For example, after correct recognition, an icon of the recognized gesture can be output, the color of which changes after the control command has been issued. The color then remains the same as long as the control command is executed.

In einigen Aspekten werden die Gestenidentifizierung und die Kategorisierung im Hintergrund aber fortgesetzt. Dies ist dann zweckmäßig, wenn es zum Sperren der Gestenkategorisierung Ausnahme geben soll, nämlich eine Geste, welche mit dem Steuerbefehl „Stop“ oder einem gleichbedeutenden assoziiert ist (z.B. Abbruchbefehl). Dadurch wird sichergestellt, dass ein Benutzer dennoch jederzeit die Bewegung der Maschine oder eine ausgeübte Funktion unterbrechen kann.In some aspects, however, gesture identification and categorization continue in the background. This is useful if there is an exception to blocking gesture categorization, namely a gesture that is associated with the control command "Stop" or an equivalent (e.g. abort command). This ensures that a user can still interrupt the movement of the machine or a function being performed at any time.

Das hier vorgeschlagenen Verfahren lässt sich in einigen Aspekten in einer Computeranordnung implementieren. Diese umfasst einen Speicher mit einem darin abgelegten Programm und wenigstens einen mit dem Speicher verbundenen Prozessor. Das Programm ist ausgebildet, das Verfahren nach einem der obigen Ansprüche auszuführen. In einem anderen Aspekt wird eine Steueranordnung mit einer Kamera zum Aufnehmen einer Vielzahl von Bildern im Blickfeld der Kamera, einem Bildschirm, wenigstens einem mit einem Speicher verbundenen Prozessor sowie einem in dem Speicher abgelegtes Programm vorgeschlagen. Das Programm weist Instruktionen auf, die bei Ausführung auf dem wenigstens einen Prozessor die obengenannten Schritte durchführen. Dazu gehört unter anderem:

- ein Erfassen zumindest einer Teilmenge der Vielzahl von der Kamera aufgenommener Bilder;
- ein Ausgeben der Vielzahl aufgenommener Bilder auf dem Bildschirm;
- ein Identifizieren wenigstens eines Objektes für jedes Bild der Teilmenge aufgenommener Bilder;
- ein Kategorisieren des wenigstens eines Objektes aus einer Menge vorbestimmter Objekte unter Bildung eines Konfidenzwertes, wobei jedem der vorbestimmten Objekte ein definierter Steuerbefehl zugeordnet ist;
- ein Ausgeben auf dem Bildschirm ein mit dem identifizierten Objekt assoziiertes Zeichen, welches den Konfidenzwert oder ein davon abgeleitetes Kriterium indiziert;
- ein Vergleichen der dem kategorisierten Objekt zugeordneten Konfidenzwerte mit einem Schwellwert für eine Anzahl aus der Teilmenge aufgenommener Bilder;
- und bei Überschreiten des Schwellwertes durch die dem kategorisierten Objekt zugeordneten Konfidenzwerte für die Anzahl aus der Teilmenge aufgenommener Bilder;
- ◯ Erzeugen und Abgeben des dem kategorisierten Objekt zugeordneten Steuerbefehls; und
- ◯ Ändern des mit dem identifizierten Objekt assoziierten Zeichens auf dem Bildschirm.

The method proposed here can be implemented in some aspects in a computer arrangement. This comprises a memory with a program stored therein and at least one processor connected to the memory. The program is designed to carry out the method according to one of the above claims. In another aspect, a control arrangement is proposed with a camera for recording a plurality of images in the field of view of the camera, a screen, at least one processor connected to a memory and a program stored in the memory. The program has instructions which, when executed on the at least one processor, carry out the above-mentioned steps. These include, among others:

- capturing at least a subset of the plurality of images captured by the camera;
- outputting the multitude of captured images on the screen;
- identifying at least one object for each image of the subset of captured images;
- categorizing the at least one object from a set of predetermined objects to form a confidence value, wherein each of the predetermined objects is assigned a defined control command;
- displaying on the screen a symbol associated with the identified object which indicates the confidence value or a criterion derived therefrom;
- comparing the confidence values associated with the categorised object with a threshold value for a number of images taken from the subset;
- and if the threshold is exceeded by the confidence values assigned to the categorised object for the number of images taken from the subset;
- ◯ Generating and issuing the control command associated with the categorized object; and
- ◯ Changing the character on the screen associated with the identified object.

KURZE BESCHREIBUNG DER ZEICHNUNGENBRIEF DESCRIPTION OF THE DRAWINGS

Weitere Aspekte und Ausführungsformen nach dem vorgeschlagenen Prinzip werden sich in Bezug auf die verschiedenen Ausführungsformen und Beispiele offenbaren, die in Verbindung mit den begleitenden Zeichnungen ausführlich beschrieben werden.

1 zeigt ein zeitliches Ablaufdiagramm für eine erste Ausführungsform des Verfahrens nach dem vorgeschlagenen Prinzip;
2 stellt ein zeitliches Ablaufdiagramm für eine zweite Ausführungsform des Verfahrens nach dem vorgeschlagenen Prinzip dar;
3 zeigt einen weiteren Aspekt für eine Ausführungsform des Verfahrens nach dem vorgeschlagenen Prinzip;
4 ist ein Screenshott eines Bildschirms, der den Konfidenzwert, sowie einen Rahmen um die identifizierte Geste zeigt;
5 ist ein Ablaufdiagramm einer Ausführungsform des Verfahrens nach einigen Aspekten des vorgeschlagenen Prinzips;
6 zeigt eine Ausführungsform einer Steueranordnung nach einigen Aspekten des vorgeschlagenen Prinzips.
7 zeigt eine Ausführungsform verschiedener Gesten als Objektkategorien mit zugehörigen Steuerbefehlen.

Further aspects and embodiments according to the proposed principle will become apparent with reference to the various embodiments and examples which will be described in detail in connection with the accompanying drawings.

1 shows a timing diagram for a first embodiment of the method according to the proposed principle;
2 shows a timing diagram for a second embodiment of the method according to the proposed principle;
3 shows a further aspect of an embodiment of the method according to the proposed principle;
4 is a screenshot of a screen showing the confidence value and a frame around the identified gesture;
5 is a flow diagram of an embodiment of the method according to some aspects of the proposed principle;
6 shows an embodiment of a control arrangement according to some aspects of the proposed principle.
7 shows an embodiment of different gestures as object categories with associated control commands.

DETAILLIERTE BESCHREIBUNGDETAILED DESCRIPTION

Die folgenden Ausführungsformen und Beispiele zeigen verschiedene Aspekte und ihre Kombinationen nach dem vorgeschlagenen Prinzip. Die Ausführungsformen und Beispiele sind nicht immer maßstabsgetreu. Ebenso können verschiedene Elemente vergrößert oder verkleinert dargestellt werden, um einzelne Aspekte hervorzuheben. Es versteht sich von selbst, dass die einzelnen Aspekte und Merkmale der in den Abbildungen gezeigten Ausführungsformen und Beispiele ohne weiteres miteinander kombiniert werden können, ohne dass dadurch das erfindungsgemäße Prinzip beeinträchtigt wird. Einige Aspekte weisen eine regelmäßige Struktur oder Form auf. Es ist zu beachten, dass in der Praxis geringfügige Abweichungen von der idealen Form auftreten können, ohne jedoch der erfinderischen Idee zu widersprechen.The following embodiments and examples show various aspects and their combinations according to the proposed principle. The embodiments and examples are not always to scale. Likewise, various elements can be shown enlarged or reduced in size in order to emphasize individual aspects. It goes without saying that the individual aspects and features of the embodiments and examples shown in the figures can easily be combined with one another without affecting the inventive principle. Some aspects have a regular structure or shape. It should be noted that in practice minor deviations from the ideal form, without contradicting the inventive idea.

Außerdem sind die einzelnen Figuren, Merkmale und Aspekte nicht unbedingt in der richtigen Größe dargestellt, und auch die Proportionen zwischen den einzelnen Elementen müssen nicht grundsätzlich richtig sein. Einige Aspekte und Merkmale werden hervorgehoben, indem sie vergrößert dargestellt werden. Begriffe wie „oben“, „oberhalb“, „unten“, „unterhalb“, „größer“, „kleiner“ und dergleichen werden jedoch in Bezug auf die Elemente in den Figuren korrekt dargestellt. So ist es möglich, solche Beziehungen zwischen den Elementen anhand der Abbildungen abzuleiten.In addition, the individual figures, features and aspects are not necessarily shown in the correct size, nor are the proportions between the individual elements necessarily always correct. Some aspects and features are emphasized by showing them in an enlarged manner. However, terms such as "above", "below", "below", "larger", "smaller" and the like are correctly shown in relation to the elements in the figures. This makes it possible to infer such relationships between the elements from the illustrations.

BEZUGSZEICHENLISTELIST OF REFERENCE SYMBOLS

11: Computer, SteueranordnungComputer, control arrangement
22: SpeicherStorage
33: Prozessorprocessor
44: Kameracamera
55: BildschirmScreen
66: MaschinenelementMachine element
1010: Bild, TeilmengeImage, subset
1111: identifiziertes Objektidentified object
12, 12a12, 12a: assoziiertes Zeichenassociated character
13, 13a13, 13a: RahmenFrame
13b13b: Rahmen, indiziert, dass Geste über längeren Zeitraum korrekt erkannt wurdeFrame, indicating that gesture was correctly recognized over a longer period of time
13c13c: Rahmen nach Starten des SteuerbefehlsFrame after starting the control command
2020: kategorisiertes Objektcategorized object
2121: zugeordneter Steuerbefehlassigned control command

Claims

Method for controlling a machine element (6), comprising: - Recording (S1) a plurality of images; - Optionally outputting (S2) the plurality of images on a screen; - Capturing (S3) at least a subset (10) of the plurality of images recorded by the camera; - Identifying (S4), in particular by a network trained on machine-based learning, at least one object (11) for each image of the subset of recorded images - Categorizing (S5), in particular by the network, the at least one object (11) from a set of predetermined objects to form a confidence value, wherein each of the predetermined objects is assigned a defined control command; - Outputting (S6) on the screen a symbol (12, 12a) associated with the identified and/or categorized object (11), which indicates the confidence value or a criterion derived therefrom; - comparing (S7) the confidence values assigned to the categorized object (20) with a threshold value for a number of images taken from the subset; - if the threshold value is exceeded by the confidence values assigned to the categorized object (20) for the number of images taken from the subset (10): ◯ generating (S8) and issuing the control command (21) assigned to the categorized object (20); and ◯ changing (S8') the character (13c) associated with the identified object (11) on the screen.

Procedure according to Claim 1 , in which the sign associated with the identified object (11) has a frame (13) around the identified object, wherein optionally the indexing of the confidence value is carried out by at least one of the following: - a number, in particular a percentage value, the size of which depends on the confidence value; - a thickness of the frame which depends on the confidence value; - a color of the frame which depends on the confidence value; - a combination of several of the aforementioned.

Method according to one of the preceding claims, wherein the confidence value or a criterion derived therefrom depends on a categorization of successive images of the subset (10), and in particular increases if the identified object (11) was categorized in the same way for previous images of the subset (10).

Method according to one of the preceding claims, in which the sign associated with the identified object (11) has a frame (13) around the identified object, the thickness and/or color of which depends on a categorization of the identified object in successive images of the subset (10), in particular the thickness and/or color of which changes if the identified object (11) was categorized in the same way for previous images of the subset (10).

Method according to one of the preceding claims, further comprising: - outputting a character associated with the issued control command on the screen; and/or - removing the character associated with the identified object on the screen after a defined period of time after the associated control command has been issued.

Method according to one of the preceding claims, wherein the sign (12) associated with the identified and/or categorized object (11) contains a designation of a category of the categorized object.

Method according to one of the preceding claims, further comprising, if the confidence values for the number of the subset of recorded images associated with the categorized object fall below the threshold value; - removing the character associated with the identified and/or categorized object on the screen.

Method according to one of the preceding claims, in which an exceedance of the threshold value by the confidence values assigned to the categorized object for the number of images taken from the subset is also evaluated if, during the number of images taken from the subset, the threshold value is undershot for fewer than 5 images, in particular fewer than 3 images.

Method according to one of the preceding claims, in which the number of images recorded from the subset depends on the categorized object; and in particular is smaller for an object to which a stop control command is assigned than for another object.

Method according to one of the preceding claims, in which - the subset comprises only every second image from the plurality of images recorded by the camera; or - the subset comprises only 2.5 images from the plurality of images recorded by the camera.

Method according to one of the preceding claims, in which an identified object is categorised the same between two consecutive images if the identified object deviates by no more than a specified number of pixels in position in the respective image.

Method according to one of the preceding claims, in which the character (13c) associated with the identified object (11) remains changed on the screen after the associated control command has been issued until the control command has been processed.

Method according to one of the preceding claims, further comprising aborting a processing control command if the threshold value is exceeded by the confidence values assigned to a categorized object (20) associated with an abort command for the number of images recorded from the subset (10).

Computer arrangement comprising: - a memory with a program stored therein; - at least one processor connected to the memory; - wherein the program is designed to carry out the method according to one of the above claims.

Control arrangement with a camera (4) for recording a plurality of images in the field of view of the camera, a screen (5), at least one processor (3) connected to a memory (2) and a program (7) stored in the memory (2) which has instructions which, when executed on the at least one processor (3), carry out the steps: - capturing at least a subset (10) of the plurality of images recorded by the camera; - outputting the plurality of recorded images (10) on the screen (5); - identifying at least one object (11) for each image of the subset of recorded images; - categorizing the at least one object (11) from a set of predetermined objects to form a confidence value, each of the predetermined objects being assigned a defined control command; - outputting on the screen a symbol (12, 12a) associated with the identified object which indicates the confidence value or a criterion derived therefrom; - comparing the confidence values associated with the categorized object with a threshold value for a number of images taken from the subset; - if the threshold value is exceeded by the confidence values associated with the categorized object (11) for the number of images taken from the subset; o generating and issuing the control command associated with the categorized object; and o changing the character on the screen associated with the identified object.

Tax order according to Claim 15 , in which the sign associated with the identified and/or categorised object (11) has a frame (13, 13a) around the identified and/or categorised object, wherein optionally the indexing of the confidence value is carried out by at least one of the following: - A number, in particular a percentage value, the size of which depends on the confidence value; - A thickness of the frame which depends on the confidence value; - A color of the frame that depends on the confidence value; - A combination of several of the above.

Tax arrangement according to one of the Claims 15 until 16 , in which the sign associated with the identified object (11) has a frame (13) around the identified object (11), the thickness and/or color of which depends on a categorization of the identified object in successive images of the subset (10), in particular the thickness and/or color of which changes if the identified object (11) was categorized in the same way for previous images of the subset (10).

Tax arrangement according to one of the Claims 15 until 17 , further comprising: - outputting a character associated with the issued control command on the screen; and/or - removing the character (12, 12a) associated with the identified and/or categorized object (11) on the screen after a defined period of time after issuing the associated control command.

Tax arrangement according to one of the Claims 15 until 18 , wherein the sign (12, 12a) associated with the identified and/or categorized object (11) contains a designation of a category of the categorized object.

Tax arrangement according to one of the Claims 15 until 19 , further comprising if the confidence values for the number of the subset of recorded images associated with the categorized object fall below the threshold value; - removing the sign (12, 12a) associated with the identified and/or categorized object (11) on the screen.

Tax arrangement according to one of the Claims 15 until 20 , in which an exceedance of the threshold value by the confidence values assigned to the categorized object for the number of images taken from the subset is also evaluated if, during the number of images taken from the subset, the threshold value is undercut for fewer than 4 images, in particular fewer than 3 images.

Tax arrangement according to one of the Claims 15 until 21 , in which the number of images taken from the subset depends on the categorized object; and in particular is smaller for an object to which a stop control command is assigned than for another object.

Tax arrangement according to one of the Claims 15 until 22 , in which - the subset only comprises every second image from the plurality of images recorded by the camera; or - the subset only comprises 2.5 images from the plurality of images recorded by the camera.

Tax arrangement according to one of the Claims 15 until 23 in which an identified object (11) is categorised equally between two consecutive images if the identified object deviates by no more than a specified number of pixels in position in the respective image.

Tax arrangement according to one of the Claims 15 until 24 , in which the character (13c) associated with the identified object (11) remains changed on the screen after the associated control command has been issued until the control command has been processed.

Tax arrangement according to one of the Claims 15 until 25 , further comprising aborting a processing control command if the threshold value is exceeded by the confidence values assigned to a categorized object (20) associated with an abort command for the number of images recorded from the subset (10).