DE102020121565A1

DE102020121565A1 - Device and method for interacting with a graphical user interface and for testing an application

Info

Publication number: DE102020121565A1
Application number: DE102020121565.7A
Authority: DE
Inventors: Andreas Rau; Jenny Rau; Andreas Zeller
Original assignee: Cispa Helmholtz Zentrum Fuer Informationssicherheit Ggmbh
Current assignee: Cispa Helmholtz Zentrum Fuer Informationssicherheit Ggmbh
Priority date: 2020-08-17
Filing date: 2020-08-17
Publication date: 2022-02-17
Also published as: WO2022037818A1

Abstract

Die Erfindung betrifft eine Vorrichtung (1) und ein Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche und zum Testen einer Applikation. Es wird ein Sprachbefehl bereitgestellt, aus dem ein Textelement identifiziert wird. Zu einem interaktiven Element (22, 23, 24) der Benutzeroberfläche wird ein lexikalischer Inhalt ermittelt. Es wird ein semantischer Ähnlichkeitsgrad des Textelements zu dem lexikalischen Inhalt bestimmt und es wird sodann in Abhängigkeit von dem Ähnlichkeitsgrad mit dem interaktiven Element (22, 23, 24) interagiert.The invention relates to a device (1) and a method for interacting with a graphical user interface and for testing an application. A voice command is provided from which a text element is identified. A lexical content is determined for an interactive element (22, 23, 24) of the user interface. A semantic degree of similarity of the text element to the lexical content is determined and the interactive element (22, 23, 24) is then interacted with depending on the degree of similarity.

Description

Die Erfindung betrifft ein Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche, wobei die graphische Benutzeroberfläche ein Interaktionselement aufweist.The invention relates to a method for interacting with a graphical user interface, the graphical user interface having an interaction element.

Dies ist so zu verstehen, dass die Benutzeroberfläche genau ein oder mehr als ein Interaktionselement aufweist. Im Regelfall wird die Benutzeroberfläche eine Vielzahl von Interaktionselementen aufweisen. Ein Interaktionselement ist ein interaktives Element mit graphischer Darstellung, also ein Element, das angezeigt werden kann und mit dem interagiert werden kann, um eine Aktion auszulösen. Das Interaktionselement kann daher insbesondere ein sogenanntes „Widget“ sein. Ein Interaktionselement kann beispielsweise klickbar sein. Eine Aktion wird ausgelöst, wenn das Interaktionselement angeklickt wird. Es gibt Interaktionselemente, die eine Eingabe erfordern, Drop-Down Menüs und viele weitere Arten von Interaktionselementen.This is to be understood in such a way that the user interface has exactly one or more than one interaction element. As a rule, the user interface will have a large number of interaction elements. An interaction element is an interactive element with a graphical representation, that is, an element that can be displayed and interacted with to trigger an action. The interaction element can therefore in particular be what is known as a “widget”. For example, an interaction element can be clickable. An action is triggered when the interaction element is clicked. There are interaction elements that require input, drop-down menus, and many other types of interaction elements.

Es kann vorgesehen sein, dass durch die Interaktion mit der Benutzeroberfläche Funktionalitäten einer Applikation gesteuert werden, die die Benutzeroberfläche als Schnittstelle verwendet. So kann insbesondere vorgesehen sein, dass durch das Auslösen einer Aktion im Wege einer Interaktion mit dem Interaktionselement ein Programmablauf der Applikation gesteuert wird. Durch eine Interaktion mit einem Interaktionselement kann insbesondere eine Aktion ausgelöst werden, die den Zustand der Applikation und/oder der Benutzeroberfläche in einen anderen Zustand überführt.Provision can be made for the interaction with the user interface to control functionalities of an application that uses the user interface as an interface. In particular, provision can be made for a program flow of the application to be controlled by triggering an action by way of an interaction with the interaction element. An interaction with an interaction element can in particular trigger an action that transfers the state of the application and/or the user interface to a different state.

Eine Interaktion kann auch als ausgeführte Aktion bezeichnet werden.An interaction can also be referred to as an action taken.

Eine Applikation kann insbesondere dadurch charakterisiert sein, dass es sich um eine Softwareapplikation handelt oder um ein Computerprogramm, das eine für den Anwender nützliche Funktion ausführt.An application can be characterized in particular by the fact that it is a software application or a computer program that performs a function that is useful for the user.

Graphische Benutzeroberflächen, häufig auch als graphische Benutzerschnittstellen bezeichnet, sind allgegenwärtig. Eine Interaktion erfolgt häufig, indem der Anwender mit einer Computermaus oder seinen Fingern auf graphisch dargestellte interaktive Elemente klickt und Eingaben mittels einer Tastatur tätigt. Dies ist umständlich.Graphical user interfaces, often also referred to as graphical user interfaces, are ubiquitous. Interaction often occurs by the user clicking on graphically displayed interactive elements with a computer mouse or his fingers and making entries using a keyboard. This is cumbersome.

Es besteht daher das Bedürfnis nach ergonomischeren Interaktionsmöglichkeiten.There is therefore a need for more ergonomic interaction options.

In den letzten Jahren sind Sprachsteuerungen entwickelt worden, die eine Bedienung verbessern können. Sprachassistenten gehen hierbei üblicherweise einen anderen Weg und interagieren nicht mit einer graphischen Benutzeroberfläche, sondern Lösen Funktionalitäten der Applikation auf andere Weise aus.In recent years, voice controls have been developed that can improve operation. Language assistants usually go a different way here and do not interact with a graphical user interface, but trigger functionalities of the application in a different way.

Eine direkte Sprachsteuerung von graphischen Benutzeroberflächen ist heutzutage lediglich sehr rudimentär umgesetzt. Es ist bekannt, graphische Benutzeroberflächen aus der Ferne zu steuern durch Verwenden von Sprachbefehlen. Die Sprachbefehle ersetzen hierbei lediglich die manuelle Bedienung, indem Befehle geäußert werden wie „bewege den Cursor zu“ oder „klicke auf die Schaltfläche mit der Bezeichnung <Name>“, wobei <Name> konkret vorgegeben ist. Eine flexible Sprachsteuerung ist daher nicht möglich, da der Anwender die richtigen Begriffe sprechen muss und durch die graphische Benutzeroberfläche in der gleichen Weise wie bei manueller Interaktion navigieren muss.Direct voice control of graphical user interfaces is only implemented very rudimentarily these days. It is known to remotely control graphical user interfaces by using voice commands. The voice commands only replace manual operation by uttering commands such as "move the cursor to" or "click on the button labeled <Name>", where <Name> is specifically specified. Flexible voice control is therefore not possible since the user must speak the correct terms and navigate through the graphical user interface in the same way as with manual interaction.

Die Erfindung betrifft ferner ein Verfahren zum Testen einer eine Benutzeroberfläche als Schnittstelle verwendende Applikation.The invention also relates to a method for testing an application using a user interface as an interface.

Eine Interaktion ist in einem solchen Umfeld erleichtert, da das Betriebssystem, mit dem die Applikation ausgeführt wird, in aller Regel eine Softwareschnittstelle zur Interaktion mit der graphischen Benutzeroberfläche zur Verfügung stellt. Interaktionen, die normalerweise manuell ausgeführt werden, können daher computergestützt über die Softwareschnittstelle erfolgen.An interaction is facilitated in such an environment since the operating system with which the application is executed generally provides a software interface for interacting with the graphical user interface. Interactions that are normally performed manually can therefore be computer-aided via the software interface.

Das Testen von Applikationen hat eine hohe praktische Bedeutung. Neben manuell ausgeführten Tests gewinnen automatisierte Testverfahren immer mehr an Bedeutung, da diese mit geringerem Aufwand durchgeführt werden können und eine hohe Vergleichbarkeit schaffen.Application testing is of great practical importance. In addition to manually executed tests, automated test procedures are becoming more and more important because they can be carried out with less effort and create a high level of comparability.

Es ist bekannt, hierzu Crawler einzusetzen, die im Wesentlichen randomisiert Interaktionen mit der graphischen Benutzeroberfläche auslösen und sukzessive eine Vielzahl von Zuständen der Benutzeroberfläche erkunden. Eine solche Herangehensweise ist langsam und kann nur einen Bruchteil des Zustandsraums erkunden.It is known to use crawlers for this purpose, which essentially trigger randomized interactions with the graphical user interface and successively explore a large number of states of the user interface. Such an approach is slow and can only explore a fraction of the state space.

In jüngster Zeit sind alternative Methoden entwickelt worden, bei denen zunächst ein vollständiger Testlauf für eine Testapplikation programmiert wird. Dieser Testlauf wird sodann auf eine alternative Applikation angewendet. Mit einer solchen Herangehensweise kann zwar im Vergleich zu einem randomisiert arbeitenden Crawler in der gleicher Zeit ein größerer Bereich des Zustandsraums der zu testenden Applikation erkundet werden, allerdings ist die Methode sehr aufwändig, da für jeden einzelnen Testfall ein vollständiger Testlauf programmiert werden muss.Alternative methods have recently been developed in which a complete test run for a test application is first programmed. This test run is then applied to an alternative application. With such an approach, a larger area of the state space of the application to be tested can be explored in the same amount of time compared to a randomized crawler, but the method is very complex because a complete test run has to be programmed for each individual test case.

Vor diesem Hintergrund liegt der Erfindung die Aufgabe zugrunde, Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche und zum Testen einer Applikation zu schaffen, die eine Navigation durch die graphische Benutzeroberfläche erleichtert. Ferner soll erreicht werden, dass die Zeit, die zum Entwickeln eines Testprotokolls für eine Applikation benötigt wird, reduziert wird und dass unterschiedliche Applikationen mit demselben Testprotokoll getestet werden können.Against this background, the invention is based on the object of creating a method for interacting with a graphical user interface and for testing an application that facilitates navigation through the graphical user interface. Furthermore, the aim is to reduce the time required to develop a test protocol for an application and to be able to test different applications with the same test protocol.

Soweit im Folgenden Varianten der Erfindung beschrieben werden, können diese beliebig miteinander kombiniert werden, sofern eine Kombination aus technischen Gründen nicht ausgeschlossen ist.To the extent that variants of the invention are described below, they can be combined with one another as desired, provided that a combination is not ruled out for technical reasons.

Zur Lösung der zuvor genannten Aufgabe schlägt die Erfindung die Merkmale von Anspruch 1 vor. Insbesondere wird somit erfindungsgemäß bei einem Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche der eingangs beschriebenen Art zur Lösung der genannten Aufgabe vorgeschlagen, dass ein Sprachbefehl bereitgestellt wird, aus dem ein Textelement identifiziert wird, dass zu dem Interaktionselement ein lexikalischer Inhalt ermittelt wird, dass ein semantischer Ähnlichkeitsgrad des Textelements zu dem lexikalischen Inhalt bestimmt wird und dass in Abhängigkeit von dem Ähnlichkeitsgrad mit dem Interaktionselement interagiert wird. Dies ist so zu verstehen, dass neben dem einen identifizierten Textelement auch weitere Textelemente identifiziert werden können. Neben dem einen Sprachfehl können auch weitere Sprachbefehle erfolgen. Beides wird sogar der Regelfall sein.The invention proposes the features of claim 1 in order to solve the aforementioned problem. In particular, it is therefore proposed according to the invention in a method for interacting with a graphical user interface of the type described at the outset to solve the stated task that a voice command is provided from which a text element is identified, that lexical content is determined for the interaction element, that a semantic Degree of similarity of the text element is determined to the lexical content and that is interacted with depending on the degree of similarity with the interaction element. This is to be understood in such a way that, in addition to the one identified text element, other text elements can also be identified. In addition to the one voice error, other voice commands can also be given. Both will even be the norm.

Hierdurch wird die Navigation durch die graphische Benutzeroberfläche erheblich erleichtert. So wird eine Sprachsteuerung ermöglicht, die nicht an von der Benutzeroberfläche vorgegebene Begriffe gebunden ist. Infolge der Bestimmung eines semantischen Ähnlichkeitsgrads zwischen Textelementen des Sprachbefehls und Interaktionselementen, können syntaktisch völlig unähnliche, allerdings semantisch ähnliche Begriffe verwendet werden. So wird beispielsweise ermöglicht, den Sprachbefehl „Melde mich an“ auf eine graphische Benutzeroberfläche anzuwenden, deren entsprechende Schaltfläche mit „Einloggen“ bezeichnet ist. Dies ermöglicht eine natürliche Sprachsteuerung, ohne die Begriffsverwendung der zu steuernden graphischen Benutzeroberfläche benutzen zu müssen.This greatly simplifies navigation through the graphical user interface. This enables voice control that is not tied to terms specified by the user interface. Due to the determination of a semantic degree of similarity between text elements of the voice command and interaction elements, syntactically completely dissimilar but semantically similar terms can be used. For example, it is possible to use the voice command "Sign me in" on a graphical user interface whose corresponding button is labeled "Login". This enables natural language control without having to use the terminology of the graphical user interface to be controlled.

Dies ist nicht nur nützlich für Menschen, die wegen mangelnder Sehkraft die Begriffe nicht erkennen können, sondern auch für sehende Menschen, da die Interaktionselemente außerhalb des angezeigten Bereichs liegen können und daher für den Anwender nicht zu jeder Zeit sichtbar sind.This is useful not only for people who cannot recognize the terms due to poor eyesight, but also for sighted people since the interaction elements can be outside the displayed area and therefore not visible to the user at all times.

Der Sprachbefehl kann aus einem einzelnen Wort bestehen, er wird im Regelfall allerdings durch eine Abfolge von Textelementen gebildet sein. Ein Textelement kann hierbei aus einem einzelnen Wort bestehen oder seinerseits durch eine Abfolge von Wörtern gebildet sein. Der Sprachbefehl ist bevorzugt aus Wörtern einer natürlichen Sprache gebildet.The voice command can consist of a single word, but it will usually be formed by a sequence of text elements. A text element can consist of a single word or be formed by a sequence of words. The voice command is preferably formed from words in a natural language.

Bevorzugt ist der Sprachbefehl nach Regeln einer Grammatik gebildet. Hierbei handelt es sich besonders bevorzugt um eine formale Grammatik.The voice command is preferably formed according to the rules of a grammar. This is particularly preferably a formal grammar.

Bevorzugt teilt die Grammatik Textelemente in verschiedene Klassen ein. Weiter unten sind beispielhaft einige mögliche Klassen genauer beschrieben. Das identifizierte Textelement gehört bevorzugt einer Klasse an, deren Textelemente Interaktionselemente bezeichnen.The grammar preferably divides text elements into different classes. Some possible classes are described in more detail below by way of example. The identified text element preferably belongs to a class whose text elements designate interaction elements.

Es kann vorgesehen sein, dass der bereitgestellte Sprachbefehl computergestützt in eine Abfolge von Textelementen zergliedert wird. Hierbei kann die Abfolge auch nur aus einem Textelement bestehen, auch wenn regelmäßig die Abfolge mehrere Textelemente umfasst. Die Zergliederung kann beispielsweise mittels eines Textparsers erfolgen. Dies erfolgt bevorzugt gemäß den Regeln der zuvor genannten Grammatik.Provision can be made for the provided voice command to be broken down into a sequence of text elements with the aid of a computer. In this case, the sequence can also consist of just one text element, even if the sequence regularly includes a plurality of text elements. The breakdown can take place, for example, using a text parser. This is preferably done according to the rules of the aforementioned grammar.

Es kann beispielsweise vorgesehen sein, dass anhand der Grammatik erkannt wird, ob ein Textelement aus einem einzelnen Wort oder aus einer Abfolge mehrerer Wörter besteht. Hierzu können beispielsweise Signalwörter oder Satzzeichen vorgesehen sein, die bewirken, dass mehrere Wörter zu einem Textelement zusammengefasst werden.It can be provided, for example, that the grammar is used to identify whether a text element consists of a single word or a sequence of several words. For this purpose, for example, signal words or punctuation marks can be provided which cause several words to be combined into one text element.

Bevorzugt wird eine Liste erstellt, welche diejenigen Textelemente, die einer Klasse angehören, deren Textelemente Interaktionselemente bezeichnen, in der Reihenfolge ihres Auftretens enthält. Eventuell bestehende Makros werden vorher aufgelöst.A list is preferably created which contains those text elements that belong to a class whose text elements designate interaction elements in the order in which they occur. Any existing macros are resolved beforehand.

Der Sprachbefehl kann vorzugsweise nach den Regeln der zuvor genannten Grammatik, ansonsten frei gebildet sein. Von Vorteil ist es allerdings, wenn zur Aufstellung des Sprachbefehls weitere Informationen über eine Applikation, die eine ähnliche Funktionalität aufweist wie die Applikation, mit deren Benutzeroberfläche interagiert werden soll, verwendet werden. Hierzu können Beschreibungsunterlagen einer solchen Applikation oder auch Ergebnisse aus einer Testung einer solchen Applikation verwendet werden. Wichtig hierbei ist, dass es sich um eine andere Applikation mit einer Benutzeroberfläche handeln kann, die sich von der Benutzeroberfläche unterscheidet, die Gegenstand des erfindungsgemäßen Verfahrens ist. Dies ist ein großer Vorteil der Erfindung, da der einmal geschaffene Sprachbefehl auf eine Vielzahl unterschiedlicher Applikationen angewendet werden kann.The voice command can preferably be formed according to the rules of the aforementioned grammar, otherwise freely. However, it is advantageous if additional information about an application that has a similar functionality to the application whose user interface is to be interacted with is used to set up the voice command. For this purpose, description documents of such an application or also results from testing such an application can be used. What is important here is that it can be a different application with a user interface that differs from the user interface that is the subject of the method according to the invention. This is a great advantage of the invention once created Voice command can be applied to a variety of different applications.

Es kann vorgesehen sein, dass der Sprachbefehl bereitgestellt wird, indem eine sprachliche Eingabe in eine Abfolge von Textelementen umgewandelt wird. Hierzu kann im Prinzip jedes Sprachaufnahmegerät und jedes Spracherkennungsverfahren verwendet werden, die dem Fachmann geeignet erscheinen. Alternativ oder zusätzlich kann vorgesehen sein, dass der Sprachbefehl bereitgestellt wird, indem eine Abfolge von Textelementen bereitgestellt wird. Beispielsweise kann dies durch eine Tastatureingabe oder durch das Auslesen eines Inhalts einer Textdatei erfolgen. Auch diesbezüglich kann die Abfolge von Textelementen lediglich ein einzelnes Textelement aufweisen, bevorzugt weist sie allerdings mehr als ein Textelement auf.Provision can be made for the spoken command to be provided by converting a spoken input into a sequence of text elements. In principle, any voice recording device and any voice recognition method can be used for this purpose that appear suitable to the person skilled in the art. Alternatively or additionally, it can be provided that the voice command is provided by providing a sequence of text elements. For example, this can be done by entering a key or by reading the content of a text file. In this regard, too, the sequence of text elements can only have a single text element, although it preferably has more than one text element.

Zur Bestimmung des semantischen Ähnlichkeitsgrad des Textelements zu dem lexikalischen Inhalt kann grundsätzlich jede Methode angewandt werden, die aus der Computerlinguistik zur Bestimmung einer semantischen Ähnlichkeit bekannt ist. Beispielsweise kann die Kosinus-Ähnlichkeit der beiden miteinander verglichenen Textelemente berechnet werden, normalisiert auf die Länge der verglichenen Zeichenketten. Die Kosinus-Ähnlichkeit liefert einen numerischen Wert in einem Intervall [-1, 1], wobei Werte nahe 1 einen hohen semantischen Ähnlichkeitsgrad bedeuten.In principle, any method known from computer linguistics for determining a semantic similarity can be used to determine the degree of semantic similarity of the text element to the lexical content. For example, the cosine similarity of the two text elements being compared can be calculated, normalized to the length of the strings being compared. Cosine similarity returns a numerical value in an interval [-1, 1], where values close to 1 indicate a high degree of semantic similarity.

Grundlage für die Bestimmung des semantischen Ähnlichkeitsgrads kann ein Modellraum für einen Satz von Wörtern sein, wobei jedem Wort des Satzes von Wörtern ein Element in dem Modellraum zugordnet ist. In dem Modellraum existiert eine Metrik, mit der ein Abstand der Elemente zueinander bestimmt werden kann, wobei der Abstand einen semantischen Abstand der Elemente zueinander beschreibt. The basis for determining the degree of semantic similarity can be a model space for a set of words, with each word in the set of words being assigned an element in the model space. A metric exists in the model space, with which a distance between the elements can be determined, with the distance describing a semantic distance between the elements.

Beispielsweise kann der Modellraum ein n-dimensionaler Vektorraum mit n >= 2 sein. Die Elemente können Vektoren in dem Vektorraum sein und die Metrik kann beispielsweise durch einen Winkel zwischen Vektoren des Vektorraums gegeben sein.For example, the model space can be an n-dimensional vector space with n>=2. The elements can be vectors in the vector space and the metric can be given by an angle between vectors of the vector space, for example.

Um den Wörtern Elemente des Modellraums bzw. Vektoren zuzuordnen, können ausgehend von einem Textkorpus Beziehungen zwischen Wörtern ermittelt werden. Wörter mit hoher semantischer Ähnlichkeit werden sodann Elementen des Modellraums mit geringem Abstand zueinander zugeordnet. Ein bekanntes Modell ist beispielsweise das word2vec-Modell, das mit einem neuronalen Netzwerk trainiert wird.In order to assign elements of the model space or vectors to the words, relationships between words can be determined starting from a text corpus. Words with high semantic similarity are then assigned to elements of the model space that are close to each other. A well-known model is, for example, the word2vec model, which is trained with a neural network.

Zur Ausführung des Verfahrens werden sodann die miteinander zu vergleichenden Textelemente mit dem Modellraum abgeglichen, sodass den Textelementen Elemente aus dem Modellraum zugeordnet werden, Mittels der Metrik kann sodann der semantische Abstand zwischen den Textelementen bestimmt werden.To carry out the method, the text elements to be compared with one another are then compared with the model space, so that the text elements are assigned elements from the model space. The metric can then be used to determine the semantic distance between the text elements.

Soll die semantische Ähnlichkeit von einem ersten Textelement zu einem zweiten Textelement bestimmt werden, von denen eines oder beide aus mehreren Wörtern des Modellraums bestehen, so kann vorgesehen sein, dass die semantischen Ähnlichkeiten jeweils paarweise berechnet und in einer Matrix angeordnet werden. Es kann sodann vorgesehen sein, dass den so ermittelten Matrixeinträgen ein Skalar zugeordnet wird, der die semantische Ähnlichkeit beschreibt. Hierzu kann vorgesehen sein, dass zu jeder Zeile oder zu jeder Spalte ein größter Wert bestimmt wird und die so ermittelten Werte miteinander addiert werden. Zur Normalisierung kann die Summe sodann durch die Anzahl der aufaddierten Werte dividiert werden. Wird für jeden Eintrag der Matrix die Kosinus-Ähnlichkeit verwendet, so liegt der so ermittelte Skalar wieder zwischen -1 und 1. If the semantic similarity of a first text element to a second text element is to be determined, one or both of which consist of several words in the model space, it can be provided that the semantic similarities are calculated in pairs and arranged in a matrix. It can then be provided that a scalar describing the semantic similarity is assigned to the matrix entries determined in this way. For this purpose, it can be provided that a maximum value is determined for each row or for each column and the values determined in this way are added to one another. For normalization, the sum can then be divided by the number of added values. If cosine similarity is used for each entry in the matrix, the scalar determined in this way is again between -1 and 1.

Grundlage für die Bestimmung eines semantischen Ähnlichkeitsgrads kann daher ein Modell eines Satzes von Wörtern sein, wobei das Modell Beziehungen zwischen den Wörtern abbildet. Die Beziehungen können beispielsweise in einem gemeinsamen Auftreten in einem Kontext, beispielsweise in einem Satz oder einer sonstigen Sinneinheit, bestehen. Dem Fachmann sind diesbezüglich eine Vielzahl von Methoden bekannt.The basis for determining a semantic degree of similarity can therefore be a model of a set of words, with the model depicting relationships between the words. The relationships can consist, for example, in a joint occurrence in a context, for example in a sentence or some other unit of meaning. A large number of methods in this regard are known to the person skilled in the art.

Derartige Beziehungen zwischen Wörtern bleiben bei rein syntaktischen Vergleichen von Textelementen außer Betracht.Such relationships between words are not taken into account in purely syntactic comparisons of text elements.

Um einen geeigneten lexikalischen Inhalt dem Interaktionselement zuordnen zu können, kann vorgesehen sein, dass der lexikalische Inhalt zu dem Interaktionselement ermittelt wird, indem ein auf dem Interaktionselement angeordneter lexikalischer Inhalt identifiziert wird.In order to be able to assign suitable lexical content to the interaction element, it can be provided that the lexical content for the interaction element is determined by identifying lexical content arranged on the interaction element.

Diese Herangehensweise scheitert, wenn auf dem Interaktionselement kein lexikalischer Inhalt angeordnet ist.This approach fails if no lexical content is placed on the interaction element.

Allerdings wird eine das Interaktionselement kennzeichnende Beschreibung in der Regel in dessen Nähe angeordnet sein. So wird regelmäßig dem Interaktionselement räumlich ein graphisch dargestelltes Beschreibungselement zugeordnet sein.However, a description characterizing the interaction element will usually be arranged in its vicinity. A graphically represented descriptive element will thus regularly be spatially assigned to the interaction element.

Es kann daher vorteilhaft sein, wenn zusätzlich oder statt einer Identifizierung eines auf dem Interaktionselement angeordneten lexikalischen Inhalts der lexikalische Inhalt zu dem Interaktionselement ermittelt wird, indem Abstände von umliegenden Beschreibungselementen mit einem lexikalischen Inhalt ermittelt werden und der lexikalische Inhalt des nahestgelegenen Beschreibungselement ausgewählt wird. Hierzu kann beispielsweise vorgesehen sein, dass euklidische Abstände von dem Interaktionselement zu seinen benachbarten Beschreibungselementen bestimmt werden, wobei sodann das Beschreibungselement mit einem geringsten Abstand ausgewählt wird. Zur Bestimmung des Abstands von zwei Elementen kann beispielsweise der euklidische Abstand von Ecken der Elemente, beispielsweise der linken-oberen Ecke der Elemente, bestimmt werden. Hierbei kann vorgesehen sein, dass bevorzugt ein Beschreibungselement ausgewählt wird, das mit dem Interaktionselement überlappt.It can therefore be advantageous if, in addition to or instead of identifying a lexical content arranged on the interaction element, the lexical content for the interaction element is determined by distances from surrounding the description elements with a lexical content are determined and the lexical content of the closest description element is selected. For this purpose, it can be provided, for example, that Euclidean distances from the interaction element to its neighboring description elements are determined, with the description element with the smallest distance then being selected. To determine the distance between two elements, for example, the Euclidean distance from corners of the elements, for example the top left corner of the elements, can be determined. It can be provided here that a description element is preferably selected which overlaps with the interaction element.

Bevorzugt wird zunächst geprüft, ob auf dem Interaktionselement ein lexikalischer Inhalt angeordnet ist. Wenn diese Prüfung negativ ausfällt, kann sodann die nähere Umgebung nach lexikalischen Inhalten wie beschrieben untersucht werden.It is preferably first checked whether lexical content is arranged on the interaction element. If this test fails, the immediate environment can then be examined for lexical content as described.

Es kann vorgesehen sein, dass als Beschreibungselemente nur solche gelten, mit denen eine Interaktion nicht möglich ist. Dies ist allerdings nicht zwingend, da auch ein benachbartes Interaktionselement eine gemeinsame Beschreibung tragen kann. Es kann daher auch vorgesehen sein, dass alle graphisch dargestellten benachbarten Elemente mit lexikalischem Inhalt als Beschreibungselemente in Betracht kommen.Provision can be made for only those elements with which an interaction is not possible to apply as descriptive elements. However, this is not mandatory since an adjacent interaction element can also have a common description. It can therefore also be provided that all graphically represented adjacent elements with lexical content can be considered as descriptive elements.

Bei einer weiteren vorteilhaften Ausgestaltung des Verfahrens kann vorgesehen sein, dass der lexikalische Inhalt des Interaktionselements oder des Beschreibungselements ermittelt wird, indem er einem Textfeld einer einen aktuellen Zustand der Benutzeroberfläche beschreibenden Datenstruktur unmittelbar entnommen wird. Häufig ist eine solche Datenstruktur über eine Softwareschnittstelle der graphischen Benutzeroberfläche verfügbar. Hierbei kann die Datenstruktur Informationen zu dem Interaktionselement oder dem Beschreibungselement in Form von Datenfeldern aufweisen. Hierbei kann das Textfeld insbesondere ein dem Interaktionselement oder dem Beschreibungselement zugeordnetes Datenfeld der Datenstruktur sein.In a further advantageous embodiment of the method, it can be provided that the lexical content of the interaction element or of the description element is determined by being taken directly from a text field of a data structure describing a current state of the user interface. Such a data structure is often available via a graphical user interface software interface. In this case, the data structure can have information about the interaction element or the description element in the form of data fields. In this case, the text field can in particular be a data field of the data structure assigned to the interaction element or the description element.

Ist kein solches Datenfeld verfügbar, so kann der lexikalische Inhalt des Interaktionselements oder des Beschreibungselements auch ermittelt werden, indem auf die graphische Darstellung des Interaktionselements oder des Beschreibungselements ein Verfahren der Texterkennung und/oder der Bilderkennung angewendet wird. Hierzu können alle gängigen Verfahren verwendet werden. Für die Texterkennung können beispielsweise OCR-(optical character recognition)-Verfahren eingesetzt werden.If no such data field is available, the lexical content of the interaction element or the description element can also be determined by applying a text recognition and/or image recognition method to the graphical representation of the interaction element or the description element. All common methods can be used for this. For example, OCR (optical character recognition) methods can be used for text recognition.

Es kann vorkommen, dass das Textfeld der Datenstruktur oder der durch die Texterkennung ermittelte Text ein Zeichen wie etwa ein Icon enthält, das von der Grammatik nicht umfasst ist. Es kann vorgesehen sein, dass ein Verfahren angewendet wird, das einem solchen Zeichen oder auch einem durch die Bilderkennung ermittelten Graphikelement einen lexikalischen Inhalt zuordnet. Hierzu kann beispielsweise eine Liste hinterlegt sein, die eine Zuordnung eines lexikalischen Inhalts zu den Zeichen und/oder Graphikelementen enthält. Das Verfahren würde in einem solchen Fall daher im Wesentlichen aus einem Zugriff auf das entsprechende Listenelement bestehen.It can happen that the text field of the data structure or the text determined by the text recognition contains a character such as an icon that is not covered by the grammar. Provision can be made for a method to be used which assigns lexical content to such a character or also to a graphic element determined by image recognition. For this purpose, for example, a list can be stored that contains an assignment of lexical content to the characters and/or graphic elements. In such a case, the method would therefore essentially consist of accessing the corresponding list element.

Es kann vorgesehen sein, dass eine Identifizierung des Interaktionselements und/oder des Beschreibungselements und/oder Informationen zu diesen Elementen mittels einer einen aktuellen Zustand der Benutzeroberfläche beschreibenden Datenstruktur erfolgt.Provision can be made for the interaction element and/or the description element and/or information about these elements to be identified by means of a data structure describing a current state of the user interface.

Ein zu einem Interaktionselement oder einem Beschreibungselement ermittelter Text kann noch unerwünschte, nicht sinntragende Unicode-Zeichen, Zeilenumbrüche oder dergleichen enthalten. Es kann daher vorgesehen sein, dass zur Ermittlung des lexikalischen Inhalts ein erkannter Text zunächst noch bereinigt wird, indem eine Auswahl von Zeichen entfernt oder durch Leerzeichen ersetzt werden. Beispielsweise kann vorgesehen sein, dass alle Sonderzeichen durch Leerzeichen ersetzt werden und alle Stoppwörter entfernt werden.A text determined for an interaction element or a description element can still contain unwanted, meaningless Unicode characters, line breaks or the like. It can therefore be provided that, in order to determine the lexical content, a recognized text is first cleaned by removing a selection of characters or replacing them with spaces. For example, it can be provided that all special characters are replaced by spaces and all stop words are removed.

Es kann vorgesehen sein, dass die erkannte und vorzugsweise bereits bereinigte Zeichenkette in eine Liste von Wörtern zergliedert wird, welche sodann den ermittelten lexikalischen Inhalt bilden.Provision can be made for the recognized and preferably already cleaned character string to be broken down into a list of words which then form the determined lexical content.

Eine erhebliche Erleichterung der Navigation auf der Benutzeroberfläche kann erzielt werden, wenn vorgesehen ist, dass der Sprachbefehl eine Bezeichnung für ein Makro umfasst, wobei das Makro seinerseits eine Abfolge von Textelementen umfasst. Die Abfolge von Textelementen zu dem Makro ist vorzugsweise in einer Textdatei gespeichert. Dies hat den Vorteil, dass der Sprachbefehl erheblich einfacher gestaltet sein kann und abstraktere Anweisungen enthalten kann, während Details in den Makros abgelegt sein können. Bei der Verarbeitung des Sprachbefehls werden bevorzugt die Makros direkt aufgelöst, indem die von dem Makro gebildete Abfolge von Textelementen die Bezeichnung des Makros im Sprachbefehl ersetzt. Das Makro kann selbst wieder Makros aufweisen, die dann rekursiv aufgelöst werden.Navigation on the user interface can be made considerably easier if it is provided that the voice command includes a designation for a macro, the macro in turn including a sequence of text elements. The sequence of text elements for the macro is preferably stored in a text file. This has the advantage that the voice command can be designed much more simply and contain more abstract instructions, while details can be stored in the macros. When processing the voice command, the macros are preferably resolved directly, in that the sequence of text elements formed by the macro replaces the designation of the macro in the voice command. The macro itself can have macros, which are then resolved recursively.

Es kann vorgesehen sein, dass für den Fall, dass eine Interaktion eine Eingabe erfordert, eine solche allerdings nicht in dem Sprachbefehl mit aufgelösten Makros enthalten ist, das Verfahren angehalten wird, bis eine Eingabe erfolgt. Hierbei kann vorgesehen sein, dass ein graphisch oder akustisch wahrnehmbarer Hinweis an den Anwender des Verfahrens erfolgt. Es kann vorgesehen sein, dass die Eingabe per Sprachsteuerung erfolgt oder aber über die üblichen von der Benutzeroberfläche geforderten Eingabewege.It can be provided that in the event that an interaction requires an input, such an input is not included in the voice command solved macros, the procedure is halted until input is given. Provision can be made here for a graphically or acoustically perceptible indication to be given to the user of the method. It can be provided that the input takes place via voice control or via the usual input paths required by the user interface.

Es kann vorgesehen sein, dass mit einem solchen Interaktionselement, wenn es außerhalb einer aktuellen Anzeige der graphischen Benutzeroberfläche liegt, erst interagiert wird, nachdem dieses durch Scrollen der Anzeige innerhalb der aktuellen Anzeige angezeigt wird. Es kann darüber hinaus vorgesehen sein, dass ein Interaktionselement immer vor einer Interaktion mit diesem angezeigt wird.Provision can be made for such an interaction element, if it is outside a current display of the graphical user interface, to only be interacted with after it has been displayed within the current display by scrolling the display. In addition, it can be provided that an interaction element is always displayed before an interaction with it.

Bei einer vorteilhaften Ausgestaltung des Verfahrens kann vorgesehen sein, dass das Interaktionselement identifiziert wird, indem ein interaktives Element eines aktuellen Zustands der graphischen Benutzeroberfläche ausgewählt wird. Für die Identifikation kann beispielsweise auf Funktionalitäten zurückgegriffen werden, die von herkömmlichen Crawlern bekannt sind, die auf eine die graphische Benutzeroberfläche beschreibende Datenstruktur zurückgreifen können.In an advantageous embodiment of the method it can be provided that the interaction element is identified by an interactive element of a current state of the graphical user interface being selected. Functionalities known from conventional crawlers, which can access a data structure describing the graphical user interface, can be used for the identification, for example.

Um existierende, allerdings nicht sichtbare interaktive Elemente von einer Interaktion auszuschließen, kann vorgesehen sein, dass nur ein solches interaktives Element ausgewählt wird, das eine endliche Ausdehnung hat. Auch kann es zweckmäßig sein, nur ein solches interaktives Element auszuwählen, das nicht von anderen angezeigten Elementen vollständig oder auch nur teilweise verdeckt ist.In order to exclude existing but not visible interactive elements from an interaction, it can be provided that only such an interactive element is selected that has a finite extent. It can also be expedient to select only such an interactive element that is not completely or only partially covered by other displayed elements.

Um Fehlinteraktionen zu vermeiden, kann bei einer weiteren vorteilhaften Ausgestaltung des Verfahrens vorgesehen sein, dass mit dem Interaktionselement nur für den Fall interagiert wird, dass der ermittelte Ähnlichkeitsgrad einen Schwellwert überschreitet. Beispielsweise haben Versuche gezeigt, dass bei der Verwendung der Kosinus-Ähnlichkeit ein Schwellwert mit dem Wert 0 besonders geeignet ist. Auch ein zu hoch angesetzter Schwellwert sollte vermieden werden, um zu vermeiden, dass an sich gute Kandidaten letztlich nicht ausgewählt werden.In order to avoid incorrect interactions, in a further advantageous embodiment of the method it can be provided that the interaction element is only interacted with in the event that the ascertained degree of similarity exceeds a threshold value. For example, experiments have shown that when using cosine similarity, a threshold with the value 0 is particularly suitable. A threshold value that is set too high should also be avoided in order to avoid that good candidates are ultimately not selected.

Wird der Schwellwert nicht erreicht oder auf andere Weise keine Ähnlichkeit festgestellt und wird ein solcher Schwellwert auch nicht bei der Bestimmung weiterer Ähnlichkeitsgrade zwischen Textelementen des Sprachbefehls und lexikalischen Inhalten von Interaktionselementen erreicht oder auf andere Weise eine Ähnlichkeit festgestellt, so kann vorgesehen sein, dass eine Interaktion zufällig ausgewählt wird. Bevorzugt werden hierbei solche Interaktionen ausgeschlossen, die bereits zuvor im Verlauf der Ausführung des Verfahrens oder des Sprachbefehls erfolgt sind. Durch derartige Ausgestaltungen des Verfahrens kann erreicht werden, dass auch größere Abweichungen von Sprachbefehl und Funktionalität der Applikation, auf die das Verfahren angewendet wird, kein Hindernis darstellen.If the threshold value is not reached or no similarity is determined in any other way and if such a threshold value is also not reached when determining further degrees of similarity between text elements of the voice command and lexical contents of interaction elements or if a similarity is determined in another way, it can be provided that an interaction is selected at random. Interactions that have already taken place previously in the course of executing the method or the voice command are preferably excluded here. Such configurations of the method make it possible to ensure that even larger deviations in the voice command and functionality of the application to which the method is applied do not represent an obstacle.

Bei einer weiteren vorteilhaften Ausgestaltung des Verfahrens kann vorgesehen sein, dass Ähnlichkeitsgrade von Textelementen des Sprachbefehls zu zu Interaktionselementen der graphischen Benutzeroberfläche ermittelten lexikalischen Inhalten bestimmt werden und dass mit demjenigen Interaktionselement zuerst interagiert wird, für das der bestimmte Ähnlichkeitsgrad am höchsten ist. Entscheidend bei dieser Ausgestaltung des Verfahrens ist, dass vor der in Frage stehenden Interaktion mehr als ein Ähnlichkeitsgrad bestimmt wird. Hierzu können Ähnlichkeitsgrade von einem Textelement zu mehreren lexikalischen Inhalten, von mehreren Textelementen zu einem lexikalischen Inhalt und von mehreren Textelementen zu mehreren lexikalischen Inhalten bestimmt werden.In a further advantageous embodiment of the method, it can be provided that degrees of similarity of text elements of the voice command to lexical content determined for interaction elements of the graphical user interface are determined and that the interaction element is interacted with first for which the determined degree of similarity is highest. What is decisive in this embodiment of the method is that more than one degree of similarity is determined before the interaction in question. For this purpose, degrees of similarity can be determined from a text element to several lexical contents, from several text elements to one lexical contents and from several text elements to several lexical contents.

„Mehrere“ bedeutet in dieser Erfindungsbeschreibung „mindestens zwei“.In this description of the invention, “several” means “at least two”.

Die Bestimmung mehrerer Ähnlichkeitsgrade und die Auswahl eines besten Paares kann zu einer erheblichen Flexibilisierung der Navigation auf einer Benutzeroberfläche mittels Sprachbefehlen führen.The determination of several degrees of similarity and the selection of a best pair can lead to a considerable flexibility in navigating a user interface using voice commands.

Besonders ersichtlich wird dies, wenn vorgesehen ist, dass die Ordnung der Textelemente des Sprachbefehls unberücksichtigt bleibt. So kann vorgesehen sein, dass mehrere Textelemente, zu denen ein Ähnlichkeitsgrad bestimmt wird, insbesondere alle Textelemente der Liste, welche die Textelemente enthält, die einer Klasse angehören, welche die Interaktionselemente beschreibt, gleichwertig behandelt werden. Dies erlaubt es, auch solche Sprachbefehle erfolgreich auszuführen, welche Interaktionen in einer bestimmten Reihenfolge vorgeben, auch wenn die Applikation, auf die das Verfahren angewendet wird, eine andere Reihenfolge von Interaktionen verlangt oder auch einige der im Sprachbefehl enthaltenen Interaktionen überhaupt nicht vorsieht.This becomes particularly evident when it is provided that the order of the text elements of the voice command is not taken into account. Provision can thus be made for a number of text elements for which a degree of similarity is determined, in particular all text elements in the list which contains the text elements which belong to a class which describes the interaction elements, to be treated equally. This also makes it possible to successfully execute those voice commands which specify interactions in a specific order, even if the application to which the method is applied requires a different order of interactions or even does not provide for some of the interactions contained in the voice command at all.

Soll die Reihenfolge nicht vollständig ignoriert werden, allerdings auch nicht zwingend an ihr festgehalten werden, so kann dies durch eine Einführung einer Gewichtung erfolgen, die den Ähnlichkeitsgrad umso mehr absenkt, desto später das Textelement in dem Sprachbefehl auftritt. Alternativ oder zusätzlich kann vorgesehen sein, dass bei gleichen Ähnlichkeitsgraden oder geringen Unterschieden die Reihenfolge der Textelemente Vorrang hat.If the order is not to be completely ignored, but also not necessarily to be adhered to, this can be done by introducing a weighting that lowers the degree of similarity the more the later the text element occurs in the voice command. Alternatively or additionally it can be provided that with the same degrees of similarity or minor differences, the order of the text elements takes precedence.

Es kann vorgesehen sein, dass dem Textelement des Sprachbefehls mindestens eine Aktion zugeordnet wird und dass die mindestens eine dem Textelement zugordnete Aktion mit einer dem Interaktionselement zugeordneten Aktion verglichen wird. Bevorzugt erfolgt die Bestimmung des Ähnlichkeitsgrads des Textelements zu dem Interaktionselement in Abhängigkeit von dem Ergebnis des Vergleichs. Besonders bevorzugt wird der Ähnlichkeitsgrad nur bestimmt, wenn die Aktionen kompatibel sind. Lässt ein Interaktionselement als Interaktion nur ein Anklicken zu und ist mit dem Textelement eine Eingabe verknüpft, so sind die Aktionen nicht kompatibel und es wäre überflüssig, einen Ähnlichkeitsgrad zu bestimmen. Bevorzugt kann ferner vorgesehen sein, dass der Ähnlichkeitsgrad auch bei miteinander kompatiblen Aktionen eine Gewichtung erfährt. Ein Beispiel, bei dem eine Absenkung zweckmäßig sein kann, ist weiter unten bei der Beschreibung des in 1 dargestellten Verfahrens beschrieben.It can be provided that at least one action is assigned to the text element of the voice command and that the at least one action assigned to the text element is compared with an action assigned to the interaction element. The degree of similarity of the text element to the interaction element is preferably determined as a function of the result of the comparison. The degree of similarity is particularly preferably only determined if the actions are compatible. If an interaction element only allows a click as an interaction and an input is linked to the text element, then the actions are not compatible and it would be superfluous to determine a degree of similarity. Provision can also preferably be made for the degree of similarity to be weighted even in the case of mutually compatible actions. An example where a lowering can be useful is given below in the description of the in 1 described procedure.

Zur besseren Steuerung der Ausführung eines Sprachbefehls kann vorgesehen sein, dass gespeichert wird, ob und/oder wie oft während der Ausführung des Verfahrens mit dem Interaktionselement interagiert worden ist. Alternativ oder zusätzlich kann vorgesehen sein, dass aus Interaktionen mit der Benutzeroberfläche und aus den infolge der Interaktionen erreichten Zuständen der Benutzeroberfläche ein Zustandsmodell der Benutzeroberfläche erstellt und/oder aktualisiert wird. Die gesammelten Informationen können verwendet werden, um das Verfahren weiter in vorteilhafter Weise zu beeinflussen, wie auch das nachfolgende Ausführungsbeispiel zeigt.For better control of the execution of a voice command, it can be provided that it is stored whether and/or how often the interaction element was interacted with during the execution of the method. Alternatively or additionally, it can be provided that a state model of the user interface is created and/or updated from interactions with the user interface and from the states of the user interface reached as a result of the interactions. The information collected can be used to further influence the method in an advantageous manner, as the following exemplary embodiment also shows.

So kann vorgesehen sein, dass der Ähnlichkeitsgrad abgesenkt wird, wenn erkannt wird, dass mit dem Interaktionselement bereits zuvor im Verlauf der Ausführung des Verfahrens interagiert worden ist. Hierzu ist bevorzugt ein Wichtungsfaktor vorgesehen, der mit dem zuvor ermittelten Ähnlichkeitsgrad multipliziert oder von diesem subtrahiert wird. Bevorzugt wird der Ähnlichkeitsgrad umso weiter abgesenkt je häufiger mit einem Interaktionselement interagiert worden ist. Derartige Ausgestaltungen des Verfahrens haben unter anderem den Vorteil, dass der Sprachbefehl schneller und zuverlässiger abgearbeitet werden kann, da insbesondere vermieden wird, dass eine nicht zielführende Aktion mehrfach oder sogar immer wiederkehrend ausgelöst wird.Provision can thus be made for the degree of similarity to be reduced if it is recognized that the interaction element has already been interacted with during the execution of the method. A weighting factor is preferably provided for this purpose, which is multiplied by the previously determined degree of similarity or subtracted from it. The degree of similarity is preferably further reduced the more frequently an interaction element has been interacted with. Such refinements of the method have the advantage, among other things, that the voice command can be processed faster and more reliably, since in particular it is avoided that an action that does not lead to the desired result is triggered several times or even repeatedly.

Es kann ferner vorgesehen sein, dass bei Zuständen der Benutzeroberfläche, die im Verlauf der Ausführung des Verfahrens erstmals erreicht werden, zunächst solche Textelemente des Sprachbefehls ausgeführt werden, die als Interaktion eine Eingabe erfordern oder ermöglichen. Es kann vorgesehen sein, dass dies nur erfolgt, wenn ein Schwellwert für den ermittelten Ähnlichkeitsgrad überschritten wird.It can also be provided that in the case of states of the user interface that are reached for the first time in the course of the execution of the method, those text elements of the voice command that require or enable an input as an interaction are executed first. It can be provided that this only takes place if a threshold value for the ascertained degree of similarity is exceeded.

Alternativ kann auch vorgesehen sein, dass der ermittelte Ähnlichkeitsgrad für derartige Textelemente erhöht wird.Alternatively, it can also be provided that the ascertained degree of similarity for such text elements is increased.

Der Zustand der Benutzeroberfläche und/oder der Applikation kann gespeichert werden, nachdem das Verfahren oder der Sprachbefehl ausgeführt worden ist. Dies kann beispielsweise für Zwecke der Auswertung oder Testung nützlich sein.The state of the user interface and/or the application can be saved after the method or the voice command has been executed. This can be useful, for example, for evaluation or testing purposes.

Zur Lösung der zuvor genannten Aufgabe schlägt die Erfindung ferner vor, ein Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche zur Ausführung einer Funktion einer Applikation, die die Benutzeroberfläche als Schnittstelle verwendet, zu verwenden, wobei das Verfahren erfindungsgemäß, insbesondere wie zuvor beschrieben oder nach einem auf ein Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche gerichteten Schutzansprüche, ausgebildet ist. Bevorzugt wird der Sprachbefehl von dem Verwender beispielsweise im Wege einer Spracheingabe bereitgestellt.To achieve the above object, the invention also proposes using a method for interacting with a graphical user interface to execute a function of an application that uses the user interface as an interface, the method according to the invention, in particular as described above or according to a a method for interacting with a graphical user interface directed protection claims, is formed. The voice command is preferably provided by the user, for example by way of a voice input.

Zur Lösung der zuvor genannten Aufgabe schlägt die Erfindung ferner die Merkmale des auf ein Verfahren zum Testen einer Applikation gerichteten nebengeordneten Anspruchs vor. Insbesondere wird somit erfindungsgemäß bei einem Verfahren zum Testen einer eine Benutzeroberfläche als Schnittstelle verwendende Applikation zur Lösung der genannten Aufgabe vorgeschlagen, dass mindestens ein Sprachbefehl vorgegeben wird, dass zu jedem des mindestens einen Sprachbefehls ein Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche ausgeführt wird, wobei dieses Verfahren erfindungsgemäß, insbesondere wie zuvor beschrieben oder nach einem auf ein Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche gerichteten Schutzansprüche, ausgebildet ist, und dass ein Zustandsmodell der Benutzeroberfläche erzeugt und/oder aktualisiert wird und abrufbar abgespeichert wird, wobei das Zustandsmodell die durchgeführten Interaktionen und die hierdurch erreichten Zustände der Benutzeroberfläche umfasst.In order to achieve the aforementioned object, the invention also proposes the features of the independent claim directed to a method for testing an application. In particular, it is therefore proposed according to the invention in a method for testing an application using a user interface as an interface to solve the stated task that at least one voice command is specified, that a method for interaction with a graphical user interface is carried out for each of the at least one voice command, with this Method according to the invention, in particular as described above or according to a protection claim directed to a method for interaction with a graphical user interface, and that a state model of the user interface is generated and/or updated and stored so that it can be called up, the state model showing the interactions carried out and the states of the user interface reached as a result.

Bevorzugt werden mehrere oder eine Vielzahl von Sprachbefehlen vorgegeben. Die Sprachbefehle testen hierbei vorzugsweise unterschiedliche Funktionen der Applikation.Several or a large number of voice commands are preferably specified. In this case, the voice commands preferably test different functions of the application.

Das Zustandsmodell kann in der Folge zur Auswertung des Testergebnisses verwendet werden. Beispielsweise kann ermittelt werden, welche Funktionen die getestete Applikation aufweist.The status model can then be used to evaluate the test result the. For example, it can be determined which functions the tested application has.

Die zuvor genannten Verfahrensschritte werden jeweils bevorzugt computergestützt ausgeführt. Bevorzugt werden alle Verfahrensschritte computergestützt ausgeführt. Es ist allerdings bevorzugt vorgesehen, dass zumindest die Spracheingabe erfolgt, indem ein Anwender einen Sprachbefehl spricht. Alternativ kann ein solcher aber auch computergestützt aus einer Datei ausgelesen werden.The method steps mentioned above are each preferably carried out with the aid of a computer. All method steps are preferably carried out with the aid of a computer. However, it is preferably provided that at least the voice input takes place by a user speaking a voice command. Alternatively, this can also be read out from a file using a computer.

Zur Lösung der zuvor genannten Aufgabe schlägt die Erfindung ferner die Merkmale des auf eine technische Vorrichtung gerichteten nebengeordneten Anspruchs vor. Insbesondere wird somit erfindungsgemäß bei einer technischen Vorrichtung zur Lösung der genannten Aufgabe vorgeschlagen, dass die technische Vorrichtung eingerichtet ist, ein Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche und/oder ein Verfahren zum Testen einer eine Benutzeroberfläche, beispielsweise die zuvor erwähnte Benutzeroberfläche, als Schnittstelle verwendende Applikation auszuführen, wobei das jeweilige Verfahren erfindungsgemäß, insbesondere wie zuvor beschrieben oder nach einem der auf ein entsprechendes Verfahren gerichteten Schutzansprüche, ausgebildet ist.In order to achieve the aforementioned object, the invention also proposes the features of the subordinate claim directed to a technical device. In particular, it is therefore proposed according to the invention for a technical device to solve the stated object that the technical device is set up to use a method for interacting with a graphical user interface and/or a method for testing a user interface, for example the aforementioned user interface, as an interface Execute application, the respective method being designed according to the invention, in particular as described above or according to one of the protection claims directed to a corresponding method.

Zur Lösung der zuvor genannten Aufgabe schlägt die Erfindung ferner eine Anordnung mit einer technischen Vorrichtung und mit einer technischen Einrichtung, auf der eine die Benutzeroberfläche als Schnittstelle verwendende Applikation ausführungs- und zugriffsbereit abgelegt ist, vor. Die technische Vorrichtung ist erfindungsgemäß, insbesondere wie zuvor oder nachgehend beschrieben, ausgebildet. Ferner ist mindestens eine Schnittstelle ausgebildet, die eine Übermittlung von Daten von der technischen Vorrichtung auf die technische Einrichtung ermöglicht. Vorzugsweise ermöglicht die mindestens eine Schnittstelle auch eine Übermittlung von Daten von der technischen Einrichtung auf die technische Vorrichtung.In order to achieve the aforementioned object, the invention also proposes an arrangement with a technical device and with a technical facility on which an application using the user interface as an interface is stored ready for execution and access. The technical device is designed according to the invention, in particular as described above or below. Furthermore, at least one interface is formed, which enables data to be transmitted from the technical device to the technical facility. The at least one interface preferably also enables data to be transmitted from the technical device to the technical device.

Bei den Schnittstellen kann es sich insbesondere um Hardwareschnittstellen oder um Softwareschnittstellen handeln.The interfaces can in particular be hardware interfaces or software interfaces.

Bevorzugt handelt es sich bei der technischen Vorrichtung und/oder der technischen Einrichtung um ein elektronisches Gerät wie etwa um einen Computer oder ein Smartphone. Es kann sich um dasselbe oder um separate Geräte handeln.The technical device and/or the technical facility is preferably an electronic device such as a computer or a smartphone. They can be the same or separate devices.

Bevorzugt sind an der technischen Vorrichtung und/oder an der technischen Einrichtung Eingabemittel zur Eingabe eines Sprachbefehls ausgebildet.Input means for inputting a voice command are preferably configured on the technical device and/or on the technical facility.

Die technische Vorrichtung weist bevorzugt zumindest einen Datenspeicher und eine Datenverarbeitungseinheit auf. In dem Datenspeicher kann ein Computerprogramm abgelegt sein, das zur Ausführung der zuvor beschriebenen Verfahren ausgeführt wird.The technical device preferably has at least one data memory and one data processing unit. A computer program can be stored in the data memory, which is executed to carry out the method described above.

Die technische Einrichtung weist bevorzugt eine Anzeige auf, auf der darzustellende Elemente der graphischen Benutzeroberfläche angezeigt werden können.The technical device preferably has a display on which elements of the graphical user interface to be displayed can be displayed.

Es kann ein, vorzugsweise optischer, Sensor, wie etwa eine Videokamera, ausgebildet sein, mit dem das auf der Anzeige Dargestellte erfassbar ist. Der Sensor kann mit der technischen Vorrichtung verbunden sein, um die erfassten Daten auf diese zu übertragen.A preferably optical sensor, such as a video camera, can be designed with which what is shown on the display can be recorded. The sensor can be connected to the technical device in order to transmit the recorded data to it.

Die technische Vorrichtung und die technische Einrichtung können gemeinsame Ressourcen nutzen wie etwa einen gemeinsam genutzten Datenspeicher oder einen gemeinsam genutzten Prozessor.The technical device and the technical device can use common resources such as a shared data memory or a shared processor.

Die Erfindung wird nun anhand einiger weniger Ausführungsbeispiele näher beschrieben, ist jedoch nicht auf diese wenigen Ausführungsbeispiele beschränkt. Weitere Ausführungsbeispiele ergeben sich durch die Kombination der Merkmale einzelner oder mehrerer Schutzansprüche untereinander und/oder mit einzelnen oder mehreren Merkmalen der Ausführungsbeispiele.The invention will now be described in more detail using a few exemplary embodiments, but is not limited to these few exemplary embodiments. Further exemplary embodiments result from the combination of the features of individual or multiple claims with one another and/or with one or more features of the exemplary embodiments.

Es zeigt:

1 ein Flussdiagramm eines Ausführungsbeispiels eines erfindungsgemäßen Verfahrens zur Interaktion mit einer graphischen Benutzeroberfläche,
2 ein Flussdiagramm eines Ausführungsbeispiels eines Verfahrens zum Testen einer Applikation, die eine Benutzeroberfläche aufweist,
3 eine Anzeige einer graphischen Benutzeroberfläche, die sich in einem konkreten Zustand befindet.
4 bis 6 jeweils ein Ausführungsbeispiel einer erfindungsgemäßen Anordnung.

It shows:

1 a flow chart of an embodiment of a method according to the invention for interaction with a graphical user interface,
2 a flowchart of an embodiment of a method for testing an application that has a user interface,
3 a graphical user interface display that is in a particular state.
4 until 6 each an embodiment of an arrangement according to the invention.

Bei der nachfolgenden Beschreibung der Erfindung erhalten in ihrer Funktion übereinstimmende Elemente auch bei abweichender Gestaltung oder Formgebung übereinstimmende Bezugszahlen.In the following description of the invention, elements that have the same function are given the same reference numbers, even if the design or shape differs.

1 zeigt ein Ausführungsbeispiel eines erfindungsgemäßen Verfahrens 100. 1 shows an embodiment of a method 100 according to the invention.

In Verfahrensschritt 101 wird das Verfahren initialisiert, indem ein Sprachbefehl bereitgestellt wird, der auf eine Applikation mit einer graphischen Benutzeroberfläche angewendet werden soll. Bei der Applikation kann es sich beispielsweise um einen Online-Shop handeln. Der Sprachbefehl kann beispielsweise lauten „Melde mich an“.In method step 101, the method is initialized by providing a voice command that is to be applied to an application with a graphical user interface. The application can be an online shop, for example. For example, the voice command can be “Sign me in”.

Für die Bezeichnung „Melde mich an“ kann ein Makro hinterlegt sein, das beispielsweise wie folgt lautet:

Für Melde mich an Mache
Klicke Anmelden
Und Schreibe Max Mustermann In Benutzername
Und Schreibe 123456 In Passwort

A macro can be stored for the designation "Log me in", which reads as follows, for example:

Sign me up for Mache
Click Sign In
And write John Doe in username
And write 123456 in password

Der Sprachbefehl kann durch eine Spracheingabe bereitgestellt werden. Hierzu spricht der Anwender die Wörter „Melde mich an“.The voice command can be provided by voice input. To do this, the user speaks the words "Sign me in".

Mittels eines Mikrophons und eines Spracherkennungsverfahrens wird die Spracheingabe in den Sprachbefehl, also in eine Abfolge von Wörtern, umgewandelt.Using a microphone and a voice recognition process, the voice input is converted into a voice command, i.e. a sequence of words.

In Schritt 102 wird der Sprachbefehl sodann mithilfe einer Grammatik und eines Parsers in seine Textelemente zergliedert, wobei die Textelemente in Klassen eingeteilt werden.In step 102, the voice command is then broken down into its text elements using a grammar and a parser, with the text elements being divided into classes.

Das Textelement „Melde mich an“ gehört einer ersten Wortklasse an. Die erste Klasse beschreibt ein Makro. Der Parser erkennt dies anhand einer Liste von Bezeichnungen, zu denen auch das Textelement „Melde mich an“ gehört. Zur Ausführung des Sprachbefehls „Melde mich an“ wird sodann das Makro „Melde mich an“ ausgeführt. Im Ergebnis wird die Bezeichnung des Makros daher durch dessen Inhalt ersetzt.The text element "Sign me up" belongs to a first class of words. The first class describes a macro. The parser recognizes this from a list of labels, which includes the text element "Sign me in". To execute the voice command "Sign me in", the macro "Sign me in" is then executed. As a result, the name of the macro is therefore replaced by its content.

Die Textelemente „Für“, „Mache“ und „Und“ gehören einer zweiten Klasse an. Die zweite Klasse beschreibt Wörter, die den Sprachbefehl steuern. Die Wörter „Für“ und „Mache“ zeigen an, dass das dazwischenliegende Textelement eine Bezeichnung eines Makros ist. Das Wort „Und“ bedeutet beispielsweise, dass neben der vor dem „Und“ stehenden Interaktion eine weitere Interaktion erfolgt, die nach dem „Und“ beschrieben ist.The text elements "For", "Do" and "And" belong to a second class. The second class describes words that control the voice command. The words "For" and "Do" indicate that the intervening text element is a label of a macro. For example, the word "And" means that in addition to the interaction before the "And", there is another interaction that is described after the "And".

Das Textelement „Klicke“ sowie die Wortkombination „Schreibe“ und „In“ gehören einer dritten Klasse an. Die dritte Klasse bezieht sich auf die Art der Interaktion. Beispielsweise kann das Wort „Klicke“ so interpretiert werden, dass auf das Interaktionselement „Anmelden“ geklickt werden soll. Hierbei kann es sich um einen kurzen oder um einen langen Klick handeln. Das aus der Wortkombination „Schreibe“ und „In“ bestehende Textelement bedeutet, dass Textelement zwischen den Wörtern „Schreibe“ und „In“ als Text in die Eingabe eingegeben werden soll, die zu dem Interaktionselement nach dem Wort „Schreiben“ gehört.The text element "click" and the word combination "write" and "in" belong to a third class. The third class relates to the nature of the interaction. For example, the word "click" can be interpreted as meaning that the interaction element "login" should be clicked. This can be a short or a long click. The text element consisting of the word combination "Write" and "In" means that text element between the words "Write" and "In" should be entered as text in the input belonging to the interaction element after the word "Write".

Die Textelemente „Anmelden“, „Benutzername“ und „Passwort“ gehören einer vierten Klasse an. Die vierte Klasse beschreibt Wörter, die Interaktionselemente bezeichnen. Wie im Folgenden noch genauer erläutert werden wird, ist es nicht erforderlich, dass die Applikation genau diese Begriffe verwendet. Es genügt, dass eine semantische Ähnlichkeit vorliegt.The text elements "login", "username" and "password" belong to a fourth class. The fourth class describes words denoting elements of interaction. As will be explained in more detail below, it is not necessary for the application to use exactly these terms. It is sufficient that there is a semantic similarity.

Sodann wird eine Liste erstellt, die die Textelemente der vierten Klasse in der Reihenfolge ihres Auftretens auflistet. Mit jedem Element der Liste werden alle Aktionen verknüpft, die kompatibel sind mit der Art der Interaktion, die aus den Textelementen der dritten Klasse entnommen sind. Hierbei werden solche Aktionen zugeordnet, die für kompatible Interaktionselemente möglich sind. Um das vorherige Beispiel aufzugreifen, wird dem Wort „Klicke“ die Aktion „click“ und „long-click“ zugeordnet, wenn die graphische Benutzeroberfläche derartige Aktionen zulässt.A list is then created listing the text elements of the fourth class in the order of their occurrence. Associated with each element of the list are all the actions compatible with the type of interaction taken from the text elements of the third class. Actions that are possible for compatible interaction elements are assigned here. To take the previous example, the action "click" and "long-click" is assigned to the word "click" if the graphical user interface allows such actions.

In Schritt 103 wird der aktuelle Zustand der graphischen Benutzeroberfläche analysiert. In der Regel stellt das Betriebssystem, auf dem die Applikation ausgeführt wird, eine Schnittstelle zur Verfügung, mittels derer der aktuelle Zustand der graphischen Benutzeroberfläche analysiert werden kann. So bieten beispielsweise Android oder Webseiten auf Desktop-PCs die Möglichkeit des Zugriffs auf eine Struktur, die alle graphischen Elemente des aktuellen Zustands der graphischen Benutzeroberfläche hierarchisch auflistet und einige Grundinformationen zu diesen Elemente liefert wie beispielsweise deren Art, deren Ort, horizontale Ausdehnung und vertikale Ebene. Auch ist erkennbar, ob die Elemente aktuell auf der dargestellten Anzeige erscheinen oder ob diese erst durch Scrollen der Anzeige erreicht werden müssen.In step 103 the current state of the graphical user interface is analyzed. As a rule, the operating system on which the application is running provides an interface that can be used to analyze the current state of the graphical user interface. For example, Android or websites on desktop PCs offer the possibility of accessing a structure that hierarchically lists all graphical elements of the current state of the graphical user interface and provides some basic information about these elements such as their type, location, horizontal extent and vertical level . It is also possible to see whether the elements currently appear on the displayed display or whether they first have to be reached by scrolling the display.

Aus der Struktur lassen sich daher insbesondere interaktive Elemente und die Interaktionsmöglichkeiten mit diesen wie etwa Klicken, Text eingeben oder Auswahl aus einem Drop-Down-Menü sowie eine mögliche unmittelbar mit dem Interaktionselement verknüpfte und auf diesem graphisch dargestellte Bezeichnung ableiten. Auch wird eine Möglichkeit zur Verfügung gestellt, mit den Interaktionselementen zu interagieren durch Verwendung eines Peripheriegeräts oder durch eine dieses ersetzende programmierte Anweisung.In particular, interactive elements and the interaction options with them, such as clicking, entering text or selecting from a drop-down menu, as well as a possible designation directly linked to the interaction element and graphically displayed on it, can therefore be derived from the structure. A possibility is also provided to interact with the interaction elements by using a peripheral device or by a programmed instruction replacing it.

Auch lassen sich nicht-interaktive Elemente wie beispielsweise reine Beschreibungselemente ableiten, für die ebenfalls Eigenschaften wie Ort, horizontale Ausdehnung und vertikale Ebene hinterlegt sein können. Die Beschreibungselemente können insbesondere eine textbasierte oder auch eine bildbasierte Beschreibung (wie etwa ein Icon) aufweisen.Non-interactive elements such as purely descriptive elements can also be derived for which properties such as location, horizontal extent and vertical level can also be stored. The descriptive elements can in particular have a text-based or an image-based description (such as an icon).

In dem hier beschriebenen Ausführungsbeispiel werden in Schritt 103 konkret die Interaktionselemente und die Beschreibungselemente des aktuellen Zustands der graphischen Benutzeroberfläche samt allen für das weitere Verfahren benötigten verfügbaren Informationen über diese Elemente bestimmt.In the exemplary embodiment described here, in step 103 the interaction elements and the description elements of the current state of the graphical user interface are specifically determined, together with all available information about these elements that is required for the further process.

In Schritt 104 werden zu den aktuellen Interaktionselementen deren jeweilige Bezeichnung bestimmt. Dies ist leicht, wenn mit dem Interaktionselement unmittelbar eine Bezeichnung verknüpft ist, da diese dann den zu dem Interaktionselement verfügbaren Informationen entnommen werden kann. So ist es beispielsweise in 3 bei dem mit der Bezugsziffer 24 versehenen Interaktionselement der Fall, bei dem die Bezeichnung „Einloggen“ auf dem Interaktionselement 24 graphisch dargestellt ist.In step 104, the respective designation of the current interaction elements is determined. This is easy if a designation is directly linked to the interaction element, since this can then be taken from the information available on the interaction element. That's how it is in, for example 3 in the case of the interaction element provided with the reference numeral 24, the designation "login" is displayed graphically on the interaction element 24.

Häufig ist eine graphisch dargestellte Bezeichnung allerdings nicht unmittelbar mit dem Interaktionselement verknüpft, sondern es befindet sich ein nicht-interaktives Beschreibungselement in der Nähe des Interaktionselements. Dies trifft in 3 beispielsweise auf das Interaktionselement 22 resp. 23 zu, dem das Beschreibungselement 25 („Username“) resp. 26 („Passwort“) zugeordnet ist.Frequently, however, a graphically represented designation is not directly linked to the interaction element, but there is a non-interactive description element in the vicinity of the interaction element. This occurs in 3 for example on the interaction element 22, respectively. 23 to which the descriptive element 25 ("Username") resp. 26 ("password") is assigned.

Um zu diesen Interaktionselementen das richtige Beschreibungselement zuzuordnen, kann vorgesehen sein, den Abstand der umgebenden Beschreibungselemente zu dem jeweiligen Interaktionselement zu bestimmen, wobei beispielsweise die Abstände der jeweils linken oberen Ecke zueinander bestimmt werden. Sodann kann beispielsweise dasjenige Beschreibungselement dem Interaktionselement zugeordnet werden, das den geringsten Abstand zu dem Interaktionselement aufweist. Sollte es überlappende Elemente geben, kann vorgesehen sein, dass ein überlappendes Element mit dem geringsten Abstand ausgewählt wird, andernfalls wird ein nicht-überlappendes Element mit einem geringsten Abstand ausgewählt.In order to assign the correct descriptive element to these interaction elements, provision can be made for determining the distance between the surrounding descriptive elements and the respective interaction element, with the distances between the respective upper left corners being determined, for example. Then, for example, that descriptive element that has the smallest distance to the interaction element can be assigned to the interaction element. Should there be overlapping elements, it can be arranged that an overlapping element with the smallest distance is selected, otherwise a non-overlapping element with a smallest distance is selected.

Den Beschreibungselementen wird sodann deren lexikalischer Inhalt entnommen. Dieser kann in der hierarchischen Struktur direkt mit dem Beschreibungselement verknüpft sein. Ist er dies nicht, weil beispielsweise eine Bilddatei hinterlegt ist, so kann der lexikalische Inhalt beispielsweise mittels eines Verfahrens zur Texterkennung (z.B. mittels einer OCR-Software) entnommen werden oder bei Icons mittels eines Verfahrens, das dem Icon einen lexikalischen Inhalt zuordnet.The lexical content of the descriptive elements is then extracted. This can be linked directly to the description element in the hierarchical structure. If it is not, for example because an image file is stored, the lexical content can be extracted, for example, using a text recognition method (e.g. using OCR software) or, in the case of icons, using a method that assigns lexical content to the icon.

Wie bereits zuvor beschrieben kann zur finalen Ermittlung des lexikalischen Inhalts der erkannte Text noch bereinigt werden und in Wörter zergliedert werden.As already described above, the recognized text can still be cleaned up and broken down into words for the final determination of the lexical content.

Im Ergebnis wird in Schritt 104 zu den Interaktionselementen ein lexikalischer Inhalt bestimmt, der diese bezeichnet.As a result, in step 104 a lexical content is determined for the interaction elements, which designates them.

In Schritt 105 wird sodann ein semantischer Ähnlichkeitsgrad zwischen den Wörtern der in Schritt 102 erstellten Liste und den in Schritt 104 bestimmten lexikalischen Inhalten bestimmt. Die gestrichelten Linien weisen darauf hin, dass ermittelte Daten aus Schritt 102 und 104 in Schritt 105 verarbeitet werden. Mögliche Methoden zur Bestimmung sind weiter oben bereits genannt worden. Hat die Liste beispielsweise N Wörter und gibt es M Interaktionselemente mit zugeordnetem lexikalischen Inhalt, so werden insgesamt bis zu N*M Ähnlichkeitsgrade bestimmt.In step 105, a semantic degree of similarity between the words in the list created in step 102 and the lexical content determined in step 104 is then determined. The dashed lines indicate that data determined from steps 102 and 104 are processed in step 105. Possible methods for determination have already been mentioned above. For example, if the list has N words and there are M interaction elements with associated lexical content, then a total of up to N*M degrees of similarity are determined.

Zur Reduktion der Anzahl der erforderlichen Berechnungen kann vorgesehen sein, dass nur solche Ähnlichkeitsgrade bestimmt werden, bei denen die Aktionen miteinander kompatibel sind. Dies kann dadurch erfolgen, dass geprüft wird, ob die einem Textelement der Liste zugeordneten Aktionen eine Aktion enthalten, die mit dem Interaktionselement ausführbar ist. Ist eine derartige Aktion nicht möglich, so bedarf es keiner Ähnlichkeitsberechnung des semantischen Inhalts, da eine Interaktion mit dem Interaktionselement ausgeschlossen wird.In order to reduce the number of calculations required, it can be provided that only those degrees of similarity are determined for which the actions are compatible with one another. This can be done by checking whether the actions assigned to a text element in the list contain an action that can be carried out with the interaction element. If such an action is not possible, there is no need for a similarity calculation of the semantic content since an interaction with the interaction element is ruled out.

In Schritt 106 wird der berechnete Ähnlichkeitsgrad mittels einer Gewichtung modifiziert. Ist der Ähnlichkeitsgrad durch eine reellwertige Zahl ausgedrückt, so kann die Gewichtung multiplikativ erfolge. Alternativ kann auch die Subtraktion mit einer Zahl vorgesehen sein.In step 106 the calculated degree of similarity is modified by means of a weighting. If the degree of similarity is expressed by a real number, the weighting can be multiplicative. Alternatively, subtraction with a number can also be provided.

So kann erstens vorgesehen sein, dass bestimmte Aktionen den berechneten Ähnlichkeitsgrad reduzieren. Enthält beispielsweise ein Wort der Liste ein optionales Datenelement, so kann diesem eine click-Aktion zugeordnet werden, wobei der Ähnlichkeitsgrad abgesenkt wird. Hierdurch kann beispielsweise adäquat mit Drop-Down-Menüs umgegangen werden, die einen editierbaren Input erst dann offenbaren, wenn sie angeklickt werden.Firstly, it can be provided that certain actions reduce the calculated degree of similarity. For example, if a word in the list contains an optional data element, a click action can be assigned to it, whereby the degree of similarity is reduced. In this way, for example, drop-down menus can be handled adequately, which only reveal an editable input when they are clicked.

Zweitens kann vorgesehen sein, dass der berechnete Ähnlichkeitsgrad reduziert wird, wenn festgestellt wird, dass bei der Ausführung des Verfahrens und des Sprachbefehls bereits zuvor mit dem Interaktionselement interagiert worden ist. Es kann vorgesehen sein, dass der Ähnlichkeitsgrad mit zunehmender Anzahl von Interaktionen immer weiter reduziert wird. Hierdurch kann vermieden werden, dass sich ungewünschte Interaktionsketten wiederholen und es kann erreicht werden, dass neue Interaktionspfade erreicht werden. Um dies zu ermöglichen, kann vorgesehen sein, dass die bei der Ausführung des Sprachbefehls erreichten Zustände und die vorgenommenen Interaktionen gespeichert werden.Secondly, it can be provided that the calculated degree of similarity is reduced if it is established that the interaction element has already been interacted with when the method and the voice command are being carried out. It it can be provided that the degree of similarity is reduced further and further as the number of interactions increases. In this way it can be avoided that undesired interaction chains are repeated and it can be achieved that new interaction paths are reached. In order to make this possible, it can be provided that the states reached during the execution of the voice command and the interactions carried out are stored.

In Schritt 107 wird sodann ein geeigneter Kandidat für die nächste Interaktion ausgewählt. Hierzu wird dasjenige Interaktionselement ausgewählt, für das - nach möglicherweise in Schritt 106 erfolgten Absenkung - der höchste Ähnlichkeitsgrad ermittelt worden ist.In step 107 a suitable candidate for the next interaction is then selected. For this purpose, that interaction element is selected for which—after possibly being lowered in step 106—the highest degree of similarity has been determined.

Ist der höchste ermittelte Ähnlichkeitsgrad geringer als ein Schwellwert, so ist in dem hier beschriebenen Ausführungsbeispiel vorgesehen, dass ein beliebiges Interaktionselement ausgewählt wird, wobei nur ein solches ausgewählt wird, mit dem noch nicht interagiert worden ist.If the highest ascertained degree of similarity is less than a threshold value, the exemplary embodiment described here provides for any interaction element to be selected, with only one being selected that has not yet been interacted with.

In Schritt 108 wird sodann mit dem Interaktionselement interagiert. Falls eine Zuordnung existiert, unter Anwendung einer Aktion, die dem entsprechenden Textelement aus der Liste zugeordnet ist. Ist eine Eingabe erforderlich, so ist das Einzugebende mit der entsprechenden Aktion in Schritt 102 bereits zuvor verknüpft worden. Sind zu mehreren Aktionen gleich effektive Ähnlichkeitsgrade zugeordnet, so wird eine von beiden ausgewählt.In step 108 the interaction element is then interacted with. If an association exists, applying an action associated with the corresponding text element from the list. If an entry is required, then what is to be entered has already been linked to the corresponding action in step 102 beforehand. If the same effective degrees of similarity are assigned to several actions, one of the two is selected.

In Schritt 109 wird sodann ein Abbruchkriterium geprüft. Beispielsweise kann geprüft werden, ob eine maximale Zeit erreicht ist und/oder ob ein gewünschter Zielzustand erreicht ist. So kann beispielsweise das Verfahren beendet werden, wenn in dem hier besprochenen Beispiel die Anmeldung erfolgreich durchgeführt wurde.In step 109 a termination criterion is then checked. For example, it can be checked whether a maximum time has been reached and/or whether a desired target state has been reached. For example, the process can be ended if the registration has been successfully completed in the example discussed here.

Es kann auch vorgesehen sein, dass nach einer Interaktion das ausgewählte Textelement aus der Liste gestrichen wird. In diesem Fall kann ferner das Verfahren beendet werden, wenn die Liste keine weiteren Elemente mehr enthält.Provision can also be made for the selected text element to be deleted from the list after an interaction. In this case, the method can also be terminated when the list no longer contains any further elements.

Ist das Abbruchkriterium erfüllt, so wird das Verfahren in Schritt 110 beendet. Insbesondere dann, wenn das Verfahren zum Testen einer Applikation verwendet wird, kann es nützlich sein, wenn die bei der Ausführung des Sprachbefehls erreichten Zustände und die vorgenommenen Interaktionen gespeichert werden.If the termination criterion is met, the method is terminated in step 110. In particular when the method is used to test an application, it can be useful if the states reached during the execution of the voice command and the interactions carried out are saved.

Ist das Abbruchkriterium nicht erfüllt, wo wird das Verfahren in Schritt 103 fortgesetzt.If the termination criterion is not met, the method continues in step 103.

2 zeigt ein Ausführungsbeispiel eines Verfahrens 200 zum Testen einer Applikation, die eine graphische Benutzeroberfläche aufweist. 2 FIG. 2 shows an embodiment of a method 200 for testing an application that has a graphical user interface.

Zur Ausführung des Verfahrens werden zunächst in Schritt 201 eine Vielzahl von Sprachbefehlen bereitgestellt. Dies kann durch sprachliche Eingabe oder durch Zurverfügungstellung einer Textdatei erfolgen, die die Sprachbefehle enthält.To carry out the method, a large number of voice commands are first provided in step 201 . This can be done by verbal input or by providing a text file containing the voice commands.

In Schritt 202 wird sodann geprüft, ob bereits alle Sprachbefehle abgearbeitet wurden oder ob noch ein Sprachbefehl unbearbeitet geblieben ist.In step 202 it is then checked whether all voice commands have already been processed or whether a voice command has remained unprocessed.

Ist noch ein Sprachbefehl unbearbeitet, wird in Schritt 203 das Verfahren 100, das in 1 dargestellt ist, ausgeführt. Das Ergebnis des Verfahrens 100 wird zu einem anfangs noch leeren, nach einer ersten Iteration bereits existierenden Zustandsmodell hinzugefügt, wobei das Zustandsmodell die zu dem jeweiligen Sprachbefehl ausgeführten Interaktionen und die hierdurch erreichten Zustände der graphischen Benutzeroberfläche umfasst.If a voice command is still unprocessed, in step 203 method 100, which is described in 1 is shown executed. The result of the method 100 is added to an initially empty state model that already exists after a first iteration, the state model including the interactions carried out for the respective voice command and the states of the graphical user interface achieved as a result.

Sodann wird das Verfahren in Schritt 202 mit den verbleibenden Sprachbefehlen fortgeführt.The method then continues in step 202 with the remaining voice commands.

Ist kein Sprachbefehl mehr auszuführen, so wird das Verfahren 200 in Schritt 204 beendet, indem das erstellte Zustandsmodell zur weiteren Analyse abrufbar abgespeichert wird.If no more voice commands need to be executed, the method 200 is ended in step 204 in that the state model that has been created is stored so that it can be called up for further analysis.

3 zeigt eine Anzeige 14 einer graphischen Benutzeroberfläche, die sich in einem aktuellen Zustand befindet. Angezeigt werden in dem gezeigten Zustand insgesamt drei interaktive Interaktionselemente 22, 23 und 24 und zwei nicht-interaktive Beschreibungselemente 25 und 26. Um sich einzuloggen, ist es zunächst erforderlich, in den Feldern 22 und 23 Benutzername und Passwort einzugeben, bevor über die Schaltfläche 24 eine Anmeldung erfolgen kann. Auf dem Interaktionselement 24 ist dessen lexikalischer Inhalt „Einloggen“ angeordnet. Bei den beiden Eingabefeldern 22 und 23 befindet sich der ihnen zugeordnete lexikalische Inhalt „Username“ und „Passwort“ auf den jeweils nächstgelegenen Beschreibungsfeldern 25 bzw. 26. An anderer Stelle ist genauer beschrieben, dass und auf welche Weise mittels des Sprachbefehls „Melde mich an“ durch Interaktion mit der Benutzeroberfläche eine Anmeldung erfolgt. Trotz abweichender im Sprachbefehl vorgesehenen Reihenfolge der Textelemente und trotz erheblicher syntaktischer Abweichungen zu den Textfeldern kann der Sprachbefehl erfolgreich ausgeführt werden und der Nutzer wird erfolgreich angemeldet. 3 Figure 12 shows a graphical user interface display 14 which is in a current state. In the state shown, a total of three interactive interaction elements 22, 23 and 24 and two non-interactive description elements 25 and 26 are displayed. In order to log in, it is first necessary to enter the user name and password in fields 22 and 23 before clicking button 24 registration can take place. The lexical content “login” is arranged on the interaction element 24 . In the case of the two input fields 22 and 23, the lexical content “username” and “password” assigned to them is located in the respective nearest description fields 25 and 26 “A login occurs through interaction with the user interface. Despite the different order of the text elements provided in the voice command and despite considerable syntactic deviations from the text fields, the voice command can be executed successfully and the user is successfully logged on.

Die zuvor beschriebenen Verfahren 100, 200 können beispielsweise mit den Anordnungen 12 bzw. mit den technischen Vorrichtungen 1 ausgeführt werden, die gemäß der 4 bis 6 ausgebildet sind und im Folgenden beschrieben werden. Hierzu werden die Verfahrensschritte automatisch mittels der technischen Vorrichtung 1, bei der es sich beispielsweise um einen Server oder ein Smartphone handeln kann, ausgeführt.The methods 100, 200 described above can be carried out, for example, with the arrangements 12 or with the technical devices 1 be carried out in accordance with the 4 until 6 are formed and are described below. For this purpose, the method steps are carried out automatically using the technical device 1, which can be a server or a smartphone, for example.

Die Anordnung 12 umfasst die technischen Vorrichtung 1 und die technische Einrichtung 2.The arrangement 12 comprises the technical device 1 and the technical device 2.

In 4 und 5 verfügt die Vorrichtung 1 über Ressourcen, die von der Einrichtung 2 getrennt sind. In 6 ist ein Beispiel gezeigt, bei dem die Technische Vorrichtung 1 und die technische Einrichtung 2 gemeinsame Ressourcen nutzen wie etwa den Prozessor 15 und den Datenspeicher 16.In 4 and 5 device 1 has resources that are separate from device 2. In 6 an example is shown in which the technical device 1 and the technical facility 2 use common resources such as the processor 15 and the data memory 16.

So kann die technische Einrichtung 2 beispielsweise ein elektronisches Gerät 13 wie etwa ein Computer oder ein mobiles Gerät, vorzugsweise ein Smartphone, sein. Das elektronische Gerät 13 kann zugleich auch die technische Vorrichtung 1 bilden wie in 6. Die technische Vorrichtung 1 kann allerdings wie in 4 oder 5 auch ein separates Gerät sein wie etwa ein Server.For example, the technical device 2 can be an electronic device 13 such as a computer or a mobile device, preferably a smartphone. The electronic device 13 can also form the technical device 1 at the same time as in 6 . However, the technical device 1 can, as in 4 or 5 can also be a separate device such as a server.

Die technische Einrichtung 2 und/oder die technische Vorrichtung 1 weisen eine Anzeige 14 auf, auf der graphische Elemente der graphischen Benutzeroberfläche in einem aktuellen Zustand angezeigt werden können.The technical facility 2 and/or the technical device 1 have a display 14 on which the current status of graphic elements of the graphic user interface can be displayed.

Auf der technischen Einrichtung 2 ist eine die Benutzeroberfläche als Schnittstelle verwendende Softwareapplikation ausführungs- und zugriffsbereit abgelegt. A software application using the user interface as an interface is stored on the technical device 2 ready for execution and access.

Auf der technischen Vorrichtung 1 ist ein Computerprogramm ausführungs- und zugriffsbereit abgelegt.A computer program is stored on the technical device 1 ready to be executed and accessed.

Die technische Einrichtung 2 verfügt über eine Dateneingangsschnittstelle 6, über die Daten empfangen werden können.The technical device 2 has a data input interface 6 via which data can be received.

Über die Dateneingangsschnittstelle 6 kann die technische Einrichtung 2 Daten von der Datenausgangsschnittstelle 3 der technischen Vorrichtung 1 empfangen. Die empfangenen Daten können beispielsweise bestimmte von Interaktionselementen der Benutzerschnittstelle auszuführende Aktionen umfassen.The technical device 2 can receive data from the data output interface 3 of the technical device 1 via the data input interface 6 . For example, the received data may include specific actions to be performed by interaction elements of the user interface.

Die technische Vorrichtung 1 weist eine Dateneingangsschnittstelle 9 auf, über die Daten empfangen werden können. Wie in 4 kann hierzu die Dateneingangsschnittstelle 9 über eine drahtlose und/oder eine drahtgebundene Datenleitung 10 wie etwa eine Internetverbindung mit einer Datenausgangsschnittstelle 17 der technischen Einrichtung 2 verbunden sein.The technical device 1 has a data input interface 9 via which data can be received. As in 4 For this purpose, the data input interface 9 can be connected to a data output interface 17 of the technical device 2 via a wireless and/or a wired data line 10 such as an Internet connection.

In 4 und 6 werden lexikalische Inhalte von Elementen der Benutzeroberfläche über eine Datenstruktur oder sonstige in Form von Daten hinterlegte Speicherinhalte ermittelt. In 5 ist vorgesehen, dass die jeweils aktuell auf der Anzeige 14 angezeigte graphische Darstellung der Benutzeroberfläche mittels eines Sensors 8 wie etwa einer Videokamera aufgenommen wird und die Aufnahmen an die technische Vorrichtung 1 übertragen werden.In 4 and 6 lexical content of elements of the user interface is determined via a data structure or other memory content stored in the form of data. In 5 it is provided that the graphical representation of the user interface currently displayed on the display 14 is recorded by means of a sensor 8 such as a video camera and the recordings are transmitted to the technical device 1 .

Die technische Vorrichtung 1 und die technische Einrichtung 2 weisen jeweils zumindest einen Datenspeicher 16 und einen Prozessor 15 auf. Letztere sind in 4 und 5 aus Gründen der Übersichtlichkeit nicht explizit eingezeichnet.The technical device 1 and the technical device 2 each have at least one data memory 16 and one processor 15 . The latter are in 4 and 5 not shown explicitly for reasons of clarity.

In 4 und 5 sind die Schnittstellen 3, 6, 9 und 17 als Hardwareschnittstellen und in 6 als Softwareschnittstellen ausgebildet.In 4 and 5 are the interfaces 3, 6, 9 and 17 as hardware interfaces and in 6 designed as software interfaces.

Zusammenfassend betrifft die Erfindung eine Vorrichtung 1 und ein Verfahren zur Interaktion mit einer graphischen Benutzeroberfläche und zum Testen einer Applikation. Es wird ein Sprachbefehl bereitgestellt, aus dem ein Textelement identifiziert wird. Zu einem interaktiven Element 22, 23, 24 der Benutzeroberfläche wird ein lexikalischer Inhalt ermittelt. Es wird ein semantischer Ähnlichkeitsgrad des Textelements zu dem lexikalischen Inhalt bestimmt und es wird sodann in Abhängigkeit von dem Ähnlichkeitsgrad mit dem interaktiven Element 22, 23, 24 interagiert.In summary, the invention relates to a device 1 and a method for interacting with a graphical user interface and for testing an application. A voice command is provided from which a text element is identified. A lexical content is determined for an interactive element 22, 23, 24 of the user interface. A semantic degree of similarity of the text element to the lexical content is determined and the interactive element 22, 23, 24 is then interacted with depending on the degree of similarity.

Bezugszeichenlistereference list

11: Technische VorrichtungTechnical device
22: Technische EinrichtungTechnical facility
33: Datenausgangsschnittstelle von 1Data output interface from 1
66: Dateneingangsschnittstelle von 2Data input interface from 2
88th: Sensorsensor
99: Dateneingangsschnittstelle von 1Data input interface from 1
1010: Datenleitungdata line
1212: Testanordnungtest arrangement
1313: elektronisches Gerätelectronic device
1414: Anzeige einer graphischen BenutzeroberflächeDisplay of a graphical user interface
1515: Prozessorprocessor
1616: Datenspeicherdata storage
1717: Datenausgangsschnittstelle von 2Data output interface of 2
2222: Interaktionselementinteraction element
2323: weiteres Interaktionselementanother interaction element
2424: weiteres Interaktionselementanother interaction element
2525: Beschreibungselementdescription element
2626: weiteres Beschreibungselementanother descriptive element

Claims

A method for interacting with a graphical user interface, the graphical user interface having an interaction element, characterized in that a voice command is provided from which a text element is identified, that lexical content for the interaction element is determined with computer assistance, that a semantic degree of similarity of the text element to the lexical content is determined with the aid of a computer and that the interaction element is interacted with depending on the degree of similarity.

Method according to the preceding claim, characterized in that the lexical content for the interaction element is determined by identifying a lexical content arranged on the interaction element and/or by determining distances from surrounding description elements with lexical content and the lexical content of the closest description element is selected.

Method according to one of the preceding claims, characterized in that the lexical content of the interaction element or the description element is determined by taking it directly from a text field of a data structure describing a current state of the user interface and/or by clicking on a graphic representation of the interaction element or the description element a method of text recognition and/or image recognition is used, in particular in the case of an initially identified graphic element and/or an initially identified character that is not covered by the grammar, a method is used that gives the graphic element and/or the character a assigns lexical content.

Method according to one of the preceding claims, characterized in that the voice command is provided by a spoken input being converted into a sequence of text elements and/or that the voice command is provided by a sequence of text elements, in particular by a keyboard input or by a text file , provided.

Method according to one of the preceding claims, characterized in that the voice command is broken down into a sequence of text elements with the aid of a computer, preferably by means of a parser.

Method according to one of the preceding claims, characterized in that the voice command includes a designation for a macro, the macro for its part including a sequence of text elements preferably stored in a text file.

Method according to any one of the preceding claims, characterized in that the interaction element is identified by selecting an interactive element of a current state of the graphical user interface, in particular wherein the interactive element has a finite extent and/or is unobscured.

Method according to one of the preceding claims, characterized in that the interaction element is only interacted with if the ascertained degree of similarity exceeds a threshold value.

Method according to one of the preceding claims, characterized in that semantic degrees of similarity of one or more than one text element of the voice command to one or more than one lexical content, which is determined for one or more than one corresponding interaction element, are determined and that with that interaction element interacted first for which the determined degree of similarity is highest.

Method according to one of the preceding claims, characterized in that at least one action is assigned to the text element of the voice command and that the at least one action assigned to the text element is compared with an action assigned to the interaction element, in particular wherein the determination of the degree of similarity of the text element to the interaction element in Dependence on the result of the comparison takes place.

Method according to one of the preceding claims, characterized in that it is stored whether and/or how often the interaction element has been interacted with during the execution of the method and/or that from interactions with the user interface and from the states of the user interface achieved as a result of the interactions a state model of the user interface is created and/or updated.

Method according to one of the preceding claims, characterized in that the degree of similarity is preferably reduced by means of a weighting factor if it is recognized that the interaction element has already been interacted with previously during the execution of the method.

Method according to one of the preceding claims, characterized in that in the case of states of the user interface which are reached for the first time during the execution of the method, those text elements of the voice command which require or enable an input as an interaction are executed first.

Use of the method according to one of the preceding claims for executing a function of an application using the user interface as an interface, in particular with the user providing the voice command.

Method for testing an application using a user interface as an interface, characterized in that at least one voice command is specified, that the method according to one of the preceding claims is carried out for each of the at least one voice command and that a state model of the user interface is generated and/or updated and is stored in a retrievable manner, the state model comprising the interactions carried out and the states of the user interface achieved as a result.

Technical device (1), characterized in that the technical device (1) is set up, the method for interacting with a graphical user interface according to one of Claims 1 until 13 and/or the method for testing an application using an application or the user interface as an interface claim 15 to be carried out, in particular with input means being designed for inputting a voice command.

Arrangement (12) with a technical device (1) according to the preceding claim and with a technical device (2) on which an application using the user interface as an interface is stored ready for execution and access, with at least one interface being formed that transmits of data from the technical device (1) to the technical facility (2).