DE10045080C2

DE10045080C2 - Method and device for recording and executing queries from data networks

Info

Publication number: DE10045080C2
Application number: DE10045080A
Authority: DE
Inventors: Ganesh Puri Sebastian Jung
Original assignee: Individual
Current assignee: Individual
Priority date: 2000-09-12
Filing date: 2000-09-12
Publication date: 2003-03-20
Anticipated expiration: 2020-09-13
Also published as: DE10045080A1

Description

Die vorliegende Erfindung betrifft ein Verfahren und eine Vorrichtung zur Aufzeichnung und Ausführung von Abfragen aus Datennetzen. Insbesondere bezieht sich die Erfindung auf die Aufzeichnung und Ausführung von Abfragen aus dem Internet.The present invention relates to a method and an apparatus for Recording and execution of queries from data networks. In particular The invention relates to the recording and execution of queries the internet.

Mit der Verbreitung des Internets hat sich eine Form der Informationsbe schaffung etabliert, bei der Informationen von einem entfernten Server-Computer über das Internet auf den persönlichen Computer abgerufen werden können. Auf dem Server-Computer können dabei eine Vielzahl von Datenseiten gespeichert sein. Um auf die Informationen auf den Datenseiten zugreifen zu können, verwendet der Nutzer einen Browser. Dieser Browser erlaubt das Abfragen der Informationen von bestimmten Datenseiten, sowie das Navigieren auf einer Datenseite und zwischen verschiedenen Datenseiten. Um zu einer Datenseite zu gelangen, wird dem Browser die Startadresse der Datenseite, ein sogenannter URL (Unified Ressource Locator), mitgeteilt, woraufhin die Datenseite bzw. Teile der Datenseite auf den persönlichen Computer heruntergeladen werden und im Brow ser dargestellt werden. Die Navigation innerhalb einer Datenseite bzw. zwischen verschiedenen Datenseiten wird dabei über sogenannte Tags ermöglicht. Ein Tag kann z. B. ein Link, ein Formular oder ein Frame sein. Ein Link ist ein Bild oder ein Text, bei dessen Anwahl (z. B. durch Mausklick) eine weitere Datenseite vom Server-Computer abgefragt wird. Ein Formular bietet dem Nutzer beim Anwählen Formularfelder an, die vor dem Absenden der Anfrage vom Nutzer auszufüllen sind, um die Anfrage damit weiter zu spezifizieren. Mit dem ausgefüllten Formular können dann wiederum bestimmte Informationen aus dem Internet abgerufen werden. Frames stellen eine weitere Strukturierung einer Datenseite dar, die es ermöglichen, innerhalb eines Frames zu navigieren, ohne dass dadurch die Teile der Datenseite außerhalb des Frames notwendigerweise beeinflusst werden. With the spread of the Internet there has been a form of information Established when creating information from a remote server computer can be accessed on the personal computer via the Internet. On The server computer can store a large number of data pages his. In order to be able to access the information on the data pages, the user uses a browser. This browser allows you to query the Information from certain data pages, as well as navigating on one Data page and between different data pages. To go to a data page the browser will see the start address of the data page, a so-called URL (Unified Resource Locator), whereupon the data page or parts of the Data page can be downloaded to the personal computer and in the brow be shown. Navigation within a data page or between Different data pages are made possible by so-called tags. A day can e.g. For example, a link, a form, or a frame. A link is an image or a Text, when selected (e.g. by mouse click) another data page from Server computer is queried. A form offers the user when selecting Form fields that must be completed by the user before submitting the request, to further specify the request. With the completed form can then in turn retrieve certain information from the Internet become. Frames represent another structuring of a data page that it allow to navigate within a frame without changing the parts the data side outside the frame is necessarily affected.

Datenseiten können sowohl einen statischen als auch einen dynami schen Aufbau haben. Statische Datenseiten werden dabei auf dem Server-Com puter vorgehalten und auf Anforderung des Nutzers zum persönlichen Computer gesendet. Dynamische Datenseiten werden zunächst anhand der in der Anfrage des Nutzers enthaltenen Informationen zusammengestellt, wobei die Verarbeitung der Anfrage bzw. die Zusammenstellung der dynamischen Datenseite verschiedene Verarbeitungsprozesse, wie Datenbankzugriffe oder Rechenoperationen beinhalten kann. Die Ergebnisse der Verarbeitungsprozesse werden dann als dynamische Datenseite zusammengestellt und zum Computer des Nutzers gesendet.Data pages can be both static and dynamic structure. Static data pages are stored on the server Com computer and at the request of the user to the personal computer Posted. Dynamic data pages are initially based on those in the request the information contained by the user is compiled, the processing the request or the compilation of the dynamic data page different Processing processes, such as database access or arithmetic operations can. The results of the processing are then considered dynamic Data page compiled and sent to the user's computer.

Die Anzeige der Datenseiten beim Nutzer erfolgt mit Hilfe des Browsers. Der Browser ist eine Software, die die Struktur der Datenseiten liest und anhand der Struktur beim Nutzer eine formatierte Ausgabe der Informationen vornimmt. Die Struktur der Datenseiten ist baumförmig, wobei der Baum Knoten aufweist, die bis auf den Wurzelknoten genau einen Vaterknoten und mehrere geordnete Kinder knoten hat.The data pages are displayed to the user using the browser. The browser is software that reads and uses the structure of the data pages the structure of the user undertakes a formatted output of the information. The The structure of the data pages is tree-shaped, with the tree having nodes that extend up to exactly one father node and several ordered children on the root nodes has knots.

Im Internet werden zur Übertragung der Datenseiten Internet-Browser eingesetzt. Das dabei verwendete Übertragungsprotokoll ist üblicherweise HTTP (HyperText Transmission Protocol). Der Internet-Browser ist dabei fähig, Daten seiten im HTML- bzw. XML-Format zu empfangen, zu verarbeiten und darzustellen sowie Datenseiten, Dokumente oder Daten in diesem Format zu versenden. Die Baumstruktur der üblichen HTML- bzw. XML-Datenseiten ist im Document Object Model (DOM) des W3-Konsortiums (WWW.W3C.org) beschrieben.Internet browsers are used to transfer data pages on the Internet used. The transmission protocol used is usually HTTP (HyperText Transmission Protocol). The internet browser is capable of data to receive, process and display pages in HTML or XML format as well as sending data pages, documents or data in this format. The The tree structure of the usual HTML or XML data pages is in the Document Object Model (DOM) of the W3 consortium (WWW.W3C.org).

Oft wird es vom Nutzer gewünscht, bestimmte Anfragen wiederholt durchzuführen. Dies wird bisher dadurch erleichtert, dass die wiederholt durchzuführende Abfrage einer Datenseite als Lesezeichen (Bookmark) gespeichert wird und so ein schnellerer Abruf durch Aufrufen des Lesezeichens, z. B. über einen Button oder ein Popup-Menü, ermöglicht wird. Beim Ausführen des Lesezeichens wird dann die unter dem Lesezeichen gespeicherte Anfrage der Datenseite ausgeführt. The user is often asked to repeat certain requests perform. So far, this has been made easier by the fact that it is repeated Query to be carried out on a data page as a bookmark is saved and thus faster retrieval by calling up the bookmark, z. B. via a button or a popup menu. When executing the The request saved under the bookmark of the Data page executed.

Es sind auch Internet-Browser bekannt, die bei wiederholtem Zugriff auf eine Seite, bei deren Abfrage Formularfelder auszufüllen sind, die früher eingege benen Werte für die Formularfelder im Browser zwischengespeichert und diese zur Auswahl oder als Vorbelegung dem Nutzer angeboten werden.Internet browsers are also known, which are used for repeated access a page, when asked to fill in form fields that were entered earlier cached values for the form fields in the browser and these for Selection or be offered to the user as a pre-assignment.

Es sind weiterhin sogenannte Meta-Maschinen, die automatisch die Er gebnisse verschiedener Seiten akkumulieren, bekannt. Dabei wird vorher eine An frage des Nutzers an mehrere verschiedene Seiten verschickt. Die Ergebnisse der verschiedenen Seiten werden danach zusammengefasst und dem Benutzer ange zeigt.There are still so-called meta machines that automatically er accumulate results from different sides, known. Before doing this, an An user question sent to several different pages. The results of the different pages are then summarized and displayed to the user shows.

Aus der europäischen Patentanmeldung EP 0945811 A1 ist eine Infor mationsvorrichtung mit einer automatischen Internet-Lese-Funktion bekannt. Die darin offenbarte Informationsvorrichtung umfasst Zugangs- und Zugriffsmittel, die automatisch Tags abfragen. Die Tags auf einer Datenseite können dabei unter Be rücksichtigung bestimmter Gesetzmäßigkeiten verfolgt werden, wobei die Gesetz mäßigkeit darin besteht, in der Hierarchie der Datenseiten entweder zunächst in die Tiefe oder in die Breite zu gehen. Damit ermöglicht die Vorrichtung ein automati sches passives Browsen im Internet. Zeitlich sich verändernde Informationen auf Datenseiten können durch zeitlich versetzte Abfragen, ähnlich wie über das Fern sehen versendete Informationen, passiv empfangen werden. Ein Nachteil der Vor richtung besteht unter anderem darin, dass die Abfragen unspezifiziert durchgeführt werden, daher ermöglicht es die Vorrichtung nicht, gezielt bestimmte Informationen im Netz abzurufen. Dabei ist es insbesondere von Nachteil, dass die vorgenannte Vorrichtung frühere Abfragen bei der aktuellen Verfolgung der Anfragefelder nicht berücksichtigt und starr einem Schema folgt.From the European patent application EP 0945811 A1 is an information Mation device with an automatic Internet reading function known. The information device disclosed therein comprises access and access means which automatically query tags. The tags on a data page can be under Be taking certain legalities into account, being the law moderation consists in the hierarchy of the data pages either first in the To go deep or wide. The device thus enables an automatic passive browsing on the Internet. Information that changes over time Data pages can be created by staggered queries, similarly as remotely see information sent, passively received. A disadvantage of the front One of the directions is that the queries are carried out unspecified therefore the device does not allow specific information to be targeted on the net. It is particularly disadvantageous that the aforementioned Does not provide previous queries while currently tracking query fields considered and rigidly follows a scheme.

Im IBM Research Disclosure 42772, November 1999/1485, "Internet Macros: A System and Method for Automation of Internet Activities" ist ein System und eine Methode veröffentlicht, die die Automatisierung von wiederholbaren Se quenzen von Benutzeraktivitäten im Internet unterstützen. Ähnlich wie beim Abrufen von Lesezeichen, können regelmäßig Folgen einer bestimmten Sequenz von Navigationslinks abgefragt werden. Damit wird eine regelmäßige Überprüfung von Informationsquellen bzw. das regelmäßige Ausdrucken von Mitteilungen aus Inter netquellen ermöglicht. Die Sequenzen der abzufragenden Navigationslinks können dabei als Makro oder Skript gespeichert werden. Weiterhin erlaubt das System das Aufzeichnen, Bearbeiten und Löschen dieser Makros. Damit wird es ermöglicht, eine bestimmte Reihenfolge von Navigationsangaben, wie Anfragefelder, Formu larfelder oder Buttons aufzuzeichnen und diese dann automatisch während eines Abrufes auszuführen. Ein Nachteil des Systems und der Methode besteht unter anderem darin, dass im Makro die konkreten Navigationsangaben, wie z. B. Name und Adresse eines Links oder einer Datenseite, zum Zeitpunkt der Aufzeichnung gespeichert werden. Dadurch kann es passieren, dass bei einer späteren Ausfüh rung des Makros der Abruf fehlerhaft abbricht, da zwischenzeitlich sich eine oder mehrere Navigationsangaben geändert haben. Damit kann das so beschriebene Makro nur erfolgreich verwendet werden, wenn die aufgezeichneten Naviga tionsangaben mit denen zum Zeitpunkt des Abrufs im Netz anzutreffenden Naviga tionsangaben hundertprozentig übereinstimmt. Dies ist besonders nachteilig, da schon die Umbenennung oder Verschiebung eines Links zum Scheitern des mit dem vorbeschriebenen Makro durchgeführten Abrufs führt.In IBM Research Disclosure 42772, November 1999/1485, "Internet Macros: A System and Method for Automation of Internet Activities "is a system and published a method that automates repeatable Se Support sequences of user activities on the Internet. Similar to retrieval of bookmarks, can regularly follow a certain sequence of Navigation links are queried. This will result in a regular review of Sources of information or the regular printing of messages from Inter net sources enabled. The sequences of the navigation links to be queried can be saved as a macro or script. The system also allows that Record, edit and delete these macros. This enables a certain order of navigation information, such as request fields, formu Larfelder or buttons record and then automatically during a To execute retrieval. A disadvantage of the system and the method exists under Another is that in the macro the specific navigation information, such as B. Name and address of a link or data page at the time of recording get saved. As a result, it can happen that a later execution the macro terminates the retrieval incorrectly, since one or changed several navigation details. So that can be described Macro can only be used successfully when the recorded Naviga information with the Naviga found on the network at the time of the call tion specifications agree 100 percent. This is particularly disadvantageous because already renaming or moving a link to the failure of the the above-mentioned macro.

Weitere Nachteile des Standes der Technik werden beim Abrufen von dynamischen Datenseiten deutlich. Der Abruf einer dynamisch erzeugten Seite läßt sich oft nicht mit der Speicherung eines Lesezeichens automatisieren. Obwohl sich einzelne Zugriffe mit Hilfe eines Lesezeichens speichern und wieder abrufen las sen, ist der vollständige Zugriff auf eine dynamisch erzeugte Seite oft mit einem Bookmark nicht möglich, da der Zugriff auf bestimmte Datenseiten vom vorherge henden Zugriff abhängt. Dabei kann der Zugriff sowohl von der Datenseite, auf die vorher zugegriffen wurde, oder die während des vorhergehenden Zugriffs über mittelten Eingaben für ein Formularfeld abhängen. Diese Abhängigkeit liegt weiter hin vor, wenn vor der Anzeige der vom Nutzer gesuchten Daten Anfragefelder an gezeigt werden, die dynamisch generiert werden und auf Datenseiten verweisen, die immer neue und unterschiedliche Namen tragen. Die Anfragen, die von diesen Feldern aus erzeugt werden, können nicht als Bookmark aufgezeichnet werden, weil der Inhalt des Bookmark ungültig wird, sobald sich der Name der Datenseite und damit auch das Anfragefeld ändert. Beispielsweise weist eine Datenseite einen Link auf eine aktuelle Nachricht auf. Der Link verweist dann heute auf die Seite 4711.asp und morgen auf die aktuellere Nachricht auf der Seite 0815.asp. In beiden Fällen befindet sich der Link auf der gleichen Stelle der Datenseite, die sich dahinter befindende Nachrichtenseite heißt jedoch jeden Tag anders.Further disadvantages of the prior art are encountered when retrieving dynamic data pages clearly. The retrieval of a dynamically generated page leaves often do not automate themselves with the saving of a bookmark. Although yourself save individual accesses using a bookmark and read them again full access to a dynamically generated page is often with one Bookmark not possible because access to certain data pages from the previous dependent access. Access can be from the data page to the was previously accessed, or which was accessed during the previous access via Dependent inputs for a form field. This dependency continues forward if request fields appear before the data the user is looking for are displayed shown that are generated dynamically and refer to data pages, that always have new and different names. The requests made by these Fields created from cannot be bookmarked, because the content of the bookmark becomes invalid as soon as the name of the data page changes and thus also changes the request field. For example, a data page has one Link to a current message. The link then refers to the page today 4711.asp and tomorrow for the more current message on page 0815.asp. In In either case, the link is in the same place on the data page that is However, the news site behind it means different every day.

Aus der DE 198 58 163 A1 ist ein Verfahren zum Auslesen von Informationen aus einem Datenbestand und zum Weiterleiten dieser ausgelesenen Informationen an eine Client-Applikation bekannt. Die Client-Applikation übermittelt daraufhin Informationen von einem elektronischen Gerät eines Empfängers, die von dem elektronischen Gerät an den jeweiligen Datenbestand weitergeleitet werden, wobei jeder aus einem Datenbestand ausgelesenen Information eine Gruppe von Informationen automatisch eine Kennung zugefügt wird, wobei die Kennung den jeweiligen Datenbestand und/oder die Information selbst kennzeichnet und die Kennung mit an die Client-Applikation bzw. den Empfänger übermittelt wird. Das Verfahren ermöglicht es dabei, Informationen zwischen einem Datenbestand und einer Client-Applikation auszutauschen, ohne daß die Client-Applikation mit den den jeweiligen Datenbestand verwaltenden elektronischen Gerät in Verbindung ist. Damit betrifft das Verfahren jedoch nicht die Aufzeichnung und Ausführung von Abfragen aus Datennetzen.DE 198 58 163 A1 describes a method for reading out information a database and to forward this read information to a client application known. The client application then transmits Information from a recipient's electronic device provided by the electronic device to be forwarded to the respective database, whereby each information read from a database is a group of Information is automatically added to an identifier, the identifier being the identifies the respective data stock and / or the information itself and the ID is also transmitted to the client application or the recipient. The The method enables information between a database and a client application without exchanging the client application with the electronic device managing the respective data inventory is connected. However, this does not affect the recording and execution of Queries from data networks.

Aus der DE 199 63 981 A1 ist ein Verfahren zum Auffinden von Dokumenten unter Verwendung von Hyperlinks bekannt. In dem Verfahren erhält ein Serversystem von einem Client-Netzknoten eine Anfrage nach einem Dateninhalt, wobei die Anfrage einen ersten Resource-Locator umfaßt, der die von einem Speichersystem zu erhaltende digitale Information spezifiziert. In Ansprechen auf den ersten Resource-Locator wird die spezifizierte digitale Information dynamisch generiert und unter einer Adresse im Speicher abgelegt. Der Zugriff auf die dynamisch generierte Information erfolgt dann über einen zweiten Resource-Locator, um die spezifizierte digitale Information über das Netz an den Client-Netzknoten zu übertragen. Das Verfahren betrifft damit das Abrufen von Informationen, die zum Zeitpunkt des Abrufes dynamisch generiert werden, nicht jedoch die Aufzeichnung von Abfragen im Datennetz. DE 199 63 981 A1 describes a method for locating documents under Known use of hyperlinks. In the process, a server system receives a request for a data content from a client network node, the Query includes a first resource locator that is from a storage system specified digital information to be obtained. In response to the first Resource locator, the specified digital information is generated dynamically and stored in memory at an address. Access to the dynamic generated information is then carried out via a second resource locator in order to specified digital information about the network to the client network node transfer. The procedure thus relates to the retrieval of information that is relevant to the The time of the call are generated dynamically, but not the recording of queries in the data network.

In der GB 2312975 A wird ein Verfahren zum Auflösen von Hyperlinks beschrieben, wobei den Hyperlinks Abfrageinformationen entnommen werden, mit denen Datenbankabfragen durchgeführt werden. Aus den Ergebnissen der Datenbankabfragen können dann Hyperlinkziele identifiziert werden. Die Hyperlinks enthalten damit Abfrageroutinen, die zum Zeitpunkt der Ausführung des Hyperlinks an die Datenbanken übersandt werden.GB 2312975 A describes a method for resolving hyperlinks, whereby query information is extracted from the hyperlinks with which Database queries are carried out. From the results of the Database queries can then be identified hyperlink targets. The hyperlinks contain query routines at the time the hyperlink was executed be sent to the databases.

Die aus dem Stand der Technik bekannten Vorrichtungen und Methoden weisen insbesondere den Nachteil auf, automatische Abrufe aus dem Internet nur dann durchführen zu können, wenn sich die Navigationsangaben sowie die Struktur und der Inhalt der Datenseiten nicht ändert. Dies entspricht aber nicht der Praxis, da sich gerade im Internet die Datenseiten, ihre Struktur und die vorhandenen Links einer ständigen Überarbeitung unterliegen und sich deshalb sehr oft verändern. Damit sind die aus dem Stand der Technik bekannten Vorrichtungen und Methoden nur bedingt zum Aufzeichnen von Abrufen aus dem Internet geeignet.The devices and methods known from the prior art have especially the disadvantage of only automatic requests from the Internet to be able to carry out if the navigation information as well as the structure and the content of the data pages does not change. However, this does not correspond to practice, since the data pages, their structure and the existing links are subject to constant revision and therefore change very often. These are the devices and methods known from the prior art only conditionally suitable for recording requests from the Internet.

Der vorliegenden Erfindung liegt daher die Aufgabe zugrunde, ein verbessertes Verfahren sowie eine Vorrichtung zur Durchführung des Verfahrens zur Aufzeichnung und Ausführung von Abfragen aus Datennetzen bereit zu stellen, das eine Veränderung der Navigationsangaben bzw. der Struktur der abzufragenden Datenseiten berücksichtigt.The present invention is therefore based on the object of an improved Method and an apparatus for performing the method for recording and executing queries from data networks a change in the navigation information or the structure of the data pages to be queried are taken into account.

Die Erfindung ist in den Ansprüchen 1 und 29 definiert. Die abhängigen An sprüche definieren bevorzugte Ausführungsformen der Erfindung.The invention is defined in claims 1 and 29. The dependent types sayings define preferred embodiments of the invention.

Ein Verfahren ist zunächst in zwei Teile unterteilt: das Aufzeichnen von Abfragen und das Ausführen von aufgezeichneten Abfragen. Das Aufzeichnen von Abfragen umfasst zunächst die Speicherung einer Startadresse, z. B. eines URL, und den Abruf der durch die Startadresse repräsentierten Datenseite und weiterhin zumindest einen Abruf aus dem Datennetz und/oder die Aufzeichnung zumindest einer ersten Positionsinformation, die ausgewählte Daten auf der abgerufenen Da tenseite definiert. Die Aufzeichnung eines Abrufs umfasst dabei die Auswahl von Datenelementen, wie z. B. Tags, auf der zuvor abgerufenen Datenseite, die Auf zeichnung einer zweiten Positionsinformation, die einen Ausschnitt der zuvor abge rufenen Datenseite definiert, wobei der Ausschnitt zumindest die ausgewählten Datenelemente beinhaltet, und den Abruf zumindest einer der durch die zuvor aus gewählten Datenelemente definierten Datenseite. A process is initially divided into two parts: the recording of queries and executing recorded queries. The recording of queries initially includes the storage of a start address, e.g. B. a URL, and the Retrieve the data page represented by the start address and continue at least one call from the data network and / or at least the recording a first position information, the selected data on the retrieved Da side defined. The recording of a call includes the selection of Data elements such as B. Tags, on the previously accessed data page, the On drawing of a second position information, a section of the previously abge called data page defined, the section at least the selected Includes data elements, and retrieval of at least one of those previously identified selected data elements defined data page.

Das von dem Verfahren beinhaltete Ausführen der aufgezeichneten Abfragen um fasst dabei das Abrufen der während der Aufzeichnung gespeicherten Startadresse und die Speicherung der aktuellen Datenseite, die von der Startadresse repräsen tiert wird, die Durchführung der aufgezeichneten Abrufe und/oder die Bestimmung der ausgewählten Daten auf den abgerufenen Datenseiten mit Hilfe der aufge zeichneten ersten Positionsinformation. Die Durchführung eines Abrufes umfasst dabei die Bestimmung zumindest eines Ausschnittes der jeweils zuvor abgerufenen aktuellen Datenseite mit Hilfe der aufgezeichneten zweiten Positonsinformation und den Abruf und die Speicherung der aktuellen Datenseiten, die durch die im aufge zeichneten Abschnitt befindlichen Datenelemente definiert wurden.Execution of the recorded queries involved by the method summarizes the retrieval of the start address saved during the recording and storing the current page of data representing the start address is carried out, the execution of the recorded calls and / or the determination the selected data on the accessed data pages using the recorded first position information. The execution of a call includes the determination of at least a section of the previously called up current data page using the recorded second position information and the retrieval and storage of the current data pages by the im data section located in the drawn section.

Im Folgenden wird die vorliegende Erfindung anhand bevorzugter Ausführungs beispiele unter Bezugnahme auf die Zeichnungen beschrieben.In the following, the present invention is based on a preferred embodiment examples described with reference to the drawings.

Es zeigen:Show it:

Fig. 1 einen Baum mit typisierten Noten, repräsentierend eine baumförmig strukturierte Datenseite; FIG. 1 shows a tree with a typed notes, representing a tree structured data page;

Fig. 2 ein Flußdiagramm, zeigend die Erstellung eines Pfades aufgrund des Baumes einer Datenseite; Fig. 2 is a flow chart showing the creation of a path based on the tree of a data page;

Fig. 3 ein Flußdiagramm, zeigend die Erstellung eines Baumes aufgrund eines Pfades; Fig. 3 is a flowchart showing the creation of a tree due to a path;

Fig. 4 ein Flußdiagramm, zeigend die Zuordnung eines Baumes, der aus einer Datenseite erstellt wurde zu einem Baum, der aus einem Pfad erstellt wurde; Fig. 4 is a flow chart showing the mapping of a tree created from a data page to a tree created from a path;

Figs. 5a-13 Flußdiagramme, die das Flußdiagramm aus Fig. 4 verfeinern.Figs. 5a-13 flowcharts refining the flowchart of FIG. 4.

Fig. 5a ein Flußdiagramm, zeigend das Rekursionsende wegen unterschiedlicher Knotentypen aus Fig. 4 FIG. 5a is a flowchart showing the end of recursion due to different node types from FIG. 4

Fig. 5b ein Flußdiagramm, zeigend das Rekursionsende mit keinen weiteren Kindknoten aus Fig. 4 Fig. 5b is a flow chart showing the Rekursionsende with no other child nodes of FIG. 4

Fig. 6 ein Flußdiagramm, zeigend den vertikalen Toleranzalgorithmus; Fig. 6 is a flowchart showing the vertical tolerance algorithm;

Fig. 7a ein Flußdiagramm, zeigend die vertikale Suche im Seitenknoten; FIG. 7a is a flowchart showing the search in the vertical side node;

Fig. 7b ein Flußdiagramm, zeigend die Ergebnisverarbeitung im vertikalen Toleranzalgorithmus; Fig. 7b is a flow chart showing the result processing in the vertical tolerance algorithm;

Fig. 8 ein Flußdiagramm, zeigend die Zuordnung der Kinder mit horizontalem Toleranzalgorithmus; Fig. 8 is a flowchart showing the assignment of children with horizontal tolerance algorithm;

Fig. 9 ein Flußdiagramm, zeigend den Rumpf der Zuordnungsschleife für den horizontalen Toleranzalgorithmus; Figure 9 is a flow chart showing the body of the mapping loop for the horizontal tolerance algorithm;

Fig. 10 ein Flußdiagramm, zeigend die Suchschleife nach besserem Seitenknoten; Fig. 10 is a flowchart showing the search loop for better page node;

Fig. 11 ein Flußdiagramm, zeigend die Suchschleife nach besserem Pfadknoten mit Ergebnisverarbeitung; Fig. 11 is a flow chart showing the search loop for better path node with result processing;

Fig. 12 ein Flußdiagramm, zeigend die Suchschleife nach besseren Pfadknoten; Fig. 12 is a flowchart showing the search loop for better path nodes;

Fig. 13 ein Flußdiagramm, zeigend die Ergebnisverarbeitung für besserer Pfad- und Seitenknoten. Fig. 13 is a flow chart showing result processing for better path and page nodes.

Das Verfahren und die entsprechende Vorrichtung gemäß einem Ausführungs beispiel der vorliegenden Erfindung gestatten es, die Position innerhalb einer Da tenseite zu speichern, an der sich ausgewählte Daten befinden. Damit wird es ermöglicht, dass ausgewählte Daten, wie z. B. Nachrichten oder Wetterinformationen anhand ihrer Position wiedergefunden werden. Von der Erfindung ist dabei sowohl der Fall abgedeckt, dass die den Nutzer interessierenden ausgewählten Daten auf der Datenseite ausgehend von der Startadresse vorgehalten werden, als auch den Fall, dass das Auffinden der ausgewählten Daten eine Folge von Abrufen aus dem Datennetz beinhaltet. Im Falle einer Folge von Abrufen wird jeweils die Position von Datenelementen auf der aktuellen Datenseite gespeichert, wobei durch den Abruf der so definierten Datenelemente ein weiterer Schritt hin zu den ausgewählten Da ten erfolgt. Bei der Ausführung einer so aufgezeichneten Abfrage wird anhand der Positionsinformationen die Struktur der Datenseiten zurückgewonnen und die aus gewählten Daten werden anhand ihrer Position wiedergefunden.The method and the corresponding device according to an embodiment example of the present invention allow the position within a da save the page on which selected data are located. This enables that selected data, such as B. News or weather information can be found based on their position. The invention is both covered the case that the selected data of interest to the user the data page based on the start address, as well as the Case that the finding of the selected data is a sequence of retrievals from the Data network includes. In the event of a sequence of calls, the position of Data elements are stored on the current data page, being accessed by of the data elements defined in this way is a further step towards the selected data ten. When executing a query recorded in this way, the Positional information recovered the structure of the data pages and the out selected data are retrieved based on their position.

Durch die Aufzeichnung von Abfragen wird die automatische Abfrage von ausge wählten Daten aus Datennetzen möglich, ohne dass dazu vom Nutzer manuell Datenelemente abgerufen oder zusätzliche Informationen eingegeben werden müssten. Dabei können auch Daten auf dynamisch erzeugten Datenseilen ausge wählt und automatisch abgerufen werden. Das Verfahren zeichnet dabei automa tisch die Anfragefelder der dynamisch erzeugten Datenseiten als zweite Posi tionsinformationen auf und bestimmt die vom Benutzer ausgewählten auf den abge rufenen Datenseiten anhand der ersten Positionsinformation.By recording queries, the automatic query is switched off selected data from data networks possible without the user having to do this manually Data elements retrieved or additional information entered would. Data can also be output on dynamically generated data lines dialed and retrieved automatically. The process draws automa table the query fields of the dynamically generated data pages as a second item tion information and determines the user-selected to the called data pages based on the first position information.

Das Verfahren kann sowohl zur Aufzeichnung einzelner als auch einer Gruppe von ausgewählten Datenelementen verwendet werden. Die so durch die zweiten Posi tionsinformationen ausgewählten Datenelemente werden dann bei der Ausführung der aufgezeichneten Abfrage automatisch angewählt, um die dahinterliegenden Datenseiten automatisch abzurufen.The method can be used to record individual as well as a group of selected data elements can be used. The so through the second Posi Data items selected are then used during execution the recorded query is automatically selected to the underlying Retrieve data pages automatically.

Durch die Aufzeichnung mehrerer Startadressen, jeweils einer Folge von Abrufen und jeweils einer ersten Positionsinformation gemäß einem Ausführungsbeispiel der vorliegenden Erfindung können Bereiche innerhalb der Datenseiten, die die ausgewählten Daten enthalten, bezeichnet werden, so dass beim Ausführen der aufgezeichneten Abfragen Teile verschiedener Datenseiten extrahiert und diese Teile neu zusammengestellt werden. Mit dem Verfahren können so mehrere dyna mische Datenseiten angewählt, die Ergebnisse abgerufen und daraus eine Ergeb nisdatenseite zusammengestellt werden.By recording several start addresses, each with a sequence of calls and in each case first position information according to an exemplary embodiment The present invention can create areas within the data pages that the selected data included, so that when you run the recorded queries extracted parts of different data pages and these Parts are put together again. With the method, several dyna selected data pages, called up the results and a result be compiled.

Das Verfahren und die Vorrichtung eignen sich insbesondere für Abfragen, die vom Benutzer wiederholt durchgeführt werden und für solche, bei denen die Ergebnisse mehrerer Abrufe zusammengefasst werden sollen, da so erfindungsgemäß die An zahl der vom Benutzer benötigten Eingaben und Suchvorgänge drastisch reduziert wird.The method and the device are particularly suitable for queries made by Users are carried out repeatedly and for those where the results several calls are to be summarized, since according to the invention the An Number of entries and searches required by the user drastically reduced becomes.

Das Verfahren umfasst als weiteren Schritt das Darstellen der aktuellen ausge wählten Daten als Ergebnis der Abfrage. Zweckmäßigerweise werden die Ergeb nisse der Abfrage auf einer Ergebnisdatenseite dargestellt.As a further step, the method comprises displaying the current data chose data as the result of the query. The results are expediently Results of the query are shown on a results data page.

Vorzugsweise weisen die Datenseiten eine geordnete Baumstruktur, wie sie z. B. HTML-Seiten haben, auf. Mittels der Erfindung ist es möglich, die Baumstruktur einer Datenseite beim Ausführen der Abfrage zurückzugewinnen und die ausge wählten Daten anhand ihrer Position in der Baumstruktur aufzufinden. Dabei ist es besonders bevorzugt, die erste und die zweite Positionsinformation jeweils als Pfad zu speichern, wobei der Pfad einen Ausschnitt der Knotenstruktur der abgerufenen Datenseite enthält. In einer bevorzugten Ausgestaltung der Erfindung erfolgt die Bestimmung des Ausschnittes und/oder die Bestimmung der ausgewählten Daten durch eine Abbildung des Pfades auf die aktuelle Datenseite anhand von Ähnlich keiten zwischen der Struktur der aktuellen Datenseite und des Pfades. Diese Abbil dung unter Berücksichtigung von Ähnlichkeiten zwischen der Struktur der aktuellen Datenseite und des aufgezeichneten Pfades ist besonders vorteilhaft, da auch in leicht veränderten Bäumen, repräsentierend die Baumstruktur der aktuellen Daten seite, die ausgewählten Daten noch wiedergefunden werden. Bei der Abbildung anhand von Ähnlichkeiten können alle geeigneten fehlertoleranten Algorithmen ein gesetzt werden. Ein Beispiel für einen solchen Algorithmus ist in Phelps, T. A., Wi lenskey, R.: "Robust Intra-document Locations", Proceedings of the 9^th World Wide Web Conference, Amsterdam, 15.-19. Mai 2000 angegeben, wobei die in diesem Artikel dargestellte Methode des "path-walk" sich nicht als robust genug erwiesen hat. Zweckmäßigerweise umfasst die Aufzeichnung der Positionsinformationen bzw. der Pfade die Speicherung von Knoten, den sogenannten Pfadknoten und ersten Wegeinformationen, die den Weg zwischen den Pfadknoten definieren, so wie die Speicherung zusätzlicher Informationen über die Art und/oder Parameter der aufzuzeichnenden Abrufe. Dabei ist es besonders zweckmäßig, wenn die zu sätzlichen Informationen und/oder Parameter einen Bereich und/oder eine Position der ausgewählten Daten auf einer Datenseite, ein Passwort, ein Datum, einen An fangs- oder Endwert, einen Suchbegriff, einen Link, ein bestimmtes Frame, ein Formular bzw. Eingaben für Formularfelder oder andere Konfigurationseinstellun gen enthält. Damit können ganz bestimmte Bereiche auf abgerufenen Datenseiten definiert, dynamische Datenseiten angewählt und Eingaben für dynamische Daten seiten bereitgestellt werden. Diese zusätzlichen Informationen werden zweckmäßi gerweise als weitere Pfadknoten und erste Wegeinformationen gespeichert. Bei der Ausführung von Abfragen wird bevorzugterweise die Struktur der aktuellen Daten seite in Form von Seitenknoten und zweiten Wegeinformationen, die den Weg zwi schen den Seitenknoten definieren, ermittelt. Die Abbildung des Pfades auf die ak tuelle Datenseite erfolgt dann durch Zuordnung von Seitenknoten zu den entspre chenden Pfadknoten, wobei jeder Pfadknoten mindestens einem Seitenknoten zu geordnet wird, wobei es besonders bevorzugt ist, wenn im Falle der nichtmöglichen eindeutigen Zuordnung eine fehlertolerante Zuordnung erfolgt. Dabei ist es beson ders vorteilhaft, wenn die fehlertolerante Zuordnung die Typen der Pfad- und der Seitenknoten berücksichtigt. Der Typ eines Knotens ist dabei vorzugsweise durch die Struktur der Datenseite definiert, so haben Hyperlinks z. B. den Typ "a", Formu lare haben den Typ "form". Weiterhin ist es vorteilhaft, wenn für die fehlertolerante Suchordnung zusätzlich die ersten und die zweiten Wegeinformationen berücksich tigt werden.The data pages preferably have an ordered tree structure, such as that shown in FIG. B. have HTML pages. By means of the invention it is possible to recover the tree structure of a data page when the query is executed and to find the selected data on the basis of their position in the tree structure. It is particularly preferred to save the first and the second position information as a path, the path containing a section of the node structure of the data page called up. In a preferred embodiment of the invention, the section and / or the selected data are determined by mapping the path onto the current data page using similarities between the structure of the current data page and the path. This mapping, taking into account similarities between the structure of the current data page and the recorded path, is particularly advantageous since, even in slightly modified trees, representing the tree structure of the current data page, the selected data can still be found. When mapping based on similarities, all suitable fault-tolerant algorithms can be used. An example of such an algorithm is in Phelps, TA, Wi lenskey, R .: "Robust Intra-document Locations", Proceedings of the 9 ^th World Wide Web Conference, Amsterdam, 15th-19th May 2000, whereby the "path-walk" method described in this article has not proven to be robust enough. The recording of the position information or the paths expediently includes the storage of nodes, the so-called path nodes and first path information, which define the path between the path nodes, and the storage of additional information about the type and / or parameters of the calls to be recorded. It is particularly useful if the additional information and / or parameters an area and / or a position of the selected data on a data page, a password, a date, an initial or final value, a search term, a link, a specific one Frame, a form or inputs for form fields or other configuration settings. In this way, very specific areas can be defined on the accessed data pages, dynamic data pages can be selected and inputs for dynamic data pages can be provided. This additional information is expediently stored as further path nodes and first path information. When executing queries, the structure of the current data page is preferably determined in the form of page nodes and second route information which define the route between the page nodes. The mapping of the path to the current data page is then carried out by assigning page nodes to the corresponding path nodes, each path node being assigned to at least one page node, it being particularly preferred if an error-tolerant assignment takes place in the event of the impossible impossible unique assignment. It is particularly advantageous if the fault-tolerant assignment takes the types of path and page nodes into account. The type of a node is preferably defined by the structure of the data page. B. the type "a", Formu lare have the type "form". It is also advantageous if the first and the second path information are additionally taken into account for the fault-tolerant search order.

Es ist weiterhin bevorzugt, wenn bei der fehlertoleranten Zuordnung die Anzahl der Knoten auf zumindest einer Ebene im Baum des Pfades bzw. in der aktuellen Da tenseite oder auch die Ebenentiefe im Pfad und/oder der Struktur der aktuellen Datenseite, auf der sich jeweils der Pfad bzw. Seitenknoten befindet, berücksichtigt wird. Dabei ist es besonders zweckmäßig, wenn die Speicherung der aktuellen Datenseite die möglichst vollständige Struktur der aktuellen Datenseite, einschließ lich ihrer Knoten, und Wegeinformationen zwischen den Knoten beinhaltet. Um die gesamten Strukturinformationen der Datenseiten zum Zeitpunkt der Aufzeichnung und zum Zeitpunkt der Ausführung zu nutzen, ist es besonders vorteilhaft, wenn während der Aufzeichnung die Speicherung zumindest eines Teils der Struktur der abgerufenen Seite erfolgt und dann die Bestimmung des Ausschnittes und/oder der ausgewählten Daten aufgrund der Abbildung der Struktur der aufgezeichneten Da tenseiten auf die Struktur der aktuellen Datenseiten erfolgt.It is further preferred if the number of Nodes on at least one level in the tree of the path or in the current Da side or the level depth in the path and / or the structure of the current one Data page on which the path or page node is located is taken into account becomes. It is particularly useful if the storage of the current Data page including the most complete structure possible of the current data page Lich their nodes, and includes path information between the nodes. To the total structural information of the data pages at the time of recording and to use at the time of execution, it is particularly advantageous if during the recording the storage of at least part of the structure of the accessed page and then the determination of the section and / or the selected data based on the mapping of the structure of the recorded data pages on the structure of the current data pages.

Gemäß einem weiteren bevorzugten Ausführungsbeispiel der Erfindung wird es ermöglicht, ausgehend von mehreren Startadressen, ausgewählte Daten, die auf mehreren Datenseiten verteilt sind, abzufragen. Dazu werden mehrere Abfragen aufgezeichnet, die jeweils die Speicherung einer Startadresse, den Abruf der durch die Startadresse repräsentierten Datenseite, die Aufzeichnung zumindest eines Abrufs aus dem Datennetz und/oder der Aufzeichnung zumindest einer ersten Po sitionsinformation, die ausgewählte Daten auf dem zuvor abgerufenen Datenseiten definieren, umfassen, bis für alle Startadressen die Abrufe und/oder die erste Posi tionsinformation aufgezeichnet sind, und beim Ausführen der Abfragen die vorher erläuterten Schritte, jeweils ausgehend von den gespeicherten Startadressen, durchgeführt werden, bis zu jeder Startadresse die aktuellen ausgewählten Daten bestimmt sind. Dadurch ist es möglich, eine Vielzahl von Abfragen auf unterschied lichen Datenseiten zu einer Abfrage zusammenzufassen und die Ergebnisse der jeweiligen Teilabfragen zu einem Gesamtergebnis zusammenzufügen. Vorzugs weise werden die zu den verschiedenen Startadressen bestimmten aktuellen aus gewählten Daten extrahiert und auf einer Präsentationsdatenseite bzw. einer Ge samtergebnisdatenseite dargestellt. Damit erhält der Nutzer auf einer Datenseite genau die gewünschten Informationen, die von einer Vielzahl unterschiedlicher Datenseiten zuvor abgefragt worden sind. Somit können ganz gezielt individuell zugeschnittene Abfragen durchgeführt werden, ohne dass der Nutzer zusätzliche Eingaben machen muss. According to a further preferred exemplary embodiment of the invention, it is enables selected data based on several start addresses distributed across multiple data pages. This will involve several queries recorded, each storing a start address, the retrieval by the start address represents data page, the recording of at least one Retrieval from the data network and / or the recording of at least one first Po sitionsinformation, the selected data on the previously accessed data pages define, include until the calls and / or the first item for all start addresses tion information are recorded, and when executing the queries the previously explained steps, each based on the stored start addresses, be carried out up to each start address the currently selected data are determined. This makes it possible to distinguish a variety of queries data pages to form a query and the results of the to merge the respective partial queries into an overall result. virtue the current ones determined for the various start addresses are shown selected data extracted and on a presentation data page or a Ge total results data page shown. The user thus receives on a data page exactly the information you want, from a variety of different Data pages have been queried beforehand. So you can be very individual tailored queries can be performed without the user needing additional Must make entries.

Gemäß einem weiteren bevorzugten Ausführungsbeispiel der Erfindung werden die Schritte zur Aufzeichnung der für die Abfrage relevanten Informationen, umfassend die Startadresse und die Positionsinformationen, als eine eigenständige, wieder verwendbare Einheit, ein sogenanntes Makro, aufgezeichnet. Ein solches Makro stellt vorzugsweise eine eigenständige Einheit dar, dessen Bearbeiten das Generie ren, das Speichern oder das Löschen des Makros oder von Teilen des Makros so wie das Ändern bzw. Bearbeiten des Inhaltes des Makros umfasst. Als Makro ge speicherte Abfragen können so separat abgelegt und entsprechend den Anforde rungen modifiziert werden. Vorzugsweise werden mehrere aufgezeichnete Makros zu einem Meta-Makro zusammengefasst, wobei zur Generierung eines Meta- Makros zunächst ein Meta-Makro-Template geöffnet wird, der Inhalt der aufge zeichneten Makros in das Meta-Makro-Template kopiert wird und das so gefüllte Meta-Makro-Template als neues Meta-Makro abgespeichert wird. Zweckmäßiger weise wird der Inhalt eines Makros oder eines Meta-Makros in einem entsprechen den Editor bearbeitet. Beim Ausführen eines Makros oder eines Meta-Makros wer den die Schritte zum Ausführen der aufgezeichneten Abfragen automatisch durch geführt. Indem so aufgezeichnete Makros oder Meta-Makros auf öffentlichen Ser vern abgelegt werden und Dritten ein kategorisierter Zugriff darauf ermöglicht wird, entsteht bevorzugterweise ein breites Spektrum von aufgezeichneten Abfragen zu unterschiedlichsten Themen, auf die themenbezogen zugegriffen werden kann.According to a further preferred embodiment of the invention, the Comprehensive steps to record the information relevant to the query the start address and the position information, as a separate, again usable unit, a so-called macro, recorded. Such a macro preferably represents an independent unit, the editing of which the genius save, or delete the macro or parts of the macro how to change or edit the content of the macro. As a macro Saved queries can be stored separately and according to the requirements be modified. Preferably multiple recorded macros combined into a meta macro, whereby to generate a meta Macros first a meta-macro template is opened, the content of the above drawn macros is copied into the meta macro template and the filled one Meta macro template is saved as a new meta macro. Appropriately wise the content of a macro or a meta macro will match in one edited the editor. When executing a macro or a meta macro, who the steps to run the recorded queries automatically guided. By thus recording macros or meta macros on public ser are stored and third parties are given categorized access to them, preferably creates a wide range of recorded queries a wide variety of topics that can be accessed by topic.

Das hier geschilderte Verfahren eignet sich besonders gut zur Aufzeichnung von Abfragen aus dem Internet. Die Startadresse ist dabei vorzugsweise eine eindeu tige Adresse (URL) im Internet. Die über die Startadresse erreichbare Datenseite entspricht dabei einem der Formate HTML, XML, WAP oder einem anderen im In ternet oder in einem anderen öffentlichen oder nichtöffentlichen Datennetz ver wendbarem Format. Gemäß einer bevorzugten Ausführungsform der Erfindung stellen die ausgewählten Daten Informationen auf einer Internetseite dar, die durch die Startadresse, einen URL, und die Positionsinformationen, definierend einen Zu griffspfad, bestimmt werden. The method described here is particularly suitable for recording Queries from the Internet. The start address is preferably a unique one current address (URL) on the Internet. The data page accessible via the start address corresponds to one of the formats HTML, XML, WAP or another in ternet or in another public or non-public data network reversible format. According to a preferred embodiment of the invention the selected data represents information on a website that is created by the start address, a URL, and the position information, defining a To grip path to be determined.

Der Gegenstand der Erfindung kann z. B. besonders bevorzugt für die automati sche Abfrage aktueller Nachrichteninformationen, die auf verschiedenen Internet seiten angeboten werden, eingesetzt werden. Weiterhin eignet sich der Gegen stand der vorliegenden Erfindung zum automatischen Surfen im Internet, insbeson dere ermöglicht es die vorliegende Erfindung verschiedener Internetseiten nach bestimmten Informationen, z. B. Verkaufsangebote, automatisch abzufragen.The object of the invention can, for. B. particularly preferred for the automati Querying current news information on various Internet pages are offered, used. The counter is also suitable stood the present invention for automatic surfing on the Internet, in particular This enables the present invention according to various Internet pages certain information, e.g. B. Automatically query sales offers.

Gemäß einem bevorzugten Ausführungsbeispiel der Erfindung wird der Ausschnitt der aktuellen Datenseite und/oder der ausgewählten Daten durch einen Vergleich der Positionsinformationen und der Struktur der aktuellen Datenseite mit Hilfe eines neuronalen Netzes bestimmt. Gespeichert wird in diesem Fall die gesamte Daten seite und die absolute Position des Ausschnitts, auf der sich die Datenelemente zum Weitersurfen oder die Ergebnisse befinden. Das Netz ist nun so zu trainieren, dass es zunächst in der gespeicherten Seite den Ausschnitt findet. Die gespei cherte Seite wird nun automatisch verändert, so dass sich die absolute Position des Ausschnitts und die Baumstruktur ändern. Nun wird das neuronale Netz so trainiert, dass es auch in der veränderten Seite den Ausschnitt findet. Die Seite kann nun immer wieder verändert werden und mit jeder Änderung und jedem neuen Training wird das neuronale Netz robuster. So kann es dann bevorzugterweise auch in einer sich dynamisch ändernden Datenseite den Ausschnitt wiederfinden.According to a preferred embodiment of the invention, the cutout the current data page and / or the selected data by a comparison the position information and the structure of the current data page using a neural network. In this case, the entire data is saved page and the absolute position of the section on which the data elements are located to continue browsing or the results. The network is now to be trained that it first finds the section in the saved page. The saved The saved page is now automatically changed so that the absolute position of the Change the section and the tree structure. Now the neural network is trained that it also finds the cutout in the changed page. The page can now be changed again and again and with every change and every new training the neural network becomes more robust. So it can preferably also in one find the section on the dynamically changing data page.

In der hier näher beschriebenen Ausführungsform der Erfindung werden Abfragen aus dem Internet als sogenannte SurfMacros aufgezeichnet. Für den Nutzer erfüllt ein SurfMacro dabei mehrere Funktionen. Ein SurfMacro kann z. B. als Datenakku mulator im Internet, als Makroagent oder Internet-Makro, als Meta-Maschine für beliebig festlegbare Themen und als Java-Surflet, ein sogenanntes kleines Pro gramm, das dem Benutzer das Surfen abnimmt, eingesetzt werden.In the embodiment of the invention described in more detail here, queries are recorded from the Internet as so-called SurfMacros. For the user a SurfMacro has several functions. A SurfMacro can e.g. B. as a data battery mulator on the Internet, as a macro agent or Internet macro, as a meta-machine for arbitrary topics and as a Java surfing, a so-called little pro gram, which relieves the user from surfing.

Ein SurfMacro-Recorder ist eine Meta-Meta-Maschine, der in ein Framework zur Erstellung von Meta-Sites eingebettet ist und dient zur Aufzeichnung der Ab fragen, also der SurfMacros. A SurfMacro recorder is a meta-meta machine built into a framework is embedded to create meta sites and is used to record the Ab ask, so the SurfMacros.

Fig. 1 zeigt ein Beispiel für einen Baum mit typisierten Knoten, wie er als baumför mig strukturierte HTML-Datenseite im Internet vorkommt, anhand dessen die Struktur der verwendeten Pfade dargestellt werden soll. Im Pfad ist die Knoten struktur der ganzen Seite abgelegt, der zu bezeichnende Ausschnitt ist durch [] ein gegrenzt. Fig. 1 shows an example of a tree with typed nodes, as it occurs as a tree-shaped structured HTML data page on the Internet, based on which the structure of the paths used is to be displayed. The node structure of the entire page is stored in the path, the section to be identified is delimited by [].

Die Knotentypen sind dabei abstrahiert (keine echten HTML-Knotentypen), um das Prinzip zu veranschaulichen.The node types are abstracted (no real HTML node types) to do this To illustrate the principle.

Der Bereich vom fett gedruckten Knotentyp C bis zum fett gedruckten Knotentyp D wird durch folgenden Pfad dargestellt:
The range from bold node type C to bold node type D is represented by the following path:

Wurzel{A{A|B|[C}B|A{B{D|D|D]}D}}{[D | | D C B} | | A {B {D]} D} A {A | B} Root

Dabei bedeuten:
{ Nächste Baumebene
| Nächster Knoten, gleiche Baumebene
} Zurück zur vorherigen Baumebene
[ Beginn des zu markierenden Bereichs
] Ende des zu markierenden BereichsMean:
{Next tree level
| Next node, same tree level
} Back to the previous tree level
[Start of the area to be marked
] End of the area to be marked

Bei der Aufzeichnung der Abfrage als SurfMacro wird der URL der HTML-Startda tenseite gespeichert und aus dieser HMTL-Seite ein wie oben erläuterter Pfad er stellt. Der Ausschnitt definiert durch die zweiten Positionsinformationen, in dem sich die Datenelemente, also z. B. zu klickende Links oder Ergebnisse, befinden, wird dabei zur weiteren Verarbeitung durch die '['- und ']'-Symbole innerhalb des Pfa des gekennzeichnet.When recording the query as SurfMacro, the URL of the HTML startda stored page and from this HMTL page a path as explained above provides. The section is defined by the second position information in which the data elements, e.g. B. links or results to be clicked thereby for further processing by the '[' and ']' symbols within the Pfa of the marked.

Wenn die Abfrage nun später, zum zweiten Zeitpunkt, mit dem SurfMacro ausge führt wird, wird zunächst die im SurfMacro gespeicherte Startdatenseite abgerufen, die abgerufene Datenseite gespeichert und die Baumstruktur dieser Datenseite als aktueller Baum extrahiert. Nun wird die im SurfMacro aufgezeichnete Baumstruktur, die als Pfad wie oben beschrieben abgelegt ist, der Baumstruktur des aktuellen Baumes zugeordnet. Aufgrund dieser Zuordnung wird der Ausschnitt oder relevan ten Bereich in der aktuellen Datenseite gefunden. Bei der Zuordnung der Bäume wird fehlertolerant gesucht, so dass auch eine leicht veränderte Baumstruktur zum richtigen Ergebnis führt.If the query is now later, at the second time, with the SurfMacro the start data page stored in the SurfMacro is first called up, the retrieved data page is saved and the tree structure of this data page is saved as current tree extracted. Now the tree structure recorded in the SurfMacro, which is stored as a path as described above, the tree structure of the current one Assigned to the tree. Based on this assignment, the cutout or relevant area found in the current data page. When assigning the trees is searched for fault tolerance, so that a slightly modified tree structure for leads to correct result.

Im Nachfolgenden werden die Flußdiagramme zur Erstellung eines SurfMacro- Pfades aus einem Baum, zur Erstellung eines Baumes aus einem Pfad und zur Zu ordnung von Bäumen erläutert.The flowcharts for creating a SurfMacro- Paths from a tree, to create a tree from a path and to close order of trees explained.

Bei der Darstellung der Flußdiagramme wird davon ausgegangen, dass das Parsen der Datenseite und das Erstellen des Baumes Standardalgorithmen sind, die nicht im einzelnen erläutert werden müssen.When displaying the flowcharts, it is assumed that the parsing the data side and building the tree are standard algorithms that are not must be explained in detail.

Zur Suche in typisierten geordneten Bäumen werden in den Flußdiagrammen eini ge zur Suche in Bäumen übliche Begriffe und Methoden verwendet. Um Miß verständnisse zu vermeiden, werden diese hier kurz erläutert:
Von einem Knoten aus einen Schritt in Richtung der Baumwurzel gelangt man zu dessen Vaterknoten. Von einem Konten aus in Richtung der Blätter gelangt man zu den Kindern.To search in typed, ordered trees, the flowcharts use a few common terms and methods for searching in trees. To avoid misunderstandings, these are briefly explained here:
From a node one step in the direction of the tree root leads to its parent node. The children can be reached from an account in the direction of the leaves.

Zur Suche in der Baumstruktur werden hier die folgende Methoden verwendet:
knoten.getParent() liefert zu einem Knoten den dazugehörigen Vaterknoten,
knoten.getChild(n), liefert zu einem Knoten den n-ten Kindknoten, wobei im geordneten Baum die Kinder ab 1 hochgezählt werden und n die sich so ergebende Knotennummer darstellt
knoten.getChildCount() liefert zu einem Knoten die Anzahl der Kindknoten The following methods are used to search in the tree structure:
knoten.getParent () returns the associated parent node for a node,
knoten.getChild (n), returns the nth child node to a node, whereby the children from 1 are counted up in the ordered tree and n represents the resulting node number
knoten.getChildCount () returns the number of child nodes for a node

Die Zuweisung "neuerKnoten = alterKnoten.getParent()" weist der Variable neuer Knoten den Knoten zu, der der Vaterknoten von alterKnoten ist.The assignment "newNode = oldNode.getParent ()" assigns the variable newer Knot the knot that is the parent of old knot.

Die Zuweisung "neuerKnoten = alterKnoten.getChild(n)" weist der Variable neuer Knoten den Knoten zu, der das n-te unter den geordneten Kindern von alterknoten ist.The assignment "newNode = oldNode.getChild (n)" assigns the variable new Knot the knot that is the nth among the ordered children of age knots is.

Da wir von Bäumen mit typisierten Knoten ausgehen, kann mit knoten.getTyp() auf den Knotentyp zugegriffen und Typgleichheit zweier Knoten mit "kno ten1.getTyp() == knoten2.getTyp()" überprüft werden.Since we start from trees with typed nodes, you can use knoten.getTyp () accessed the node type and matched two nodes with "kno ten1.getTyp () == knoten2.getTyp () "can be checked.

Die Überprüfung zweier Knoten auf Gleichheit erfolgt mit: "alterKno ten == neuerKnoten?"Two nodes are checked for equality with: "alterKno ten == newnode? "

Zur Durchführung von Stringoperationen werden Methoden der Java-Klasse String verwendet. Für den String result bedeutet dabei:
result = "": Zuweisung eines Leerstrings
result.length(): Ergibt die Anzahl der Zeichen des Strings
result.charAt(i): Ergibt das Zeichen an Index i des Strings. Indizes werden 0 an gezählt, das erste Zeichen des Strings ist also result.charAt(0), das letzte ist re sult.charAt(result.length()-1)Methods of the Java class String are used to perform string operations. For the string result means:
result = "": assignment of an empty string
result.length (): Returns the number of characters in the string
result.charAt (i): Returns the character at index i of the string. Indices are counted to 0, so the first character of the string is result.charAt (0), the last is re sult.charAt (result.length () - 1)

Das Erstellen eines Pfades aufgrund des Baumes einer Datenseite wird mittels des Flußdiagramms in Fig. 2 erläutert.The creation of a path based on the tree of a data page is explained using the flow chart in FIG. 2.

Die Erstellung des Pfades wird als rekursive Methode von Baumknoten "Kno ten.makePath()" dargestellt. Wenn ein Baumknoten Kinder hat, wird das Ergebnis zusammengesetzt aus den Unterergebnissen von "Knoten.getChild(i).makePath()". The creation of the path is used as a recursive method of tree nodes "Kno ten.makePath () ". If a tree node has children, the result will be composed from the sub-results of "Node.getChild (i) .makePath ()".

Die Methode wird für den WurzelKnoten des Baumes mit Wurzel.makePath() auf grufen und liefert dann einen Pfad für den ganzen Baum. Beim Aufruf werden zwei Knoten mit den Namen "startKnoten" und "endKnoten" als Parameter mitgegeben. Aufgrund dieser Knoten werden bei der Baumerstellung das '['- und das ']'-Zei chen eingefügt.The method is opened for the root node of the tree with Wurzel.makePath () and then provides a path for the whole tree. When called, two Given nodes with the names "startNode" and "endNode" as parameters. Because of these nodes, the '[' - and the ']' - time are used in the tree creation Chen inserted.

Der in Fig. 2 dargestellte Algorithmus erzeugt z. B. aus dem in Fig. 1 dargestellten Baum den Pfad "Wurzel{A{A|B|[C}B|A{B{D|D|D]}D}}".The algorithm shown in Fig. 2 generates e.g. B. from the tree shown in Fig. 1 the path "Root {A {A | B | [C} B | A {B {D | D | D]} D}}".

Das Erstellen eines Baumes aufgrund eines Pfades wird mittels des Flußdia gramms in Fig. 3 dargestellt.The creation of a tree on the basis of a path is shown by means of the flow diagram in FIG. 3.

Der Pfad wird in diesem Flußdiagramm als bereits in einzelne Tokens zerlegt an genommen. Wenn im Pfad eines der Zeichen {, }, [, ] oder | vorkommt, beginnt je weils ein neues Token. Der Pfad Wurzel{A(A|B|[C}B|A{B{D|D|D]}D}} besteht also aus den Token Wurzel, {, A, {, A, |, B, |, [, C, }, B, |, A, {, B, {, D, |, D, ], D, |, }, D, } und }. path.nextToken() liefert dabei nacheinander alle Token und path.hasMoreTokens() ist wahr, wenn weitere Token vorhanden sind. Die Wurzel des zu erzeugenden Baumes wird durch die Variable root dargestellt. Ein neuer Wurzelknoten wird durch new Knoten(type) erzeugt, wobei type der Typ des zu er zeugenden Knotens ist. Ein neuer Kindknoten wird durch Knoten.insertChild(type) erzeugt, wobei type der Typ des zu erzeugenden Kindknotens ist.The path is broken down into individual tokens in this flowchart taken. If one of the characters {,}, [,] or | occurs begins because a new token. The path root {A (A | B | [C} B | A {B {D | D | D]} D}} therefore exists from the token root, {, A, {, A, |, B, |, [, C,}, B, |, A, {, B, {, D, |, D,], D, |,} , D,} and }. path.nextToken () returns all tokens and path.hasMoreTokens () is true if there are more tokens. The root of the tree to be created is represented by the variable root. A new Root node is created by new node (type), where type is the type of the to producing node. A new child node is created by node.insertChild (type) created, where type is the type of the child node to be created.

Aus dem Pfad "Wurzel{A{A|B|[C}B|A{B{D|D|D]}D}}" erzeugt dieser Algorithmus bei spielsweise den in Fig. 1 angegebenen Baum.From the path "Root {A {A | B | [C} B | A {B {D | D | D]} D}}" this algorithm generates the tree shown in FIG. 1, for example.

Die Zuordnung eines Baumes, der aus einer Datenseite erstellt wurde, zu einem Baum, der aus einem Pfad erstellt wurde, wird mittels des Flußdiagramms in Fig. 4 dargestellt. Alle weiteren Flußdiagramme, also Fig. 5a bis Fig. 14, sind Detaildar stellungen zu Fig. 4. Der Bezug der Flußdiagramme zueinander ergibt sich aus den doppelt umrandeten Kästen: In Fig. 4 kommen z. B. Bezüge auf Figs. 5a, 5b, 6 und 9 vor. Dies bedeutet, dass die entsprechenden Abläufe an die Stelle der Kästen innerhalb von Fig. 4 gehören.The assignment of a tree that was created from a data page to a tree that was created from a path is shown by means of the flow chart in FIG. 4. .. All other flow charts, ie 5a to 14, Detaildar settings to Figure 4. The reference to the flow charts to each other results from the double-lined boxes. In Fig. 4 are for. B. References to Figs. 5a, 5b, 6 and 9. This means that the corresponding processes belong to the place of the boxes within FIG. 4.

Innerhalb des Flußdiagramms werden die Begriffe "seitenKnoten", "seitenWurzel", "pfadKnoten" und "pfadwurzel" benutzt. Dabei ist "seitenWurzel" die Wurzel des Baumes, der aus einer Datenseite erstellt wurde und "pfadWurzel" die Wurzel des Baumes, der aus einem Pfad erstellt wurde. "seitenKnoten" und "pfadKnoten" wer den zunächst mit "seitenWurzel" und "pfadWurzel" initialisiert.Within the flowchart, the terms "page node", "page root", "path node" and "path root" used. "SeitenWurzel" is the root of the Tree that was created from a data page and "path root" is the root of the Tree created from a path. "side node" and "path node" who initialized with "seitenWurzel" and "pfadWurzel".

Fig. 4 stellt dabei die rekursive Methode matchNode dar. Als Parameter erhält sie einen Baumknoten, der aus einem Pfad erstellt wurde (pfadknoten) und eine bool sche Angabe doMatch. Fig. 4 illustrates this recursive method matchNode. As parameters it receives a tree node that has been created from a path (path knot), and a specific indication bool doMatch.

Ist doMatch == Nein, soll keine Zuordnung der Knoten und ihrer Unterbäume vor genommen werden, sondern nur die Ähnlichkeit der Bäume geprüft werden. Bei exakter Übereinstimmung der Bäume ist der Rückgabewert 0, ansonsten gibt der Rückgabewert an, wie viele Knoten eingefügt, verändert oder gelöscht werden müssen, um die Bäume zur Übereinstimmung zu bringen.If doMatch == No, there should be no assignment of the nodes and their subtrees be taken, but only the similarity of the trees are checked. at the exact match of the trees is the return value 0, otherwise the Return value on how many nodes are inserted, changed or deleted to match the trees.

Ist doMatch == Ja, wird jedem Pfadknoten genau ein Seitenknoten zugeordnet. So kann zu den Pfadknoten startKnoten und endKnoten nach der Zuordnung genau bestimmt werden, zu welchen Seitenknoten und damit zu welcher Stelle in der Da tenseite sie gehören.If doMatch == Yes, exactly one page node is assigned to each path node. So can match the path nodes startNode and endNode exactly after the assignment to which side nodes and thus to which position in the da side you belong.

Die Zuordnung eines Pfadknotens zu einem Seitenknoten erfolgt durch den Metho denaufruf seitenKnoten.setLink(pfadKnoten).The assignment of a path node to a side node is done by the metho the call seitenKnoten.setLink (pathNode).

Der erste Aufruf der in Fig. 4 dargestellten Methode erfolgt durch seitenWur zel.matchNode(pfadWurzel, Ja). Innnerhalb der Figs. 4, 9, 10, 11, 12 und 13 erfol gen dann rekursive Aufrufe von matchNode. Sowohl der "seitenKnoten", für den der Aufruf erfolgt, als auch die Parameter "pfadKnoten" und "doMatch" sind bei jedem Aufruf angegeben. The first call of the method shown in FIG. 4 is made by seitenWur zel.matchNode (path root, yes). Within the Figs. 4, 9, 10, 11, 12 and 13 are then followed by recursive calls to matchNode. Both the "page node" for which the call is made and the parameters "path node" and "doMatch" are specified for each call.

Um eine robuste Zuordnung zu ermöglichen, auch wenn der im Pfad gespeicherte und der in der aktuellen Datenseite vorkommende Baum nicht genau übereinstim men, weil die Inhalte sich seit der Aufzeichnung verändert haben, werden der verti kale und der horizontale Toleranzalgorithmus durchgeführt, um die Bäume doch noch so gut wie möglich matchen zu können.To enable a robust assignment, even if the one saved in the path and the tree in the current data page does not exactly match because the content has changed since the recording, the verti kale and the horizontal tolerance algorithm performed to the trees after all to be able to match as best as possible.

Wenn die Typen zweier zu matchender Knoten nicht übereinstimmen, wird zu nächst der vertikale Toleranzalgorithmus angewandt, um zwei übereinstimmende Knotentypen zu finden.If the types of two nodes to be matched do not match, becomes next the vertical tolerance algorithm applied to match two Find node types.

Der vertikale Toleranzalgorithmus, bei dem eine Änderung der Werte von seiten Knoten, pfadKnoten und result erfolgt, ist im Flußdiagramms in Fig. 6 dargestellt.The vertical tolerance algorithm in which the values of the side node, path node and result are changed is shown in the flow chart in FIG. 6.

Der vertikale Toleranzalgorithmus soll das Matching zweier Bäume mit unter schiedlicher Struktur ermöglichen, wenn einzelne Knoten eingefügt werden. Zu nächst erfolgt die 1-Kind-Rekursion der pfadKnoten. Dabei gilt:
pfadKnoten.getChildCount() == 1
&& pfadKnoten.getChild(1).getTyp() == seitenKnoten.getTyp()
→ vertikaler Toleranzalgorithmus erfolgreich.The vertical tolerance algorithm is intended to allow two trees with different structures to be matched when individual nodes are inserted. The 1-child recursion of the path nodes follows first. The following applies:
pathNode.getChildCount () == 1
&& pfadKnoten.getChild ( 1 ) .getTyp () == seitenKnoten.getTyp ()
→ vertical tolerance algorithm successful.

Wenn jedoch pfadKnoten.getChild(1).getTyp() != seitenKnoten.getTyp(), so erfolgt die Rekursion der pfadKnoten: Solange pfadKnoten und die Kinder von pfadKnoten jeweils genau 1 Kindknoten aber einen anderen Typ als seitenKnoten haben, gehe jeweils zum nächsten Kindknoten von pfadKnoten. Sobald hierbei ein Kindknoten von pfadKnoten den gleichen Typ hat wie seitenKnoten, ist der vertikale Toleran zalgorithmus erfolgreich.However, if pfadKnoten.getChild ( 1 ) .getTyp ()! = SeitenKnoten.getTyp (), the path nodules are recursed: As long as path nodules and the children of path nodules each have exactly 1 child node but a different type than side node, go to the next one Child nodes from path nodes. As soon as a child node of path nodes has the same type as page nodes, the vertical tolerance algorithm is successful.

Ist die 1-Kind-Rekursion der pfadKnoten nicht erfolgreich, erfolgt im vertikalen Tole ranzalgorithmus die 1-Kind-Rekursion der seitenKnoten. Dieser ist in Fig. 7a darge stellt. Hier gilt:
seitenKnoten.getChildCount() == 1
→ führe den vertikalen Toleranzalgorithmus für seitenKnoten.getChild(1) durch. Beginne dabei wieder mit "pfadKnoten".If the 1-child recursion of the path nodes is unsuccessful, the 1-child recursion of the side nodes takes place in the vertical tolerance algorithm. This is shown in Fig. 7a Darge. The following applies here:
seitenKnoten.getChildCount () == 1
→ run the vertical tolerance algorithm for seitenKnoten.getChild ( 1 ). Start again with "path node".

Nachdem der vertikale Toleranzalgorithmus durchgeführt wurde, kann sich sowohl der zu matchende seitenKnoten, als auch der pfadKnoten verändert haben. Die sich ergebende Ergebnisverarbeitung ist in Fig. 7b dargestellt.After the vertical tolerance algorithm has been carried out, both the page node to be matched and the path node may have changed. The resulting processing of results is shown in FIG. 7b.

Ist die Methode mit doMatch = Ja aufgerufen worden, muß zu jedem pfadKnoten genau ein seitenKnoten zugeordnet werden. Daher werden alle im vertikalen Tole ranzalgorithmus übersprungenen pfadKnoten dem gefundenen seitenKnoten zuge ordnet.If the method was called with doMatch = Yes, each path node must exactly one page node can be assigned. Therefore, all are in vertical tole ranz algorithm skipped path nodes to the found page node assigns.

Ist die Methode mit doMatch = Nein aufgerufen worden, muß eine Bewertung der Zuordnung zurückgegeben werden. Daher wird für jedem Schritt, der in der 1-Kind- Rekursion von pfadKnoten und seitenKnoten gegangen wurde 1 addiert.If the method was called with doMatch = No, the Assignment will be returned. Therefore, for each step in the 1-child Recursion of path nodes and side nodes has been added 1.

Wenn nach dem vertikalen Toleranzalgorithmus die Typen von pfadKnoten und seitenKnoten immer noch nicht übereinstimmen, findet das in Fig. 5a dargestellte Rekursionsende wegen unterschiedlichen Knotentypen statt.If, according to the vertical tolerance algorithm, the types of path nodes and side nodes still do not match, the end of the recursion shown in FIG. 5a takes place because of different node types.

Ist die Methode mit doMatch = Ja aufgerufen worden werden alle Kinder von pfad Knoten zu seitenKnoten zugeordnet.If the method is called with doMatch = Yes, all children are from path Nodes assigned to side nodes.

Ist die Methode mit doMatch = Nein aufgerufen worden, wird zur Anzahl der Kind knoten von pfadKnoten die Anzahl der Kindknoten von seitenKnoten und 2 addiert und das Ergebnis als Rückgabewert von doMatch zurüchgegeben.If the method was called with doMatch = No, the number of children becomes node of path nodes the number of child nodes of side nodes and 2 added and the result is returned as a return value from doMatch.

Wenn nach dem vertikalen Toleranzalgorithmus pfadKnoten.getChildCount == 0 gilt, findet das in 5b dargestellt Rekursionsende, keine weiteren Kindknoten, statt. ist die Methode mit doMatch = Ja aufgerufen worden werden Kinder von pfadKnoten zu seitenKnoten zugeordnet.If the vertical tolerance algorithm path_node.getChildCount == 0, the end of the recursion shown in FIG. 5b, no further child nodes, takes place. is the method with doMatch = Yes children are called from path nodes assigned to side nodes.

Ist die Methode mit doMatch = Nein aufgerufen worden, wird die Anzahl der Kind knoten von pfadKnoten als Rückgabewert von doMatch zurüchgegeben. If the method was called with doMatch = No, the number of children node of path node returned as doMatch return value.

Nachdem in Fig. 4 der vertikale Toleranzalgorithmus durchgeführt wurde, die Kno tentypen von pfadKnoten und seitenKnoten übereinstimmen und pfadKno ten.getChildCount()<0 ist, wird der in Fig. 8 dargestellte horizontale Toleranzalgo rithmus durchgeführt. Im horizontalen Toleranzalgorithmus wird ermittelt, welche Kinder von pfadKnoten mit welchen Kindern von seitenKnoten zu mathcne sind. Hierbei gelten 2 Grundregeln: 1. Jeder pfadKnoten muß genau einem seitenKnoten zugeordnet werden und 2. Das matching muß so erfolgen, dass möglichst wenig Knoten entfernt, hinzugefügt oder verändert werden müßten, damit die beiden Bäume übereinstimmen.After the vertical tolerance algorithm has been carried out in FIG. 4, the node types of path node and side node match and path node.getChildCount () <0, the horizontal tolerance algorithm shown in FIG. 8 is carried out. The horizontal tolerance algorithm determines which children of path nodes are mathcne with which children of side nodes. Two basic rules apply here: 1. Each path node must be assigned to exactly one page node and 2. The matching must be carried out in such a way that as few nodes as possible have to be removed, added or changed so that the two trees match.

Im in Fig. 8 dargestellten horizontalen Toleranzalgorithmus werden die Kindknoten von pfadKnoten anhand der pfadKnotenNr und die Kindknoten von seitenKnoten anhand der seitenKnotenNr durchlaufen. Die Schleife bricht ab, wenn das letzte Kind von pfadKnoten oder das letzte Kind von seitenKnoten erreicht sind. Der Rumpf der Zuordnungsschleife für den horizontalen Toleranzalgorithmus ist in Fig. 9 dargestellt.In the horizontal tolerance algorithm shown in FIG. 8, the child nodes of path nodes are traversed on the basis of the path node number and the child nodes of page nodes on the basis of the page node number. The loop breaks when the last child of path nodes or the last child of side nodes are reached. The body of the mapping loop for the horizontal tolerance algorithm is shown in FIG. 9.

Dabei wird zunächst festgestellt, ob die durch pfadKnotenNr und seitenKnotenNr bezeichneten Kindknoten und die sich unterhalb dieser Knoten befindlichen Teil bäume genügend Ähnlichkeit aufweisen, um auf die Suche nach einem besseren Matching für pfadKnoten.getChild(pfadKnotenNr) und seitenKno ten.getChild(seitenKnotenNr) zu verzichten. Dazu wird
"baseMatch" = Anzahl der Knoten, die entfernt, hinzugefügt oder verändert werden müssen, um die Teilbäume exakt zu matchen,
"summeSeitenKnoten" = Anzahl der Knoten des Teilbaums unterhalb von seiten Knoten.getChild(seitenKnotenNr) und
"summeSeitenKnoten" = Anzahl der Knoten des Teilbaums unterhalb von pfad Knoten.getChild(pfadKnotenNr)
bestimmt. Wenn die Typen übereinstimmen und
baseMatch<=(summeSeitenKnoten+summePfadKnoten)/7
gilt, so wird das Matching als ausreichend angesehen. Die 7 ist hier willkürlich ge wählt und hat sich in der Praxis als sinnvoll erwiesen. Diese Abbruchbedingung ist etwas weiter gefasst als die alternative Bedingung: baseMatch == 0. Dies hat den Sinn, bei nur geringer Abweichungder Bäume die Berechnungszeit abzukürzen.It is first determined whether the child nodes designated by pathNodeNo and pageNodeNo and the subtrees below these nodes are sufficiently similar to search for better matching for pathNode.getChild (pathNodeNo) and page node.getChild (PageNodeNo) dispense. This will
"baseMatch" = number of nodes that have to be removed, added or changed in order to match the subtrees exactly,
"sumSeitenKnoten" = number of nodes in the subtree below the Seiten nodes.getChild (seitenKnotenNr) and
"sumSeitenKnoten" = number of nodes in the subtree below path node.getChild (pathNodeNumber)
certainly. If the types match and
base match <= (sum + side node summation path nodes) / 7
applies, the matching is considered sufficient. The 7 is chosen arbitrarily here and has proven to be useful in practice. This termination condition is somewhat broader than the alternative condition: baseMatch == 0. This makes sense to shorten the calculation time if there is only a slight difference in the trees.

Wenn in Fig. 9 keine ausreichende Übereinstimmung der Teilbäume festgestellt wurde, wird zunächst die in Fig. 10 dargestellte Suchschleife nach einem besseren seitenKnoten durchgeführt.If a sufficient match between the subtrees was not found in FIG. 9, the search loop shown in FIG. 10 is first carried out for a better side node.

Dabei wird für alle Kinder von seitenKnoten von seitenKnotenNr bis seitenKno ten.getChildCount() festgestellt, ob sie besser zu pfadKno ten.getChild(pfadKnotenNr) passen, als der bisher beste Match "bestSeitenMatch". Dabei wird die bestSeitenKnotenNr, das ist die Nr des Kindknotens von seitenKno ten, der zusammen mit dem unterhalb gelegenen Teilbaum am besten zu pfad Knoten.getChild(pfadKnotenNr) paßt, ermittelt.Here, for all children from seitenKnoten from seitenKnotenNr to seitenKno ten.getChildCount () determined whether it is better to pathKno ten.getChild (pathNodeNumber) fits as the best match so far "bestSeitenMatch". The bestSeitenKnotenNr, that is the number of the child node of seitenKno ten, which is best to path together with the subtree below Node.getChild (pathNodeNumber) matches.

Wenn die bestSeitenKnotenNr das letzte Kind von seitenKnoten bezeichnet, so bricht die Verarbeitung in Fig. 9 ab, die Schleife in Fig. 8 wird beendet und die Ab schlußverarbeitung in Fig. 4 findet statt.If the best page node number designates the last child of page nodes, the processing in FIG. 9 is terminated, the loop in FIG. 8 is ended and the final processing in FIG. 4 takes place.

Wir wollen hier jedoch noch nicht die Abschlußverarbeitung in Fig. 4 beschreiben, sondern erst die Details des horizontalen Toleranzalgorithmus in den Figs. 11-13 darstellen, ehe wir das Ende der Verarbeitung in Fig. 4 besprechen.However, we do not yet want to describe the completion processing in FIG. 4, but only the details of the horizontal tolerance algorithm in FIGS. 11-13 before we discuss the end of processing in FIG. 4.

Nun findet die in Fig. 11 dargestellte Suchschleife nach besserem Seitenknoten mit Ergebnisverarbeitung statt:
Wenn sich während der Suchschleife nach besserem seitenKnoten in Fig. 10 eine bestSeitenKnotenNr != seitenKnotenNr gefunden hat, müssen summeSeitenKnoten und baseMatch neu berechnet werden, da diese Zahlen während der Suchschleife nach bessrem pfadKnoten zum Vergleich herangezogen werden.Now the search loop shown in FIG. 11 for a better page node with result processing takes place:
If a bestSeitenKnotenNr! = SeitenKnotenNr has been found during the search loop for a better SeitenKnoten in Fig. 10, sumSeitenKnoten and baseMatch must be recalculated, since these numbers are used during the search loop for a better pathKnoten for comparison.

Wenn die Typen übereinstimmen und
baseMatch<=(summeSeitenKnoten+summePfadKnoten)/7
gilt, so wird das Matching als ausreichend angesehen und die Abschlußverarbei tung in Fig. 9 durchgeführt.If the types match and
base match <= (sum + side node summation path nodes) / 7
applies, the matching is considered sufficient and the final processing in FIG. 9 is carried out.

Ansonsten wird jedoch zunächst die in Fig. 12 dargestellte Suchschleife nach bes serem pfadKnoten durchgeführt. Das Prinzip ist ähnlich wie das der in Fig. 10 dar gestellten Suchschleife nach besserem seitenKnoten:
Alle Kindknoten von pfadKnotenNr+1 bis pfadKnoten.getChildCount() werden nacheinander mit seitenKnoten.getChild(seitenKnotenNr) gematcht, um festzustel len, ob eines dieser Kinder von pfadKnoten besser paßt, als das bisher beste Mat ching. Wenn sich hierbei ein besseres Matching ergibt, so wird der Wert dieses Matching "bestPfadMatch" zum zukünftigen Rückgabewert "result" der Mehode doMach addiert.Otherwise, however, the search loop shown in FIG. 12 is first carried out for a better path node. The principle is similar to that of the search loop shown in FIG. 10 for a better page node:
All child nodes from pfadKnotNr + 1 to pfadKnot.getChildCount () are matched in succession with seitenKnot.getChild (seitenNodelNo) to determine whether one of these children fits better than the best matching so far. If this results in a better matching, the value of this matching "bestPfadMatch" is added to the future return value "result" of the doMach method.

Nachdem weder in Fig. 4, noch in Fig. 9 oder Fig. 11 die Typen und Teilbäume der zu matchenden Kindknoten ähnlich genug waren, beginnt nun von Fig. 11 aus die in Fig. 13 dargestellte Ergebnisverarbeitung für besseren pfad- und seitenKnoten.After neither in Fig. 4, even in FIG. 9 or FIG. 11, the types and sub-trees of were similar enough to be matched child nodes, will now begin to FIG. 11 from the position shown in Fig. 13 result processing for better path and page nodes.

Ist matchNode mit doMatch = Ja aufgerufen worden werden alle Kinder von pfad Knoten zwischen pfadKnotenNr und bestPfadKnotenNr mit seitenKno ten.getChild(bestSeitenKnotenNr) zugeordnet.If matchNode is called with doMatch = Yes, all children are called by path Node between path node number and best path node number with side node assigned to ten.getChild (bestSeitenKnotenNr).

Ist matchNode mit doMatch = Nein aufgerufen worden, wird geprüft, ob das letzte kind von pfadKnoten erreicht ist und, wenn dies der Fall ist, zum Rückgabewert "re sult" seitenKnoten.getChildCount()-bestSeitenKnotenNr addiert. If matchNode was called with doMatch = No, it is checked whether the last one child of path nodes is reached and, if this is the case, to the return value "re sult "seitenKnoten.getChildCount () - bestSeitenKnotenNr added.

Wir befinden uns jetzt am Ende der Verarbeitung, die in Fig. 11 dargestellt ist. Nun wird noch dargestellt, wie in Fig. 9 nach der Verarbeitung der Suchschleife nach besserem pfadKnoten mit Ergebnisverarbeitung gemäß Fig. 11 der Rumpf der Zu ordnungschleife für den horizontalen Toleranzalgorithmus endet und wie nach der Zuordnung der Kinder mit dem horizontalen Toleranzalgorithmus in Fig. 4 die Ab schlußverarbeitung der Methode matchNode stattfindet.We are now at the end of the processing shown in Fig. 11. Now it is shown how in Fig. 9 after processing the search loop for a better path node with result processing according to Fig. 11 the body of the assignment loop for the horizontal tolerance algorithm ends and how after the assignment of the children with the horizontal tolerance algorithm in Fig. 4 the The final processing of the matchNode method takes place.

Wenn während der in Figs. 9 + 11 dargestellten Verarbeitung festgestellt wird, dass 2 Kindknoten gefunden wurden, die gut genug zusammen passen und gematcht werden sollen, ohne dass weitere noch besser passende Kindknoten gesucht wer den ("typesMatch == Ja"), findet am Ende von Fig. 9 die Abschlußverarbeitung für passende Typen statt.If during the in Figs. 9 + 11 processing found that 2 child nodes were found that fit together well enough and should be matched without looking for further better matching child nodes ("typesMatch == Yes"), is found at the end of Fig. 9 the final processing takes place for suitable types.

Ist matchNode mit doMatch = Ja aufgerufen worden, wird für seitenKno ten.getChild(bestSeitenKnotenNr) und pfadKnoten.getChild(bestPfadKnotenNr) matchNode mit doMatch = Ja aufgerufen.If matchNode was called with doMatch = Yes, seitenKno ten.getChild (bestPageNodeNo) and pathNode.getChild (bestPathNodeNo) matchNode called with doMatch = Yes.

Ist matchNode mit doMatch = Nein aufgerufen worden, wird baseMatch zum Rück gabewert "result" addiert.If matchNode was called with doMatch = No, baseMatch will be returned added value "result".

Wenn der Rumpf der Zuordungsschleife in Fig. 9 mit "Weiter" terminiert, geht die in Fig. 8 dargestellte Zuordnugnsschleife weiter. Wenn der Rumpf der Zuordungs schleife in Fig. 9 jedoch mit "Abbruch" terminiert, oder eine der in Fig. 8 dargestell ten Abbruchbedingungen eintritt, findet die Abschlußverarbeitung in Fig. 4 statt. Wenn bei der Zuordnung der Kindknoten das letzte Kind von seitenKnoten erreicht wurde, aber noch Kinder von pfadKnoten übrig sind, werden die übrigen Kinder von pfadKnoten mit dem letzten zugeordneten Kind von seitenKnoten gematcht.If the body of the mapping loop in FIG. 9 terminates with "Next", the mapping loop shown in FIG. 8 continues. However, if the body of the mapping loop in FIG. 9 terminates with "abort", or one of the abort conditions shown in FIG. 8 occurs, the completion processing in FIG. 4 takes place. If the last child of the side node was reached when the child nodes were assigned, but there are still children of path nodes, the remaining children of path nodes are matched with the last assigned child of side nodes.

Ist matchNode mit doMatch = Ja aufgerufen worden, werden nun pfadKnoten und seitenKnoten einander zugeordnet und als Rückgabewert 0 geliefert.If matchNode was called with doMatch = Yes, path nodes and page nodes assigned to each other and returned as return value 0.

Ist matchNode mit doMatch = Nein aufgerufen worden, wird als Rückgabewert "re sult" geliefert. If matchNode was called with doMatch = No, the return value is "re sult "delivered.

Gemäß einem weiteren Ausführungsbeispiel der Erfindung wird eine Vorrichtung zum automatischen Aufzeichnen und Ausführen von Abfragen aus dem Internet beschrieben. Die Vorrichtung weist Zugriffsmittel auf, mit denen eine Verbindung zum Internet hergestellt werden kann, und über die der Zugriff auf Startadressen und die dahinterliegenden Datenseiten im Internet erfolgt. Weiterhin weist die Vor richtung einen Speicher auf, der zum Speichern der Startadressen, der Positi onsinformationen und auch ganzer Datenseiten aus dem Internet geeignet ist. Weiterhin weist die Vorrichtung als Berechnungsmittel eine zentrale Verarbeitungs einheit (CPU) und vorteilhafterweise einen Monitor als Darstellungsmittel zum Dar stellen der abgerufenen Daten aus dem Internet auf. Alle Bestandteile der Vorrich tung sind so informationstechnisch miteinander verbunden, dass die Vorrichtung das vorbeschriebene Verfahren ausführen kann. Zweckmäßigerweise ist die Vor richtung ein Personalcomputer (PC), der zumindest ein Modem zum Herstellen ei ner Verbindung ins Internet, einen Arbeitsspeicher, eine Verarbeitungseinheit und einen Monitor umfasst. Die Verarbeitungseinheit steuert die Abfragen aus dem In ternet, indem sie über das Modem auf eine Startadresse im Internet zugreift, die Startadresse und die durch den Zugriff abgerufene Datenseite im Arbeitsspeicher speichert und den Monitor veranlasst, die abgerufenen Informationen darzustellen. Die Vorrichtung umfasst ebenfalls Eingabemittel, über die der Nutzer zum Beispiel einen Link auf der zuvor abgerufenen Datenseite auswählen kann. Daraufhin ver anlasst die Verarbeitungseinheit den Link als Positionsinformation im Arbeitsspei cher zu speichern, und über das Modem auf die hinter dem Link liegenden Informa tionen im Internet zuzugreifen, und diese ebenfalls im Arbeitsspeicher abzuspei chern. Erkennt der Nutzer nun auf der abgerufenen Datenseite ihn interessierende Informationen, markiert er diese Informationen als ausgewählte Daten. Die Position dieser ausgewählten Daten wird ebenfalls als erste Positionsinformation im Arbeits speicher gespeichert. Bei der Ausführung der Abfrage wird wiederum auf die im Arbeitsspeicher abgespeicherte Startadresse zugegriffen und die damit abgerufene Datenseite als aktuelle Datenseite im Arbeitsspeicher gespeichert. Die durch die Folge von zuvor aufgezeichneten Abrufen gespeicherten Positionsinformationen werden nun von der Verarbeitungseinheit verwendet, die ausgewählten Daten auf der aktuellen Datenseite aufzufinden. Nach erfolgreicher Auffindung der ausge wählten Daten auf der aktuellen Datenseite veranlasst die Verarbeitungseinheit die aktuellen ausgewählten Daten auf dem Monitor darzustellen. Eine vorteilhafte Aus gestaltung der hier dargestellten Ausführungsform besteht darin, dass Teile der Vorrichtung, wie zum Beispiel der Speicher oder die Verarbeitungseinheit, sich in einem entfernten Servercomputer befinden, und der Nutzer selbst an seinem Platz nur ein Eingabemittel (Maus, Tastatur), ein Monitor zum Darstellen der Informatio nen und einen Internetanschluss (Modem) benötigt.According to a further embodiment of the invention, a device for automatic recording and execution of queries from the Internet described. The device has access means with which a connection to the Internet and through which access to start addresses and the underlying data pages take place on the Internet. Furthermore, the front towards a memory that is used to store the start addresses, the positi onsinformation and entire data pages from the Internet is suitable. Furthermore, the device has central processing as calculation means unit (CPU) and advantageously a monitor as a display means for Dar set up the data retrieved from the Internet. All components of the device tion are so interconnected in terms of information technology that the device can perform the procedure described above. The front is expedient direction a personal computer (PC), the egg at least a modem to manufacture ner connection to the Internet, a working memory, a processing unit and includes a monitor. The processing unit controls the queries from within ternet by accessing a starting address on the Internet via the modem, the Start address and the data page in memory accessed by the access saves and causes the monitor to display the retrieved information. The device also includes input means via which the user, for example can select a link on the previously accessed data page. Thereupon ver the processing unit starts the link as position information in the working memory save and via the modem to the information behind the link to access the internet and also save it in the working memory manuals. The user now recognizes the user who is interested on the accessed data page Information, it marks this information as selected data. The position this selected data is also used as the first position information in the work memory saved. When executing the query, in turn, the im Start address stored in RAM and the retrieved address Data page saved as current data page in the working memory. The through the Sequence of position information stored previously recorded are now used by the processing unit, the selected data to find the current data page. After successfully finding the out selected data on the current data page causes the processing unit display the currently selected data on the monitor. An advantageous off design of the embodiment shown here is that parts of the Device, such as the memory or the processing unit, in a remote server computer, and the user himself in place only one input device (mouse, keyboard), a monitor to display the information and an Internet connection (modem) is required.

Insgesamt wird damit ein Verfahren und eine Vorrichtung bereitgestellt, die ein be sonders komfortables, konfiguriertes und fehlertolerantes Aufzeichnen und Ausfüh ren von Abfragen aus Datennetzen ermöglicht.Overall, a method and a device is thus provided which a particularly comfortable, configured and fault-tolerant recording and execution queries from data networks.

Claims

1. A method for recording and executing queries from the Internet and / or other data networks, comprising the following steps:

a) storage of a start address, retrieval of a data page starting from the start address, recording of at least one retrieval from the data network and / or at least one item of data defining selected position information, the recording of a retrieval comprising:
Selection of data elements on the previously accessed data page;
Recording a second positional information defining a section of the previously accessed data page, which contains the selected data elements;
Retrieving at least one of the data pages defined by the data elements;

and at a second point the steps:

a) Retrieving the start address and storing the current data page, starting from the start address;
b) Execution of the recorded calls, the execution of a call comprising:
Determining at least a section of the previously retrieved current data page by the stored second position information;
Retrieval and storage of the current data pages defined by the data elements in the recorded section; and or
c) Determination of the selected data on the accessed data pages by the first position information.

2. The method according to claim 1, further comprising the step at the second point in time:

a) Displaying the currently selected data as the result of the query.

3. The method according to any one of claims 1 to 2, wherein the data pages a ge ordered tree structure (such as HTML pages).

4. The method according to any one of claims 1 to 3, wherein the first and the second Position information is in each case a path that is a section of the Node structure of the retrieved data page with designation of start and Contains end nodes.

5. The method of claim 4, wherein the determination of the section in step c) and / or the determination of the selected data in step d) by a Mapping of the path to the current data page due to similarities between the structure of the current data page or the position information and the path takes place.

6. The method according to any one of the preceding claims, wherein the recording of the position information comprises:
Storage of nodes, the path nodes, and first path information, defining the path between the path nodes, and / or
Storage of additional information about the type and / or parameters of the calls to be recorded, the additional information and / or parameters comprising: a range of information on a data page, a password, a date, a start or end value, a search term, a link or other configuration settings.

7. The method according to claim 6, wherein the additional information and / or Parameters can be saved as additional path nodes.

8. The method according to any one of the preceding claims, wherein the mapping of the path onto the current data page initially comprises:
Determination of nodes in the current data page, the page nodes, and second route information, defining routes between the page nodes.

9. The method according to any one of the preceding claims, wherein the mapping of the path to the current data page further comprises:
Assignment of corresponding page nodes to the path nodes, each path node being assigned to at least one page node and, in the event that the unique assignment is not possible, an error-tolerant assignment is made.

10. The method of claim 9, wherein the fault-tolerant assignment the Berück consideration of the types of path and side nodes.

11. The method according to any one of claims 9 or 10, wherein the fault tolerant Zu order taking into account the first and second route information includes.

12. The method according to any one of claims 9 to 11, wherein the fault tolerant Zu order taking into account the number of nodes on at least one Level of the path and structure of the current data page.

13. The method according to any one of claims 9 to 12, wherein the fault tolerant Zu order taking into account the level depth in the path and / or the structure the current data page, on which there is a path and / or a page knot is included.

14. The method according to any one of claims 8 to 13, wherein the retrieval of the current Data page storing the structure of the current data page, including Side node and second way information includes.

15. The method of claim 14, wherein the recording in step a) die Storage of at least part of the structure of the data pages accessed includes and the determination in step c) and / or d) the mapping of Structure of the recorded data pages on the structure of the current one Data pages.

16. The method according to any one of the preceding claims for automatically recording and executing queries on selected data distributed over a plurality of data pages, comprising:
at the first time:

a) performing step a) several times for a different start address until the calls and / or the first position information are recorded for all these start addresses;

at the second point in time:

a) performing step b) for each start address stored in step aa);
b) Executing step c) several times, the calls from the data network recorded for each start address being carried out.
c) performing step d) several times until the current selected data are determined for each start address.

17. The method according to claim 16, further comprising in step dd) the steps:
Extracting the currently selected data; and
Presentation of the extracted data on a presentation data page.

18. The method according to any one of the preceding claims, wherein in step a) standalone, reusable unit, a macro, is recorded, including the start address and the position information.

19. The method of claim 18, further comprising editing at least one macro comprising:
Generate, change, save or delete the macro or parts of the macro.

20. The method according to any one of claims 18 or 19, comprising generating a meta macro from recorded macros, comprising the steps:
Opening a meta macro template;
Copy the contents of the recorded macros into the meta macro template;
Save the filled meta macro template as a meta macro.

21. The method according to any one of claims 18 to 20, further comprising the Run at least one macro or meta macro, including the auto perform steps b) -d) with the macro or meta macro recorded content.

22. The method according to any one of claims 18 to 21, further comprising the Storage of macros or meta macros on public servers for use third party.

23. The method according to any one of the preceding claims, wherein the start address a unique address (URL) on the Internet or in another public one or non-public data network.

24. The method according to any one of the preceding claims, wherein the out chose data on data pages, such as HTML data pages, XML data pages or WAP-enabled data pages are saved.

25. The method according to any one of claims 23 or 24, wherein the path is an access path based on the URL to selected data on a website.

26. The method according to any one of the preceding claims, wherein the out selected data are stored on dynamic data pages and the selected data change depending on the retrieval.

27. The method according to any one of the preceding claims, wherein the automatically data to be retrieved current news information on various Are websites that are accessed regularly.

28. The method according to any one of the preceding claims, wherein the Determination of the section of the current data page and / or the selected data by comparing the position information and the Structure of the current data page takes place by means of a neural network.

29. Device for performing the method according to one of the preceding Expectations.

30. Computer program storable on a computer readable data carrier, that when the computer program is running on a computer that Method according to one of claims 1 to 28 realized.