DE19859169A1

DE19859169A1 - Learning system for controlling autonomous earth moving machines

Info

Publication number: DE19859169A1
Application number: DE19859169A
Authority: DE
Inventors: Patrick S Rowe
Original assignee: Carnegie Mellon University
Current assignee: Carnegie Mellon University
Priority date: 1997-12-19
Filing date: 1998-12-21
Publication date: 1999-06-24
Also published as: JPH11315556A

Abstract

The system has a machine with several mechanical connections joined together at connecting points. These are simultaneously moved on command. A processing system (20-38) carries out at least one script. Each script has at least one variable parameter defining a movement of the machine. A learning algorithm modifies the value of at least one variable parameter on the basis of a desired result and measured conditions affecting the result. An Independent claim is also included for a method for controlling autonomous earth moving machines

Description

Die Erfindung bezieht sich auf ein System und ein Verfahren zur Steuerung der Bewegung von Roboter- bzw. ferngesteuerten Maschinen, insbesondere auf einen Lernalgorithmus zur Modifi zierung der Steuerparameter ferngesteuerter Maschinen während Erdbewegungsarbeiten.The invention relates to a system and a method to control the movement of robotic or remote controlled Machines, especially on a learning algorithm for Modifi the control parameters of remote-controlled machines during Earthmoving work.

Bestimmte Arten von Maschinen vollführen während des Betriebs wiederholte Bewegungen. Beispielsweise vollführt ein hydraulischer Bagger bei Erdbewegungsarbeiten wiederholte Bewegungen wie Abtragen oder Ausheben und Laden. Gegenwärtig geht die Entwicklung von Systemen zur Automatisierung der Steuerung von Erdbewegungs- und anderen Arten von Maschinen dahin, den Bedarf an menschlichen Bedienungspersonen zu min dern und die Aufgaben so schnell und genau wie möglich aus zu führen. Der Begriff "Erdbewegungsmaschine" und ähnliche bezie hen sich hier auf Bagger, Radlader, Kettenschlepper, Motor- Erdhobel, Landwirtschaftsmaschinen, Pflastermaschinen, Asphal tiermaschinen und dergleichen, die sowohl erstens über eine Baustelle beweglich und zweitens in der Lage sind, Topographie oder Geographie einer Baustelle mit einem Werkzeug oder dem Funktionsteil der Maschine wie dem Kübel, der Schaufel, der Klinge, dem Aufreißer, der Verdichtungswalze und dergleichen zu verändern.Certain types of machines operate during the Operating repeated movements. For example, introduces a hydraulic excavator repeated during earthmoving work Movements such as removal or excavation and loading. Currently goes on to develop systems to automate the Control of earthmoving and other types of machines to reduce the need for human operators to min and the tasks as quickly and accurately as possible to lead. The term "earth moving machine" and the like here on excavators, wheel loaders, chain tractors, motor Ground planes, agricultural machines, paving machines, asphal animal machines and the like, both firstly via a Construction site is mobile and secondly, are able to topography or geography of a construction site with a tool or the Functional part of the machine such as the bucket, the shovel, the Blade, the ripper, the compacting roller and the like to change.

Gegenwärtig werden Systeme für ferngesteuerte Maschinen entwickelt, die während des Betriebs "lernen". Der "Lernvor gang" umfaßt typischerweise das Speichern einer Reihe von Schritten zur Ausführung einer Funktion wie Abtragen/Ausheben und Abkippen sowie die befehlsgemäß häufige Wiederholung der Schritte. Die gegenwärtigen Lernfunktionen sind so angelegt, daß wiederkehrende Aufgaben wiederholt werden, um den Bedarf an Bedienungspersonal zu mindern, um die gleiche Aufgabe meh rere Male auszuführen. Die Bedingungen auf einer Baustelle können sich jedoch häufig ändern, wodurch eine programmierte Reihe von Schritten mit den sich ändernden Bedingungen weniger effizient werden. Auf einem Baggergelände beispielsweise än dert sich das Terrain an der Baggerfront ständig, die Menge und die Verteilung des Materials auf der Ladefläche eines Lastwagens ändert sich, wenn Materialladungen hinzugefügt wer den. Ferner können sich die Eigenschaften des Aushubmaterials mit fortschreitender Freilegung neuer Bodenschichten ändern, beispielsweise große Geröllsteine, Felsen, Kies, loser Sand, klebriger von. Eine programmierte Reihe von Schritten, die am Beginn einer Arbeitsschicht im höchsten Maße effizient sein kann, wird mit fortschreitender Arbeit weniger effizient.Systems for remote-controlled machines are currently being used developed that "learn" during operation. The "learning process gang "typically involves storing a number of Steps to perform a function such as cut / dig and tipping as well as the repetition of the Steps. The current learning functions are designed that repetitive tasks are repeated to meet the need to reduce operating personnel to perform the same task to perform several times. The conditions on a construction site however, can change frequently, making a programmed Series of steps with changing conditions less become efficient. For example, on an excavator site the terrain on the excavator front is constantly changing, the crowd and the distribution of the material on the bed one Truck changes when material loads are added the. Furthermore, the properties of the excavated material can change with the progressive exposure of new soil layers, for example large rubble stones, rocks, gravel, loose sand, sticky of. A programmed series of steps that take place on Be efficient at the start of a shift can become less efficient as work progresses.

Von B. Song und A. Koivo, "Neuraladaptive Steuerung von Baggern", Proceedings of International Conference on Intel ligent Robots and Systems, Band 1, S. 162-167 ist ein opti mierter Drehmoment-Term (feed-forward torque term) zum Anpas sen des Aushubplans bei Änderungen der Textur des auszuheben den Materials bekannt. Der Drehmoment-Term wird durch ein neu rales Netzwerk berechnet, das trainiert ist, die inverse Dyna mik des Baggers zu berechnen. Obwohl die Einführung des opti mierten Drehmoment-Terms die Nachführung und Stabilität ins gesamt verbessert hat, erfordern neurale Netzwerke beträcht liche Rechenzeit zum Trainieren; einmal trainiert, werden Vor ausbestimmungen aber sehr schnell berechnet. Ein zusätzlicher Nachteil bei neuralen Netzwerken ist, daß sie neu trainiert werden müssen, um Information über neue Daten einzuführen, d. h. sie passen sich nicht leicht autonom Änderungen der Um gebung an.By B. Song and A. Koivo, "Neural Adaptive Control of Digger ", Proceedings of International Conference on Intel ligent Robots and Systems, Volume 1, pp. 162-167 is an opti Feed torque torque term for adaptation the excavation plan when the texture of the excavation changes known the material. The torque term is replaced by a new one calculated network, which is trained, the inverse Dyna to calculate the mic of the excavator. Although the introduction of the opti torque terms, the tracking and stability ins overall, neural networks require considerable computing time for training; once trained, be pre determinations but calculated very quickly. An additional one The disadvantage of neural networks is that they train again to introduce information about new data, d. H. they do not easily adapt to changes in the autonomous order indication.

Von beispielsweise D. Seward, "LUCIE - Der Autonome ferngesteuerte Bagger", Industrial Robot International Quar terly, Band 19, Nr. 1, S. 14-18 sind auf Regeln basierende Systeme zur Steuerung der Arbeiten auf einem Baggergelände bekannt. Diese Systeme erfordern typischerweise eine große Anzahl von Regeln zur Verarbeitung variabler Bedingungen während der Aushubarbeiten; die Regeln müssen vor dem Start der Arbeiten implementiert werden. Derartige Systeme sind nicht in der Lage, die Regeln einzustellen oder anzupassen, um unvorhergesehenen Situationen gerecht zu werden oder Bewegun gen aufgrund vorausgegangener Erfahrungen zu optimieren. Es ist auch unklar, wie die Parameter oder Schwellwerte in den Regeln erzeugt werden.For example, from D. Seward, "LUCIE - The Autonomous remote controlled excavators ", Industrial Robot International Quar terly, Volume 19, No. 1, pp. 14-18 are rule-based Systems for controlling work on an excavator site known. These systems typically require a large one Number of rules for processing variable conditions during the excavation work; the rules need to start of the work to be implemented. Such systems are unable to set or adjust the rules to to cope with unforeseen situations or movement to optimize based on previous experience. It is also unclear how the parameters or thresholds in the Rules are generated.

Aus der DE 198 04 006.7 A1 als bekannte gelten ein System und ein Verfahren zur Steuerung der Bewegung von Maschinen unter Verwendung parameterisierter Skripten. Verschiedene Gruppen von Skriptparametern können je nach dem Arbeitsmodus oder bei Auftreten unterschiedlicher Ereignisse gewählt wer den. Die Parameter einer Bewegung, z. B. für einen hydrauli schen Bagger, werden unter Anwendung inverser Kinematik, ver bundener Geschwindigkeitsinformation und verschiedenerlei Heuristik berechnet. Einige der heuristisch gefundenen Werte werden zur Berechnung der die Bodenbeschaffenheit betreffenden Parameter verwendet. Die Winkel zwischen den Bewegungskom ponenten können je nach den Eigenschaften des Aushubmaterials sehr unterschiedlich sein, z. B. trockener Sand und nasser Schlamm, und ein falscher Winkel kann zu einer ungenauen Bodenplazierung führen. Die Gleichungen, nach denen diese Parameter berechnet werden, ändern sich nicht, wenn sie nicht im System reprogrammiert werden. Wenn daher der Bagger wegen schlechter Annahmen oder heuristisch gefundener Werte schlecht arbeitet, dann gibt es derzeit keine Möglichkeit, die Leistung zu verbessern, ohne den Betrieb der Maschine zu unterbrechen.A system is known from DE 198 04 006.7 A1 and a method for controlling the movement of machines using parameterized scripts. Various Groups of script parameters can vary depending on the work mode or if different events occur the. The parameters of a movement, e.g. B. for a hydraulic excavators, using inverse kinematics, ver tied speed information and various Heuristic calculated. Some of the heuristically found values are used to calculate the soil conditions Parameters used. The angles between the movement comm Components can vary depending on the properties of the excavated material be very different, e.g. B. dry sand and wet Mud, and an incorrect angle can lead to an inaccurate one Guide floor placement. The equations by which this Parameters calculated will not change if they are not be reprogrammed in the system. So if the excavator because of bad assumptions or heuristically found values bad works, then there is currently no way of performance to improve without interrupting the operation of the machine.

Der Erfindung liegt die Aufgabe zugrunde, die beim Stand der Technik vorliegenden Schwierigkeiten zu beseitigen. Ins besondere sollen ein Verfahren und ein System geschaffen werden, die in der Lage sind, den Fortschritt der Arbeit autonom zu überwachen und die Programmierung während des Betriebs zu modifizieren, so daß die Maschine über einen weiten Bereich der Aushub- und Ladebedingungen effizient arbeitet.The invention has for its object that in the state to overcome existing difficulties in technology. Ins special procedures and systems are to be created be able to track the progress of the work autonomously monitor and program during the Modify operation so that the machine has a wide range of excavation and loading conditions efficiently is working.

In einer Ausführungsform der Erfindung wird zur Steuerung einer autonomen Maschine ein Bewegungs-Planungs-Algorithmus verwendet. Der Bewegungs-Planungs-Algorithmus besteht aus einem Gerüst oder Skript, das die allgemeine Richtung der Bewegung aufnimmt, während Parameter im Skript mit den kinema tischen Einzelheiten für eine bestimmte Maschine und Gruppen von Bewegungen ausgefüllt werden. Ein Lernalgorithmus berech net die Skriptparameter unter Rückkopplung der Arbeitsweise und -güte der Maschine während des vorausgehenden Zyklus mit dem gegenwärtigen Parameterset. Die Parameter werden so einge stellt, daß die Arbeitsweise und -güte der Maschine während nachfolgender Arbeitszyklen verbessert wird. Die neuen Parame ter werden durch den Lernalgorithmus unter Verwendung eines vorausschauenden Funktionsapproximators berechnet und ausge wertet, um verschiedene Arbeitskriterien zu testen, z. B. die zur Ausführung einer Aufgabe erforderliche Zeit und die Genau igkeit, mit der die Aufgabe ausgeführt wurde. Die Arbeits- oder Gütekriterien werden so gewichtet, daß die Voraussage des Ergebnisses alternativer Bewegungen die Gütekriterien betont, die als am wichtigsten betrachtet werden. Mit der Anhäufung von Daten aus wiederholten Bewegungen macht sich der Algorith mus die Historie der Ergebnisse verschiedener Bewegungen zunutze, um die Parameter neu zu berechnen und zu verfeinern und die Arbeitsgüte zu verbessern.In one embodiment of the invention is used for control an autonomous machine a motion planning algorithm used. The motion planning algorithm consists of a framework or script that outlines the general direction of the Movement picks up while parameters in the script with the kinema table details for a specific machine and groups filled with movements. Calculate a learning algorithm net the script parameters with feedback of the working method and quality of the machine during the previous cycle the current parameter set. The parameters are set in this way represents that the operation and quality of the machine during subsequent working cycles is improved. The new params are learned through the learning algorithm using a predictive function approximators calculated and out evaluates to test different work criteria, e.g. B. the the time required to complete a task and the exact with which the task was carried out. The working or Quality criteria are weighted so that the prediction of the Result of alternative movements emphasizes the quality criteria, that are considered the most important. With the accumulation The algorithm makes use of data from repeated movements must be the history of the results of various movements use to recalculate and refine the parameters and improve the quality of work.

Die Erfindung wird anhand der Zeichnungen näher erläu tert. Es zeigen:The invention will be explained in more detail with reference to the drawings tert. Show it:

Fig. 1 ein Fließbild des Bewegungs-Planungs-Schemas, bei dem der erfindungsgemäße Lernalgorithmus verwendet wird, Fig. 1 is a flow diagram of the motion planning scheme in which the learning algorithm according to the invention is used,

Fig. 2 die perspektivische Ansicht eines Baggers beim Beladen eines Lastwagens, und Fig. 2 is a perspective view of an excavator when loading a truck, and

Fig. 3 die Draufsicht auf die Stellung eines Baggers, einer Baggerfläche oder -stelle und eines Aushubhaufens auf einem Baggergelände in einem Polarkoordinatensystem. Fig. 3 is a plan view of the position of an excavator, an excavator surface or site and an excavation pile on an excavator site in a polar coordinate system.

Fig. 1 zeigt mehrere Komponenten einer bevorzugten Aus führungsform des erfindungsgemäßen Systems mit einem oder mehreren Sensorsystemen 20, die eine Wahrnehmungsinformation über die Umgebung der Maschine liefern, z. B. einen Aushub in einem Erdbewegungsgelände. Die vom Sensorsystem 20 gelieferte Information wird durch einen oder mehrere Software-Module in einem Erkennungssystem 33 verarbeitet, die einen bestimmten Informationsteil über die Umgebung gewinnen oder ein gewünsch tes Ergebnis für die Aktionen der Maschine liefern. In einem Erdbewegungsgelände beispielsweise kann das Erkennungssystem 33 Funktionen durchführen, wie Erkennen von Ladebehältern und Bestimmen ihres Ortes und ihrer Ausrichtung 23, Bestimmen des gewünschten auszuhebenden Bereichs 24, Bestimmen der gewünsch ten Fläche zum Entladen des Aushubs 26, Erfassen von Hinder nissen 28. Ein Lernalgorithmus 30 berechnet Skriptparameter unter Verwendung früherer Ergebnisse von Aktionen zusammen mit der verarbeiteten Information. Die Parameter werden in Skrip ten 32 verwendet, die Muster oder Schablonen darstellen, die beschreiben, wie eine bestimmte Aufgabe (Los) als Reihe von Schritten auszuführen ist. Die Skripten 32 erzeugen Befehle für Steuergeräte 22 zum Positionieren der beweglichen Bauteile der Maschine zur Erfüllung der erforderlichen Aufgaben. Fig. 1 shows several components of a preferred embodiment of the system according to the invention with one or more sensor systems 20 , which provide perception information about the environment of the machine, for. B. excavation in an earth moving area. The information supplied by the sensor system 20 is processed by one or more software modules in a detection system 33 , which obtain a certain piece of information about the environment or provide a desired result for the actions of the machine. In an earth moving terrain, for example, the recognition system 33 can perform functions such as recognizing loading containers and determining their location and orientation 23 , determining the desired area 24 to be excavated, determining the desired area for unloading the excavation 26 , detecting obstacles 28. A learning algorithm 30 calculates script parameters using previous results of actions along with the processed information. The parameters are used in scripts 32 that represent patterns or templates that describe how a particular task (lot) is to be performed as a series of steps. The scripts 32 generate commands for control units 22 for positioning the moving components of the machine in order to perform the required tasks.

Der Lernalgorithmus 30 verwendet die vom Erkennungs- oder Erfassungssystem 33 gelieferte Information über die gegen wärtigen Anfangsbedingungen 31 und die gewünschten Ergebnisse zum Berechnen der nächsten Aktion der Maschine. Die Anfangs bedingungen 31 können jegliche erforderliche Information über die Umgebung umfassen, die zur Erfüllung der Aufgaben notwen dig sind, wie Form und Ort des Baggergeländes, Höhe des Entla de-Lkw und Ort, in den der Aushub geladen wird, oder die anfängliche Startkonfiguration der Maschine selbst. Die ge wünschten Ergebnisse können sich auf jeglichen Aspekt der Beschaffenheit der Aufgabe beziehen, wie spezifisches Volumen des je Ladung erfaßten Bodens, Ausheben einer maximalen Boden menge während der minimal möglichen Zeit und/oder den Ort für die Masse des geladenen Materials. Der Lernalgorithmus 30 führt eine vorgeschlagene, von der Maschine durchzuführende Aktion zurück, mit der er annimmt, daß die gewünschten Ergeb nisse bei den gegenwärtigen Anfangsbedingungen am besten erreicht werden.The learning algorithm 30 uses the information provided by the recognition or detection system 33 about the current initial conditions 31 and the desired results to calculate the next action of the machine. The initial conditions 31 may include any required information about the environment necessary to perform the tasks, such as the shape and location of the excavator site, the height of the unloading truck and the location where the excavation is being loaded, or the initial start configuration of the Machine itself. The desired results can relate to any aspect of the nature of the task, such as the specific volume of the soil covered per load, excavating a maximum amount of soil during the minimum possible time and / or the location for the mass of the loaded material. The learning algorithm 30 returns a proposed machine action to assume that the desired results are best achieved under the current initial conditions.

Der Lernalgorithmus 30 initiiert eine Anfrage beim Erken nungssystem 33 und fordert eine Anfangsbedingung 31 über die Umgebung oder das Gelände, und welches Ergebnis gewünscht wird. Nachdem das Erkennungssystem 33 antwortet, führt der Lernalgorithmus 30 eine vorgeschlagene Aktion der Maschine zurück. Die Frage kann in getrennte Teile aufgeteilt werden, z. B. den Anfangszustand der Umgebung und das gewünschte Ergeb nis einer Aktion wie Ausheben, und einen weiteren Teil, der das gewünschte Ergebnis für eine weitere Aktion, z. B. die Ladeprozedur, berücksichtigt. Auf diese Weise kann das Erken nungssystem 33 eine Antwort auf die zweite Frage formulieren, während die erste Aktion ausgeführt wird. Die Anfangsbedingun gen 31 können jegliche erforderliche Information hinsichtlich der Umgebung umfassen, die erforderlich ist, um die Aufgabe zu erfüllen, beispielsweise die Form und den Ort des Aushubgelän des, die Lkw-Ladehöhe und den Ort, in den der Aushub geladen wird. Die Ergebnisse der Aktionen 35 können sich auf jeglichen Aspekt der Ausführung der Aufgabe beziehen, beispielsweise auf das spezifische Volumen des je Ladung erfaßten Bodens, Aushe ben einer maximalen Bodenmenge in der minimal möglichen Zeit und/oder den Ort für die Masse des geladenen Materials. Die Informationen über die Umgebung und die gewünschten Ergebnisse sind vom Sensorsystem 20 und vom Erkennungssystem 33 über die Ergebnisse der Aktionsdaten 35 erhältlich.The learning algorithm 30 initiates a request to the recognition system 33 and requests an initial condition 31 about the environment or the terrain and which result is desired. After the recognition system 33 responds, the learning algorithm 30 returns a proposed action by the machine. The question can be divided into separate parts, e.g. B. the initial state of the environment and the desired result of an action such as lifting, and another part that the desired result for another action, for. B. the loading procedure is taken into account. In this way, the detection system 33 can formulate an answer to the second question while the first action is being carried out. The initial conditions 31 may include any required environmental information required to accomplish the task, such as the shape and location of the excavation site, the truck loading height, and the location where the excavation is loaded. The results of actions 35 may relate to any aspect of performing the task, such as the specific volume of the soil covered per load, excavating a maximum amount of soil in the minimum possible time and / or the location for the mass of the loaded material. The information about the environment and the desired results are available from the sensor system 20 and from the recognition system 33 via the results of the action data 35 .

Die Optimierungsroutine 38 des Lernalgorithmus 30 gene riert oder veranlaßt eine vorgeschlagene Aktion für den Bagger unter Verwendung der Anfangsbedingungen 31, der gewünschten Ergebnisse und der Erfahrung aus vorangegangenen Aktionen und Ergebnissen. Ein Funktionsapproximator 36 wird von der Opti mierungsroutine 38 verwendet, um das Ergebnis einer anstehen den Aktion vorauszusagen, bevor sie an der Maschine ausgeführt wird. Das vorausgesagte Ergebnis wird dann verwendet, um die Kosten bzw. den Aufwand für die anstehende Aktion zu berech nen, was sich darauf bezieht, wie nahe vorausgesagtes und gewünschtes Ergebnis beieinanderliegen. Die Funktionsapproxi mation kann auf verschiedenerlei Weise erfolgen; bei der bevorzugten Ausführungsform wird ein speichergestütztes Lern modell verwendet, bei dem alle früheren Aktionen der Maschine genau und explizit über die gesamte Lebensdauer einer fernge steuerten Maschine gespeichert werden. Ein solches speicherge stütztes Lernmodell ist lokalgewichtete Regression. Bei der lokal gewichteten Regression wird Punkten in einer Datenbank eine Gewichtung zugeordnet, die proportional ist vom Abstand der Punkte von der anstehenden Aktion. Segmente oder Lokalitä ten komplexer, nicht linearer Funktionen können durch verhält nismäßig einfache algebraische Modelle angenähert werden, wie beispielsweise lineare oder quadratische Gleichungen. Ein ex ponentieller Gewichtungsterm wird typischerweise verwendet, und die Koeffizienten für das lokale Modell der Daten werden unter Verwendung der gewichteten Daten berechnet. Die Gewich tung der Daten gibt Datenpunkten größere Auswirkung, die näher an der anstehenden Aktion liegen als weiter entfernte Punkte. Segmente komplexer, nicht-linearer Funktionen lassen sich da her annähern, weil die Gewichtung der lokalen Punkte eines Segments unterschiedliche Koeffizienten für unterschiedliche Bereiche der nicht-linearen Funktion zuläßt.The optimization routine 38 of the learning algorithm 30 generates or initiates a proposed action for the excavator using the initial conditions 31 , the desired results and the experience from previous actions and results. A function approximator 36 is used by the optimization routine 38 to predict the result of a pending action before it is performed on the machine. The predicted result is then used to calculate the cost of the upcoming action, which relates to how close the predicted and desired result is. The functional approximation can be carried out in various ways; In the preferred embodiment, a memory-based learning model is used, in which all previous actions of the machine are saved exactly and explicitly over the entire life of a remote-controlled machine. Such a memory-based learning model is locally weighted regression. With locally weighted regression, points in a database are assigned a weighting that is proportional to the distance of the points from the upcoming action. Segments or locations of complex, non-linear functions can be approximated by relatively simple algebraic models, such as linear or quadratic equations. An exponential weighting term is typically used and the coefficients for the local model of the data are calculated using the weighted data. Weighting the data gives data points greater impact, which are closer to the upcoming action than points farther away. Segments of complex, non-linear functions can be approximated because the weighting of the local points of a segment allows different coefficients for different areas of the non-linear function.

Die Datenbank-Eingaben, nämlich die anstehenden fernge steuerten oder Robot-Aktionen, und die Ausgaben, nämlich die Ergebnisse der Robot-Aktionen, sind abhängig von der Aufgabe selbst. Bei einem autonomen Bagger, bei dem parameterisierte Skripten verwendet werden, sind die Eingaben oder anstehenden Aktionen eine Gruppe von Skript-Parametern. Die Ausgaben sind die gewünschten Variablen zur Optimierung oder Verbesserung der Aktionen, wie Ausführungszeit einer Aufgabe, Maschinenef fizienz und/oder Genauigkeit bei der Ausführung der Aufgabe. Die Ausgangsvariablen werden bevorzugt bei jedem Arbeitszyklus unter Verwendung geeigneter Sensorsysteme gemessen oder er faßt.The database entries, namely the upcoming remote controlled or robotic actions, and the expenditure, namely the The results of the robot actions depend on the task itself. With an autonomous excavator, with the parameterized Scripts used are input or pending Actions a set of script parameters. The expenses are the desired variables for optimization or improvement the actions, such as execution time of a task, machine ref efficiency and / or accuracy in performing the task. The output variables are preferred for every work cycle measured using suitable sensor systems or he sums up.

Beispielsweise umfaßt eine vereinfachte Implementierung des Lernalgorithmus 30 für eine Erdbewegungsmaschine, nämlich einen Bagger 50 (Fig. 2), die Wahl einer begrenzten Anzahl von Eingabe- und Ausgabevariablen zur Überwachung während der Aus führung der Aufgabe. Bei diesem Beispiel wurden die meisten relevanten Skriptparameter für den Lernalgorithmus 30 zur Be stimmung für einen Aushub-Arbeitszyklus wie folgt gewählt:
For example, a simplified implementation of the learning algorithm 30 for an earth moving machine, namely an excavator 50 ( FIG. 2), includes the selection of a limited number of input and output variables for monitoring during the execution of the task. In this example, most of the relevant script parameters for learning algorithm 30 were chosen to determine an excavation cycle as follows:

the angle between a boom 55 and a horizontal plane 56 which triggers the pivoting to a truck 57 ,
the angle of rotation about a pivot axis 60 which triggers the movement of a rod 54 for the unloading maneuver,
the angle of rotation about the pivot axis 60 which triggers the beginning of the opening of the blade 58 ,
the angle between the rod 54 and the arm 55 which triggers the opening of the bucket 58 ,
the angle between the rod 54 and the boom 55 for the first half of the unloading process,
- The angle between the rod 54 and the boom 55 for the second half of the unloading process, and
- The angle of the blade 58 , which triggers the movement of the rod 54 from the first to the second unloading position.

Bei dem Ausführungsbeispiel umfassen die für jeden Ar beitszyklus gespeicherten Ausgangsvariablen die Zeit (t) bis zur Vollendung der Entladebewegung, die unmittelbar nach Beendigung des Aushebens beginnt und endet, wenn der Bagger die Last entladen hat und zurück zur Aushubstelle geschwenkt ist, und den Ort eines Aushubhügels 52 in Polarkoordinaten (r, θ) bezüglich des Baggers. Daher ist der Ausgaberaum dreidimen sional und der Eingaberaum siebendimensional, so daß sich insgesamt 10 Zahlen ergeben, die für jeden Arbeitszyklus in der Datenbank gespeichert werden.In the embodiment, the output variables stored for each work cycle include the time (t) to completion of the unloading movement that begins and ends immediately after the excavation has ended when the excavator has unloaded the load and pivoted back to the excavation site, and the location of one Excavation hill 52 in polar coordinates (r, θ) with respect to the excavator. The output space is therefore three-dimensional and the input space is seven-dimensional, so that there are a total of 10 numbers which are stored in the database for each work cycle.

Die Optimierungsroutine 38 erzeugt eine anfängliche Gruppe von auszuführenden oder "Kandidaten"-Skriptparametern. Diese Kandidatenparameter werden dann durch den Funktions approximator ausgewertet, der ein vorhergesagtes Ergebnis der Aktion zurückführt. Dieses vorhergesagte Ergebnis wird ver wendet zur Berechnung des Aufwands oder des Rangs zur Aus führung der gegebenen anstehenden Aktion. Die Aufwandsinforma tion wird dann von der Optimierungsroutine 38 verwendet, um eine neue Gruppe von Kandidaten-Skriptparametern auszuwählen, die einen niedrigeren Aufwand oder einen höheren Rang aufwei sen. Dieser Prozeß dauert fort, bis die Optimierungsroutine 38 an einer Endgruppe von Skriptparametern anlangt.The optimization routine 38 creates an initial set of script parameter to be executed or "candidate". These candidate parameters are then evaluated by the function approximator, which returns a predicted result of the action. This predicted result is used to calculate the effort or rank to perform the given upcoming action. The effort information is then used by the optimization routine 38 to select a new set of candidate script parameters that have a lower effort or a higher rank. This process continues until the optimization routine 38 arrives at an end set of script parameters.

Die vom Funktionsapproximator 36 verwendete Kandidaten gruppe von Skriptparametern wird unter Verwendung der Optimie rungskomponente 38 gewählt. Es lassen sich verschiedene Opti mierungsroutinen in der Optimierungskomponente 38 des Lernal gorithmus 30 in Fig. 1 verwenden, um eine anstehende Aktion zu wählen. Ist die Dimensionalität des Aktionsraumes klein genug, kann eine Grobsuche über alle Aktionen mit einigermaßen endli cher Auflösung akzeptabel sein. In anderen Fällen, wenn die Dimensionalität hoch ist, z. B. über 4, können einige Möglich keiten darin bestehen, Aktionen willkürlich zu wählen oder vielleicht willkürliche Änderungen an den besten vorausgehen den Aktionen auszuführen. Auch kann man es mit einer Interpo lation versuchen, die aus einer linearen Kombination früherer Aktionen in der Datenbank eine neue anstehende oder "Kandida tenaktion" erzeugt. Eine andere Möglichkeit besteht darin, willkürlich Aktionen zu erzeugen und als Möglichkeit, die Ak tionen mit einer Priorität zu versehen, eine Bewertung der Zuverlässigkeit der vorhergesagten Genauigkeit zu verwenden. Läßt sich das gewünschte Ergebnis als zu minimierende oder maximierende Aufwandsfunktion ausdrücken, kann zur Wahl einer Aktion ein Algorithmus verwendet werden, der als Gradientenge fälle (gradient descent) bekannt ist.The candidates used by the function approximator 36 group script parameters is chosen approximately 38 component using the optimization. Various optimization routines can be used in the optimization component 38 of the learning algorithm 30 in FIG. 1 in order to choose an upcoming action. If the dimension of the action space is small enough, a rough search of all actions with a somewhat finite resolution can be acceptable. In other cases, when the dimensionality is high, e.g. For example, over 4, some options may be to choose actions arbitrarily or perhaps to make arbitrary changes to the best precedent actions. You can also try an interpolation that creates a new pending or "candidate action" from a linear combination of previous actions in the database. Another possibility is to generate actions arbitrarily and, as a way to prioritize the actions, use an assessment of the reliability of the predicted accuracy. If the desired result can be expressed as an effort function to be minimized or maximized, an algorithm can be used to select an action, which is known as gradient descent.

Bei einer bevorzugten Ausführungsform der Erfindung wird das bekannte "Downhill-Simplex-Verfahren" angewandt, bei dem eine lokal gewichtete lineare Regressionsfunktion im Approxi mator 36 für jede ausgeführte Funktionsauswertung angewandt wird. Der anfängliche Simplexwert wird unter Verwendung einer Intelligenz über oder um den Gradienten am Startpunkt berech net. Statt längs jeder Dimension des Eingaberaums einen klei nen Wert zu addieren, wird ein jedem Term im Gradienten (oder negativen Gradienten, falls minimiert werden soll) proportio naler Wert zum Ausgangspunkt addiert. Ist in einer Dimension des Eingaberaums eine sehr steile Neigung vorhanden, kann auf diese Weise der entsprechende Simplexscheitel weiter bergab starten und schneller zu einem Minimum führen. Die Optimie rungsroutine kann in verschiedenen unterschiedlichen Stellen gestartet werden, wenn potentielle Schwierigkeiten mit lokalen Minima vorhanden sind. Die Ausgangspunkte werden als die vor herigen Maschinenaktionen in der Datenbank mit den besten Er gebnissen gewählt, wie niedrigster Aufwand, kürzeste Zeit oder geringster radialer Fehler.In a preferred embodiment of the invention, the known "downhill simplex method" is used, in which a locally weighted linear regression function is applied in the approximator 36 for each function evaluation carried out. The initial simplex value is calculated using intelligence above or around the gradient at the starting point. Instead of adding a small value along each dimension of the input space, each term in the gradient (or negative gradient, if minimized) is added to the starting point. If there is a very steep incline in one dimension of the input space, the corresponding simplex vertex can start further downhill and lead to a minimum more quickly. The optimization routine can be started in various different places if there are potential difficulties with local minima. The starting points are selected as the previous machine actions in the database with the best results, such as lowest effort, shortest time or lowest radial error.

Die Optimierungskomponente 38 wird ausgeführt, bis sie einen minimalen Aufwandswert erreicht. Ein Beispiel einer Aufwandsfunktion, die für die Entladephase der Massen-Aushub aufgabe geeignet ist, ist folgende Gleichung:
The optimization component 38 is executed until it reaches a minimum cost. An example of an expense function that is suitable for the unloading phase of the mass excavation task is the following equation:

c = w₁t + w₂(loc_des - loc_act)² [1]
c = w ₁ t + w ₂ (loc _des - loc _act ) ² [1]

worin w₁ und w₂ Gewichtungswerte auf den verschiedenen Termen und loc_des - loc_act der Fehler zwischen der Soll- und der Ist-Stelle des Aushubhaufens während der Entladephase der Bewegung ist. Für die in den Fig. 2 und 3 gezeigte Situa tion kann die Aufwandsfunktion wie folgt geschrieben werden:
where w ₁ and w ₂ weighting values on the various terms and loc _des - loc _{act is} the error between the target and the actual position of the excavation pile during the unloading phase of the movement. For the situation shown in FIGS . 2 and 3, the effort function can be written as follows:

c = w_tt + w_r(r_des - r_act)² + w_θ(θ_des - e_act)² [2].c = w _t t + w _r (r _des - r _act ) ² + w _θ (θ _des - e _act ) ² [2].

Darin sind t die Ausführungszeit, r_act und θ_act der tat sächliche Radius bzw. Winkel des Aushubhaufens 52 gegenüber einer Bezugsstelle an der Baggerstelle zum Zentrum des Aushub haufens 52 (Fig. 3), r_des und θ_des sind die gewünschten oder Soll-Koordinaten des Aushubhaufens 52. w_θ vor dem Tangential fehlerterm ist ein gewichteter Term, der verwendet wird, um den Tangentialfehler gegenüber dem Radialfehler unter Ver wendung der Gleichung
Therein t is the execution time, r _act and θ _act the actual radius or angle of the excavation pile 52 with respect to a reference point at the excavator point to the center of the excavation pile 52 ( FIG. 3), r _des and θ _des are the desired or desired Coordinates of the excavation pile 52. w _θ before the tangential error term is a weighted term that is used to compare the tangential error versus the radial error using the equation

s = rθ [3]
s = rθ [3]

zu dimensionalisieren und/oder anderweitig zu skalieren, worin s die Bogenlänge in Tangentialrichtung beim Radius r ist. Diese besondere Aufwandsfunktion ist eine lineare Kom bination der verschiedenen Ausgangsvariablen, nämlich der Ausführungszeit und der Komponenten der Genauigkeit der Opera tion, z. B. der radiale und der tangentiale Lagefehler. Diese Terme können mit einstellbaren Gewichten oder Faktoren gewich tet werden, je nach dem, was als wichtiger eingeschätzt wird. Beispielsweise können die Faktoren gewählt werden, indem man bestimmt, wie viel eine Zeitsekunde in Abstandseinheiten wert ist, d. h. ist w_t = 2 und w_r = 400, dann sind 0,5 Zeitsekunden äquivalent 5 Zentimetern an räumlichem Fehler. Die Fähigkeit, diese Faktoren einzustellen, erlaubt es einem menschlichen Supervisor, ein Kriterium gegenüber dem anderen je nach der Art der Aufgabe zu bevorzugen.to dimension and / or otherwise scale, where s is the arc length in the tangential direction at radius r. This special expense function is a linear combination of the various output variables, namely the execution time and the components of the accuracy of the operation, e.g. B. the radial and the tangential position error. These terms can be weighted with adjustable weights or factors, whichever is considered more important. For example, the factors can be selected by determining how much a time second in units of distance is worth, ie if w _t = 2 and w _r = 400, then 0.5 time seconds are equivalent to 5 centimeters of spatial error. The ability to adjust these factors allows a human supervisor to prefer one criterion over the other depending on the type of task.

Beispiele weiterer Kriterien, die in die Aufwandsfunktion eingeschlossen werden könnten, sind: der Fehler zwischen dem Soll- und dem Ist-Volumen des beim Baggern in der Schaufel erfaßten Bodens, die Maschineneffizienz als zur Verfügung ste hende gegenüber der verbrauchten Leistung, und ein Maß an Gleichmäßigkeit der Bewegung, insbesondere wenn auf der Ma schine Erfassungssensoren befestigt sind. Mit dieser Technik läßt sich die Wahl einer Aktion, die an dem Roboter ausgeführt werden soll, als Optimierungsproblem darstellen. Es sind be reits viele Techniken zur Optimierung einer Funktion bekannt. Bei dieser Lösung wählt das Lernsystem aus dem, was es bereits kennt, die Aktion, die es für am besten hält, statt einer willkürlichen Aktion.Examples of other criteria included in the expense function could include: the error between the Target and actual volume of the excavator in the bucket ground that machine efficiency was available compared to the power consumed, and a measure of Uniformity of movement, especially when on the Ma Automatic detection sensors are attached. With this technique lets you choose an action to be performed on the robot should be presented as an optimization problem. There are Many techniques for optimizing a function are already known. With this solution, the learning system chooses from what it already has knows the action that it thinks is best, rather than one arbitrary action.

Die Optimierungsroutine 38 verwendet den Funktionsappro ximator 36 zur Vorhersage des Ergebnisses einer anstehenden Maschinenaktion, was zur Berechnung eines Rangs für diese Aktion verwendet wird. Der Funktionsapproximator 36 verwendet bei dieser Ausführungsform zur Berechnung der vorhergesagten Ergebnisse einen lokal gewichteten linearen Regressionsalgo rithmus. Ein gewichtetes Schema dient bei der bevorzugten Ausführungsform zur Betonung der zuvor gespeicherten Parame ter, die näher an den "Kandidaten"-Parametern liegen. Die folgende Gewichtungsfunktion ist eine von vielen verwendbaren:
The optimization routine 38 uses the function appro ximator 36 to predict the result of an upcoming machine action, which is used to calculate a rank for that action. The function approximator 36 uses a locally weighted linear regression algorithm in this embodiment to calculate the predicted results. A weighted scheme is used in the preferred embodiment to emphasize the previously stored parameters that are closer to the "candidate" parameters. The following weighting function is one of many that can be used:

worin w_i das der i-ten Gruppen von Parametern in der Datenbank zugeordnete Gewicht, i zwischen 1 und n, n die Anzahl der Zyklen, für die Daten gespeichert werden, x_i der i-te Datenpunkt, d. h. die Gruppe von Parametern, die bei der Aufzeichnung des Datenpunkts verwendet wurde, input die Gruppe eines oder mehrerer "Kandidaten"-Parameter, die dem Funktions approximator 36 eingegeben werden, dessen Ausgang vorhergesagt wird, K die Kernel-Breite, die den Exponentialterm skaliert, und D eine euklidische Abstandsfunktion ist, die den quadrati schen Abstand zwischen Eingang und i-tem Datenpunkt zurück führt. Punkte, die sehr nahe am Eingang liegen, erhalten eine höhere Gewichtung als Punkte, die weiter entfernt sind; die Kernel-Breite beeinflußt den Ort und Anpassung. Große Kernel- Breiten resultieren in einer global gleicheren Gewichtung der Datenpunkte, während eine kleine Kernel-Breite nur die al lernächsten Datenpunkte gewichtet. Alle Eingangsdaten werden zwischen 0 und 1 normalisiert, und zwar auf der Grundlage vorbestimmter Grenzen im Bereich jedes Eingabeparameters, bevor das Gewicht oder der Faktor berechnet wird. Dies ver meidet es, daß ein Datenpunkt mit einem sehr unterschiedlichen Maßstab die Entfernungsberechnung dominiert. Für eine Daten bank mit großer Datenmenge kann die Anzahl der in die Berech nung eingeschlossenen Zyklen auf eine gewählte Zahl begrenzt werden. Ferner können intelligente Datenstrukturen, wie k-d-Bäume, verwendet werden, um die Datengewinnung und Gewichts- bzw. Faktorenberechnung stark zu beschleunigen.where w _{i is} the weight assigned to the i-th group of parameters in the database, i between 1 and n, n the number of cycles for which data is stored, x _i the i-th data point, ie the group of parameters that was used in the recording of the data point, input the group of one or more "candidate" parameters that are input to the function approximator 36 whose output is predicted, K the kernel width that scales the exponential term, and D is a Euclidean distance function , which leads back the quadratic distance between the input and the ith data point. Points that are very close to the entrance are weighted higher than points that are further away; kernel width affects location and customization. Large kernel widths result in a globally equal weighting of the data points, while a small kernel width only weights the next data points. All input data is normalized between 0 and 1 based on predetermined limits in the range of each input parameter before the weight or factor is calculated. This avoids that a data point with a very different scale dominates the distance calculation. For a database with a large amount of data, the number of cycles included in the calculation can be limited to a selected number. Furthermore, intelligent data structures, such as kd trees, can be used to greatly accelerate data acquisition and weight or factor calculation.

Sowohl die Eingangs- als auch die Ausgangsterme jedes Datenpunktes werden mit den Gewichtsfaktoren multipliziert. Ein Weg, dies in Matrixform auszudrücken, ist:
Both the input and the output terms of each data point are multiplied by the weight factors. One way to express this in matrix form is:

Z = WX [5]
v = Wy [6]Z = WX [5]
v = Wy [6]

Angenommen, es gäbe n Datenpunkte in der Datenbank und m Terme für die Eingabe jedes Datenpunkts, dann ist W eine (n × n)-Diagonalmatrix der Gewichtsfaktoren, X eine (n × m)-Matrix der Eingabeterme oder -parameter jedes Datenpunkts und y ein (n × 1)-Vektor des jedem Datenpunkt zugeordneten Ausgangs- oder Ausgabewerts. Im Fall mehrfacher Ausgänge, beispielsweise zeitlicher und räumlicher Genauigkeit, sind mehrere unter schiedliche y- und v-Vektoren vorhanden, es würden jedoch die gleichen Gewichtsfaktoren benutzt, weil sie eine Funktion des Abstands zwischen den Eingabetermen sind. Um das Problem zu lösen, nicht das lineare Modell zu benötigen, um durch den Ursprung zu gehen, wird eine zusätzliche Spalte von 1-en zur Matrix X hinzugefügt, so daß sich eine (n × (m + 1))-Matrix ergibt.Assume there are n data points in the database and m Terms for the entry of each data point, then W is a (n × n) diagonal matrix of the weight factors, X an (n × m) matrix the input term or parameter of each data point and y (n × 1) vector of the output or assigned to each data point Output values. In the case of multiple outputs, for example temporal and spatial accuracy, are several under different y and v vectors exist, but they would same weight factors because they are a function of Are the distances between the input terms. To the problem solve, not needing the linear model to go through the To go to origin, an additional column of 1's is added Matrix X added, so that there is an (n × (m + 1)) matrix results.

Nachdem die Z- und die v-Matrix berechnet sind, werden die m + 1-Koeffizienten β des linearen Modells durch Lösen folgender Gleichung gefunden:
After the Z and v matrix have been calculated, the m + 1 coefficients β of the linear model are found by solving the following equation:

Zβ = v [7]Zβ = v [7]

Um einen numerisch stabilen Algorithmus bei singulären oder annähernd singulären Matrizen vorzugeben, wird zur Lösung der obigen Matrixgleichung für β ein Singularwert-Auflösungs algorithmus verwendet. Für jede Vorhersage für jeden Ausgang werden unterschiedliche β-Vektoren berechnet.A numerically stable algorithm for singular or specifying almost singular matrices becomes a solution the above matrix equation for β a singular value resolution algorithm used. For every forecast for every exit different β vectors are calculated.

Ist β einmal berechnet, ist der vorhergesagte Ausgang
Once β is calculated, the predicted output is

y' = β^Tinput [8]y '= β ^T input [8]

Eine der wichtigeren Variablen zur Auswahl in der lokal gewichteten linearen Regressionstechnik ist die Kernel-Breite K. Dies kann automatisch durch Querverweis- oder Kreuzvalidie rung geschehen. Während der Kreuzvalidierung werden ein oder mehr Datenpunkte von der vorhandenen Datenbank entfernt, und die "Unter-Datenbank" wird mit diesen extrahierten Punkten abgefragt. Der Fehler zwischen der vorhergesagten und der tat sächlichen Antwort wird berechnet. Auf diese Weise kann die Kernel-Breite K gewählt werden, die den niedrigsten Kreuzvali dierungsfehler angibt.One of the more important variables to choose from locally weighted linear regression technique is the kernel width K. This can be done automatically by cross-reference or cross-validation happen. During the cross validation, an or removed more data points from the existing database, and the "sub-database" is with these extracted points queried. The mistake between the predicted and the did factual answer is calculated. In this way, the Kernel width K can be chosen which has the lowest cross vali dation error indicates.

Die lokal gewichtete lineare Regression hat wünschens werte Eigenschaften, u. a. leicht erhältliche Gradienteninfor mation, die Fähigkeit, mit störungsbehafteten Daten umzugehen, und keine lokalen Minima-Schwierigkeiten (wie in neuralen Netzwerken), weil die Antwort analytisch berechnet wird. Kompliziertere Regressionsalgorithmen, wie die Bayesianische Regression, können ebenfalls Vertrauensintervalle auf den vorhergesagten Ausgang und die Störungsstatistik zurückführen. Quadratische Regressionsalgorithmen können benutzt werden, wenn die Funktionen nicht linear, sondern lokal quadratisch sind.The locally weighted linear regression is desirable worthy properties, u. a. easily available gradient information mation, the ability to deal with disruptive data, and no local minimum difficulties (like in neural Networks) because the answer is calculated analytically. More complex regression algorithms, such as the Bayesian Regression, confidence intervals can also be based on the return predicted output and fault statistics. Quadratic regression algorithms can be used if the functions are not linear, but locally quadratic are.

Nachdem die nächste Maschinenaktion durch den Lernalgo rithmus 30 bestimmt ist, werden die Skripten 32 mit den Skriptparametern ausgefüllt, die Aktion wird ausgeführt und die Ergebnisse werden gemessen und gespeichert. Während dieser Prozeß des Ansammelns von mehr Daten fortschreitet, werden die Skriptparameter aktualisiert, um die Arbeitsqualität der Maschine zu verbessern. Verschiedene Robot-Lernsysteme erfor dern eine Periode des Experimentierens oder der "Praxis", um Gebiete des Geländes zu erkunden, was schließlich zu einer effizienteren Arbeitsweise führen kann. Nach der Erfindung können einige anfängliche Eichvorgänge ausgeführt werden, bevor die Aufgabe beginnt. Beispielsweise ist es bei einem Bagger möglich, mit den Aushubbewegungsparametern zu experi mentieren und dann den Boden an der gleichen Stelle wieder abzulegen, nachdem der Versuch beendet ist. Die Versuchsphase kann ferner Daten liefern, die dann genutzt werden, wenn das Ergebnis einer Aktion während des Betriebs nicht vollständig gemessen werden kann. Dies kann passieren, wenn die Sichtlinie eines Sensorsystems vom interessierenden Bereich versperrt ist und die Daten daher störungsbehaftet sind oder ganz fehlen. In dieser Situation können während der Eichung erhaltene Daten benutzt werden, um die fehlenden Daten zu ersetzen.After the next machine action is determined by the learning algorithm 30 , the scripts 32 are filled with the script parameters, the action is carried out and the results are measured and stored. As this process of collecting more data progresses, the script parameters are updated to improve the work quality of the machine. Different robot learning systems require a period of experimentation or "practice" to explore areas of the site, which can ultimately lead to a more efficient way of working. According to the invention, some initial calibration operations can be performed before the task begins. For example, with an excavator, it is possible to experiment with the excavation movement parameters and then put the ground back in the same place after the test is finished. The trial phase can also provide data that can be used when the result of an action cannot be fully measured during operation. This can happen if the line of sight of a sensor system from the area of interest is blocked and the data is therefore faulty or missing. In this situation, data obtained during calibration can be used to replace the missing data.

Gibt es keine "Geschichte" oder Datenbank, auf die man sich beim Start oder zur Verbesserung der Operationen ver lassen kann, wenn eine Maschine an einer neuen Stelle zum ersten Mal in Tätigkeit gesetzt wird, ist eine Lösung, Daten von vorangegangenen Operationen und/oder Maschinen zu nutzen und anzunehmen, daß die Umgebungsbedingungen ähnlich sind. Eine weitere Möglichkeit besteht darin, eine menschliche Bedienungsperson anfangs einige Arbeitszyklen ausführen und das System die relevanten Parameter und gemessenen Ergebnisse in der Datenbank speichern zu lassen. Ein weiteres Verfahren besteht darin, mit einer anfänglichen Gruppe von Parametern zu starten, die mit Maschinenmodellen und heuristisch gefundenen Werten berechnet werden. Die anfänglichen Parameter können dann verbessert werden, nachdem der Lernalgorithmus die weite re Steuerung übernimmt.There is no "history" or database to look at change at the start or to improve operations can leave if a machine in a new place to Getting started is a solution, data from previous operations and / or machines and assume that the environmental conditions are similar. Another option is human Operate a few cycles initially and the system the relevant parameters and measured results to be saved in the database. Another procedure is to start with an initial set of parameters start that with machine models and heuristically found Values are calculated. The initial parameters can then be improved after the learning algorithm the wide control takes over.

Der erfindungsgemäße Lernalgorithmus kann auch mit ande ren Bewegungssteuerungstechniken kombiniert werden, beispiels weise regelgestützten Systemen. Diese hybride Lösung läßt Wissen zu, das sich auf unveränderliche Teile einer gegebenen Aufgabe bezieht, die in Regeln codiert werden können. Dann können Parameter für solche Teile der Aufgabe berechnet wer den, bei denen der Lernvorgang die Arbeitseffektivität verbes sern kann.The learning algorithm according to the invention can also be used with others Ren motion control techniques are combined, for example wise rule-based systems. This hybrid solution leaves Knowing about changeable parts of a given Task that can be coded in rules. Then parameters for such parts of the task can be calculated those for whom the learning process improves work effectiveness can learn.

Die Erfindung läßt sich anwenden auf verschiedene Arten von Bau- und Landwirtschaftsmaschinen. Ein Beispiel eines Skripts für einen hydraulischen Bagger ist im Folgenden ge zeigt, das ausgelegt war auf
The invention can be applied to various types of construction and agricultural machinery. An example of a script for a hydraulic excavator is shown below that was designed for

(1) moving around obstacles detected in the terrain,
(2) Minimize the time for each bucket load by coupling simultaneous movements between the rod 54 , the boom 55 , the bucket 58 and the movement about the pivot axis 60 of the excavator 50 ( FIG. 2), and
(3) Minimize spillage of soil on the ground rings around the truck while tipping.

Für dieses Skript sind alle Parameter verbundene Winkel. Die fett geschriebenen Zahlen sind die für jede Schaufellast neu berechneten Parameter. Die als Befehle definierten Parameter werden auf der Grundlage der Geometrie des Baggers berechnet. Die Parameter im Regelteil jedes Skript-Schrittes sind die Auslöseparameter, die unter Benutzung dynamischer Information oder heuristischer Werte berechnet werden. For this script, all parameters are connected angles. The Numbers in bold are new for each bucket load calculated parameters. The parameters defined as commands are calculated based on the geometry of the excavator. The parameters in the rules section of each script step are Triggering parameters using dynamic information or heuristic values can be calculated.

Als Beispiel, wie das Skript zu lesen ist, seien die er sten beiden Schritte des Schwenk-Unterskripts betrachtet. Das Skript beginnt bei Schritt 1, wenn der Aushebevorgang beendet ist, und der Schritt 1 zugeordnete Schwenkbefehl ist 5°, was der gegenwärtige Schwenkwinkel ist. Während dieser Zeit wird der Auslegerwinkel während des Anhebens des Auslegers 55 über wacht. Wenn der Ausleger 55 einen bestimmten Winkel passiert, in diesem Falle 14°, schaltet der Skript-Schritt von Schritt 1 nach Schritt 2, und der Schwenkbefehl schaltet von 5° auf 101°, wodurch der Bagger 50 zur bezeichneten Entladestelle schwenkt. Das Skript sendet diesen Schwenkbefehl weiter aus, bis die Bedingungen erfüllt sind, um nach Schritt 3 zu schal ten und den Schwenkbefehl wieder zu ändern.As an example of how to read the script, consider the first two steps of the pan sub-script. The script begins at step 1 when the lift is complete and the pan command associated with step 1 is 5 °, which is the current pan angle. During this time, the boom angle is monitored while boom 55 is being raised . When the boom 55 passes a certain angle, in this case 14 °, the script step switches from step 1 to step 2 and the pivot command switches from 5 ° to 101 °, causing the excavator 50 to pivot to the designated unloading point. The script continues to send this swivel command until the conditions are met to switch to step 3 and change the swivel command again.

Der Lernalgorithmus modifiziert die Parameter, wenn er feststellt, daß eine Änderung die Effektivität der oben be schriebenen Operation verbessert. Die Skripten können bei vielen verschiedenen Arten komplizierter Bewegungssteuerungen bei Robot-Systemen angewendet werden, wo komplizierte Aufgaben in eine Reihe einfacher Schritte aufgeteilt werden können. Die Verwendung variabler Parameter in allgemeinen Befehlen ist ei ne wirksame Art, Operationen zu verfeinern. Die Kopplung von Bewegungen von Verbindungen über Parameter mit externe Bedin gungen darstellenden Werten, einschließlich des Auftretens von Ereignissen als Ergebnis der Bewegung anderer Verbindungen und interner Faktoren, wie Leistungsbegrenzungen, vereinfacht die Erzeugung von Befehlen oder Subskripten und ermöglicht eine flexible Operation.The learning algorithm modifies the parameters if it notes that a change in the effectiveness of the above be written operation improved. The scripts can be found at many different types of complicated motion controls be used in robotic systems where complicated tasks can be broken down into a series of simple steps. The Using variable parameters in general commands is egg an effective way to refine operations. The coupling of Movement of connections via parameters with external conditions values, including the occurrence of Events as a result of the movement of other connections and internal factors, such as power limits, simplify the Generation of commands or subscripts and enables one flexible operation.

Claims

1. A method for controlling the automated movement of an earth moving machine with a plurality of mechanical connections connected at connection points, which are simultaneously movable on commands generated by at least one script, each script containing at least one variable parameter defining a movement of the machine, with the following steps:
Determining at least one desired result,
Measuring the conditions in the vicinity of the machine relating to any desired result, and
Determine the candidate value for at least one variable parameter using a learning algorithm before executing the script in which the variable parameter is used.

2. The method according to claim 1, characterized in that the candidate value for each variable parameter is a predetermined value that is used during the initial execution of at least one script, further comprising the following steps:
Storing the measured conditions during at least one execution cycle, and
Store the candidate value used for each variable parameter during at least one execution cycle.

3. The method according to claim 2, characterized in that the learning algorithm comprises the following step:
Executing a function approximator ( 36 ) to evaluate each candidate value by predicting the result using at least one stored value for each variable parameter and at least one of the stored conditions from at least one execution cycle.

4. The method according to claim 3, characterized in that the learning algorithm ( 30 ) further includes the following step:
Optimizing an expense function which is dependent on at least one variable which represents the desired result and on at least one corresponding data value from the actual measured conditions.

5. The method of claim 4, wherein executing the func tion approximator ( 36 ) comprises the step of:
Predicting the result of the machine action using a weighted regression algorithm, wherein the weighting factors are calculated using an exponential function that is dependent on the difference between at least one stored variable parameter and the candidate value.

6. The method according to claim 5, characterized in that the step of executing the function approximator ( 36 ) comprises using a locally linear function to approximate a segment of the response.

7. The method of claim 5, characterized in that the step of executing the function approximator ( 36 ) comprises using a locally quadratic function to approximate a segment of the response.

8. A system for controlling the automated movement of a machine, comprising:
a machine with several mechanical connections connected to one another at connection points, which can be moved at the same time on command,
a processing system for executing at least one script, each of which has at least one variable parameter defining a movement of the machine, and
a learning algorithm for modifying the values of at least one variable parameter based on a desired result of a machine action and conditions measured in the environment of the machine that relate to the desired result.

9. System according to claim 8, characterized in that we at least one candidate value for each variable parameter is a predetermined value during the initial execution tion of at least one script and that a Data storage device for storing the measured conditions conditions during at least one execution cycle and at Save the for each variable parameter for at least of an execution cycle used value is provided.

10. The system of claim 9, characterized in that the learning algorithm ( 30 ) comprises a functional approximator ( 36 ) having at least one candidate value for modifying at least one variable parameter by predicting the result using at least one stored value for each variable parameter and evaluates at least one of the stored measured conditions from at least one execution cycle.

11. The system of claim 10, characterized in that the learning algorithm further comprises:
an expense function which is dependent on at least one variable which represents the desired result and at least one corresponding data value from the actually measured conditions.

12. The system of claim 11, characterized in that the functional approximator ( 36 ) further comprises:
a weighted regression algorithm for predicting the result of the machine action, wherein the weighting factors are calculated using an exponential function that is dependent on the difference between at least one stored variable parameter and the candidate value.

13. System according to claim 12, characterized in that the Function approximator also the use of a local linea function to approximate a segment of the response of the Machine contains.

14. System according to claim 12, characterized in that the Function approximator also the use of a local qua dramatic function to approximate a segment of the answer the machine contains.

15. System according to claim 8, characterized in that we at least a fixed command to control the movement of the Machine is provided.