DE102021213898A1

DE102021213898A1 - Method for optimizing resources in a communication network using at least two agents

Info

Publication number: DE102021213898A1
Application number: DE102021213898.5A
Authority: DE
Inventors: Felix Rauterberg; Michael Urlaub; Julia Rosenberger
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2021-12-07
Filing date: 2021-12-07
Publication date: 2023-06-07

Abstract

Die Erfindung betrifft ein Verfahren zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten, wobei das Kommunikationsnetzwerk wenigstens zwei Netzknoten aufweist, wobei jedem Netzknoten in dem Kommunikationsnetzwerk einer der wenigstens zwei Agenten zugeordnet ist, und wobei das Verfahren (10,20) folgenden Schritt aufweist: Optimieren der Ressourcen in dem Kommunikationsnetzwerk basierend auf einem Algorithmus des bestärkenden Lernens durch die wenigstens zwei Agenten (11,21).The invention relates to a method for optimizing resources in a communication network using at least two agents, the communication network having at least two network nodes, each network node in the communication network being assigned one of the at least two agents, and the method (10,20) having the following step comprising: the at least two agents (11,21) optimizing the resources in the communication network based on a reinforcement learning algorithm.

Description

Die Erfindung betrifft ein Verfahren zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten und insbesondere ein Verfahren, mit welchem eine Bandbreitenauslastung und/oder eine Auslastung einzelner Netzknoten in dem Kommunikationsnetzwerk optimiert werden kann.The invention relates to a method for optimizing resources in a communications network using at least two agents and in particular to a method with which bandwidth utilization and/or utilization of individual network nodes in the communications network can be optimized.

Allgemein versteht man unter einem Kommunikationsnetzwerk ein Transportsystem für den Datenverkehr. Insbesondere stellt ein Kommunikationsnetzwerk eine Ende-zu-Ende-Verbindung zwischen den Endbeziehungsweise Kommunikationsteilnehmern, das heißt Netzwerkknoten zur Verfügung, damit diese miteinander kommunizieren können.In general, a communication network is a transport system for data traffic. In particular, a communication network provides an end-to-end connection between the end or communication participants, ie network nodes, so that they can communicate with one another.

Ein Beispiel eines derartigen Kommunikationsnetzwerkes ist dabei ein Internet-of-Things-System. Unter Internet-der-Dinge beziehungsweise Internet-of-Things (loT) werden allgemein Technologien einer globalen Infrastruktur, die es ermöglicht, physische und virtuelle Objekte miteinander zu vernetzen und sie durch Informations- und Kommunikationstechniken zusammenarbeiten zu lassen, verstanden.An example of such a communication network is an Internet of Things system. The Internet of Things or Internet of Things (loT) is generally understood to mean technologies of a global infrastructure that make it possible to network physical and virtual objects with one another and to let them work together using information and communication technologies.

Das Internet of Things deckt eine riesige Bandbreite an Branchen und Anwendungsfällen ab, die von einem einzelnen Gerät bis hin zu einer massiven plattformübergreifenden Implementierung von eingebetteten Technologien und Cloud-Systemen reichen. Bekannt sind dabei beispielsweise Systeme, welche auf smarten Sensoren basieren, das heißt Sensoren, welche neben der eigentlichen Messgrößenerfassung auch die komplette Signalaufbereitung und Signalverarbeitung in einem Gehäuse vereinigen. Die smarten Sensoren können dabei beispielsweise ausgebildet sein, Verbrauchsdaten bezüglich einer Infrastrukturkomponente, beispielsweise einen Verbrauch an Wasser oder Energie zu erfassen und die entsprechenden Verbrauchsdaten insbesondere drahtlos beziehungsweise funkbasiert an einen Server beziehungsweise Host eines entsprechenden Providers übertragen.The Internet of Things covers a huge range of industries and use cases, ranging from a single device to a massive cross-platform implementation of embedded technologies and cloud systems. For example, systems are known which are based on smart sensors, that is to say sensors which, in addition to the actual detection of measured variables, also combine the complete signal preparation and signal processing in one housing. The smart sensors can be designed, for example, to record consumption data relating to an infrastructure component, for example consumption of water or energy, and to transmit the corresponding consumption data, in particular wirelessly or radio-based, to a server or host of a corresponding provider.

Als nachteilig erweist sich hierbei jedoch, das Internet-of-Things-Geräte für gewöhnlich nur über begrenzte Ressourcen wie Speicher, Rechenkapazität und Bandbreite verfügen. Gleichzeitig steigen jedoch die zu verarbeitenden Datenmengen und die Forderung nach Datenverarbeitung auf Endebeziehungsweise Edge-Geräten.However, the disadvantage here is that Internet-of-Things devices usually only have limited resources such as memory, computing capacity and bandwidth. At the same time, however, the amount of data to be processed and the demand for data processing on end or edge devices are increasing.

Aus der Druckschrift EP 2 342 607 A1 ist ein Verfahren zur Entwicklung eines Multi-Agenten-Systems, beispielsweise eines Automations- und/oder Produktionssystems, bekannt, umfassend Software- und/oder Hardwarekomponenten, deren Ressourcen und Funktionen durch Software-Agenten repräsentiert und/oder gesteuert werden, wobei jeder Software-Agent die Fähigkeit hat, definierte Ziele durch Interaktion mit der Umgebung und anderen Agenten zu erreichen, wobei die Ressourcen und Funktionen der Software-Agenten und anderer Software-Komponenten als Services dargestellt werden, wobei jeder Software-Agent zur Erreichung seiner eigenen Ziele Services anderer Software-Agenten aufruft und seine eigenen Ressourcen und Funktionen anderen Software-Agenten als Service anbietet.From the pamphlet EP 2 342 607 A1 a method for developing a multi-agent system, for example an automation and/or production system, is known, comprising software and/or hardware components whose resources and functions are represented and/or controlled by software agents, each software Agent has the ability to achieve defined goals through interaction with the environment and other agents, where the resources and functions of the software agents and other software components are represented as services, with each software agent using services of other software to achieve its own goals -agents and offers its own resources and functions as a service to other software agents.

Der Erfindung liegt somit die Aufgabe zugrunde, ein verbessertes Verfahren zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk anzugeben.The invention is therefore based on the object of specifying an improved method for optimizing resources in a communication network.

Die Aufgabe wird mit einem Verfahren zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten gemäß den Merkmalen des Patentanspruchs 1 gelöst.The object is achieved with a method for optimizing resources in a communication network using at least two agents according to the features of claim 1.

Die Aufgabe wird zudem mit einem Kommunikationsnetzwerk gemäß den Merkmalen des Patentanspruchs 7 gelöst.The task is also solved with a communication network according to the features of patent claim 7 .

Offenbarung der ErfindungDisclosure of Invention

Gemäß einer Ausführungsform der Erfindung wird diese Aufgabe gelöst durch ein Verfahren zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten, wobei das Kommunikationsnetzwerk wenigstens zwei Netzknoten aufweist, und wobei jedem Netzknoten in dem Kommunikationsnetzwerk einer der wenigstens zwei Agenten zugeordnet ist. Dabei werden die Ressourcen in dem Kommunikationsnetzwerk basierend auf einem Algorithmus des bestärkenden Lernens durch die wenigstens zwei Agenten optimiert.According to one embodiment of the invention, this object is achieved by a method for optimizing resources in a communication network using at least two agents, the communication network having at least two network nodes, and each network node in the communication network being assigned one of the at least two agents. The resources in the communication network are optimized based on an algorithm of reinforcement learning by the at least two agents.

Unter Agent wird dabei eine Softwareeinheit mit definiertem Ziel, mit autonomen Verhalten zur Erreichung des Ziels und mit der Fähigkeit zur Interaktion mit einer Umgebung und anderen Agenten verstanden. Insbesondere beinhaltet ein Agent ein Abbild der relevanten Umgebung beziehungsweise wenigstens aus einer teilweisen Beobachtung der Umgebung gewonnen Informationen, definierte, beispielsweise indirekt durch ein trainiertes neuronales Netz vorgegebene Ziele sowie Fähigkeiten beziehungsweise mögliche Aktionen, welche der Agent wählen kann. Ein System bestehend aus wenigstens zwei Agenten wird dabei auch als Multi-Agenten-System bezeichnet.An agent is understood to be a software unit with a defined goal, with autonomous behavior to achieve the goal and with the ability to interact with an environment and other agents. In particular, an agent contains an image of the relevant environment or at least information obtained from a partial observation of the environment, defined goals, for example indirectly specified by a trained neural network, and capabilities or possible actions that the agent can select. A system consisting of at least two agents is also referred to as a multi-agent system.

Unter Algorithmus des bestärkenden Lernens wird weiter ein Algorithmus des maschinellen Lernens verstanden, welcher keine klassische Trainingsphase mit gelabelten Daten aufweist und stattdessen selbstständig durch Interaktion mit seiner Umwelt und durch Belohnungen, welche er auf seine ausgeführten Aktionen hin erhält, lernt. Ziel ist es dabei, die erhaltenen Belohnungen langfristig zu maximieren.Reinforcement learning algorithm is further referred to as machine learning algorithm Understood learning, which does not have a classic training phase with labeled data and instead learns independently through interaction with its environment and through rewards, which it receives for its performed actions. The aim is to maximize the rewards received over the long term.

Allgemein wird dabei ferner zwischen klassischem bestärkendem Lernen und tiefem bestärkendem Lernen unterschieden. Das klassische bestärkende Lernen basiert dabei auf einer Strategie beziehungsweise Policy, welche in Form einer Look-up Tabelle abgespeichert werden kann. Beim tiefem bestärkenden Lernen kommt hingegen ein mehrschichtiges neuronales Netz zum Einsatz, welches im Vornhinein trainiert wird.In general, a distinction is also made between classic reinforcement learning and deep reinforcement learning. Classic reinforcement learning is based on a strategy or policy, which can be saved in the form of a look-up table. In deep reinforcement learning, on the other hand, a multi-layered neural network is used, which is trained in advance.

Somit wird eine intelligente Ressourcenzuweisung mittels multi-objektiver Optimierung ermöglicht, insbesondere in großen, dezentralen und dynamischen Systemen. Die Ressourcenoptimierung ist dabei datenbasiert, lernfähig und adaptiv und kann sich an dynamische Änderungen der verfügbaren Ressourcen anpassen. Insbesondere können neue Netzknoten ohne Probleme in die Optimierung eingebunden beziehungsweise integriert werden.This enables intelligent resource allocation using multi-objective optimization, especially in large, decentralized and dynamic systems. The resource optimization is data-based, capable of learning and adaptive and can adapt to dynamic changes in the available resources. In particular, new network nodes can be included or integrated into the optimization without any problems.

Insgesamt wird somit ein verbessertes Verfahren zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk angegeben.Overall, an improved method for optimizing resources in a communication network is thus specified.

Dabei kann der Schritt des Optimierens der Ressourcen basierend auf dem Algorithmus des bestärkenden Lernens durch die wenigstens zwei Agenten ein Optimieren der Ressourcen basierend auf dem Algorithmus des bestärkenden Lernens und basierend auf vorgegebenen Prioritäten aufweisen.Thereby, the step of optimizing the resources based on the reinforcement learning algorithm by the at least two agents can comprise optimizing the resources based on the reinforcement learning algorithm and based on predetermined priorities.

Priorität bezeichnet dabei im Allgemeinen den Vorrang einer Sache beziehungsweise Tasks vor anderen Tasks. Insbesondere können dabei Tasks mit den höchsten Prioritäten bevorzugt behandelt werden.Priority generally refers to the precedence of a thing or task over other tasks. In particular, tasks with the highest priorities can be treated preferentially.

Dies hat den Vorteil, dass insbesondere grundlegende Funktionen des Kommunikationsnetzwerkes bevorzugt behandelt werden können und durch das Optimieren der Ressourcen nicht beeinträchtigt werden.This has the advantage that, in particular, basic functions of the communication network can be treated preferentially and are not impaired by the optimization of the resources.

Die Prioritäten können dabei beispielsweise durch einen Provider und/oder Anbietern einzelner auszuführender Algorithmen vorgegeben werden.In this case, the priorities can be specified, for example, by a provider and/or providers of individual algorithms to be executed.

In einer Ausführungsform ist der Algorithmus des bestärkenden Lernens dabei ausgebildet, eine Bandbreitenauslastung in dem Kommunikationsnetzwerk zu optimieren.In one embodiment, the reinforcement learning algorithm is configured to optimize bandwidth utilization in the communication network.

Insbesondere kann der Algorithmus des bestärkenden Lernens dabei ausgebildet sein, Bandbreiten zwischen den einzelnen Netzknoten sowie gegebenenfalls Prioritäten von zwischen den Netzknoten zu übertragenden Daten und/oder Zugriffsrechte abzufragen und den zu übertragenden Daten anschließend einen Übertragungsweg innerhalb des Kommunikationsnetzwerkes basierend auf den Ergebnissen der Abfrage zu zuordnen, um hierfür eine Belohnung zu erhalten.In particular, the reinforcement learning algorithm can be designed to query bandwidths between the individual network nodes and, where appropriate, priorities of data to be transmitted between the network nodes and/or access rights and then to assign a transmission path within the communication network to the data to be transmitted based on the results of the query to receive a reward for doing so.

Die Optimierung der Bandbreitenauslastung hat dabei den Vorteil, dass verfügbare Bandbreiten optimal ausgelastet werden können, Latenzen durch das Weiterleiten der Daten minimiert werden können, Datenverluste aufgrund von Bandbreitenengpässen verhindert werden können und für den Ausbau der Netzwerkinfrastruktur benötigte Kosten reduziert werden können. Zudem können beispielsweise die Möglichkeiten von Edge Computing in einem Internet-of-Things-System erweitert werden.Optimizing the bandwidth utilization has the advantage that available bandwidths can be optimally utilized, latencies caused by the forwarding of data can be minimized, data loss due to bandwidth bottlenecks can be prevented and the costs required for expanding the network infrastructure can be reduced. In addition, the possibilities of edge computing can be expanded in an Internet of Things system, for example.

In einer weiteren Ausführungsform ist der Algorithmus des bestärkenden Lernens ausgebildet, jeweils eine Auslastung der einzelnen Netzknoten in dem Kommunikationsnetzwerk zu optimieren.In a further embodiment, the reinforcement learning algorithm is designed to optimize utilization of the individual network nodes in the communication network.

Insbesondere kann der Algorithmus des bestärkenden Lernens dabei ausgebildet sein, aktuell zur Verfügung stehende Ressourcen der einzelnen Netzknoten und offene Tasks, das heißt welche Algorithmen auszuführen sind, abfragen und die Ausführung der offenen Aufgaben beziehungsweise Tasks anschließend basierend auf den Abfrageergebnissen einzelnen Netzknoten zuzuordnen, um eine Belohnung zu erhalten.In particular, the reinforcement learning algorithm can be designed to query the currently available resources of the individual network nodes and open tasks, i.e. which algorithms are to be executed, and then to assign the execution of the open tasks or tasks to individual network nodes based on the query results in order to to receive a reward.

Die Optimierung der Auslastung der einzelnen Netzknoten hat dabei den Vorteil, dass verfügbare Ressourcen optimal ausgenutzt werden können, so dass beispielsweise eine Datenanalyse und weitere Berechnungen auch auf ressourcen-begrenzten Netzknoten beziehungsweise Endgeräten ausgeführt werden können. Hierdurch können beispielsweise wiederum die Möglichkeiten von Edge Computing gesteigert werden und Kosten, welche durch den Einbau von zusätzlicher oder leistungsstärkerer Hardware entstehen würden, reduziert werden.Optimizing the utilization of the individual network nodes has the advantage that available resources can be optimally utilized, so that, for example, data analysis and other calculations can also be carried out on resource-limited network nodes or terminals. This can, for example, in turn increase the possibilities of edge computing and reduce costs that would arise from the installation of additional or more powerful hardware.

Bei dem Kommunikationsnetzwerk kann es sich dabei um ein Internet-of-Things-System handeln.The communication network can be an Internet of Things system.

Derartige Internet-of-Things-Systeme bestehen für gewöhnlich aus dezentralen, vernetzten und ressourcen-beschränkten Netzknoten beziehungsweise Endgeräten. Das Netzwerk besitzt dabei die Struktur eines vermaschten Netzwerkes und ist dynamisch, das heißt, dass Netzknoten zur Laufzeit hinzugefügt werden können, Daten und Datenströme hinzugefügt oder entfernt werden können, beispielsweise durch den Einbau eines weiteren Sensors, oder Kommunikationswege hinzugefügt oder entfernt werden können.Such Internet-of-Things systems usually consist of decentralized, networked and resource-limited network nodes or end devices. The network has the structure of a meshed network and is dynamic, which means that network nodes at runtime can be added, data and data streams can be added or removed, for example by installing another sensor, or communication paths can be added or removed.

Somit wirkt sich eine Optimierung der Ressourcen insbesondere bei Internet-of-Things-Systemen positiv aus.Optimizing resources therefore has a positive effect, particularly in Internet-of-Things systems.

Dabei, dass es sich bei dem Kommunikationsnetzwerk um ein Internet-of-Things-System handelt, handelt es sich jedoch nur um eine bevorzugte Ausführungsform. Vielmehr kann es sich bei dem Kommunikationsnetzwerk beispielsweise auch um jedes weitere Netzwerk, bei dem Kommunikationseinheiten dynamisch hinzugefügt oder entfernt werden können, handeln. Ferner kann es sich bei dem Kommunikationsnetzwerk aber beispielsweise auch um ein statisches System handeln.However, the fact that the communication network is an Internet of Things system is only a preferred embodiment. Rather, the communication network can also be, for example, any other network in which communication units can be added or removed dynamically. Furthermore, the communication network can also be a static system, for example.

Zudem kann das Kommunikationsnetzwerk ausgebildet sein, Sensordaten zu verarbeiten.In addition, the communication network can be designed to process sensor data.

Ein Sensor, welcher auch als Detektor, (Messgrößen- oder Mess-)Aufnehmer oder (Mess-) Fühler bezeichnet wird, ist ein technisches Bauteil, das bestimmte physikalische oder chemische Eigenschaften und/oder die stoffliche Beschaffenheit seiner Umgebung qualitativ oder als Messgröße quantitativ erfassen kann.A sensor, which is also referred to as a detector, (measuring variable or measuring) recorder or (measuring) sensor, is a technical component that qualitatively or quantitatively records certain physical or chemical properties and/or the material composition of its environment as a measured variable can.

Somit können auf einfache Art und Weise Gegebenheiten außerhalb der eigentlichen Datenverarbeitungsanlage, auf welcher das Verfahren ausgeführt wird, erfasst und bei der Ressourcenoptimierung berücksichtigt werden.In this way, circumstances outside of the actual data processing system on which the method is carried out can be recorded in a simple manner and taken into account in resource optimization.

Mit einer weiteren Ausführungsform der Erfindung wird auch ein Kommunikationsnetzwerk angegeben, wobei das Kommunikationsnetzwerk wenigstens zwei Netzknoten und wenigstens zwei Agenten, welche ausgebildet sind, Ressourcen in dem Kommunikationsnetzwerk zu optimieren, aufweist, wobei jedem Netzknoten in dem Kommunikationsnetzwerk einer der wenigstens zwei Agenten zugeordnet ist, und wobei die wenigstens zwei Agenten ausgebildet sind, die Ressourcen in dem Kommunikationsnetzwerk durch einen Algorithmus des bestärkenden Lernens zu optimieren.A further embodiment of the invention also specifies a communications network, the communications network having at least two network nodes and at least two agents which are designed to optimize resources in the communications network, with each network node in the communications network being assigned one of the at least two agents, and wherein the at least two agents are configured to optimize resources in the communication network through a reinforcement learning algorithm.

Somit wird eine intelligente Ressourcenzuweisung innerhalb des Kommunikationsnetzwerkes mittels multi-objektiver Optimierung ermöglicht, insbesondere in großen, dezentralen und dynamischen Systemen. Die Ressourcenoptimierung ist dabei datenbasiert, lernfähig und adaptiv und kann sich an dynamische Änderungen der verfügbaren Ressourcen anpassen. Insbesondere können neue Netzknoten ohne Probleme in die Optimierung eingebunden beziehungsweise integriert werden. Insgesamt wird somit ein verbessertes Kommunikationsnetzwerk, in welchem die Ressourcen optimiert werden, angegeben.This enables intelligent resource allocation within the communication network using multi-objective optimization, especially in large, decentralized and dynamic systems. The resource optimization is data-based, capable of learning and adaptive and can adapt to dynamic changes in the available resources. In particular, new network nodes can be included or integrated into the optimization without any problems. Overall, an improved communication network in which the resources are optimized is thus specified.

Dabei können die wenigstens zwei Agenten ausgebildet sein, die Ressourcen in dem Kommunikationsnetzwerk durch den Algorithmus des bestärkenden Lernens und basierend auf vorgegebenen Prioritäten zu optimieren. Dies hat den Vorteil, dass insbesondere grundlegende Funktionen des Kommunikationsnetzwerkes bevorzugt behandelt werden können und durch das Optimieren der Ressourcen nicht beeinträchtigt werden.The at least two agents can be designed to optimize the resources in the communication network using the reinforcement learning algorithm and based on predefined priorities. This has the advantage that, in particular, basic functions of the communication network can be treated preferentially and are not impaired by the optimization of the resources.

In einer Ausführungsform ist der Algorithmus des bestärkenden Lernens dabei wiederum ausgebildet, eine Bandbreitenauslastung in dem Kommunikationsnetzwerk zu optimieren. Die Optimierung der Bandbreitenauslastung hat dabei den Vorteil, dass verfügbare Bandbreiten optimal ausgelastet werden können, Latenzen durch das Weiterleiten der Daten minimiert werden können, Datenverluste aufgrund von Bandbreitenengpässen verhindert werden können und für den Ausbau der Netzwerkinfrastruktur benötigte Kosten gesenkt werden können. Zudem können beispielsweise die Möglichkeiten von Edge Computing in einem Internet-of-Things-System erweitert werden.In one embodiment, the reinforcement learning algorithm is in turn designed to optimize bandwidth utilization in the communication network. Optimizing the bandwidth utilization has the advantage that available bandwidths can be optimally utilized, latencies caused by the forwarding of data can be minimized, data loss due to bandwidth bottlenecks can be prevented and the costs required for expanding the network infrastructure can be reduced. In addition, the possibilities of edge computing can be expanded in an Internet of Things system, for example.

In einer weiteren Ausführungsform ist der Algorithmus des bestärkenden Lernens ausgebildet, jeweils eine Auslastung der einzelnen Netzknoten in dem Kommunikationsnetzwerk zu optimieren. Die Optimierung der Auslastung der einzelnen Netzknoten hat dabei den Vorteil, dass verfügbare Ressourcen optimal ausgenutzt werden können, so dass beispielsweise eine Datenanalyse und weitere Berechnungen auch auf ressourcen-begrenzten Netzknoten beziehungsweise Endgeräten ausgeführt werden können. Hierdurch können beispielsweise wiederum die Möglichkeiten von Edge Computing gesteigert werden und Kosten, welche durch den Einbau von zusätzlicher oder leistungsstärkerer Hardware entstehen würden, reduziert werden.In a further embodiment, the reinforcement learning algorithm is designed to optimize utilization of the individual network nodes in the communication network. Optimizing the utilization of the individual network nodes has the advantage that available resources can be optimally utilized, so that, for example, data analysis and other calculations can also be carried out on resource-limited network nodes or terminals. This can, for example, in turn increase the possibilities of edge computing and reduce costs that would arise from the installation of additional or more powerful hardware.

Bei dem Kommunikationsnetzwerk kann es sich ferner um ein Internet-of-Things-System handeln.The communication network can also be an Internet of Things system.

Derartige Internet-of-Things-Systeme bestehen für gewöhnlich aus dezentralen, vernetzten und ressourcen-beschränkten Netzknoten beziehungsweise Endgeräten. Das Netzwerk besitzt dabei die Struktur eines vermaschten Netzwerkes und ist dynamisch, das heißt, dass Netzknoten zur Laufzeit hinzugefügt werden können, Daten und Datenströme hinzugefügt oder entfernt werden können, beispielsweise durch den Einbau eines weiteren Sensors, oder Kommunikationswege hinzugefügt oder entfernt werden können.Such Internet-of-Things systems usually consist of decentralized, networked and resource-limited network nodes or end devices. The network has the structure of a meshed network and is dynamic, which means that network nodes can be added at runtime, data and data streams can be added or removed, for example by installing another sensor, or communication channels can be added or removed.

Zudem kann das Kommunikationsnetzwerk ausgebildet sein, Sensordaten zu verarbeiten. Somit können auf einfache Art und Weise Gegebenheiten außerhalb der eigentlichen Datenverarbeitungsanlage, auf welcher das Optimieren ausgeführt wird, erfasst und bei der Ressourcenoptimierung berücksichtigt werden.In addition, the communication network can be designed to process sensor data. In this way, conditions outside of the actual data processing system on which the optimization is carried out can be recorded in a simple manner and taken into account in the resource optimization.

Zusammenfassend ist festzustellen, dass mit der vorliegenden Erfindung ein Verfahren zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten angegeben wird, mit welchem eine Bandbreitenauslastung und/oder eine Auslastung einzelner Netzknoten in dem Kommunikationsnetzwerk optimiert werden kannIn summary, it can be stated that the present invention specifies a method for optimizing resources in a communications network using at least two agents, with which bandwidth utilization and/or utilization of individual network nodes in the communications network can be optimized

Die beschriebenen Ausgestaltungen und Weiterbildungen lassen sich beliebig miteinander kombinieren.The configurations and developments described can be combined with one another as desired.

Weitere mögliche Ausgestaltungen, Weiterbildungen und Implementierungen der Erfindung umfassen auch nicht explizit genannte Kombinationen von zuvor oder im Folgenden bezüglich der Ausführungsbeispiele beschriebenen Merkmale der Erfindung.Further possible configurations, developments and implementations of the invention also include combinations of features of the invention described above or below with regard to the exemplary embodiments that are not explicitly mentioned.

Figurenlistecharacter list

Die beiliegenden Zeichnungen sollen ein weiteres Verständnis der Ausführungsformen der Erfindung vermitteln. Sie veranschaulichen Ausführungsformen und dienen im Zusammenhang mit der Beschreibung der Erklärung von Prinzipien und Konzepten der Erfindung.The accompanying drawings are provided to provide a further understanding of embodiments of the invention. They illustrate embodiments and, together with the description, serve to explain principles and concepts of the invention.

Andere Ausführungsformen und viele der genannten Vorteile ergeben sich im Hinblick auf die Zeichnungen. Die dargestellten Elemente der Zeichnungen sind nicht notwendigerweise maßstabsgetreu zueinander gezeigt.Other embodiments and many of the foregoing advantages will become apparent by reference to the drawings. The illustrated elements of the drawings are not necessarily shown to scale with respect to one another.

Es zeigen:

1 ein Blockschaltbild eines Kommunikationsnetzwerkes gemäß Ausführungsformen der Erfindung;
2 ein Flussdiagramm eines Verfahrens zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten gemäß einer ersten Ausführungsform; und
3 ein Flussdiagramm eines Verfahrens zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten gemäß einer zweiten Ausführungsform.

Show it:

1 a block diagram of a communication network according to embodiments of the invention;
2 a flowchart of a method for optimizing resources in a communication network using at least two agents according to a first embodiment; and
3 a flowchart of a method for optimizing resources in a communication network using at least two agents according to a second embodiment.

In den Figuren der Zeichnungen bezeichnen gleiche Bezugszeichen gleiche oder funktionsgleiche Elemente, Bauteile oder Komponenten, soweit nichts Gegenteiliges angegeben ist.In the figures of the drawings, the same reference symbols designate the same or functionally identical elements, parts or components, unless otherwise stated.

1 zeigt ein Blockschaltbild eines Kommunikationsnetzwerkes 1 gemäß Ausführungsformen der Erfindung. 1 FIG. 1 shows a block diagram of a communication network 1 according to embodiments of the invention.

Wie 1 zeigt, weist das Kommunikationsnetzwerk mehrere Netzknoten 2, das heißt Kommunikationsteilnehmer beziehungsweise Endgeräte und/oder Sensoren auf, welche über ein Netzwerk 3, beispielsweise ein drahtloses Netzwerk miteinander kommunizieren und insbesondere Daten austauschen können.How 1 shows, the communication network has a plurality of network nodes 2, ie communication participants or terminals and/or sensors, which can communicate with one another via a network 3, for example a wireless network, and in particular can exchange data.

Ein Beispiel eines derartigen Kommunikationsnetzwerkes 1 ist dabei ein Internet-of-Things-System. Unter Internet-der-Dinge beziehungsweise Internet-of-Things (loT) werden allgemein Technologien einer globalen Infrastruktur, die es ermöglicht, physische und virtuelle Objekte miteinander zu vernetzen und sie durch Informations- und Kommunikationstechniken zusammenarbeiten zu lassen, verstanden.An example of such a communication network 1 is an Internet of Things system. The Internet of Things or Internet of Things (loT) is generally understood to mean technologies of a global infrastructure that make it possible to network physical and virtual objects with one another and to let them work together using information and communication technologies.

Wie 1 zeigt, weist das Kommunikationsnetzwerk 1 dabei wenigstens zwei Agenten 4, welche ausgebildet sind, Ressourcen in dem Kommunikationsnetzwerk zu optimieren, auf, wobei jedem Netzknoten 2 in dem Kommunikationsnetzwerk 1 einer der wenigstens zwei Agenten 4 zugeordnet ist, und wobei die wenigstens zwei Agenten 4 ausgebildet sind, die Ressourcen in dem Kommunikationsnetzwerk durch einen Algorithmus des bestärkenden Lernens zu optimieren.How 1 shows, the communication network 1 has at least two agents 4, which are designed to optimize resources in the communication network, with each network node 2 in the communication network 1 being assigned one of the at least two agents 4, and with the at least two agents 4 being designed are to optimize the resources in the communication network through a reinforcement learning algorithm.

Gemäß den Ausführungsformen der 1 wird somit eine intelligente Ressourcenzuweisung innerhalb des Kommunikationsnetzwerkes 1 mittels multi-objektiver Optimierung ermöglicht, insbesondere in großen, dezentralen und dynamischen Systemen. Die Ressourcenoptimierung ist dabei datenbasiert, lernfähig und adaptiv und kann sich an dynamische Änderungen der verfügbaren Ressourcen anpassen. Insbesondere können neue Netzknoten ohne Probleme in die Optimierung eingebunden beziehungsweise integriert werden. Insgesamt wird somit ein verbessertes Kommunikationsnetzwerk 1, in welchem die Ressourcen optimiert werden, angegeben.According to the embodiments of 1 an intelligent allocation of resources is thus made possible within the communication network 1 by means of multi-objective optimization, especially in large, decentralized and dynamic systems. The resource optimization is data-based, capable of learning and adaptive and can adapt to dynamic changes in the available resources. In particular, new network nodes can be included or integrated into the optimization without any problems. Overall, an improved communication network 1 in which the resources are optimized is thus specified.

Bei den Agenten 4 handelt es sich dabei, gemäß den Ausführungsformen der 1, um jeweils auf einem Netzknoten 2 ausgebildete Softwareeinheiten, wobei die Agenten 4 weiter ausgebildet sind, sich untereinander abzusprechen. The agents 4 are, according to the embodiments of FIG 1 , to each software units trained on a network node 2, with the agents 4 being further trained to coordinate with one another.

1 zeigt ferner zwei Agenten 4. Ferner können aber auch mehr als zwei Agenten vorgesehen sein, beispielsweise für jeden Netzknoten in dem Kommunikationsnetzwerk ein eigener Agent. 1 FIG. 1 also shows two agents 4. However, more than two agents can also be provided, for example a separate agent for each network node in the communication network.

Die zwei Agenten 4 sind dabei weiter ausgebildet, die Ressourcen in dem Kommunikationsnetzwerk 1 durch den Algorithmus des bestärkenden Lernens und basierend auf vorgegebenen Prioritäten zu optimieren.The two agents 4 are further trained to optimize the resources in the communication network 1 using the reinforcement learning algorithm and based on predefined priorities.

Gemäß den Ausführungsformen der 1 ist der Algorithmus des bestärkenden Lernens ausgebildet, eine Bandbreitenauslastung in dem Kommunikationsnetzwerk zu optimieren.According to the embodiments of 1 the reinforcement learning algorithm is configured to optimize bandwidth utilization in the communication network.

Zudem ist der Algorithmus des bestärkenden Lernens ausgebildet, jeweils eine Auslastung der einzelnen Netzknoten in dem Kommunikationsnetzwerk zu optimieren.In addition, the reinforcement learning algorithm is designed to optimize utilization of the individual network nodes in the communication network.

Insgesamt entscheidet somit ein Agent 4 über die Ressourcen von einem oder mehreren Netzknoten 2 in dem Kommunikationsnetzwerk 1, wobei der Agent 4 ausgebildet ist, die entsprechenden Netzknoten 2 optimal auszulasten aber nicht zu überlasten.Overall, an agent 4 thus decides on the resources of one or more network nodes 2 in the communication network 1, with the agent 4 being designed to optimally utilize the corresponding network nodes 2 but not to overload them.

Bei dem Kommunikationsnetzwerk 1 handelt es sich, gemäß den Ausführungsformen der 1, ferner um ein Internet-of-Things-System.The communication network 1 is, according to the embodiments of FIG 1 , as well as an Internet of Things system.

Zudem ist das Kommunikationsnetzwerk 1 ausgebildet, Sensordaten zu verarbeiten.In addition, the communication network 1 is designed to process sensor data.

2 zeigt ein Flussdiagramm eines Verfahrens 10 zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten gemäß einer ersten Ausführungsform. 2 10 shows a flowchart of a method 10 for optimizing resources in a communication network using at least two agents according to a first embodiment.

2 zeigt dabei insbesondere ein Verfahren 10 zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten, wobei das Kommunikationsnetzwerk wenigstens zwei Netzknoten aufweist, wobei jedem Netzknoten in dem Kommunikationsnetzwerk einer der wenigstens zwei Agenten zugeordnet ist, und wobei in einem Schritt 11 die Ressourcen in dem Kommunikationsnetzwerk basierend auf einem Algorithmus des bestärkenden Lernens durch die wenigstens zwei Agenten optimiert werden. 2 shows in particular a method 10 for optimizing resources in a communication network using at least two agents, the communication network having at least two network nodes, each network node in the communication network being assigned one of the at least two agents, and wherein in a step 11 the resources in the Communication network can be optimized based on an algorithm of reinforcement learning by the at least two agents.

Gemäß der ersten Ausführungsform weist der Schritt 11 des Optimierens der Ressourcen basierend auf dem Algorithmus des bestärkenden Lernens durch die wenigstens zwei Agenten ein Optimieren der Ressourcen basierend auf dem Algorithmus des bestärkenden Lernens und basierend auf vorgegebenen Prioritäten aufweist.According to the first embodiment, the step 11 of optimizing the resources based on the reinforcement learning algorithm by the at least two agents comprises optimizing the resources based on the reinforcement learning algorithm and based on predetermined priorities.

Gemäß der ersten Ausführungsform ist der Algorithmus des bestärkenden Lernens dabei ausgebildet, eine Bandbreitenauslastung in dem Kommunikationsnetzwerk zu optimieren.According to the first embodiment, the reinforcement learning algorithm is designed to optimize bandwidth utilization in the communication network.

Wie 2 zeigt, wird dabei in einem Schritt 12 zunächst ein aktueller Zustand abgefragt. Das Abfragen des aktuellen Zustandes kann dabei ein Abfragen von aktuellen Bandbreiten zwischen den Netzknoten, ein Abfragen von offenen Aufgaben beziehungsweise Tasks und insbesondere ein Abfragen, welcher Teilnehmer beziehungsweise Netzknoten Bedarf an welchen Daten hat, ein Abfragen von Prioritäten der Daten, und/oder ein Abfragen von Zugriffsrechten aufweisen.How 2 shows, in a step 12 a current status is first queried. The querying of the current status can include querying the current bandwidths between the network nodes, querying open tasks or tasks and, in particular, querying which subscriber or network node needs which data, querying priorities of the data, and/or querying have access rights.

In einem Schritt 13 werden dann jeweils Übertragungswege, insbesondere Übertragungswege zwischen entsprechenden Netzknoten zu den einzelnen innerhalb des Kommunikationsnetzwerkes zu übertragenden Daten zugeordnet. Das Zuordnen von Übertragungswegen kann dabei ein Data Splitting, das heißt ein Aufteilen von Daten, wobei nur ein Teil der Daten auf einem Übertragungsweg übermittelt und die Daten bei Bedarf an anderer Stelle innerhalb des Kommunikationsnetzwerkes wieder zusammengefügt werden, aufweisen. Auch können Umwege beziehungsweise ein Multi-Hop-Routing in Betracht gezogen werden, beispielsweise falls eine direkte Verbindung zu stark ausgelastet ist, oder eine dynamische Änderung des Kommunikationsnetzwerkes, beispielsweise ein Entfall von Routen, falls Netzknoten entfernt werden, oder neue verfügbare Routen, falls Netzknoten hinzugefügt werden, berücksichtigt werden.In a step 13, transmission paths, in particular transmission paths between corresponding network nodes, are then assigned to the individual data to be transmitted within the communication network. The matching of transmission paths can include data splitting, ie splitting of data, with only part of the data being transmitted on one transmission path and the data being reassembled at a different point within the communication network if required. Detours or multi-hop routing can also be considered, for example if a direct connection is too busy, or a dynamic change in the communication network, for example the elimination of routes if network nodes are removed, or new available routes if network nodes are removed are added are taken into account.

Gemäß der ersten Ausführungsform können zudem Leserechte, das heißt Informationen, welcher Netzknoten welche Daten nicht erhalten darf, berücksichtigt werden. Jedoch können die Daten auch verschlüsselt übertragen werden, so dass derartige Leserechte nicht berücksichtigt werden müssen.According to the first embodiment, reading rights, ie information about which network node is not allowed to receive which data, can also be taken into account. However, the data can also be transmitted in encrypted form, so that such reading rights do not have to be taken into account.

Ferner ist es auch möglich, vorab bevorzugte Routen festzulegen und dies bei der Zuordnung in Schritt 13 zu berücksichtigen.Furthermore, it is also possible to specify preferred routes in advance and to take this into account in the assignment in step 13.

Zudem zeigt 2 einen Schritt 14 eines Zuweisens einer Belohnung an die wenigstens zwei Agenten basierend auf der Zuordnung in Schritt 13. Die entsprechende Belohnung ergibt sich dabei insbesondere aus der Auswirkung der vorgenommenen Zuordnung auf das Kommunikationsnetzwerk. Je stärker sich dabei einem optimalen Zustand genähert wird, desto höher fällt die Belohnung aus.In addition, shows 2 a step 14 of assigning a reward to the at least two agents based on the assignment in step 13. The corresponding reward results in particular from the effect of the assignment made on the communication network. The closer you get to an optimal state, the higher the reward.

Die Zuweisung einer Belohnung ist dabei jedoch nur für das Trainieren des Algorithmus des bestärkenden Lernens relevant. Wird hingegen ein bereits fertig trainierter Algorithmus des bestärkenden Lernens verwendet, so kann das Zuweisen einer Belohnung in Schritt 14 entfallen.However, the assignment of a reward is only relevant for training the reinforcement learning algorithm. If, on the other hand, a reinforcement learning algorithm that has already been fully trained is used, the assignment of a reward in step 14 can be omitted.

Die Agenten können anschließend aus der erhaltenen Belohnung lernen mit dem Optimierungsziel, die Belohnung zu maximieren.The agents can then learn from the reward received with the optimization goal of maximizing the reward.

3 zeigt ein Flussdiagramm eines Verfahrens 20 zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten gemäß einer zweiten Ausführungsform. 3 shows a flowchart of a method 20 for optimizing resources in a communication network using at least two agents according to a second embodiment.

3 zeigt dabei wiederum ein Verfahren 20 zum Optimieren von Ressourcen in einem Kommunikationsnetzwerk mittels wenigstens zwei Agenten, wobei das Kommunikationsnetzwerk wenigstens zwei Netzknoten aufweist, wobei jedem Netzknoten in dem Kommunikationsnetzwerk einer der wenigstens zwei Agenten zugeordnet ist, und wobei in einem Schritt 21 die Ressourcen in dem Kommunikationsnetzwerk basierend auf einem Algorithmus des bestärkenden Lernens durch die wenigstens zwei Agenten optimiert werden. 3 shows in turn a method 20 for optimizing resources in a communication network using at least two agents, the communication network having at least two network nodes, each network node in the communication network being assigned one of the at least two agents, and wherein in a step 21 the resources in the Communication network can be optimized based on an algorithm of reinforcement learning by the at least two agents.

Der Unterschied zwischen dem in 3 gezeigten Verfahren 20 gemäß der zweiten Ausführungsform und dem in 2 gezeigten Verfahren 10 gemäß der ersten Ausführungsform besteht dabei insbesondere darin, dass der Algorithmus des bestärkenden Lernens gemäß der zweiten Ausführungsform ausgebildet ist, jeweils eine Auslastung der einzelnen Netzknoten in dem Kommunikationsnetzwerk zu optimieren.The difference between the in 3 Method 20 shown according to the second embodiment and in FIG 2 Method 10 shown according to the first embodiment consists in particular in that the algorithm of reinforcement learning according to the second embodiment is designed to optimize a utilization of the individual network nodes in the communication network.

Wie 3 zeigt, wird dabei in einem Schritt 22 zunächst wiederum ein aktueller Zustand abgefragt. Das Abfragen des aktuellen Zustandes in Schritt 22 kann dabei beispielsweise ein Abfragen einer aktuellen Ressourcenauslastung, beispielsweise von Speicher- und/oder Rechenkapazitäten der einzelnen, von einem der wenigstens zwei Agenten überwachten Netzknoten, ein Abfragen einer aktuellen Ressourcenauslastung von, von anderen der wenigstens zwei Agenten überwachten Netzknoten und/oder eines Bearbeitungszustandes von aktuell ausgeführten Aufgaben beziehungsweise Tasks aufweisen.How 3 shows, in a step 22, a current status is first queried again. The querying of the current status in step 22 can be, for example, querying a current resource utilization, for example storage and/or computing capacities of the individual network nodes monitored by one of the at least two agents, querying a current resource utilization of other of the at least two agents have monitored network nodes and/or a processing status of currently running tasks or tasks.

In einem Schritt 23 wird dann eine auszuführende Aufgabe wenigstens einem der Netzknoten zum Ausführen zugeordnet basierend auf den in Schritt 22 gewonnen Informationen. Die Zuordnung in Schritt 23 erfolgt dabei insbesondere mit dem Ziel, eine maximale Anzahl an auszuführenden Aufgaben an die Netzknoten zu verteilen, ohne dass es zu einer Überlastung von Ressourcen kommt, wobei vielmehr eine möglichst gleichmäßige Ressourcenauslastung erzielt werden soll.In a step 23 a task to be executed is then assigned to at least one of the network nodes for execution based on the information obtained in step 22 . The assignment in step 23 is carried out in particular with the aim of distributing a maximum number of tasks to be executed to the network nodes without resource overload occurring, with the aim being to achieve resource utilization that is as uniform as possible.

Ferner ist es aber auch möglich, dass eine auszuführende Aufgabe in Schritt 23 keinem Gerät zugewiesen wird, beispielsweise falls alle Netzknoten zu dem entsprechenden Zeitpunkt bereits voll ausgelastet sind. In diesem Fall würde die entsprechende Aufgabe hintenangestellt, wobei zunächst versucht werden würde, andere Aufgaben zuzuweisen.,However, it is also possible that a task to be performed is not assigned to any device in step 23, for example if all network nodes are already fully utilized at the corresponding point in time. In this case, the corresponding task would be deferred, and an attempt would first be made to assign other tasks.,

Dabei kann eine auszuführende Aufgabe auch auf einem Netzknoten, auf welchem ein Agent ausgebildet ist, ausgeführt werden. Weiter können auch Leserechte der einzelnen Netzknoten berücksichtigt werden, das heißt welcher Netzknoten welche Netzdaten lesen darf.A task to be performed can also be performed on a network node on which an agent is configured. Read rights of the individual network nodes can also be taken into account, ie which network node may read which network data.

Zudem zeigt 3 wiederum einen Schritt 24 eines Zuweisens einer Belohnung an die wenigstens zwei Agenten basierend auf der Zuordnung in Schritt 23. Die entsprechende Belohnung ergibt sich dabei insbesondere wiederum aus der Auswirkung der vorgenommenen Zuordnung auf das Kommunikationsnetzwerk. Je stärker sich dabei einem optimalen Zustand genähert wird, desto höher fällt die Belohnung aus.In addition, shows 3 in turn a step 24 of assigning a reward to the at least two agents based on the assignment in step 23. The corresponding reward results in particular from the effect of the above taken assignment to the communication network. The closer you get to an optimal state, the higher the reward.

Die Zuweisung einer Belohnung ist dabei jedoch wiederum nur für das Trainieren des Algorithmus des bestärkenden Lernens relevant. Wird hingegen ein bereits fertig trainierter Algorithmus des bestärkenden Lernens verwendet, so kann das Zuweisen einer Belohnung in Schritt 24 entfallen.Again, the assignment of a reward is only relevant for training the reinforcement learning algorithm. If, on the other hand, a reinforcement learning algorithm that has already been fully trained is used, the assignment of a reward in step 24 can be omitted.

Die Agenten können anschließend wiederum aus der erhaltenen Belohnung lernen mit dem Optimierungsziel, die Belohnung zu maximierenThe agents can then in turn learn from the reward received with the optimization goal of maximizing the reward

ZITATE ENTHALTEN IN DER BESCHREIBUNGQUOTES INCLUDED IN DESCRIPTION

Diese Liste der vom Anmelder aufgeführten Dokumente wurde automatisiert erzeugt und ist ausschließlich zur besseren Information des Lesers aufgenommen. Die Liste ist nicht Bestandteil der deutschen Patent- bzw. Gebrauchsmusteranmeldung. Das DPMA übernimmt keinerlei Haftung für etwaige Fehler oder Auslassungen.This list of documents cited by the applicant was generated automatically and is included solely for the better information of the reader. The list is not part of the German patent or utility model application. The DPMA assumes no liability for any errors or omissions.

Zitierte PatentliteraturPatent Literature Cited

EP 2342607 A1 [0006]

Claims

Method for optimizing resources in a communication network using at least two agents, the communication network having at least two network nodes, each network node in the communication network being assigned one of the at least two agents, and the method (10, 20) having the following step: - optimizing the resources in the communication network based on a reinforcement learning algorithm by the at least two agents (11,21).

procedure after claim 1 , wherein the step of optimizing the resources based on the reinforcement learning algorithm by the at least two agents (11,21) comprises optimizing the resources based on the reinforcement learning algorithm and based on predetermined priorities.

procedure after claim 1 or 2 , wherein the reinforcement learning algorithm is configured to optimize bandwidth utilization in the communication network.

Procedure according to one of Claims 1 until 3 , wherein the algorithm of the reinforcement learning is designed to optimize a utilization of the individual network nodes in the communication network.

Procedure according to one of Claims 1 until 4 , where the communication network is an Internet of Things system.

Procedure according to one of Claims 1 until 5 , wherein the communication network is designed to process sensor data.

Communications network, the communications network (1) having at least two network nodes (2) and at least two agents (4) which are designed to optimize resources in the communications network (1), each network node (2) in the communications network (1) one of the at least two agents (4) is assigned, and wherein the at least two agents (4) are adapted to optimize the resources in the communication network (1) by a reinforcement learning algorithm.

communication network claim 7 , wherein the at least two agents (4) are designed to optimize the resources in the communication network (1) by the reinforcement learning algorithm and based on predetermined priorities.

communication network claim 7 or 8th , wherein the reinforcement learning algorithm is configured to optimize bandwidth utilization in the communication network.

Communication network according to one of Claims 7 until 9 , wherein the algorithm of the reinforcement learning is designed to optimize a utilization of the individual network nodes in the communication network.

Communication network according to one of Claims 7 until 10 , wherein the communication network (1) is an Internet-of-Things system.

Communication network according to one of Claims 7 until 11 , wherein the communication network (1) is designed to process sensor data