WO2022232994A1

WO2022232994A1 - Devices and methods for autonomous distributed control of computer networks

Info

Publication number: WO2022232994A1
Application number: PCT/CN2021/091915
Authority: WO
Inventors: Albert CABELLOS; Guillermo BERNARDEZ; Bo Wu; Jose SUAREZ VARELA; Sheng Xu; Miquel FERRIOL GALMES; Xiangle CHENG; Pere BARLET; Shihan XIAO; Wenjie Liu
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2021-05-06
Filing date: 2021-05-06
Publication date: 2022-11-10

Abstract

A network node (110a) for performing a distributed networking function in a communication network (100) is provided. The network node is connected via communication links to neighboring network nodes (110b-d). The network node comprises a message aggregation module (121a) and a hidden state update module (121b). The message aggregation module is configured to receive a message from each neighboring network node comprising information about a hidden state of the neighboring network node. The hidden state update module is configured to generate an updated hidden state of the network node based on a hidden state of the network node and the information about the hidden states of the neighboring network nodes. The network node further comprises a control policy module (123) configured to select, based on the updated hidden state of the network node, an action for optimizing the distributed networking function.

Description

Devices and methods for autonomous distributed control of computer networks

TECHNICAL FIELD

The present disclosure relates to communication networks in general. More specifically, the present disclosure relates to devices and methods for autonomous distributed control of computer networks.

BACKGROUND

Network optimization techniques are generally used to improve network performance, so as to provide network services to network users with improved quality of service and quality of user experience, and also satisfy other service requirements. As networks increasingly expand and become more complex, network optimization is facing challenges to meet stricter requirements, such as lower optimization cost, shorter optimization time, higher optimization accuracy, and the like.

Artificial intelligence provides techniques that use an electronic machine to mimic human intelligence. Artificial intelligence techniques aim to solve many problems, such as reasoning, planning, learning, natural language processing, perception, moving and manipulating objects to name a few. Artificial intelligence techniques have already been used in various applications, such as autonomous vehicles, medical diagnosis, playing games (such as Chess) , search engines (such as Google search) , online assistants (such as Siri) , and image recognition, among many others. Artificial intelligence techniques have also been used in the field of telecommunications, e.g., for improving telecommunications services and products.

Some recent works pursue automated network control and optimization with the goal of achieving self-driving networks, in part motivated by the increasing interest of the networking community to leverage different machine learning techniques, such as Reinforcement Learning (RL) . However, most of the state-of-the-art AI-based approaches are fully centralized solutions, which contrast with the distributed operation that most of the networks implement nowadays.

SUMMARY

It is an objective of the present disclosure to provide improved devices and methods for autonomous distributed control of computer networks.

The foregoing and other objectives are achieved by the subject matter of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.

According to a first aspect, a network node for performing, i.e. supporting a distributed networking function in a communication network, is provided. The network node is connected via one or more communication links to one or more neighboring network nodes of a plurality of network nodes of the communication network. The distributed networking function may be a distributed routing function for routing data within the communication network.

The network node comprises a message processing module, including a message aggregation module and a hidden state update module. The message aggregation module is configured to receive a message from each of the one or more neighboring network nodes, wherein the message comprises information about the current hidden state of the respective neighboring network node. The hidden state update module is configured to generate an updated hidden state of the network node based on the current hidden state of the network node and the one or more current hidden states of the one or more neighboring network nodes. The message processing module may implement an iterative update process, including a plurality of message exchange stages and a corresponding plurality of stages of updating the hidden state of the network node.

The network node further comprises a control policy module configured to select, based on the updated hidden state of the network node and the updated hidden states of the one or more neighboring network nodes, one or more actions for increasing, i.e. optimizing a performance measure of the distributed networking function in the communication network.

Thus, advantageously, the network node according to the first aspect as part of a plurality of like network nodes of the communication network may learn a communication protocol configured to optimize a distributed networking function in a fully automated and decentralized manner, meaning that the network node only exchanges information with other nodes in its neighbourhood. Advantageously, this results in a very good scalability of the optimization of the distributed networking function provided by the plurality of network nodes to larger networks. Moreover, the optimization of the distributed networking function provided by the plurality of network nodes may be easily adapted to other networks not seen during a training phase. More specifically, the optimization of the distributed networking function provided by the plurality of network nodes may be trained in controlled testbeds (e.g., in a networking lab) and then be deployed in real-world networks.

In a further possible implementation form of the first aspect, the message processing module is configured to iteratively generate the updated hidden state of the network node based on the current hidden state of the network node and the one or more current hidden states of the one or more neighboring network nodes for a predefined number of iterations or until a stop criterion has been reached.

In a further possible implementation form of the first aspect, the message processing module comprises a graph neural network, GNN, in particular a message passing neural network, MPNN, configured to generate the updated hidden state of the network node based on the current hidden state of the network node and the one or more current hidden states of the one or more neighboring network nodes.

In a further possible implementation form of the first aspect, the message processing module is configured to initialize the hidden state of the network node based on one or more input features of the graph neural network.

In a further possible implementation form of the first aspect, the one or more input features comprise at least of one or more adjustable network node parameters relating to a global network routing strategy, in particular information about the next forwarding hops or link weights, traffic measurement information, in particular link utilization information and graph-based measurement information, in particular betweenness centrality information.

In a further possible implementation form of the first aspect, each graph node of the GNN represents one of the plurality of network nodes and each graph edge of the GNN represents a respective communication link between two respective network nodes of the plurality of network nodes.

In a further possible implementation form of the first aspect, the message processing module is configured to represent the current hidden state of the network node as a hidden state vector.

In a further possible implementation form of the first aspect, the hidden state vector comprises a first component indicative of a weight of a respective communication link and/or a second component indicative of a utilization measure of a respective communication link.

In a further possible implementation form of the first aspect, one or more further components of the hidden state vector are zero-padded.

In a further possible implementation form of the first aspect, the GNN is defined by one or more adjustable GNN parameters, in particular GNN weights, wherein the values of the one or more adjustable GNN parameters, in particular GNN weights, are based on a training stage of the GNN, wherein in the training stage the plurality of network nodes have defined a first network architecture and at least one second network architecture different from the first network architecture.

In a further possible implementation form of the first aspect, the control policy module comprises a multi-agent reinforcement learning, MARL, neural network configured to select, based on the updated hidden state of the network node and the updated hidden states of the one or more neighboring network nodes, the one or more actions for increasing, i.e. optimizing a performance measure of the global function in the communication network.

In a further possible implementation form of the first aspect, the distributed networking function is a distributed routing function within the communication network and wherein a reward defined by the MARL, neural network is a change of the maximum communication link load.

In a further possible implementation form of the first aspect, the control policy module is configured to generate a destination-based routing table and to route data within the communication network on the basis of the destination-based routing table.

In a further possible implementation form of the first aspect, the control policy module is configured to generate the destination-based routing table by determining a weighted shortest path to a destination network node of the plurality of network nodes based on a plurality of weights of the plurality of communication links defining the weighted shortest path to the destination network node.

In a further possible implementation form of the first aspect, the MARL neural network is defined by one or more adjustable MARL neural network parameters, in particular MARL neural network weights, wherein the values of the one or more adjustable MARL neural network parameters, in particular MARL neural network weights, are based on a training stage of the MARL neural network, wherein in the training stage the plurality of network nodes have defined a first network architecture and at least one second network architecture different from the first network architecture.

In a further possible implementation form of the first aspect, the network node is a router or a switch.

Traditional distributed network protocols not only define the exact way network devices communicate with each other, but also explicitly describe the information they exchange at any moment. Embodiments disclosed herein address the design of an automated distributed protocol in which the network itself learns how to achieve a predefined goal -e.g. routing optimization -without feeding the system with any previous knowledge about the information that should be exchanged.

In contrast to existing AI solutions, embodiments disclosed herein provide a fully decentralized system where a set of agents are executed in network nodes and coordinate between them to optimize a global goal. This can be especially interesting from an industry perspective, as the system can be implemented into separate AI-enabled network devices that run agents in a distributed way. Such a distributed system leverages MARL to achieve effective cooperation between agents to achieve the desired goal. Moreover, thanks to GNN, embodiments disclosed herein also offer generalization capabilities -i.e. the ability to respond successfully to other network scenarios not seen during training. In this context, the assembly of network nodes 110a-g can be trained in a controlled network environment (e.g., at the vendor’s facilities) and then be directly deployed in real networks.

According to a second aspect, an assembly of network nodes comprising a plurality of network nodes according to the first aspect is provided.

According to a third aspect, a method for performing, i.e. supporting a distributed networking function by a network node in a communication network, is provided. The network node is connected via one or more communication links to one or more neighboring network nodes of a plurality of network nodes of the communication network. The method comprises the steps of: receiving a message from each of the one or more neighboring network nodes, wherein the message comprises information about the current hidden state of the respective neighboring network node; generating an updated hidden state of the network node based on the current hidden state of the network node and the one or more current hidden states of the one or more neighboring network nodes; and selecting, based on the updated hidden state of the network node and the updated hidden states of the one or more neighboring network nodes, one or more actions for increasing, i.e. optimizing a performance measure of the distributed networking function in the communication network. The message receiving step and the updating step may be repeated for iteratively updating the hidden state of the network node.

According to a fourth aspect, a computer program product is provided comprising a computer-readable storage medium for storing program code which causes a computer or a processor to perform the method according to the third aspect when the program code is executed by the computer or the processor.

Details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following, embodiments of the present disclosure are described in more detail with reference to the attached figures and drawings, in which:

Fig. 1 shows a schematic diagram illustrating a communication network including a plurality of network nodes according to an embodiment;

Fig. 2 shows a schematic diagram illustrating in more detail one network node according to an embodiment;

Fig. 3 shows a schematic diagram illustrating a communication network including a plurality of network nodes and a message exchanged between the plurality of network nodes according to an embodiment;

Fig. 4 shows a schematic diagram illustrating a training stage and an application stage of the plurality of network nodes according to an embodiment;

Fig. 5 shows in more detail processing steps implemented at a network node according to an embodiment; and

Fig. 6 shows a flow diagram illustrating a method for performing a distributed networking function according to an embodiment.

In the following, identical reference signs refer to identical or at least functionally equivalent features.

DETAILED DESCRIPTION OF THE EMBODIMENTS

In the following description, reference is made to the accompanying figures, which form part of the disclosure, and which show, by way of illustration, specific aspects of embodiments of the present disclosure or specific aspects in which embodiments of the present disclosure may be used. It is understood that embodiments of the present disclosure may be used in other aspects and comprise structural or logical changes not depicted in the figures. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims.

For instance, it is to be understood that a disclosure in connection with a described method may also hold true for a corresponding device or system configured to perform the method and vice versa. For example, if one or a plurality of specific method steps are described, a corresponding device may include one or a plurality of units, e.g. functional units, to perform the described one or plurality of method steps (e.g. one unit performing the one or plurality of steps, or a plurality of units each performing one or more of the plurality of steps) , even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a specific apparatus is described based on one or a plurality of units, e.g. functional units, a corresponding method may include one step to perform the functionality of the one or plurality of units (e.g. one step performing the functionality of the one or plurality of units, or a plurality of steps each performing the functionality of one or more of the plurality of units) , even if such one or plurality of steps are not explicitly described or illustrated in the figures. Further, it is understood that the features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless specifically noted otherwise.

Figure 1 shows a schematic diagram illustrating a communication network 100 including a plurality, i.e. an assembly of network nodes 110a-g according to an embodiment. As illustrated in figure 1, the plurality of network nodes 110a-g are configured to communicate with each other using messages 150 via a plurality of communication links, such as the communication link 111 between the network node 110a and its neighboring network node 110a. In the following the network node 110a will be described in more detail with the understanding that the exemplary network node 110a is a representative example for the plurality of network nodes and that the other network nodes 110b-g may have the same features as the exemplary network node 110a.

As will be described in more detail below, the network node 110a is configured to support, i.e. perform a distributed networking function in the communication network 100. In an embodiment, the distributed networking function may be a distributed routing function for routing data within the communication network 100. Thus, in an embodiment, the plurality of network nodes 110a-g may comprise a plurality of network routers or switches 110a-g. In an embodiment, the communication network 100 may be an IP-based communication network 100.

As illustrated in figure 1, the network node 110a may comprise a processing circuitry 120, including, for instance, one or more processors 120 for processing data and implementing an agent on the network node 110a, a communication interface 121 for exchanging data via the plurality of communication links with the neighboring network nodes 110b-d and a non-volatile memory 140. The processing circuitry 120 of the exemplary network node 110a may be implemented in hardware and/or software. The hardware may comprise digital circuitry, or both analog and digital circuitry. Digital circuitry may comprise components such as application-specific integrated circuits (ASICs) , field-programmable arrays (FPGAs) , digital signal processors (DSPs) , artificial intelligence (AI) processors or general-purpose processors. The non-volatile memory 140 may be configured to store data and executable program code which, when executed by the processing circuitry 120 causes the exemplary network node 110a to perform the functions, operations and methods described in the following.

Further referring to figure 2, the network node 110 comprises a message processing module 121, including a message aggregation module 121a and a hidden state update module 121b. In an embodiment, the message processing module 121, including the message aggregation module 121a and the hidden state update module 121b, may be implemented by the processing circuitry 120 of the network node 110a.

As will be described in more detail below, the message aggregation module 121a of the network node 110a is configured to receive a message 150 from each its neighboring network nodes, which in the example shown in figure 2 are the neighboring network nodes 110b-e. The respective message 150 received from the respective neighboring network node 110b-e comprises information about the current hidden state of the respective neighboring network node 110b-e. The hidden state update module 121b of the network node 110a is configured to generate an updated hidden state of the network node 110a based on the current hidden state of the network node 110a and the one or more current hidden states of the neighboring network nodes 110b-e.

As illustrated in figure 2, the network node 110a further comprises a control policy module 123 (which may be implemented by the processing circuitry 120) configured to select, based on the updated hidden state of the network node 110a and the updated hidden states of the neighboring network nodes 110b-e, one or more actions for increasing, i.e. optimizing a performance measure of the distributed networking function in the communication network.

As already described above, the processing circuitry 120 of the network node 110a may implement one or more agents, which, in turn, implement, for instance, the message processing module 121 and the control policy module 123. By exchanging messages with the agents of the neighboring network nodes 110b-e, the agents, i.e. the network nodes 110a-g learn what information to exchange to achieve a global goal. In order to achieve the control and optimization of a distributed networking function in the computer network 100, the plurality of network nodes 110a-g, in an embodiment, may implement a multi-agent reinforcement learning (MARL) system. To this end, in an embodiment, each network node 110a-g may implement one or more RL agents. In an embodiment, agents may correspond to different types of network elements, from physical (e.g., the network nodes 110a-g or links) to more abstract ones (e.g., end-to-end paths) . Thus, each agent may be configured control the configuration and local state of the entity associated to it.

Within the MARL system implemented by the plurality of network nodes 110a-g, each agent, i.e. network node 110a-e has some internal attributes (also referred to as hidden state) . For processing and modeling information of the network state and enabling propagating local information over the whole communication network 100 the plurality of network nodes 110a-g may implement a graph neural network, GNN, in the form of a Message-Passing Graph Neural Network (MPNN) . This graph-based AI technique -embedded into the MARL system-allows message communications between the agents, i.e. the network nodes 110a-g, so that each network node 110a-g can modify its internal attributes (i.e. its hidden state) depending on the information provided by the messages received from the neighboring network nodes. Furthermore, there may be a direct mapping between the MPNN execution and the actual computer network where it operates. For instance, nodes/links of the network 100 can be represented as node/link entities in the GNN, and the information exchanged internally by these GNN entities are ultimately messages transmitted over the network infrastructure. Further details about the Message-Passing Graph Neural Network implemented by embodiments disclosed herein may be found in Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals, and G.E. Dahl, “Neural message passing for quantum chemistry, ” in Proceedings of the International Conference on Machine Learning (ICML) -Volume 70, pp. 1263–1272, 2017.

By running the MARL system and providing proper rewards for the actions that agents. i.e. the network nodes 110a-g take, each agent ends up learning what information is relevant to exchange with its neighbors, and also discovers how to adapt its internal attributes (i.e. its hidden state) to optimize the global objective, i.e. to increase a performance measure of a distributed networking function in the communication network 100, such as minimizing the average delay. As already described above, a key aspect of this process is parameter sharing. During a training phase, all agents, i.e. network nodes 110a-g may have the same internal parameters and optimize them jointly. In an embodiment, the agents, i.e. network nodes 110a-g may share the GNN parameters (i.e. the parameters of the neural networks implemented within the MPNN) . After the training phase, the plurality of trained agents, i.e. network nodes 110a-g can be replicated and distributed across multiple devices of any other network.

Although in the beginning of the training phase all agents, i.e. network nodes 110a-g may have the same internal parameters, each agent 110a-g modifies its internal attributes and adapts them based on its local observations. More specifically, as already described above, these attributes may be included in the hidden states exchanged by the GNN, and each agent, i.e. network node 110a-g shares these hidden state representations with its neighbors via the messages 150 over the network 100. In an embodiment, hidden states may be simultaneously optimized for both encoding meaningful information within the messages 150 between the agents and determining the best actions to take.

Thus, implementing MPNN into a MARL system, as employed by embodiments disclosed herein, provides a distributed solution that facilitates the cooperation between multiple agents that communicate via messages 150. As already described above, the assembly of trained agents, i.e. network nodes 110a-d can be successfully deployed in any network architecture without the necessity of retraining.

Further referring to figure 2, in a first stage, the message aggregation module 121a of the exemplary network node 110a receives the information from its neighboring network nodes 110b-e. As already described above, this can be implemented by a GNN-based, in particular a MPNN-based module that is responsible for processing this data, in particular choosing what relevant information of the neighbors’ hidden states should be considered. Then, the output of the message aggregation module 121a is used by the hidden state update module 121b. As already described above, the hidden state update module 121b may be based on a NN as well and is configured to combine the selected information from the previous step with the internal hidden state of the network node 110a, thereby generating an updated hidden representation for that network node 110a. As already described above, the message aggregation module 121a and the hidden state update module 121b are part of the message processing module 121 that may be identified with the operation of the MPNN. For some variants of MPNN the functions executed by these modules are not necessarily modeled by a NN. Once the hidden state of the network node 110a has been updated, it is transmitted back to its neighboring network nodes 110b-e. Then, the neighboring network nodes 110b-e perform exactly the same process, and this message processing module 121 is executed for several iterations so that the agents, i.e. network nodes 110a-g move from a local to a global context awareness. In a final stage, after the iterative execution of the message processing module 121, the control policy module 123 is in charge of choosing an action to pursue the global network goal based on the final hidden states generated by the agents, i.e. the network nodes 110a-g. In an embodiment, the control policy module 123 may be implemented as a NN that produces the output policy.

In an embodiment, the distributed networking function performed by the plurality of network nodes 110a-g is a distributed routing function within the communication network 100, wherein a reward defined by the MARL neural network is a change of the maximum communication link load (i.e. load balancing) . In other words, the goal of the network 100 is to minimize the utilization of the maximum loaded link (which is a common concern in carrier-grade networks, i.e., geographically-distributed networks with hundreds of nodes like Internet Service Provider (ISP) ones) .

In this embodiment, each network node 110a-g may be implemented as a network router 110a-g, wherein each router 110a-g is equipped with an autonomous agent. In an embodiment, the agents of the routers 110a-g receive as inputs the hidden states from the neighboring agents of the GNN that interconnects all the agents 110a-g. With this information the agents change the local configuration (e.g. the routing weights) of each router 110a-g, for instance by dynamically adjusting the weights (and thus, the flow of the paths) to achieve the network goal. In an embodiment, additional configurations may be possible, for instance by dynamically adjusting MPLS paths, ECMP weights, etc. One main advantage for this scenario provided by embodiments disclosed herein is that, while conventional solutions assume that the traffic matrix (bandwidth per source-destination) is static, the autonomous agents 110a-g can adjust the configuration of the network 100 in the presence of dynamic traffic.

Figure 3 shows the communication network 100 including the plurality of network nodes 110a-g and a message 150 exchanged between the plurality of network nodes 110a-g according to an embodiment. As already described above, each node 110a-g of the network 100 may assemble one or several AI-based modules/agents that, at execution time, iteratively define (and process) the message communications, such as the message 150, with the direct neighboring nodes, i.e. agents.

Figure 4 shows schematically a training stage and an application stage of the plurality of network nodes 110a-g according to an embodiment. As illustrated in figure 4, using an exhaustive training process, i.e. learning from a variety of different network scenarios, i.e. architectures, the plurality of network nodes 110a-g may be trained in a controlled network environment (e.g., at the vendor’s facilities) and then be directly deployed in different real networks without any retraining required.

Further details of embodiments for the load balancing scenario described above will be described in the following under further reference to figure 5. Each agent, router 110a-g computes, collects, or is provided with, some input features. These input features may include, but are not limited to: internal device attributes that relate to the global network routing strategy, and that can be modified by the own device actions in order to optimize the routing (such as next forwarding hops or link weights) ; traffic measurements (such as link utilization) ; and/or generic graph-based measurements (such as the betweenness centrality) .

All the agents, i.e. routers 110a-g, initialize their hidden states based on their corresponding input features (see 501 of figure 5) . Thus, input features are encoded into hidden states, which may be fixed-size vectors. In an embodiment, this initialization may be performed by adding the input features and applying zero-padding to fit the dimension of the hidden state vector.

Then, each agent, i.e. router 110a-g sends the same message to all its neighboring agents (see 503 of figure 5) , which contains its current hidden representation. Consequently, each agent receives the hidden states of all its neighboring agents.

After receiving these messages, every agent, i.e. router 110a-g processes them in the following way. Firstly, each neighbor hidden state is combined with the agent representation by means of the message aggregation module 121a (see 505) , which can be implemented by as a neural network. Secondly, the resulting combined representations of the previous step may be aggregated using a predefined aggregation function, such as an element-wise sum of the hidden state vectors. Finally, the agent, i.e. router 110a-g updates its own hidden state with the new aggregated information using the hidden state update module 121b (see 507) , which may be implemented as a neural network as well. As a result, each agent generates an updated hidden representation, i.e. state that potentially incorporates information from its neighborhood.

As illustrated in figure 5, the message exchange and processing of

steps

503, 505 and 507 may be repeated several times, each turn considering the updated agent hidden states. During these iterations there is a recognizable pattern of periodic message communications between neighboring nodes. Assuming that updates add some local topological awareness through neighbor representations, the range is expanded at successive message passing iterations. At each turn, agents have access to information of more distant nodes in the network. Moreover, hidden states may be expected to evolve from sparse data to much more dense representations in vectors as the iterations of the message passing procedure are executed. The number of iterations can be either predefined, or depend on some parameters (e.g., convergence criterion over hidden states) .

After the final iteration of the

message passing stages

503, 505 and 507, each agent 110a-g individually evaluates its final hidden representation using the control policy model 123 (see 509) , which may be implemented as another neural network. This function provides the agent 110a-g with the output routing policy. Based on this, the agent 110a-g decides what action to take, i.e. how to modify its configuration parameters. For instance, it might involve a change in the next hops of a Forwarding Information Base (FIB) in a router, or varying a link weight that is later used to compute the routing configuration via a weighted shortest path algorithm (e.g., Dijkstra's algorithm) .

In a further embodiment, the distributed networking function performed by the plurality of network nodes 110a-g may be the fulfillment of different QoS requirements. For instance, different users of the communication network 100 may have different QoS requirements. In an embodiment, these different QoS requirements may be expressed as a given maximum delay, jitter and guaranteed bandwidth, or in the form of a deterministic QoS with on-time and in-time delivery. The granularity of the QoS demands can be defined as packets with the same destination, with the same source-destination, or with the same 5-tuple (flows) . As already described above, each network node router 110a-g may be equipped with one agent. User-generated packets may be tagged with a label, for instance in the Traffic Class field of the IPv6 header, or the Type of Service field in IPv4. This label may be statically associated with a given QoS demand. The agents, i.e. the network nodes 110a-g can read such fields, and have configured the mapping between the QoS label and the QoS requirements (e.g., bounded delay) . Then, based on a set of input features such as local link utilization, the effective delay and jitter of the packets per flow (as defined before) and the information learned via the hidden state of the GNN, the agent of each network node 110a-g makes local decisions to fulfill the QoS demands, i.e. to optimize the performance of the distributed networking function. Such local decisions can have the form of routing (configuring the weights of the local links) or changing the local configuration of the queue scheduling policy.

A common issue of existing QoS solutions is that they operate on the principle of overprovisioning, making an inefficient use of the resources. Advantageously, embodiments disclosed herein can operate dynamically based on the traffic and make an efficient use of the resources, adjusting the network configuration based on the current load of users.

As will be appreciated, in addition to the load balancing and the QoS requirements example described above, embodiments disclosed herein may be applied to other distributed networking functions, such as Service Function Chaining routing or Network Function Virtualization. In each case, embodiments disclosed herein may decide the flow of the packets while achieving a certain network goal, load-balance the utilization of the services, QoS, and the like.

Figure 6 shows a flow diagram illustrating a method 600 for performing a distributed networking function by the network node 110a in the communication network 100. The method 600 comprises a first step 601 of receiving a message 150 from each of the one or more neighboring network nodes 110b-d, wherein the message 150 comprises information about a hidden state of the respective neighboring network node 110b-d. Moreover, the method 600 comprise a second step 603 of generating an updated hidden state of the network node 110a, based on a hidden state of the network node 110a and the information about the one or more hidden states of the one or more neighboring network nodes. The method 600 further comprises a step 605 of selecting, based on the updated hidden state of the network node, one or more actions for increasing a performance measure of the distributed networking function in the communication network.

The person skilled in the art will understand that the "blocks" ( "units" ) of the various figures (method and apparatus) represent or describe functionalities of embodiments of the present disclosure (rather than necessarily individual "units" in hardware or software) and thus describe equally functions or features of apparatus embodiments as well as method embodiments (unit = step) .

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described embodiment of an apparatus is merely exemplary. For example, the unit division is merely logical function division and may be another division in an actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, functional units in the embodiments of the invention may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.

Claims

A network node (110a) for performing a distributed networking function in a communication network (100) , wherein the network node (110a) is connected via one or more communication links (111) to one or more neighboring network nodes (110b-d) of a plurality of network nodes (110a-g) of the communication network (100) , wherein the network node (110a) comprises:

a message processing module (121) , including a message aggregation module (121a) and a hidden state update module (121 b) , wherein the message aggregation module (121a) is configured to receive a message (150) from each of the one or more neighboring network nodes (110b-d) , wherein the message (150) comprises information about a hidden state of the respective neighboring network node (110b-d) , and wherein the hidden state update module (121 b) is configured to generate an updated hidden state of the network node (110a) based on a hidden state of the network node (110a) and the information about the one or more hidden states of the one or more neighboring network nodes (110b-d) ; and

a control policy module (123) configured to select, based on the updated hidden state of the network node, one or more actions for increasing a performance measure of the distributed networking function in the communication network (100) .
The network node (110a) of claim 1, wherein the message processing module (121) is configured to iteratively generate the updated hidden state of the network node (110a) based on the hidden state of the network node (110a) and the information about the one or more hidden states of the one or more neighboring network nodes (110b-d) for a predefined number of iterations or until a stop criterion has been reached.
The network node (110a) of claim 1 or 2, wherein the message processing module (121) comprises a graph neural network, GNN, configured to generate the updated hidden state of the network node (110a) based on the hidden state of the network node and the information about the one or more hidden states of the one or more neighboring network nodes (110b-d) .
The network node (110a) of claim 3, wherein the message processing module (121) is configured to initialize the hidden state of the network node (110a) based on one or more input features of the GNN.
The network node (110a) of claim 4, wherein the one or more input features of the GNN comprise at least of one or more adjustable network node parameters relating to a global network routing strategy, traffic measurement information and graph-based measurement information.
The network node (110a) of any one of claims 3 to 5, wherein each graph node of the GNN represents one of the plurality of network nodes (110a-f) and wherein each graph edge of the GNN represents a respective communication link (111) between two respective network nodes of the plurality of network nodes (110a-f) .
The network node (110a) of any one of claims 3 to 6, wherein the message processing module (121) is configured to represent the hidden state of the network node (110a) as a hidden state vector.
The network node (110a) of claim 7, wherein the hidden state vector comprises a first component indicative of a weight of a respective communication link (111) and/or a second component indicative of a utilization measure of a respective communication link (111) .
The network node (110a) of claim 8, wherein one or more further components of the hidden state vector are zero-padded.
The network node (110a) of any one of claims 3 to 5, wherein the GNN is defined by one or more adjustable GNN parameters and wherein values of the one or more adjustable GNN parameters are based on a training stage of the GNN, wherein in the training stage the plurality of network nodes (110a-g) have defined a first network architecture and at least one second network architecture.
The network node (110a) of any one of the preceding claims, wherein the control policy module (123) comprises a multi agent reinforcement learning, MARL, neural network configured to select, based on the updated hidden state of the network node (110a) and the information about the updated hidden states of the one or more neighboring network nodes (110b-d) , the one or more actions for increasing the performance measure of the distributed networking function in the communication network (100) .
The network node (110a) of claim 11, wherein the distributed networking function is a distributed routing function within the communication network (100) and wherein a reward defined by the MARL neural network is a change of the maximum communication link load.
The network node (110a) of claim 12, wherein the control policy module (123) is configured to generate a destination-based routing table and to route data within the communication network (100) on the basis of the destination-based routing table.
The network node (110a) of claim 13, wherein the control policy module (123) is configured to generate the destination-based routing table by determining a weighted shortest path to a destination network node of the plurality of network nodes (110b-g) based on a plurality of weights of the plurality of communication links (111) defining the weighted shortest path to the destination network node.
The network node (110a) of any one of claims 11 to 14, wherein the MARL neural network is defined by one or more adjustable MARL neural network parameters and wherein values of the one or more adjustable MARL neural network parameters are based on a training stage of the MARL neural network, wherein in the training stage the plurality of network nodes (110a-g) have defined a first network architecture and at least one second network architecture.
The network node (110a) of any one of the preceding claims, wherein the network node (110a) is a router or a switch.
An assembly of network nodes (110a-g) comprising a plurality of network nodes (110a-g) according to any one of the preceding claims.
A method (600) for performing a distributed networking function by a network node (101a) in a communication network (100) , wherein the network node (110a) is connected via one or more communication links (111) to one or more neighboring network nodes (110b-d) of a plurality of network nodes (110b-g) of the communication network (100) , wherein the method (600) comprises:

receiving (601) a message (150) from each of the one or more neighboring network nodes (110b-d) , wherein the message (150) comprises information about a hidden state of the respective neighboring network node (110b-d) ;

generating (603) an updated hidden state of the network node (110a) , based on a hidden state of the network node (110a) and the information about the one or more hidden states of the one or more neighboring network nodes (110b-d) ; and

selecting (605) , based on the updated hidden state of the network node (110a) , one or more actions for increasing a performance measure of the distributed networking function in the communication network (100) .
A computer program product comprising a computer-readable storage medium for storing program code which causes a computer or a processor to perform the method (600) of claim 18 when the program code is executed by the computer or the processor.