CN114286413B

CN114286413B - TSN network joint routing and stream distribution method and related equipment

Info

Publication number: CN114286413B
Application number: CN202111290231.7A
Authority: CN
Inventors: 魏翼飞; 阳柳; 李骏; 王小娟; 宋梅
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2021-11-02
Filing date: 2021-11-02
Publication date: 2023-09-19
Anticipated expiration: 2041-11-02
Also published as: CN114286413A

Abstract

The application provides a TSN network joint routing and flow distribution method and related equipment, wherein the method comprises the following steps: constructing a system model of the TSN network based on the software defined network, wherein the system model comprises a controller; constructing a Markov decision model of communication flow distribution and routing problems in a TSN network, and determining a state space, an action space and a reward function; taking the controller as an agent, based on the Markov decision model, taking the minimum end-to-end average time delay of the communication flow under the constraint condition as an optimization target, and obtaining a routing strategy of the communication flow by utilizing a DQN algorithm; and distributing routing paths for each communication flow according to the routing strategy. The technical scheme of the application can meet the service quality of high-priority traffic transmission, and simultaneously, low-priority traffic can finish transmission within the maximum end-to-end time delay.

Description

TSN network joint routing and stream distribution method and related equipment

Technical Field

The present application relates to the field of communications networks, and in particular, to a TSN network combined routing and flow allocation method and related devices.

Background

A time sensitive network (Time Sensitive Networking, TSN), which is a mixed flow system, has deterministic traffic and non-deterministic traffic. Each message generated in the TSN network is classified into Time Triggered (TT) traffic, audio Video Bridging (AVB) traffic, and Best Effort (BE) traffic according to its communication requirements.

The TSN network mainly relies on bounded delay and jitter to ensure the quality of service of the network, and in order to prevent best effort traffic from interfering with real-time traffic, different traffic in the TSN network needs to be scheduled and routed. At present, in order to simplify and abstract the problem of complexity, many methods assume that a routing path set and a scheduling flow are fixed and priori, so that the network utilization rate is low; in addition, these methods are no longer applicable when a link changes or bursty traffic occurs, and the generalization capability is low and the traffic cannot be efficiently scheduled.

Disclosure of Invention

In view of the above, the present application is directed to a TSN network joint routing and flow allocation method and related devices for solving the above-mentioned problems.

Based on the above object, a first aspect of the present application provides a method for joint routing and flow allocation of a TSN network, including:

constructing a system model of the TSN network based on the software defined network, wherein the system model comprises a controller;

constructing a Markov decision model of communication flow distribution and routing problems in a TSN network, and determining a state space, an action space and a reward function;

taking the controller as an agent, based on the Markov decision model, taking the minimum end-to-end average time delay of the communication flow under the constraint condition as an optimization target, and obtaining a routing strategy of the communication flow by utilizing a DQN algorithm;

and distributing routing paths for each communication flow according to the routing strategy.

Further, the routing policy includes:

and the intelligent agent selects a next-hop node for each communication flow in the current node queues according to the current network state until each communication flow finishes path allocation or reaches preset iteration times.

Further, the constraint is represented by the following formula:

wherein, representing the end-to-end delay in the transmission of a communication stream from a source node to a destination node, t representing the time slot, f _k Representing communication flow, F _TT Representing a set of time triggered TT flows, τ _TT Represents the maximum value of the end-to-end delay of TT flow, τ _AVB Representing the maximum value of the end-to-end delay of audio bridging AVB traffic, F _AVB Representing a set of AVB traffic, T representing a communication period, F _BE Represents the set of best effort BE traffic, +.>Indicating the link capacity used by the communication flow from node i to node j, u _ij Representing the link capacity from node i to node j;

the optimization objective is represented by the following formula:

wherein omega ₁ And omega ₂ For the weight, represent the optimization trend, and ω ₁ +ω ₂ =1, t' denotes all slots in the communication period,normalized mean delay representing TT traffic at time slot t, < >>Representing the normalized average delay of AVB traffic at time slot t.

Further, the state space includes a network state including a node link, a remaining capacity of the node link, a node queue, and a state of the communication flow;

the action space comprises selecting a next-hop node for each communication flow in the current node and forwarding the next-hop node, so that the communication flow enters a corresponding priority queue;

the bonus function r _t Represented by the formula:

wherein ρ is _t ,η _t All represent control functions, ρ when each of the communication flows of the slot t reaches the destination node _t = -1, otherwise ρ _t =0; if the accumulated delay of the current node exceeds the maximum allowable delay, the node is added with +>Otherwise->If the communication flow does not reach the destination node and the maximum allowable delay is not exceeded, eta _t = -1, otherwise η _t =0; u is a constant greater than 0, phi _t Representing a function that is positively correlated to the current node queue length.

Further, the obtaining the routing policy of the communication flow by using the DQN algorithm further includes:

acquiring a network topology diagram of a TSN network;

performing feature extraction on each node of the network topological graph by using a pre-trained graph convolution neural network to obtain a feature extraction result;

and updating the network state based on the feature extraction result.

Further, the number of layers of the graph roll-up neural network is 2, and the propagation rule of the hidden layer of the first layer of the graph roll-up neural network is represented by the following formula:

wherein, sigma (·) represents the activation function,the representation adds self-loops for each node, and +.>J represents the connection relation between nodes, I is an identity matrix,>a degree matrix representing the number of links connected to a node, W ^(l) A weight matrix representing a first layer of the graph convolution neural network, and sigma (·) represents an activation function;

the graph volume integrating sub of the graph volume neural network is represented by the following formula:

wherein, representing the characteristics of node i at layer (l+1), a +.>Representing the characteristics of node i at layer i, < >>Neighbor node set representing node i, +.>Representing the normalization factor;

the forward propagation formula of the graph roll-up neural network is as follows:

wherein, representation pair->Standardized H ⁽⁰⁾ Representing node characteristic matrix, W ⁽⁰⁾ And W is ⁽¹⁾ The weight matrices of the first layer and the second layer of the convolutional neural network are shown, respectively.

Further, the system model comprises a topology management module, a flow management module and a queue management module; wherein,

the topology management module is configured to obtain network topology information of the TSN network, and perform representation by using a directed graph g= (V, E), where v= { V ₁ ,v ₂ ,…,v _N The node set of N switches in the network, e= { E }, is represented _ij I, j e N, i not equal to j represents L physical link sets;

the traffic management module is used for acquiring communication tasks in the TSN network and abstracting the communication tasks into the communication streams, wherein the communication streams pass through the following tuples f '' _k The representation is:

wherein n is _src,k ,n _dst,k E V represents the communication flow f respectively _k Is provided with a source node, a destination node,indicating the size of the traffic stream, p _k ∈N ^* Representing the period of the communication stream τ _k ∈R ⁺ Representing the maximum allowable delay, delta, of a communication stream _k Indicating the priority of the communication stream and +.>

The queue management module is used for generating a node queue of the communication flow according to the priority of the communication flow, and the expression of the node queue is as follows:

q _i ≡{q _i,1 ,q _i,2 ,…,q _i,p }

wherein q _i,p Representing node v _i Is the p-th priority queue of (c).

Based on the same inventive concept, a second aspect of the present application provides a TSN network joint routing and flow distribution device,

from the foregoing, it can be seen that the present application provides a method comprising:

a first construction module: a system model configured to build a TSN network based on a software defined network, the system model comprising a controller;

and a second construction module: a Markov decision model configured to construct traffic distribution and routing problems in the TSN network, determining a state space, an action space and a reward function;

and a data processing module: the controller is used as an intelligent agent, based on the Markov decision model, the minimum end-to-end average time delay of the communication flow under the constraint condition is used as an optimization target, and a routing strategy of the communication flow is obtained by utilizing a DQN algorithm;

the strategy execution module: is configured to allocate a routing path for each of the traffic flows according to the routing policy.

Based on the same inventive concept, a third aspect of the present application provides an electronic device, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to the first aspect when executing the program.

Based on the same inventive concept, a fourth aspect of the present application provides a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of the first aspect.

From the above, the TSN network joint routing and flow distribution method and related equipment provided by the present application consider the mixed scheduling of the critical traffic and the non-critical traffic in the TSN network, and are closer to the real network environment, so that the flexibility of traffic scheduling is increased, the low delay and jitter of TT traffic transmission are ensured, the end-to-end delay of AVB traffic is reduced, and the BE traffic can BE normally transmitted within the maximum end-to-end delay.

Drawings

In order to more clearly illustrate the technical solutions of the present application or related art, the drawings that are required to be used in the description of the embodiments or related art will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort to those of ordinary skill in the art.

Fig. 1 is a flowchart of a TSN network joint routing and flow allocation method according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of a system model of a TSN network according to an embodiment of the present application;

FIG. 3 is a flow chart of updating network status using a graph convolutional neural network in accordance with an embodiment of the present application;

fig. 4 is a schematic structural diagram of a TSN network combined routing and flow distribution device according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the application.

Detailed Description

The present application will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present application more apparent.

It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present application should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present application belongs. The terms "first," "second," and the like, as used in embodiments of the present application, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items.

As described in the background section, it is also difficult to satisfy the need in the related art for a solution for scheduling traffic in a TSN network, where only one of the traffic is scheduled by fixing one of the traffic, and the set of routing paths and the scheduled flow are fixed and a priori. The applicant finds that the following problems exist in the technical scheme in the prior art in the process of realizing the application: the fixed route path set is used for transmitting the traffic, so that the network utilization rate is low; more importantly, when a link between nodes in a network fails or traffic changes, the traffic cannot be scheduled reasonably.

In view of this, the present application provides a scheme of combining routing and flow distribution for TSN networks, which combines graph convolutional neural networks (Graph convolutional neural network, GCN) with deep reinforcement learning algorithms to schedule and route traffic of TSN networks, so as to ensure low delay and jitter of TT service transmission, reduce end-to-end delay of AVB service, and enable normal transmission of BE traffic.

The following describes the technical scheme of the present application in detail through specific examples.

Referring to fig. 1, a TSN network joint routing and flow allocation method provided in an embodiment of the present application specifically includes the following steps:

step S101, constructing a system model of the TSN network based on the software defined network (Software Defined Network, SDN), the system model comprising a controller.

In this step, in conjunction with fig. 2, the system model includes a data plane, a control plane, and an application plane.

Data plane end devices, switches and full duplex physical links between them. A terminal device is a network device that generates communication tasks, typically the end device that generates the message is a talker and then the other end device that arrives at the message transmitted over the physical link is a receiver, each end device being both a talker and a receiver. The switch acts as a bridge between the messaging processes, receiving and sending messages according to a schedule. The network topology can be represented as a directed graph g= (V, E), where v= { V ₁ ,v ₂ ,…,v _N The node set of N switches in the network, e= { E }, is represented _ij I, j e N, i+.j } represents a set of L physical links.

The control plane includes controllers, which in turn include a centralized user controller (Centralized User Configuration, CUC), a centralized network controller (Central Network Controller, CNC), and an SDN controller. The controllers are connected through a gateway. It should be noted that, the CUC is configured to collect the communication requirements (frequency, delay/jitter requirements) and send the collected communication requirements to the CNC; the CNC is used for calculating a routing path and a scheduling table according to the communication request provided by the CUC; dynamic network connectivity is provided in the physical system by the SDN controller due to the hop count limit of the TSN network.

The control plane connects the data plane and the application plane, provides a global view of the data plane to the application plane by collecting network state information including traffic flow state, node state, etc. The application plane may provide different services including network monitoring, data storage, traffic scheduling and routing, etc.

Step S102, a Markov decision model of the communication flow distribution and routing problem in the TSN network is constructed, and a state space, an action space and a reward function are determined.

In this step, the allocation and routing process of traffic flows in the TSN network is a sequence decision problem, so this process can be modeled as a markov decision process. Considering the environmental dynamic change of the solving problem and the large solving space and high complexity, the solution can be carried out by adopting a reinforcement learning algorithm.

And step S103, taking the controller as an agent, taking the minimum end-to-end average time delay of the communication flow under the constraint condition as an optimization target based on the Markov decision model, and obtaining a routing strategy of the communication flow by utilizing a DQN (Deep Q-Learning) algorithm.

In this step, the reinforcement learning algorithm is an algorithm that does not need priori knowledge of the environment, but acquires information by interacting with the environment, and the reinforcement learning process is as follows: at time step t, the state s received by the agent _t When E S, according to policy pi (a _t |s _t ) Selecting an action a _t E A, the mapping of state space to action space at policy pi is denoted pi: S→P (A), after the action a is performed _t Then get a timely prize r _t Then, at transition probability P (s _t+1 |s _t ,a _t ) Transition to the next state s _t+1 The iteration is continued until the round is ended or the termination condition is met. The goal of the agent is to maximize the long term cumulative benefit in the final state, which can be expressed asWherein, gamma is E (0, 1)]Representing the discount factor.

Specifically, reinforcement learning algorithms use state-value functions (state-value functions) or action-value functions (action values) to evaluate the performance of an agent in a state or of an agent in a given state. State s with state value under policy pi _t Which can be decomposed into the following bellman expectation equation representation:

wherein V(s) _t ) Representing the state s at time slot t _t Under the state value function, V (s _t+1 ) Representing the state s at time slot t+1 _t+1 The lower state value function. The bellman expectation equation can find an optimal strategy, wherein the optimal action value function is defined as:

optimal policy pi after state value convergence ^* Can be calculated by the following formula:

it should be noted that, because prior knowledge of a scene cannot be obtained, a model-free reinforcement learning algorithm is adopted, and secondly, because the motion space and the state space of the distribution and routing process of communication flows in the TSN network are very large, enumerating all states and motions in a value iterative reinforcement learning algorithm (Q-learning) increases more time and memory cost, so that a DQN algorithm is adopted, and the DQN algorithm introduces deep learning on the basis of the Q-learning.

The DQN network consists of two networks of identical structure but different parameters: and the parameters of the target network and the Q network are updated once by adopting the parameters of the Q network after each C iterations. DQN employs a deep network (Q-network) to approximate the value function Q(s) _t ,a _t ) The approximation function obtained by the Q network can be expressed as Q (s _t ,a _t ；θ _i )，θ _i The parameters at the ith iteration of the Q network are represented, namely the connection weights of the neural network. The goal of each iteration optimization of the Q network is generated by the target network and can be expressed as:

wherein a is _t+1 Representing the next moment of action, s _t+1 The state of the next time is indicated,representing parameters of the target network.

During the Q network training process, parameter updates are made by minimizing the following loss functions:

L(θ _i )＝E[(y _t -Q(s _t ,a _t |θ _i )) ² ]

the bias derivative of the loss function is obtained by:

another improvement of the DQN algorithm is the introduction of an empirical playback mechanism. At time step t, the agent will experience e _t ＝(s _t ,a _t ,r _t ,s _t+1 ) The method comprises the steps of storing the current state, the action, the reward and the next state in an experience pool D, randomly sampling small batches of samples from the D each time to update network parameters, and the expression is as follows:

step S104, distributing route paths for the communication flows according to the route selection strategy.

It can BE seen that, the technical solution of this embodiment considers the mixed scheduling of the critical traffic and the non-critical traffic in the TSN network, which is closer to the real network environment, increases the flexibility of traffic scheduling, ensures the low delay and jitter of TT traffic transmission, reduces the end-to-end delay of AVB traffic, ensures the normal transmission of BE traffic, and the constructed system model provides advantages for reinforcement learning realization, and improves the computation efficiency of DQN algorithm.

In some embodiments, the routing policy comprises:

In this embodiment, by combining the time delay characteristics of different types of communication flows, the non-critical traffic and the critical traffic are mixed and scheduled, which is closer to the real network environment, and the scheduling flexibility is increased.

In some embodiments, the constraint is represented by the following expression:

wherein, representing the end-to-end delay in the transmission of a communication stream from a source node to a destination node, t representing the time slot, f _k Representing communication flow, F _TT Representing a set of time triggered TT flows, τ _TT Represents the maximum value of the end-to-end delay of TT flow, τ _AVB Representing the maximum value of the end-to-end delay of audio bridging AVB traffic, F _AVB Representing a set of AVB traffic, T representing a communication period, F _BE Represents the set of best effort BE traffic, +.>Indicating the link capacity used by the communication flow from node i to node j, u _ij Representing the link capacity from node i to node j.

It should be noted that constraint (a) indicates that the end-to-end delay of the TT traffic is less than or equal to the maximum allowable end-to-end delay of the TT traffic; constraint (b) represents an end-to-end delay of the AVB traffic being less than or equal to a maximum allowable end-to-end delay of the AVB traffic; constraint (c) indicates that the transmission of BE traffic should BE completed within a preset communication period to ensure that BE traffic can BE transmitted in time; constraint (d) indicates that the link utilization of the current slot cannot exceed the link capacity.

In particular, the end-to-end delay in the transmission of a communication stream from a source node to a destination nodeCan be represented by the following formula:

wherein d _pr Representing a processing delay, the size of which depends on the switch design; d, d _tr Representing transmission delay, which is determined by frame size and link transmission rate; d, d _pg Representing the propagation delay of a link, the size of which is determined by the propagation medium and the cable length; d, d _q Representing queuing delay.

The above-mentioned delays are deterministic and bounded, however queuing delays occur when several communication flows attempt to transmit at the switch egress port, the value of which is not deterministic, depending on the current queue length. Therefore, the end-to-end delay is mainly determined by the queuing delay, and the communication flow can be isolated in time and space so as to reduce the queuing delay.

The optimization objective is represented by the following formula:

wherein omega ₁ And omega ₂ Is weight and represents the bestTendency to transform, and ω ₁ +ω ₂ =1, t' denotes all slots in the communication period,normalized mean delay representing TT traffic at time slot t, < >>Representing the normalized average delay of AVB traffic at time slot t.

Specifically, the normalized average delay of the TT traffic at the time slot t is represented by the following formula:

wherein, |F _TT I indicates the number of TT flows, τ _TT The maximum allowable end-to-end delay, which represents TT traffic, is a constant.

The normalized average delay of AVB traffic at time slot t is represented by:

wherein, |F _AVB I indicates the number of AVB flows, τ _AVB The maximum allowable end-to-end delay, representing AVB traffic, is a constant.

In some embodiments, the state space comprises a network state comprising a remaining capacity g of a node link _i Node queue q _i The communication flow f _k State y of _k And information related to the status of the network,n _src,k ,n _dst,k respectively representing a source node and a destination node of a communication flow, n _pos,k Indicating the node where the communication flow is currently located, r _k Indicating the size of the traffic stream, p _k Indicating the period, ζ, of the communication flow _k Representing communication flow toCumulative delay, delta, of current node _k Indicating the priority of the traffic flow. The related information of the network state comprises a communication task number K, a network node number N and a queue priority P.

Specifically, the state S of time slot t in state space S _t Represented by the formula:

s _t ＝{g ₁ (t),g ₂ (t),…,g _N (t),q ₁ (t),q ₂ (t),…,q _N (t),Υ ₁ (t),Υ ₂ (t),…,Υ _K (t)}

wherein g _i (t)＝[g _i1 (t),g _i2 (t),…,g _iN (t)]Representing the remaining link capacity of the link connected to node i; q _i (t)＝[q _i,1 (t),q _i,2 (t),…,q _n,p (t)]Queue representing node i, y _k ＝Representing the traffic flow state of the node.

The action space comprises selecting a next-hop node for each communication flow in the current node and forwarding the next-hop node, so that the communication flows enter corresponding priority queues.

It should be noted that, when routing each communication flow (the communication flows are not separable and only transmitted on one path at the same time), the agent needs to allocate a path from the source node to the destination node for each flow, and because it is difficult to directly select one path for the communication flow in a large-scale network, the shared link existing between different paths may cause collision and interference between different flows. In this scheme, the agent selects a next hop for each communication flow in the node, sends the communication flow to the next node and enters the corresponding priority queue. Compared with the direct selection of the path, the next hop selection method improves the generalization capability of the algorithm, and finally, the path selection process of the communication flow is finished through continuous iteration.

Specifically, a represents an action space, and action a at time slot t _t E A is expressed as:

wherein, indicating that traffic flow k is selected as candidate node for next hop, when +.>When it is indicated that node n is selected as the next hop for communication flow k, and vice versa,/->Indicating that node n is not selected.

The bonus function r _t Represented by the formula:

In some embodiments, referring to fig. 3, the obtaining the routing policy of the communication flow using the DQN algorithm further includes the following steps:

step S301, obtaining a network topology map of the TSN network.

In this step, the network topology map of the TSN network may be obtained by a topology management module in the control plane.

Step S302, feature extraction is carried out on each node of the network topological graph by utilizing a pre-trained graph convolution neural network so as to obtain a feature extraction result.

In the step, feature extraction is carried out on each node through the graph convolutional neural network, and the processed node features not only comprise the features of the current node but also comprise the features of neighbor nodes.

And step S303, updating the network state based on the feature extraction result.

In the step, when the network topology changes, the current network node characteristics are updated in time so as to ensure the effectiveness of the routing strategy.

In some embodiments, the number of layers of the graph roll-up neural network is 2, and the propagation rule of the first hidden layer of the graph roll-up neural network is represented by the following formula:

wherein, sigma (·) represents the activation function,the representation adds self-loops for each node, and +.>J represents the connection relation between nodes, I is an identity matrix,>a degree matrix representing the number of links connected to a node, W ^(l) Representing the weighting matrix of the first layer of the graph roll-up neural network, and sigma (·) represents the activation function.

wherein, representing the characteristics of node i at layer (l+1), a +.>Representing the characteristics of node i at layer i, < >>Neighbor node set representing node i, +.>Representing the normalization factor.

In some embodiments, in conjunction with fig. 2, the system model includes a topology management module (TPM), a Traffic Management Module (TMM), and a Queue Management Module (QMM). Wherein,

the topology management module is used for acquiring network topology information of the TSN network and entering by utilizing a directed graph G= (V, E)The rows represent, where v= { V ₁ ,v ₂ ,…,v _N The node set of N switches in the network, e= { E }, is represented _ij I, j e N, i+.j } represents a set of L physical links.

q _i ≡{q _i,1 ,q _i,2 ,…,q _i,p }

wherein q _i,p Representing node v _i Is the p-th priority queue of (c).

It should be noted that, the method of the embodiment of the present application may be performed by a single device, for example, a computer or a server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one of the devices may perform only one or more steps of the method of an embodiment of the present application, the devices interacting with each other to accomplish the method.

It should be noted that the foregoing describes some embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments described above and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

Based on the same inventive concept, the application also provides a TSN network joint routing and flow distribution device corresponding to the method of any embodiment.

Referring to fig. 4, the TSN network joint routing and flow distribution device includes:

a first building module 401 configured to build a system model of the TSN network based on the software defined network, the system model comprising a controller;

a second construction module 402 configured to construct a markov decision model of traffic flow allocation and routing problems in the TSN network, determining a state space, an action space and a reward function;

the data processing module 403 is configured to use the controller as an agent, based on the markov decision model, to obtain a routing policy of the communication flow by using a DQN algorithm with minimum end-to-end average delay of the communication flow under constraint conditions as an optimization target;

a policy enforcement module 404 is configured to allocate a routing path for each of the traffic flows according to the routing policy.

For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, the functions of each module may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.

The device of the foregoing embodiment is configured to implement the TSN network joint routing and flow allocation method corresponding to any one of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.

Based on the same inventive concept, the application also provides an electronic device corresponding to the method of any embodiment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the TSN network joint routing and flow distribution method of any embodiment when executing the program.

Fig. 5 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.

The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided in the embodiments of the present disclosure.

The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.

The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.

Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).

Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).

It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.

The electronic device of the foregoing embodiment is configured to implement the TSN network joint routing and flow allocation method corresponding to any one of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.

Based on the same inventive concept, the present application also provides a non-transitory computer readable storage medium corresponding to the method of any embodiment, wherein the non-transitory computer readable storage medium stores computer instructions for causing the computer to execute the TSN network joint routing and flow allocation method according to any embodiment.

The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.

The storage medium of the foregoing embodiment stores computer instructions for causing the computer to perform the TSN network joint routing and flow allocation method according to any one of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.

Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the application (including the claims) is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the application, the steps may be implemented in any order, and there are many other variations of the different aspects of the embodiments of the application as described above, which are not provided in detail for the sake of brevity.

Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure the embodiments of the present application. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring the embodiments of the present application, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the embodiments of the present application are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the application, it should be apparent to one skilled in the art that embodiments of the application can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.

While the application has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.

The present embodiments are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omissions, modifications, equivalent substitutions, improvements, and the like, which are within the spirit and principles of the embodiments of the application, are intended to be included within the scope of the application.

Claims

1. A TSN network joint routing and flow distribution method, comprising:

taking the controller as an agent, based on the Markov decision model, taking the minimum end-to-end average time delay of the communication flow under the constraint condition as an optimization target, and obtaining a routing strategy of the communication flow by utilizing a DQN algorithm; the constraint is represented by the following formula: subject to Wherein (1)>Representing communication flow from source node to destinationEnd-to-end delay in the process of node, t represents time slot, f _k Representing communication flow, F _TT Representing a set of time triggered TT flows, τ _TT Represents the maximum value of the end-to-end delay of TT flow, τ _AVB Representing the maximum value of the end-to-end delay of audio bridging AVB traffic, F _AVB Representing a set of AVB traffic, T representing a communication period, F _BE Represents the set of best effort BE traffic, +.>Indicating the link capacity used by the communication flow from node i to node j, u _ij Representing the link capacity from node i to node j; the optimization objective is represented by the following formula: wherein omega ₁ And omega ₂ For the weight, represent the optimization trend, and ω ₁ +ω ₂ =1, t' represents all slots in the communication period, +.>Normalized mean delay representing TT traffic at time slot t, < >>Representing the normalized average time delay of the AVB flow under the time slot t;

2. The method of claim 1, wherein the routing policy comprises:

3. The method of claim 1, wherein the state space comprises a network state comprising a node link, a remaining capacity of the node link, a node queue, and a state of the communication flow;

the bonus function r _t Represented by the formula:

wherein, all represent control functions, ρ when each of the communication flows of the slot t reaches the destination node _t = -1, otherwise ρ _t =0; if the accumulated delay of the current node exceeds the maximum allowable delay, the node is added with +>Otherwise->If the communication flow does not reach the destination node and the maximum allowable delay is not exceeded, eta _t = -1, otherwise η _t =0; u is a constant greater than 0, phi _t Representing a function that is positively correlated to the current node queue length.

4. A method according to claim 3, wherein said deriving a routing policy for said traffic flow using DQN algorithm further comprises, before:

acquiring a network topology diagram of a TSN network;

and updating the network state based on the feature extraction result.

5. The method of claim 4, wherein the number of layers of the convolutional neural network is 2, and the propagation rule of the first hidden layer of the convolutional neural network is represented by the following formula:

6. The method of claim 1, wherein the system model comprises a topology management module, a traffic management module, and a queue management module; wherein,

wherein n is _src,k ,n _dst,k E V represents the communication flow f respectively _k Is provided with a source node, a destination node,indicating the size of the communication stream, P _k ∈N ^* Representing the period of the communication stream τ _k ∈R ⁺ Representing the maximum allowable delay, delta, of a communication stream _k Representing priority of communication flows, and

q _i ＝{q _i，1 ，q _i，2 ，…,q _i，p }

wherein q _i,p Representing node v _i Is the p-th priority queue of (c).

7. A TSN network joint routing and flow distribution device, comprising:

and a data processing module: the controller is used as an intelligent agent, based on the Markov decision model, the minimum end-to-end average time delay of the communication flow under the constraint condition is used as an optimization target, and a routing strategy of the communication flow is obtained by utilizing a DQN algorithm; the constraintThe conditions are represented by the following formula: subject to Wherein (1)>Representing the end-to-end delay in the transmission of a communication stream from a source node to a destination node, t representing the time slot, f _k Representing communication flow, F _TT Representing a set of time triggered TT flows, τ _TT Represents the maximum value of the end-to-end delay of TT flow, τ _AVB Representing the maximum value of the end-to-end delay of audio bridging AVB traffic, F _AVB Representing a set of AVB traffic, T representing a communication period, F _BE Represents the set of best effort BE traffic, +.>Indicating the link capacity used by the communication flow from node i to node j, u _ij Representing the link capacity from node i to node j; the optimization objective is represented by the following formula: />Wherein omega ₁ And omega ₂ For the weight, represent the optimization trend, and ω ₁ +ω ₂ =1, t' represents all slots in the communication period, +.>Normalized mean delay representing TT traffic at time slot t, < >>Representing the normalized average time delay of the AVB flow under the time slot t;

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 6 when the program is executed by the processor.

9. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 6.