WO2023274304A1 - 分布式路由确定方法、电子设备及存储介质 - Google Patents

分布式路由确定方法、电子设备及存储介质 Download PDF

Info

Publication number
WO2023274304A1
WO2023274304A1 PCT/CN2022/102398 CN2022102398W WO2023274304A1 WO 2023274304 A1 WO2023274304 A1 WO 2023274304A1 CN 2022102398 W CN2022102398 W CN 2022102398W WO 2023274304 A1 WO2023274304 A1 WO 2023274304A1
Authority
WO
WIPO (PCT)
Prior art keywords
domain
intra
training
inter
domain path
Prior art date
Application number
PCT/CN2022/102398
Other languages
English (en)
French (fr)
Inventor
唐春
李丹
李海滨
彭鑫
薄开涛
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023274304A1 publication Critical patent/WO2023274304A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • H04L45/04Interdomain routing, e.g. hierarchical routing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • H04L45/08Learning-based routing, e.g. using neural networks or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/38Flow based routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/44Distributed routing

Definitions

  • the present application relates to routing technology, in particular to a distributed machine learning routing determination method, electronic equipment and storage media.
  • routing and switching devices at the bottom of the network increase exponentially.
  • services carried by routing devices are becoming more and more complex, thus requiring an efficient routing control method to forward data, so as to improve the quality and efficiency of network communication.
  • the routing control of the Internet mainly adopts the methods of "static routing based on topology" and "traffic engineering based on planning". Because static routing does not consider traffic status, this type of solution generally needs to reserve enough network bandwidth to deal with burst traffic, resulting in a very low maximum link utilization of the current network. Since traffic engineering based on planning needs to collect traffic status information of the entire network, and the planning solution takes a long time, it is difficult for this type of solution to respond to the highly dynamic network traffic status in real time.
  • Routing algorithms based on machine learning require continuous training based on historical data of network traffic.
  • the machine learning training system requires training and reasoning to be placed at the central node with data from the entire network for training and reasoning, and the dynamic changes in network traffic require timely updates to the routing model to respond to changes in network traffic in a timely manner.
  • the inference node first collects the real-time data of each node, and synchronously summarizes the data to form feature data; then, the inference node runs the inference logic, and sends the inference result to each node; finally, each node sets the The routing of the packet. This process is difficult to meet the immediacy requirements of traffic changes.
  • Embodiments of the present application provide a distributed route determination method, electronic equipment and storage media, which can respond to changes in network traffic in a timely manner by separating the route training model from the route reasoning process.
  • the embodiment of the present application proposes a method for determining a distributed route, including the following steps: receiving training information sent by multiple edge routers, the training information including the intra-domain traffic data of the domain where the edge router is located and the edge Inter-domain delay data corresponding to the router; training and generating an intra-domain path model according to the intra-domain traffic data, and sending the intra-domain path model to the corresponding edge router, so that the edge router can adjust the intra-domain path model according to the intra-domain path model Generate an intra-domain path scheme for the path; generate an inter-domain path model according to the inter-domain delay data training, and generate an inter-domain path scheme according to the inter-domain path model; send the inter-domain path scheme to the corresponding edge router, For the edge router to determine a routing path according to the inter-domain path scheme and the intra-domain path scheme.
  • the embodiment of the present application also proposes a method for determining a distributed route, including the following steps: acquiring training information, and sending the training information to the training system, the training information includes Traffic data and inter-domain delay data corresponding to the edge router; receiving the intra-domain path model sent by the training system, adjusting the intra-domain path according to the intra-domain path model to generate an intra-domain path plan, the intra-domain path model is determined by the training system Generate according to the intra-domain traffic data training; receive the inter-domain path scheme sent by the training system, the inter-domain path scheme is generated by the inter-domain path model, and the inter-domain path model is generated by the training system according to the inter-domain Delay data training generation; determining a routing path according to the inter-domain path scheme and the intra-domain path scheme.
  • the embodiment of the present application also provides an electronic device, including a memory and a processor.
  • the memory stores computer programs.
  • the processor executes the computer program, the aforementioned routing path determination method is realized.
  • the embodiment of the present application also provides a computer-readable storage medium, storing computer-executable instructions, where the computer-executable instructions are used to execute the method for determining a distributed route as described above.
  • FIG. 1 shows a network topology diagram of a distributed routing determination system according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of data exchange during inter-domain path training according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of data exchange during intra-domain path training according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of data exchange during intra-domain path reasoning according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of data exchange during inter-domain path reasoning according to an embodiment of the present application.
  • FIG. 6 is a data exchange logic diagram of a distributed routing determination system according to an embodiment of the present application.
  • FIG. 7A is a flow chart of the training system side of the distributed routing determination method according to the embodiment of the present application.
  • FIG. 7B is a flow chart on the router side of a method for determining a distributed route according to an embodiment of the present application
  • FIG. 8 is a flowchart of a sub-method for inter-domain path training according to an embodiment of the present application.
  • FIG. 9 is a flowchart of a sub-method for intra-domain path training according to an embodiment of the present application.
  • Figure 10 shows a block diagram of a representative computing device that can be used to implement the various features and processes described herein.
  • the present application provides a distributed route determination system, method and storage medium, by periodically collecting training information by an edge router, uploading it to the intra-domain training subsystem of the domain and forwarding it to the inter-domain training subsystem, and the intra-domain training subsystem and inter-domain training subsystems respectively train the intra-domain path model and inter-domain path model to respond to dynamic burst traffic in a timely manner; at the same time, the trained intra-domain path model and inter-domain path model are delivered to the corresponding edge routers and domain
  • the inter-reasoning subsystem uses distributed reasoning to determine the routing path adjustment scheme.
  • the technical solution of this application aims at the slow speed of machine learning training, but based on the fast reasoning speed of the trained model, it separates the model training from the model-based reasoning process, so that the model can continuously update and learn based on historical data, and respond in a timely manner Network traffic changes dynamically.
  • FIG. 1 is a network topology diagram of a distributed route determination system according to an embodiment of the present application.
  • the system includes an inter-domain training subsystem 1 , an intra-domain training subsystem 10 a , an inter-domain reasoning subsystem 10 b and an edge controller 101 .
  • the inter-domain training subsystem 1 can be set in a federated controller (not shown) between multiple domains 2, while the intra-domain training subsystem 10a and the inter-domain reasoning subsystem 10b can be set in each domain 2 in the domain controller (not shown).
  • one set of the inter-domain training subsystem 1 and the intra-domain training subsystem 10a may be deployed in each domain.
  • the federated deployment mode of the latter ensures the timeliness of the inter-domain path model 301 and the intra-domain path model 403, and reduces the transmission of intermediate data.
  • the federated controller and the intra-domain controller can provide a rest service as an external interface, so that users can configure and manage the entire training system through the web, and browse training results.
  • the federation controller and the intra-domain controller can communicate based on restapi and gRPC protocols, and the intra-domain controllers can communicate with the edge routers in the domain using the gRPC protocol.
  • the edge router can exchange data with each training subsystem through a unified topology, traffic, delay, and model transceiver interface.
  • the edge router 101 is arranged at the edge of each domain as an application-side traffic entrance, and is connected to the intra-domain controller of the domain where it is located. Therefore, the edge router 101 is an intelligent router, on which a machine learning intelligent model is deployed, so that machine learning methods can be used to periodically perform routing decision-making reasoning tasks online according to the local network traffic status, and output them on the alternative paths between nodes
  • the traffic split ratio to ensure routing efficiency. Because the overhead of adjusting the alternative path at the router is very large, but the overhead of adjusting the traffic split only on the alternative path is very small (fixed alternative path).
  • the candidate paths may be initialized by algorithms such as K-Shortest-Paths (KSP) or independent routing.
  • KSP K-Shortest-Paths
  • the routing module 101b of the edge router 101 can adjust the packet routing in time based on the traffic split ratio.
  • the other routers within domain 2 are intermediate routers 102 .
  • the machine learning intelligent model may not be deployed on the intermediate router 102, so that it only needs to execute segment routing logic according to the path information of the packet header, without performing any routing decision reasoning tasks.
  • the intra-domain training subsystem 10a responds to the intra-domain training data information, and performs offline reinforcement learning (Reinforcement Learning, RL for short, that is, machine learning in a "trial and error” way, which obtains rewards through interacting with the environment, and guides machine learning in a way that maximizes the rewards obtained) training to generate an intra-domain path model 403, and in After the reinforcement learning training is completed, the intra-domain path model 403 is sent to each edge router 101 in the domain. The corresponding edge router 101 generates an intra-domain path scheme 404 based on the intra-domain path model 403 distributed reasoning to adjust the intra-domain path.
  • Reinforcement Learning Reinforcement Learning
  • the inter-domain training subsystem 10b responds to the inter-domain training data information, based on the delay data 401 forwarded by each domain, performs offline training to generate the inter-domain path model 301, and Send the inter-domain path model 301 to the inter-domain reasoning subsystem 10b of each domain.
  • the inter-domain inference subsystem 10b acquires the inter-domain path model 301, it performs one-time inference, generates an inter-domain path scheme 303 according to the policy, and sends these paths to each edge router 101 in the local domain for end-to-end path selection .
  • the edge router 101 periodically collects training information, uploads it to the intra-domain training subsystem 10a of the domain and forwards it to the inter-domain training subsystem 1, and the intra-domain training subsystem 10a and the inter-domain training sub-system System 1 performs training on the intra-domain path model 403 and the inter-domain path model 303 respectively, so as to respond to dynamic burst traffic in a timely manner.
  • the trained intra-domain path model 403 and inter-domain path model 303 are delivered to the corresponding edge router 101 and the inter-domain inference subsystem 10b respectively, and the routing path adjustment scheme is determined by distributed inference.
  • the application scenario of the intelligent routing training system described in the embodiment of the present application is to illustrate the technical solution of the embodiment of the present application more clearly, and does not constitute a limitation to the technical solution provided in the embodiment of the present application.
  • the intelligent routing training system can be applied to networks such as IP networks, SDH networks, PTN networks, IPRAN networks, and OTN networks.
  • FIG. 1 does not constitute a limitation to the embodiment of the present application, and may include more or less components than those shown in the illustration, or combine certain components, or different components. layout.
  • the collection module 101a of the edge router 101 collects the delay data, it can be uploaded to the delay collection interface of the corresponding inter-domain reasoning subsystem 10b, and forwarded to the inter-domain training subsystem 1 for delay aggregation. After the time delay data of each domain is aggregated, the aggregated data can be saved as the original data set at a specific time.
  • the raw data set at a specific moment may be analyzed, and based on the analysis result, inter-domain training data information may be sent to trigger reinforcement learning training of the inter-domain path model 301 .
  • the trigger condition of the reinforcement learning training of the inter-domain path model 301 is that the analysis result shows that the delay change in the acquisition cycle exceeds the threshold of the delay in the previous training cycle (the threshold can be configured according to training experience) , the training process is triggered, and this cycle delay is recorded as the current training cycle delay.
  • the trigger condition may also be a change in path asset data (for example, addition or deletion of routing nodes). At this point, inter-domain training data information will be sent.
  • the trained model is delivered by the inter-domain training subsystem 1 to the inter-domain reasoning subsystem 1b of each domain.
  • the inter-domain reasoning subsystem 1b of each domain will save the model and trigger the reasoning process.
  • the inter-domain path solution 303 formed based on the inter-domain path model 301 and the inference policy will be delivered to all edge routers 101 in the domain where the inter-domain inference subsystem 1b is located.
  • each The collection module 101a of the edge router 101 can collect the real historical traffic demand data of the domain in the past period of time (generally one month), and periodically send these data to the training subsystem 10a in the domain through the traffic collection interface (for example, each domain in the domain)
  • the flow data of each flow on the router is decompressed and summarized to form the flow of the entire domain path in each cycle, which is saved to the file system as original data).
  • the collection module 101a of each edge router 101 regularly collects real historical traffic demand data and aggregates it into a form of traffic demand matrix 203 . Then, the historical traffic demand data is replayed to simulate the historical operation of the network.
  • the data collected by all edge routers 101 in the same domain can be trained and inferred using reinforcement learning algorithms.
  • the data collected by the collection module 101a of the edge router 101 can be aggregated according to the time dimension to form a traffic demand matrix, so that only data with representative characteristics are selected to solve the problem of machine learning training. Slow, and the problem of collecting a large amount of data.
  • the aggregation of collected data may adopt a principal component analysis method to select typical characteristic data.
  • the intra-domain training data information may be sent based on the elapse of a preset training cycle time by the timer of the intra-domain training subsystem.
  • the trigger condition may also be an intra-domain configuration data change.
  • the intra-domain path model 403 can be interactively and iteratively trained in this simulated network.
  • the intra-domain training subsystem 10a sends it to the reasoning module 101c of the edge router 101 to form the intra-domain path scheme as shown in FIG. Select a path.
  • the intra-domain training subsystem 10b may be deployed in a distributed manner.
  • the intra-domain training subsystem 10b may perform multi-GPU or multi-machine cooperative training based on a parameter server (for example, based on a microservice for GPU training supported by nvidia-docker).
  • each training and inference algorithm can be packaged into docker and provide a unified access method (for example, pv/pvs volume mapping access), so as to coordinate with the parameter server and nvidia-docker for scheduling to complete distributed training.
  • the intra-domain controller can generate simulated network elements and interface configurations according to the topology relationship when simulating and reproducing the historical operating conditions of the network.
  • the training data information in the domain can be sent in response to changes in the topological relationship of the domain.
  • the intra-domain training subsystem 10a may use the historical traffic demand matrix to drive the training system to perform multiple rounds of training.
  • the in-domain training subsystem 10a may perform processing such as algorithm mining, algorithm exploration, and big data analysis on the execution of the historical traffic demand matrix.
  • the intra-domain path model 403 obtained through training may be parameters of machine learning intelligent models representing edge routers that have obtained good evaluations in each traffic demand matrix under a given topology environment.
  • Fig. 7A is a flow chart of the training system side of the distributed route determination method provided by the embodiment of the present application.
  • the route acquisition method may include but not limited to steps S100A to S300A.
  • Step S100A receiving training information sent by multiple edge routers 101 .
  • the training information includes intra-domain traffic data of the domain 2 where the edge router 101 is located and inter-domain delay data corresponding to the edge router 101 .
  • the edge router 101 can periodically collect training information, upload it to the intra-domain training subsystem 10a of the domain where it is located, and forward it to the inter-domain training subsystem 1, and the intra-domain training subsystem 10a and the inter-domain training subsystem 1 respectively target
  • the intra-domain path model 403 and the inter-domain path model 303 are trained to respond to dynamic burst traffic in a timely manner.
  • each edge router 101 can also collect the real historical traffic demand data of the domain in the past period of time (generally one month), and periodically send these data through the traffic collection interface to train in the domain.
  • Subsystem 10a for example, the traffic data of each flow on each router in the domain is decompressed and aggregated to form the entire domain path traffic in each cycle, which is saved to the file system as original data).
  • the collection module 101a of the edge router 101 can be uploaded to the delay collection interface of the corresponding inter-domain reasoning subsystem 10b, and forwarded to the inter-domain training subsystem 1 for time delay data. Extended polymerization. After the time delay data of each domain is aggregated, the aggregated data can be saved as the original data set at a specific time.
  • the raw data set at a specific moment may be analyzed, and based on the analysis result, inter-domain training data information may be sent to trigger reinforcement learning training of the inter-domain path model 301 .
  • the trigger condition of the reinforcement learning training of the inter-domain path model 301 is that the analysis result shows that the delay change in the acquisition cycle exceeds the threshold of the delay in the previous training cycle (the threshold can be configured according to training experience) , the training process is triggered, and this cycle delay is recorded as the current training cycle delay.
  • the trigger condition may also be a change in path asset data (for example, addition or deletion of routing nodes). At this point, inter-domain training data information will be sent.
  • Step S200A generate an intra-domain path model 403 according to the intra-domain traffic data training, and send the intra-domain path model 403 to the corresponding edge router 101 for the edge router 101 to adjust the intra-domain path according to the intra-domain path model 403 to generate an intra-domain path scheme 404; at the same time, generate an inter-domain path model 403 according to the inter-domain delay data training, and generate an inter-domain path scheme 404 according to the inter-domain path model 403.
  • the inter-domain training subsystem and the inter-domain inference subsystem cooperate with each other to logically train the inter-domain path model 301 and the intra-domain path model 403 respectively, and send the trained models to the corresponding inter-domain inference Subsystem 10b and edge router 101.
  • the time-delay data 201 collected by the edge routers in the network is performed off-line for reinforcement learning training to generate an intra-domain path model 403, and the intra-domain path model 403 is delivered to each edge router 101 in the domain after the reinforcement learning training is completed.
  • the inter-domain training subsystem 10b responds to the inter-domain training data information, based on the delay data 401 forwarded by each domain, performs offline training to generate the inter-domain path model 301, and Send the inter-domain path model 301 to the inter-domain reasoning subsystem 10b of each domain.
  • the inter-domain inference subsystem 10b acquires the inter-domain path model 301, it performs one-time inference, generates an inter-domain path scheme 303 according to the policy, and sends these paths to each edge router 101 in the local domain for end-to-end path selection .
  • Step S300A sending the inter-domain path scheme 303 to the corresponding edge router 101 for the edge router 101 to determine a routing path according to the inter-domain path scheme 303 and the intra-domain path scheme 404 .
  • the inter-domain training subsystem 10b responds to the inter-domain training data information, based on the delay data 401 forwarded by each domain, performs offline training to generate the inter-domain path model 301, and Send the inter-domain path model 301 to the inter-domain reasoning subsystem 10b of each domain.
  • the inter-domain inference subsystem 10b acquires the inter-domain path model 301, it performs one-time inference, generates an inter-domain path scheme 303 according to the policy, and sends these paths to each edge router 101 in the local domain for end-to-end path selection .
  • the intra-domain training subsystem 10a and the inter-domain training subsystem 1 perform training on the intra-domain path model 403 and the inter-domain path model 303 respectively, so as to respond to dynamic burst traffic in a timely manner.
  • the trained intra-domain path model 403 and inter-domain path model 303 are delivered to the corresponding edge router 101 and the inter-domain inference subsystem 10b respectively, and the routing path adjustment scheme is determined by distributed inference.
  • FIG. 7B is a flowchart on the router side of the method for determining a distributed route provided by an embodiment of the present application.
  • the method for obtaining a route may include but not limited to steps S100B to S300B.
  • Step S100B acquire training information, and send the training information to the training system.
  • the training information includes intra-domain traffic data of the domain where the edge router 101 is located and inter-domain delay data corresponding to the edge router.
  • the edge router 101 can periodically collect training information, upload it to the intra-domain training subsystem 10a of the domain where it is located, and forward it to the inter-domain training subsystem 1, and the intra-domain training subsystem 10a and the inter-domain training subsystem 1 respectively target
  • the intra-domain path model 403 and the inter-domain path model 303 are trained to respond to dynamic burst traffic in a timely manner.
  • each edge router 101 can also collect the real historical traffic demand data of the domain in the past period of time (generally one month), and periodically send these data through the traffic collection interface to train in the domain.
  • Subsystem 10a for example, the traffic data of each flow on each router in the domain is decompressed and aggregated to form the entire domain path traffic in each cycle, which is saved to the file system as original data).
  • the generation of training data is started to form the data set required for training.
  • the real historical traffic demand data is the traffic matrix of each hour in 10 days, thus forming 240 training samples (referred to as TM, which contain the end-to-end traffic information of all edge routers in the domain, such as the The traffic in the traffic collection period is 1000MB, the traffic between the first end and the third end in the traffic collection period is 2000MB, etc.).
  • the collection module 101a of the edge router 101 collects the delay data, it can be uploaded to the delay collection interface of the corresponding inter-domain reasoning subsystem 10b, and forwarded to the inter-domain training subsystem 1 for delay aggregation. After the time delay data of each domain is aggregated, the aggregated data can be saved as the original data set at a specific time.
  • the collection module 101a of each edge router 101 may also periodically collect real historical traffic demand data and aggregate it into a form of traffic demand matrix 203 . Then, the historical traffic demand data is replayed to simulate the historical operation of the network.
  • the raw data set at a specific moment may be analyzed, and based on the analysis result, inter-domain training data information may be sent to trigger reinforcement learning training of the inter-domain path model 301 .
  • the trigger condition of the reinforcement learning training of the inter-domain path model 301 is that the analysis result shows that the delay change in the acquisition cycle exceeds the threshold of the delay in the previous training cycle (the threshold can be configured according to training experience) , the training process is triggered, and this cycle delay is recorded as the current training cycle delay.
  • the trigger condition may also be a change in path asset data (for example, addition or deletion of routing nodes). At this point, inter-domain training data information will be sent.
  • Step S200B receiving the intra-domain path model 403 sent by the training system, adjusting the intra-domain path according to the intra-domain path model 403 to generate an intra-domain path plan 404, the intra-domain path model 403 is generated by the training system according to the intra-domain traffic data training ;
  • receiving the inter-domain path scheme 303 sent by the training system the inter-domain path scheme 303 is generated by the inter-domain path model 301, and the inter-domain path model 301 is generated by the training system according to the inter-domain delay Data training generation.
  • the intra-domain path model 403 can be trained interactively and iteratively in the simulated network formed in the previous step.
  • the intra-domain training subsystem 10a sends it to the reasoning module 101c of the edge router 101 to form the intra-domain path scheme as shown in FIG. Select a path.
  • the intra-domain training subsystem 10b may be deployed in a distributed manner.
  • the intra-domain training subsystem 10b may perform multi-GPU or multi-machine cooperative training based on a parameter server (for example, based on a microservice for GPU training supported by nvidia-docker).
  • each training and inference algorithm can be packaged into docker and provide a unified access method (for example, pv/pvs volume mapping access), so as to coordinate with the parameter server and nvidia-docker for scheduling to complete distributed training.
  • distributed training can be performed in the following manner: First, generate multiple agents (Agents) based on the number of edge routers to represent corresponding edge routers respectively; then, corresponding to each TM, the Agent can Iteratively adopts multiple strategies to perform multiple iterations, so as to perform segmentation operations on the traffic of the TM on the Agent, and input the actions into the network simulation environment; secondly, the network simulation environment calculates the traffic flow of the TM on each link according to the operations of all Agents.
  • Agents agents based on the number of edge routers to represent corresponding edge routers respectively
  • the Agent can Iteratively adopts multiple strategies to perform multiple iterations, so as to perform segmentation operations on the traffic of the TM on the Agent, and input the actions into the network simulation environment;
  • the models trained by the Agent are delivered to corresponding edge routers 101 by the training system.
  • the intra-domain controller can generate simulated network elements and interface configurations according to the topology relationship when simulating and reproducing the historical operating conditions of the network.
  • the training data information in the domain can be sent in response to changes in the topological relationship of the domain.
  • the intra-domain training subsystem 10a may use the historical traffic demand matrix to drive the training system to perform multiple rounds of training.
  • the in-domain training subsystem 10a may perform processing such as algorithm mining, algorithm exploration, and big data analysis on the execution of the historical traffic demand matrix.
  • the intra-domain path model 403 obtained through training may be parameters of machine learning intelligent models representing edge routers that have obtained good evaluations in each traffic demand matrix under a given topology environment.
  • the edge router 101 determines a routing path based on the received inter-domain path solution and the generated intra-domain path solution. In some embodiments of the present application, after each edge router 101 obtains its own model, it performs inference and decision-making tasks according to the local network traffic of the edge router 101 to generate a traffic split ratio and drive packet routing according to the traffic split ratio.
  • the edge router 101 periodically collects training information, uploads it to the intra-domain training subsystem 10 a of the domain where it is located, and forwards it to the inter-domain training subsystem 1 .
  • the trained intra-domain path model 403 and inter-domain path model 303 are sent to the corresponding edge router 101 and inter-domain inference subsystem 10b respectively, and the routing path adjustment scheme is determined by distributed inference, thereby solving the problem of untimely inference of traffic dynamic changes problems, and ensure the load balance of network links, and respond to dynamic burst traffic in real time.
  • the application scenario of the intelligent routing training system described in the embodiment of this application is to more clearly illustrate the technical solution of the embodiment of the application, and does not constitute a limitation to the technical solution provided by the embodiment of the application.
  • the intelligent routing training system can be applied to networks such as IP networks, SDH networks, PTN networks, IPRAN networks, and OTN networks.
  • the sub-method may include but not limited to the following steps.
  • step S202 the time delay data collected by the collection module 101a of each edge router 101 is summarized into an original data set at a specific time, so that the inter-domain path training subsystem 1 performs reinforcement learning training based on the original data set at a specific time to Generate an inter-domain path model 301 for each edge router 101
  • Step S204 after the collection module 101a of the edge router 101 collects the delay data, it can upload it to the delay collection interface of the corresponding inter-domain reasoning subsystem 10b, and forward it to the inter-domain training subsystem 1 for delay aggregation.
  • the aggregated data can be saved as the original data set at a specific time.
  • the raw data set at a specific moment may be analyzed, and based on the analysis result, inter-domain training data information may be sent to trigger reinforcement learning training of the inter-domain path model 301 .
  • the trigger condition of the reinforcement learning training of the inter-domain path model 301 is that the analysis result shows that the delay change in the acquisition cycle exceeds the threshold of the delay in the previous training cycle (the threshold can be configured according to training experience) , the training process is triggered, and this cycle delay is recorded as the current training cycle delay.
  • the trigger condition may also be a change in path asset data (for example, addition or deletion of routing nodes). Otherwise, the training process ends.
  • the trained model is delivered by the inter-domain training subsystem 1 to the inter-domain reasoning subsystem 1b of each domain.
  • the inter-domain reasoning subsystem 1b of each domain will save the model and trigger the reasoning process.
  • the inter-domain path solution 303 formed based on the inter-domain path model 301 and the inference policy will be delivered to all edge routers 101 in the domain where the inter-domain inference subsystem 1b is located.
  • the submethod may include but not limited to the following steps.
  • Step S212 the collection module 101a of each edge router 101 can collect the real historical traffic demand data of the domain in the past period of time (usually one month), and periodically send these data to the intra-domain training subsystem 10a through the traffic collection interface .
  • step S214 the flow data of each flow on each router in the domain is decompressed and aggregated to form the flow of the entire domain path in each period, which is saved to the file system as original data.
  • the collection module 101a of each edge router 101 regularly collects real historical traffic demand data and aggregates it into a form of traffic demand matrix 203 .
  • the data collected by all edge routers 101 in the same domain can be trained and inferred using reinforcement learning algorithms.
  • the data collected by the collection module 101a of the edge router 101 can be aggregated according to the time dimension to form a traffic demand matrix, so that only data with representative characteristics are selected to solve the problem of machine learning training. Slow, and the problem of collecting a large amount of data.
  • the aggregation of collected data may adopt a principal component analysis method to select typical characteristic data.
  • the intra-domain training data information may be sent based on the elapse of a preset training cycle time by the timer of the intra-domain training subsystem.
  • the trigger condition may also be an intra-domain configuration data change.
  • step S216 the historical traffic demand data is replayed, so as to simulate and reproduce the historical operating conditions of the network.
  • the intra-domain path model 403 can be trained interactively and iteratively in this simulated network.
  • the intra-domain training subsystem 10a sends it to the reasoning module 101c of the edge router 101 to form the intra-domain path scheme as shown in FIG. Select a path.
  • the intra-domain training subsystem 10b may be deployed in a distributed manner.
  • the intra-domain training subsystem 10b may perform multi-GPU or multi-machine cooperative training based on a parameter server (for example, based on a microservice for GPU training supported by nvidia-docker).
  • a parameter server for example, based on a microservice for GPU training supported by nvidia-docker.
  • each training and inference algorithm can be packaged into docker, and it can be scheduled with parameter server and nvidia-docker to complete distributed training.
  • the router device and the routing training system jointly build an intelligent routing system.
  • the intelligent routing function is only deployed on the edge router 101 .
  • the routing training system adopts a group management and control method, so that when there are a large number of router nodes across domains, they can be divided into multiple router groups.
  • Each group has an in-domain training subsystem 10a to perform machine learning on the data plane in the group, so as to avoid processing overload caused by only setting up a single machine learning center.
  • the intra-domain controller can generate simulated network elements and interface configurations according to the topology relationship when simulating and reproducing the historical operating conditions of the network.
  • the training data information in the domain can be sent in response to changes in the topological relationship of the domain.
  • the intra-domain training subsystem 10a may use the historical traffic demand matrix to drive the training system to perform multiple rounds of training.
  • the in-domain training subsystem 10a may perform processing such as algorithm mining, algorithm exploration, and big data analysis on the execution of the historical traffic demand matrix.
  • the intra-domain path model 403 obtained through training may be parameters of machine learning intelligent models representing edge routers that have obtained good evaluations in each traffic demand matrix under a given topology environment.
  • the computing device of this embodiment includes: a processing unit 1000 , a memory 1010 , and a computer program stored in the memory and operable on the processor, such as a distributed routing determination program.
  • the processing unit 1000 executes the computer program, it implements the steps in the embodiments of the above-mentioned distributed route determination methods, for example, steps S100 to S300 shown in FIG. 7 .
  • the computer program may be divided into one or more modules/units, and the one or more modules/units are stored in the memory and executed by the processor to complete the present application.
  • the one or more modules/units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the distributed route determination device.
  • the distributed routing determination device is shown in the form of a dedicated computer system.
  • Components of the distributed route determination device may include, but are not limited to, one or more processors or processing units 1000 , a system memory 1010 , and a bus 1015 coupling various system components including memory 1010 to processor 1000 .
  • Bus 1015 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures .
  • bus architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (Video Electronics Standards Association, VESA) local bus and peripheral component interconnect (Peripheral Component Interconnect, PCI) bus.
  • One or more processing units 1000 may execute computer programs stored in memory 1010 .
  • the routines of particular embodiments may be implemented using any suitable programming language, including C, C++, Java, assembly language, and the like. Different programming techniques can be used, such as procedural or object-oriented.
  • a routine can execute on a single computing device or on multiple computing devices. Furthermore, multiple processing units 1000 may be used.
  • Computing devices typically include various computer system readable media. Such media can be any available media that can be accessed by a computing device, and such media includes both volatile and nonvolatile media, and both removable and non-removable media.
  • System memory 1010 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 1020 and/or cache memory 1030 .
  • the computing device may also include other removable/non-removable, volatile/nonvolatile computer system storage media.
  • storage system 1040 may be provided for reading from and writing to non-removable, non-volatile magnetic media (not shown and commonly referred to as a "hard drive”).
  • a disk drive may be provided for reading from and writing to removable non-volatile disks (e.g., "floppy disks"), as well as for An optical drive that reads from or writes to removable non-volatile optical discs or other optical media.
  • memory 1010 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments described in this disclosure.
  • a program/utility having a set (at least one) of program modules 1055 may be stored in memory 1010, along with an operating system, one or more software applications, other program modules, and program data.
  • Each of the operating system, one or more application programs, other program modules, and program data, or some combination thereof, may comprise an implementation of a networked environment.
  • the distributed route determination device may also communicate with one or more external devices 1070, such as a keyboard, pointing device, display, etc.; one that enables a user to interact with a computing device or multiple devices; and/or any device (eg, network card, modem, etc.) that enables a computing device to communicate with one or more other computing devices. Such communication may occur via one or more input/output interfaces 1060 .
  • external devices 1070 such as a keyboard, pointing device, display, etc.
  • any device eg, network card, modem, etc.
  • Such communication may occur via one or more input/output interfaces 1060 .
  • the distributed route determination device may communicate via network adapter 1080 with one or more networks, such as a local area network (LAN), a general wide area network (WAN), and/or a public network (eg, , Internet). As depicted, network adapter 1080 communicates with other components of the computing device via bus 1015 . It should be understood that, although not shown, other hardware components and/or software components may be used in conjunction with the distributed route determination device. Examples include, but are not limited to, microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data archival storage systems, among others.
  • the embodiment of the present application includes a distributed routing determination method, wherein the training system receives training information sent by multiple edge routers, and the training information includes the traffic data in the domain where the edge router is located and the domain corresponding to the edge router time delay data; generate an intra-domain path model according to the intra-domain traffic data training, and send the intra-domain path model to the corresponding edge router, so that the edge router can adjust the intra-domain path according to the intra-domain path model to generate an intra-domain path solution; generate an inter-domain path model according to the inter-domain delay data training, and generate an inter-domain path solution according to the inter-domain path model; send the inter-domain path solution to the corresponding edge router for all
  • the edge router determines a routing path according to the inter-domain path scheme and the intra-domain path scheme.
  • the embodiment of the present application also includes a distributed route determination method, wherein the router obtains training information and sends the training information to the training system, the training information includes the intra-domain traffic data of the domain where the edge router is located and the edge router Corresponding inter-domain delay data; receiving the intra-domain path model sent by the training system, adjusting the intra-domain path according to the intra-domain path model to generate an intra-domain path plan, the intra-domain path model is trained by the training system according to the intra-domain traffic data generating; receiving the inter-domain path scheme sent by the training system, the inter-domain path scheme is generated by an inter-domain path model, and the inter-domain path model is generated by the training system according to the inter-domain delay data training; according to The inter-domain path scheme and the intra-domain path scheme determine a routing path.
  • the training of the routing path selection model is logically centralized offline training, and the trained model is correspondingly distributed to the inter-domain reasoning subsystem of each domain or each edge router to perform distributed reasoning to determine the preferred routing path. Therefore, the technical solution provided by the embodiment of the present application aims at the slow speed of machine learning training, but based on the characteristics of fast reasoning speed of the trained model, by separating the training and reasoning process, the model can be continuously updated and learned based on historical data, and timely Respond to dynamic changes in network traffic.
  • the application may be a system, method and/or computer program product at any possible level of integration of technical details.
  • a computer program product may include a computer-readable storage medium (or multiple media) having computer-readable program instructions thereon for causing a processor to perform aspects of the present application.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example and without limitation, electronic storage, magnetic storage, optical storage, electromagnetic storage, semiconductor storage, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of computer-readable storage media includes the following: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disc, mechanically encoded device such as punched card or raised structures of instructions and any suitable combination of the foregoing.
  • computer-readable storage media should not be construed as transient signals per se, such as radio waves, or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables) Or an electrical signal transmitted through a wire.
  • the computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a corresponding computing device/processing device, or to an external computer or external storage device via a network (such as the Internet, a local area network, a wide area network, and/or a wireless network).
  • the network may use copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing device/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing device/processing device.
  • the computer readable program instructions for carrying out the operations of the present application may be assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state setting data, configuration data for integrated circuits , or source or object code written in any combination of one or more programming languages (including object-oriented programming languages such as Smalltalk, C++, etc.; and procedural programming languages such as the "C" programming language or similar programming languages ).
  • the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider) .
  • an electronic circuit comprising, for example, a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA) may personalize the electronic circuit by utilizing state information of computer-readable program instructions to personalize the electronic circuit.
  • Computer readable program instructions are executed to carry out aspects of the application.
  • These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device create a Means for function/action specified in one or more blocks of diagrams and/or block diagrams.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium that can direct a computer, programmable data processing device, and/or other apparatus to function in a specific manner such that the instructions stored therein
  • the computer-readable storage medium includes an article of manufacture comprising instructions implementing aspects of the flowchart and/or the functions/acts specified in one or more block diagram blocks.
  • Computer-readable program instructions can also be loaded onto a computer, other programmable data processing equipment, or other devices, so that a series of operation steps can be executed on the computer, other programmable devices, or other devices, thereby producing a computer-implemented process, making the Instructions executed on computers, other programmable devices, or other devices implement the functions/actions specified in one or more blocks of the flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, section, or portion of instructions, which includes one or more executable instructions for implementing the specified logical function or functions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

一种分布式路由确定方法、电子设备及存储介质,方法包括:接收多个边缘路由器发送的训练信息,所述训练信息包括所述边缘路由器所在域的域内流量数据和所述边缘路由器对应的域间时延数据;根据所述域内流量数据训练生成域内路径模型,将所述域内路径模型发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域内路径模型调整域内路径生成域内路径方案;根据所述域间时延数据训练生成域间路径模型,根据所述域间路径模型生成域间路径方案;将所述域间路径方案发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域间路径方案和所述域内路径方案确定路由路径。

Description

分布式路由确定方法、电子设备及存储介质
相关申请的交叉引用
本申请基于申请号为202110738084.9、申请日为2021年6月30日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及路由技术,尤其涉及一种分布式机器学习路由确定方法、电子设备及存储介质。
背景技术
随着网络流量爆炸式增长,网络底层的路由交换设备呈指数级增加。相应地,路由设备承载的业务日趋复杂,从而需要高效的路由控制方法转发数据,以提高网络通信的质量和效率。
当前,互联网的路由控制主要采用“基于拓扑的静态路由”和“基于规划的流量工程”方法。因为静态路由选择不考虑流量状态,所以该类方案一般需要预留足够的网络带宽以应对突发流量,从而造成当前网络的最高链路利用率很低。由于基于规划的流量工程则需要收集全网流量状态信息,并且规划求解耗时较长,因此这类方案难以对动态性强的网络流量状态进行实时响应。
近年来,将机器学习方法应用到网络中的讨论吸引了众多技术人员的关注。针对网络状态难以预测、现有模型普适性较低等问题,基于机器学习的路由控制充分发挥了机器学习算法建模能力强和推理速度快特点,从而快速生成路由路径。此外,部署有机器学习模型的智能路由器能够根据基于机器学习的路由模型和本地网络流量状态,及时调整路由规则,从而解决网络链路的负载不均衡,以及对动态突发流量的反应慢等问题。
基于机器学习的路由算法需要基于网络流量的历史数据进行不断训练。然而,机器学习训练系统需要训练和推理放在中心节点以全网数据进行训练推理,而网络流量的动态变化要求对路由模型进行及时更新,以及时响应网络流量变化。具体地,推理节点首先收集各个节点的即时数据,并按时间同步汇总数据形成特征数据;然后,推理节点运行推理逻辑,并将推理结果给下发各个节点;最后,各个节点根据推理结果设定包的路由。这个过程难以满足流量变化的即时性要求。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本申请实施例提供了一种分布式路由确定方法、电子设备及存储介质,通过将路由训练模型与路由推理过程分开,从而能够及时响应网络流量变化。
第一方面,本申请实施例提出一种分布式路由确定方法,包括以下步骤:接收多个边缘路由器发送的训练信息,所述训练信息包括所述边缘路由器所在域的域内流量数据和所述边 缘路由器对应的域间时延数据;根据所述域内流量数据训练生成域内路径模型,将所述域内路径模型发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域内路径模型调整域内路径生成域内路径方案;根据所述域间时延数据训练生成域间路径模型,根据所述域间路径模型生成域间路径方案;将所述域间路径方案发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域间路径方案和所述域内路径方案确定路由路径。
第二方面,本申请实施例还提出了一种分布式路由确定方法,包括以下步骤:获取训练信息,将所述训练信息发送至训练系统,所述训练信息包括所述边缘路由器所在域的域内流量数据和所述边缘路由器对应的域间时延数据;接收所述训练系统发送的域内路径模型,根据所述域内路径模型调整域内路径生成域内路径方案,所述域内路径模型由所述训练系统根据所述域内流量数据训练生成;接收所述训练系统发送的域间路径方案,所述域间路径方案由域间路径模型生成,所述域间路径模型由所述训练系统根据所述域间时延数据训练生成;根据所述域间路径方案和所述域内路径方案确定路由路径。
第三方面,本申请实施例还提出了一种电子设备,包括存储器、处理器。所述存储器存储有计算机程序。所述处理器执行所述计算机程序时实现前述路由路径确定方法。
本申请实施例还提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行如上所述的分布式路由确定方法。
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。
图1所示为根据本申请实施例的分布式路由确定系统的网络拓扑图;
图2所示为根据本申请实施例的域间路径训练期间的数据交换示意图;
图3所示为根据本申请实施例的域内路径训练期间的数据交换示意图;
图4所示为根据本申请实施例的域内路径推理期间的数据交换示意图;
图5所示为根据本申请实施例的域间路径推理期间的数据交换示意图;
图6所示为根据本申请实施例的分布式路由确定系统的数据交换逻辑图;
图7A所示为根据本申请实施例的分布式路由确定方法的训练系统侧流程图;
图7B所示为根据本申请实施例的分布式路由确定方法的路由器侧流程图;
图8所示为根据本申请实施例的域间路径训练子方法的流程图;
图9所示为根据本申请实施例的域内路径训练子方法的流程图;
图10所示为可以用于实现本文所描述的各种特征和过程的代表性计算装置的框图。
具体实施方式
以下将结合实施例和附图对本申请的构思、具体结构及产生的技术效果进行清楚、完整的描述,以充分地理解本申请的目的、方案和效果。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。附图中各处使用的相同的附图标记指示相 同或相似的部分。
需要说明的是,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
本申请提供了一种分布式路由确定系统、方法及存储介质,通过由边缘路由器周期性采集训练信息,上传到所在域的域内训练子系统并转发至域间训练子系统,由域内训练子系统和域间训练子系统分别针对域内路径模型和域间路径模型进行训练,以及时响应动态突发流量;同时,训练所得的域内路径模型和域间路径模型分别下发到相应的边缘路由器和域间推理子系统,以分布式推理确定路由路径调整方案。因此,本申请的技术方案针对机器学习训练速度慢,但是基于训练所得模型推理速度快的特点,将模型训练和基于模型的推理过程分开,从而使得模型能够基于历史数据不断更新学习,以及时响应网络流量动态变化。
下面结合附图,对本申请实施例作进一步阐述。
图1所示为根据本申请实施例的分布式路由确定系统的网络拓扑图。在图1所示的实施例中,该系统包括域间训练子系统1、域内训练子系统10a、域间推理子系统10b和边缘控制器101。在一些实施例中,域间训练子系统1可设置在多个域2之间的联邦控制器(未示出)中,而域内训练子系统10a和域间推理子系统10b可设置在各个域2的域内控制器(未示出)中。在另一些实施例中,域间训练子系统1和域内训练子系统10a可以分别在每个域中各部署一套。后者这种联邦式部署方式保证了域间路径模型301和域内路径模型403的及时性,并减少了中间数据的传输。在这些实施例中,联邦控制器和域内控制器可以提供rest服务作为对外接口,以方便用户通过web方式配置和管理整个训练系统,以及浏览训练结果。进一步地,联邦控制器和域内控制器之间可以基于restapi和gRPC协议进行通讯,而所述域内控制器和所在域的边缘路由器之间可以采用gRPC协议进行通讯。可替代地或附加地,边缘路由器可通过统一的拓扑、流量、时延和模型收发接口,与个训练子系统交换数据。这些实施例能够达到逻辑上集中路由训练,物理上分散推理的目的,使得智能路由系统能够及时响应网络流量的动态变化。
在一些实施例中,在全网中仅设置单个域间训练子系统1在中心机房。边缘路由器101布置在各个域的边缘作为应用端流量入口,并与所在域的域内控制器相连。因此,边缘路由器101为智能路由器,在其上部署机器学习智能模型,从而能够利用机器学习方法周期性地根据本地网络流量状态在线执行路由决策推理任务,并输出在节点之间的备选路径上的流量分割比,以保证路由效率。因为在路由器对备选路径进行调整的开销很大,而只在备选路径上执行流量分割的调整的开销很小(固定备选路径)。因此,在一些实施例中,在训练过程开始前,备选路径可通过K条最短路径(K-Shortest-Paths,KSP)或无关路由等算法而初始化。然后,边缘路由器101的路由模块101b可以基于流量分割比而及时调整包路由。域2内的其它路由器为中间路由器102。在某些实施例中,中间路由器102上可不部署机器学习智能模型,从而其只需根据包头的路径信息执行分段路由逻辑,而不执行任何路由决策推理任务。
结合图1和图3,在本申请的一些实施例中,域内训练子系统10a响应于域内训练数据信息,基于所在域的边缘路由器所采集的时延数据201,离线执行强化学习(Reinforcement Learning,简称RL,即以“试错”的方式进行机器学习,其通过与环境进行交互所获得奖赏,并以旨在最大化所获得奖赏的方式指导机器学习)训练以生成域内路径模型403,并在强化 学习训练完成后将所述域内路径模型403下发到所在域的各个边缘路由器101。相应的边缘路由器101基于所述域内路径模型403分布式推理生成域内路径方案404以调整域内路径。结合图1和图2,在本申请的一些实施例中,域间训练子系统10b响应于域间训练数据信息,基于各域转发的时延数据401,离线训练生成域间路径模型301,并将所述域间路径模型301下发到各个域的域间推理子系统10b。所述域间推理子系统10b获取到域间路径模型301后,进行一次性推理,根据策略生成域间路径方案303,并将这些路径发给本域内各边缘路由器101用于端到端路径选择。在图1所示的实施例中,边缘路由器101周期性采集训练信息,上传到所在域的域内训练子系统10a并转发至域间训练子系统1,由域内训练子系统10a和域间训练子系统1分别针对域内路径模型403和域间路径模型303进行训练,以及时响应动态突发流量。同时,训练所得的域内路径模型403和域间路径模型303分别下发到相应的边缘路由器101和域间推理子系统10b,以分布式推理确定路由路径调整方案。本申请实施例描述的智能路由训练系统的应用场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定。本领域技术人员可知,随着具体智能路由的应用网络场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。例如,作为非限制性示例,智能路由训练系统可应用于如下网络,如IP网络、SDH网络、PTN网络、IPRAN网络和OTN网络等。
本领域技术人员可以理解的是,图1中示出的拓扑结构并不构成对本申请实施例的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
基于上述网络拓扑的结构,提出本申请的分布式路由确定系统的各个实施例。
参照图2所示的域间路径训练期间的数据交换示意图、图5所示的域间路径推理期间的数据交换示意图和图6所示的数据交换逻辑图,在本申请的一些实施例中,边缘路由器101的采集模块101a采集时延数据后,可上传到相应的域间推理子系统10b的时延采集接口,并转发到域间训练子系统1进行时延聚合。每个域的时延数据被汇总后,汇聚数据可保存为某个特定时刻原始数据集。
在本申请的一些实施例中,所述特定时刻原始数据集可被分析,并基于分析结果发送域间训练数据信息以触发域间路径模型301的强化学习训练。在本申请的一些实施例中,域间路径模型301的强化学习训练的触发条件是分析结果表明采集周期内的时延变化超过前一个训练周期时延的阈值(该阈值可根据训练经验配置),则触发训练过程,并记录此周期时延为当前训练周期时延。可替代地或附加地,触发条件也可以是路径资产数据发生变更(例如路由节点的增加或删除)。此时,域间训练数据信息将被发出。
训练完毕的模型由域间训练子系统1下发到各个域的域间推理子系统1b。在一些实施例中,各个域的域间推理子系统1b接收到模型后,将保存模型,并触发推理流程。基于域间路径模型301和推理策略形成的域间路径方案303将下发到域间推理子系统1b所在域的所有的边缘路由器101。
参照图3所示的域内路径训练期间的数据交换示意图、图4所示的域内路径推理期间的数据交换示意图和图6所示的数据交换逻辑图,在本申请的一些实施例中,每个边缘路由器101的采集模块101a可采集所在域过去一段时间(一般为一个月)内的真实历史流量需求数据,并周期性地通过流量采集接口将这些数据发送域内训练子系统10a(例如,域内各个路由器上各个流的流量数据被解压、汇总,形成各个周期整个域路径流量,作为原始数据保存 到文件系统)。
每个所述边缘路由器101的采集模块101a定期采集真实历史流量需求数据被聚合成流量需求矩阵203的形式。然后,历史流量需求数据被执行流量重放,以对网络的历史运行情况进行模拟再现。在本申请的一些实施例中,同一域内所有边缘路由器101所采集的数据可使用强化学习算法进行训练、推理。然而,在本申请的其它实施例中,边缘路由器101的采集模块101a所采集的数据可按照时间维度进行聚合以形成流量需求矩阵,使得只有具有代表性特征的数据被选中,以解决机器学习训练慢,而采集数据量大的问题。例如,在本申请的一些实施例中,所采集数据的聚合可采用主成分分析方法,以选择典型特征数据。
在本申请的一些实施例中,域内训练数据信息可基于域内训练子系统的定时器经过预设的训练周期时间而发出。可替代地或附加地,触发条件也可以是域内配置数据变更。
随后,域内路径模型403可在此模拟网络中进行交互迭代式训练。域内路径模型403的强化学习训练完成后,由域内训练子系统10a下发到边缘路由器101的推理模块101c以形成如图4所示的域内路径方案,其中,路由器之间的虚线连接表示域内备选路径。在本申请的一些实施例中,域内训练子系统10b可采用分布式部署。例如,作为非限制性示例,域内训练子系统10b可以基于参数服务器(例如,基于nvidia-docker所支持的GPU训练的微服务)进行多GPU或多机协同训练。此时,每个训练和推理算法可被封装至docker中并提供统一访问方式(例如,pv/pvs卷映射访问),从而配合参数服务器和nvidia-docker进行调度,以完成分布式训练。
在本申请的一些实施例中,域内控制器可在对网络的历史运行情况进行模拟再现时,根据拓扑关系生成仿真网元和接口配置。域内训练数据信息可响应于所在域的拓扑关系产生变化而发出。此时,域内训练子系统10a可使用历史流量需求矩阵驱动训练系统进行多轮训练。例如,作为非限制性示例,域内训练子系统10a可对历史流量需求矩阵执行进行算法挖掘、算法探索和大数据分析等处理。相应地,训练所得的域内路径模型403可以是在给定拓扑环境下,在各流量需求矩阵中获得好评价的各个代表边缘路由器的机器学习智能模型的参数。
继续参考图1所示网络拓扑的结构,提出本申请的分布式路由确定方法的各个实施例。
图7A是本申请的实施例所提供的分布式路由确定方法的训练系统侧流程图,该路径获取方法可以包括但不限于有步骤S100A~步骤S300A。
步骤S100A,接收多个边缘路由器101发送的训练信息。所述训练信息包括所述边缘路由器101所在域2的域内流量数据和所述边缘路由器101对应的域间时延数据。
在本步骤中,边缘路由器101可以周期性采集训练信息,上传到所在域的域内训练子系统10a并转发至域间训练子系统1,由域内训练子系统10a和域间训练子系统1分别针对域内路径模型403和域间路径模型303进行训练,以及时响应动态突发流量。在本申请的一些实施例中,每个边缘路由器101还可采集所在域过去一段时间(一般为一个月)内的真实历史流量需求数据,并周期性地通过流量采集接口将这些数据发送域内训练子系统10a(例如,域内各个路由器上各个流的流量数据被解压、汇总,形成各个周期整个域路径流量,作为原始数据保存到文件系统)。
在本申请的一些实施例中,边缘路由器101的采集模块101a采集时延数据后,可上传到相应的域间推理子系统10b的时延采集接口,并转发到域间训练子系统1进行时延聚合。每个域的时延数据被汇总后,汇聚数据可保存为某个特定时刻原始数据集。
在本申请的一些实施例中,所述特定时刻原始数据集可被分析,并基于分析结果发送域间训练数据信息以触发域间路径模型301的强化学习训练。在本申请的一些实施例中,域间路径模型301的强化学习训练的触发条件是分析结果表明采集周期内的时延变化超过前一个训练周期时延的阈值(该阈值可根据训练经验配置),则触发训练过程,并记录此周期时延为当前训练周期时延。可替代地或附加地,触发条件也可以是路径资产数据发生变更(例如路由节点的增加或删除)。此时,域间训练数据信息将被发出。
步骤S200A,根据所述域内流量数据训练生成域内路径模型403,将所述域内路径模型403发送至对应的所述边缘路由器101,以供所述边缘路由器101根据所述域内路径模型403调整域内路径以生成域内路径方案404;同时,根据所述域间时延数据训练生成域间路径模型403,根据所述域间路径模型403生成域间路径方案404。
在本步骤中,域间训练子系统和域间推理子系统相互协作,以分别在逻辑上集中训练域间路径模型301和域内路径模型403,并将训练完毕的模型发送至相应的域间推理子系统10b和边缘路由器101。
具体地,结合图1所示的网络拓扑图,以及图2和图3所示的数据交换示意图,在本申请的一些实施例中,域内训练子系统10a响应于域内训练数据信息,基于所在域的边缘路由器所采集的时延数据201,离线执行强化学习训练以生成域内路径模型403,并在强化学习训练完成后将所述域内路径模型403下发到所在域的各个边缘路由器101。
结合图1和图2,在本申请的一些实施例中,域间训练子系统10b响应于域间训练数据信息,基于各域转发的时延数据401,离线训练生成域间路径模型301,并将所述域间路径模型301下发到各个域的域间推理子系统10b。所述域间推理子系统10b获取到域间路径模型301后,进行一次性推理,根据策略生成域间路径方案303,并将这些路径发给本域内各边缘路由器101用于端到端路径选择。
步骤S300A,将所述域间路径方案303发送至对应的所述边缘路由器101,以供所述边缘路由器101根据所述域间路径方案303和所述域内路径方案404确定路由路径。
结合图1和图2,在本申请的一些实施例中,域间训练子系统10b响应于域间训练数据信息,基于各域转发的时延数据401,离线训练生成域间路径模型301,并将所述域间路径模型301下发到各个域的域间推理子系统10b。所述域间推理子系统10b获取到域间路径模型301后,进行一次性推理,根据策略生成域间路径方案303,并将这些路径发给本域内各边缘路由器101用于端到端路径选择。
在图7A所示的实施例中,域内训练子系统10a和域间训练子系统1分别针对域内路径模型403和域间路径模型303进行训练,以及时响应动态突发流量。同时,训练所得的域内路径模型403和域间路径模型303分别下发到相应的边缘路由器101和域间推理子系统10b,以分布式推理确定路由路径调整方案。
图7B是本申请的实施例所提供的分布式路由确定方法的路由器侧流程图,该路径获取方法可以包括但不限于有步骤S100B~步骤S300B。
步骤S100B,获取训练信息,将所述训练信息发送至训练系统。所述训练信息包括所述边缘路由器101所在域的域内流量数据和所述边缘路由器对应的域间时延数据。
在本步骤中,边缘路由器101可以周期性采集训练信息,上传到所在域的域内训练子系统10a并转发至域间训练子系统1,由域内训练子系统10a和域间训练子系统1分别针对域内 路径模型403和域间路径模型303进行训练,以及时响应动态突发流量。在本申请的一些实施例中,每个边缘路由器101还可采集所在域过去一段时间(一般为一个月)内的真实历史流量需求数据,并周期性地通过流量采集接口将这些数据发送域内训练子系统10a(例如,域内各个路由器上各个流的流量数据被解压、汇总,形成各个周期整个域路径流量,作为原始数据保存到文件系统)。例如,训练工作流管理触发后,启动训练数据生成,形成训练所需要的数据集。例如,真实历史流量需求数据为10天内每个小时的流量矩阵,从而形成240个训练样本(简称为TM,包含本域所有边缘路由器的端到端流量信息,例如第一端与第二端之间在该流量采集周期内的流量为1000MB,第一端与第三端之间在该流量采集周期内的流量为2000MB等)。
参照图2所示的域间路径训练期间的数据交换示意图、图5所示的域间路径推理期间的数据交换示意图和图6所示的数据交换逻辑图,在本申请的一些实施例中,边缘路由器101的采集模块101a采集时延数据后,可上传到相应的域间推理子系统10b的时延采集接口,并转发到域间训练子系统1进行时延聚合。每个域的时延数据被汇总后,汇聚数据可保存为某个特定时刻原始数据集。每个所述边缘路由器101的采集模块101a也可以定期采集真实历史流量需求数据被聚合成流量需求矩阵203的形式。然后,历史流量需求数据被执行流量重放,以对网络的历史运行情况进行模拟再现。
在本申请的一些实施例中,所述特定时刻原始数据集可被分析,并基于分析结果发送域间训练数据信息以触发域间路径模型301的强化学习训练。在本申请的一些实施例中,域间路径模型301的强化学习训练的触发条件是分析结果表明采集周期内的时延变化超过前一个训练周期时延的阈值(该阈值可根据训练经验配置),则触发训练过程,并记录此周期时延为当前训练周期时延。可替代地或附加地,触发条件也可以是路径资产数据发生变更(例如路由节点的增加或删除)。此时,域间训练数据信息将被发出。
步骤S200B,接收所述训练系统发送的域内路径模型403,根据所述域内路径模型403调整域内路径生成域内路径方案404,所述域内路径模型403由所述训练系统根据所述域内流量数据训练生成;同时,接收所述训练系统发送的域间路径方案303,所述域间路径方案303由域间路径模型301生成,所述域间路径模型301由所述训练系统根据所述域间时延数据训练生成。
在本步骤中,域内路径模型403可在前一步骤形成的模拟网络中进行交互迭代式训练。域内路径模型403的强化学习训练完成后,由域内训练子系统10a下发到边缘路由器101的推理模块101c以形成如图4所示的域内路径方案,其中,路由器之间的虚线连接表示域内备选路径。在本申请的一些实施例中,域内训练子系统10b可采用分布式部署。例如,作为非限制性示例,域内训练子系统10b可以基于参数服务器(例如,基于nvidia-docker所支持的GPU训练的微服务)进行多GPU或多机协同训练。此时,每个训练和推理算法可被封装至docker中并提供统一访问方式(例如,pv/pvs卷映射访问),从而配合参数服务器和nvidia-docker进行调度,以完成分布式训练。具体地,在一些实施例中,分布式训练可以采用以下方式执行:首先,基于边缘路由器个数生成多个智能体(Agent)以分别代表对应的边缘路由器;然后,对应每个TM,Agent可迭代采用多个策略以执行多次迭代,从而对TM在本Agent上的流量执行分割操作,并将动作输入网络仿真环境;其次,网络仿真环境根据所有Agent的操作,计算TM在各个链路上的分配,得到各个链路的最大利用率,将全局最大 链路率下降作为返回给Agent的奖励,以驱动Agent进一步的迭代操作,直到达到最大奖励或最大配置迭代轮数;最后,Agent通过多组TM和多轮训练而得到各自的模型。这些由Agent训练所得的模型被训练系统下发到各相应的边缘路由器101。
在本申请的一些实施例中,域内控制器可在对网络的历史运行情况进行模拟再现时,根据拓扑关系生成仿真网元和接口配置。域内训练数据信息可响应于所在域的拓扑关系产生变化而发出。此时,域内训练子系统10a可使用历史流量需求矩阵驱动训练系统进行多轮训练。例如,作为非限制性示例,域内训练子系统10a可对历史流量需求矩阵执行进行算法挖掘、算法探索和大数据分析等处理。相应地,训练所得的域内路径模型403可以是在给定拓扑环境下,在各流量需求矩阵中获得好评价的各个代表边缘路由器的机器学习智能模型的参数。
步骤S300B,所述边缘路由器101基于所接收到的域间路径方案以及所生成的域内路径方案,确定路由路径。在本申请的一些实施例中,各个边缘路由器101得到自己的模型后,根据本边缘路由器101的本地网络流量执行推理决策任务,以生成流量分割比,并按流量分割比驱动包的路由。
在图7B所示的实施例中,边缘路由器101周期性采集训练信息,上传到所在域的域内训练子系统10a并转发至域间训练子系统1。同时,训练所得的域内路径模型403和域间路径模型303分别下发到相应的边缘路由器101和域间推理子系统10b,以分布式推理确定路由路径调整方案,从而解决流量动态变化推理不及时问题,并保证网络链路的负载均衡性,即时响应动态突发流量。
本申请实施例描述的智能路由训练系统的应用场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域技术人员可知,随着具体智能路由的应用网络场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。例如,作为非限制性示例,智能路由训练系统可应用于如下网络,如IP网络、SDH网络、PTN网络、IPRAN网络和OTN网络等。
参照图8所示的域间路径训练子方法的流程图和图6所示的数据交换逻辑图,在本申请的一些实施例中,该子方法可以包括但不限于以下步骤。
步骤S202,每个所述边缘路由器101的采集模块101a采集的时延数据被汇总为特定时刻原始数据集,使得所述域间路径训练子系统1基于特定时刻原始数据集进行强化学习训练,以针对每个边缘路由器101生成域间路径模型301
步骤S204,边缘路由器101的采集模块101a采集时延数据后,可上传到相应的域间推理子系统10b的时延采集接口,并转发到域间训练子系统1进行时延聚合。
每个域的时延数据被汇总后,汇聚数据可保存为某个特定时刻原始数据集。
在本申请的一些实施例中,所述特定时刻原始数据集可被分析,并基于分析结果发送域间训练数据信息以触发域间路径模型301的强化学习训练。
在本申请的一些实施例中,域间路径模型301的强化学习训练的触发条件是分析结果表明采集周期内的时延变化超过前一个训练周期时延的阈值(该阈值可根据训练经验配置),则触发训练过程,并记录此周期时延为当前训练周期时延。可替代地或附加地,触发条件也可以是路径资产数据发生变更(例如路由节点的增加或删除)。否则,训练流程结束。
步骤S206,训练完毕的模型由域间训练子系统1下发到各个域的域间推理子系统1b。在一些实施例中,各个域的域间推理子系统1b接收到模型后,将保存模型,并触发推理流程。 基于域间路径模型301和推理策略形成的域间路径方案303将下发到域间推理子系统1b所在域的所有的边缘路由器101。
参照图9所示的域内路径训练子方法的流程图和图6所示的数据交换逻辑图,在本申请的一些实施例中,该子方法可以包括但不限于以下步骤。
步骤S212,每个边缘路由器101的采集模块101a可采集所在域过去一段时间(一般为一个月)内的真实历史流量需求数据,并周期性地通过流量采集接口将这些数据发送域内训练子系统10a。
步骤S214,域内各个路由器上各个流的流量数据被解压、汇总,形成各个周期整个域路径流量,作为原始数据保存到文件系统。
每个所述边缘路由器101的采集模块101a定期采集真实历史流量需求数据被聚合成流量需求矩阵203的形式。在本申请的一些实施例中,同一域内所有边缘路由器101所采集的数据都可使用强化学习算法进行训练、推理。然而,在本申请的其它实施例中,边缘路由器101的采集模块101a所采集的数据可按照时间维度进行聚合以形成流量需求矩阵,使得只有具有代表性特征的数据被选中,以解决机器学习训练慢,而采集数据量大的问题。例如,在本申请的一些实施例中,所采集数据的聚合可采用主成分分析方法,以选择典型特征数据。
在本申请的一些实施例中,域内训练数据信息可基于域内训练子系统的定时器经过预设的训练周期时间而发出。可替代地或附加地,触发条件也可以是域内配置数据变更。
步骤S216,历史流量需求数据被执行流量重放,以对网络的历史运行情况进行模拟再现。域内路径模型403可在此模拟网络中进行交互迭代式训练。域内路径模型403的强化学习训练完成后,由域内训练子系统10a下发到边缘路由器101的推理模块101c以形成如图4所示的域内路径方案,其中,路由器之间的虚线连接表示域内备选路径。在本申请的一些实施例中,域内训练子系统10b可采用分布式部署。例如,作为非限制性示例,域内训练子系统10b可以基于参数服务器(例如,基于nvidia-docker所支持的GPU训练的微服务)进行多GPU或多机协同训练。此时,每个训练和推理算法可被封装至docker中,配合参数服务器和nvidia-docker进行调度,以完成分布式训练。这样,路由器设备和路由训练系统共同构建了智能路由系统。智能路由功能只部署在边缘路由器101上。相应地,路由训练系统采用分组管控方式,使得当跨域或存在大量路由器节点时,可以分成多个路由器组。每组分别由一个域内训练子系统10a对组内的数据平面进行机器学习,避免仅设置单个机器学习中心而导致处理超过负载。
在本申请的一些实施例中,域内控制器可在对网络的历史运行情况进行模拟再现时,根据拓扑关系生成仿真网元和接口配置。域内训练数据信息可响应于所在域的拓扑关系产生变化而发出。此时,域内训练子系统10a可使用历史流量需求矩阵驱动训练系统进行多轮训练。例如,作为非限制性示例,域内训练子系统10a可对历史流量需求矩阵执行进行算法挖掘、算法探索和大数据分析等处理。相应地,训练所得的域内路径模型403可以是在给定拓扑环境下,在各流量需求矩阵中获得好评价的各个代表边缘路由器的机器学习智能模型的参数。
图10是可以用于实现本文所描述的各种特征和过程的代表性计算装置的框图。如图10所示,该实施例的计算装置包括:处理单元1000、存储器1010以及存储在所述存储器中并可在所述处理器上运行的计算机程序,例如分布式路由确定程序。所述处理单元1000执行所述计算机程序时实现上述各个分布式路由确定方法实施例中的步骤,例如图7所示的步骤 S100至S300。
示例性的,所述计算机程序可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器中,并由所述处理器执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序在所述分布式路由确定设备中的执行过程。
如图10所示,分布式路由确定设备以专用计算机系统的形式示出。分布式路由确定设备的部件可以包括(但不限于)一个或多个处理器或处理单元1000、系统存储器1010以及将包括存储器1010的各种系统部件耦接至处理器1000的总线1015。
总线1015表示几种类型的总线结构中的一种或多种,包括存储器总线或存储器控制器、外围总线、加速图形端口,以及使用各种总线体系结构中的任一种的处理器或局部总线。作为示例而非限制,这样的架构包括工业标准架构(Industry Standard Architecture,ISA)总线、微通道架构(Micro Channel Architecture,MCA)总线、增强型ISA(EISA)总线、视频电子标准协会(Video Electronics Standards Association,VESA)局部总线和外围部件互连(Peripheral Component Interconnect,PCI)总线。
一个或多个处理单元1000可以执行存储在存储器1010中的计算机程序。可以使用任何合适的编程语言来实现特定实施方式的例程,所述任何合适的编程语言包括C、C++、Java、汇编语言等。可以采用不同的编程技术,比如面向过程或面向对象。例程可以在单个计算装置或多个计算装置上执行。此外,可以使用多个处理单元1000。
计算装置通常包括各种计算机系统可读介质。这样的介质可以是可由计算装置访问的任何可用介质,并且这样的介质既包括易失性和非易失性介质,又包括可移动和不可移动介质两者。
系统存储器1010可以包括易失性存储器形式的计算机系统可读介质,比如随机存取存储器(RAM)1020和/或高速缓冲存储器1030。计算装置还可以包括其他可移动/不可移动、易失性/非易失性计算机系统存储介质。仅作为示例,存储系统1040可以被提供用于从不可移动、非易失性磁介质(未示出并且通常被称为“硬盘驱动器”)读取和向其写入。尽管未示出,但是可以提供用于从可移除的非易失性磁盘(例如,“软盘”)读取和向其写入的磁盘驱动器,以及用于从诸如CD-ROM、DVD-ROM或其他光学介质的可移除的非易失性光盘读取或向其写入的光盘驱动器。在此类实例中,每一者可以通过一个或多个数据媒体接口连接至总线1015。如下面将进一步描绘和描述的,存储器1010可以包括具有配置成执行本公开中所描述的各实施例的功能的一组(例如,至少一个)程序模块的至少一个程序产品。
作为示例而非限制,具有一组(至少一个)程序模块1055的程序/实用程序可以存储在存储器1010中,以及操作系统、一个或多个应用软件、其他程序模块和程序数据中。操作系统、一个或多个应用程序、其他程序模块以及程序数据或其某种组合中的每一者可以包括联网环境的实现。
如图所示,分布式路由确定设备还可以与一个或多个外部装置1070通信,所述一个或多个外部装置1070比如为键盘、定点装置、显示器等;使得用户能够与计算装置交互的一个或多个装置;和/或任何使得计算装置能够与一个或多个其他计算装置通信的装置(例如,网卡、调制解调器等)。这样的通信可以经由一个或多个输入/输出接口1060发生。
此外,如上所述,分布式路由确定设备可以经由网络适配器1080与一个或多个网络通信, 所述一个或多个网络比如为局域网(LAN)、通用广域网(WAN)和/或公共网络(例如,因特网)。如所描绘的,网络适配器1080经由总线1015与计算装置的其他部件通信。应当理解的是,尽管未示出,但是可以结合分布式路由确定设备使用其他硬件部件和/或软件部件。示例包括(但不限于)微代码、装置驱动器、冗余处理单元、外部磁盘驱动器阵列、RAID系统、磁带驱动器和数据存档存储系统等。
本申请实施例包括一种分布式路由确定方法,其中,训练系统接收多个边缘路由器发送的训练信息,所述训练信息包括所述边缘路由器所在域的域内流量数据和所述边缘路由器对应的域间时延数据;根据所述域内流量数据训练生成域内路径模型,将所述域内路径模型发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域内路径模型调整域内路径生成域内路径方案;根据所述域间时延数据训练生成域间路径模型,并根据所述域间路径模型生成域间路径方案;将所述域间路径方案发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域间路径方案和所述域内路径方案确定路由路径。本申请实施例还包括一种分布式路由确定方法,其中路由器获取训练信息,将所述训练信息发送至训练系统,所述训练信息包括所述边缘路由器所在域的域内流量数据和所述边缘路由器对应的域间时延数据;接收所述训练系统发送的域内路径模型,根据所述域内路径模型调整域内路径生成域内路径方案,所述域内路径模型由所述训练系统根据所述域内流量数据训练生成;接收所述训练系统发送的域间路径方案,所述域间路径方案由域间路径模型生成,所述域间路径模型由所述训练系统根据所述域间时延数据训练生成;根据所述域间路径方案和所述域内路径方案确定路由路径。根据本申请实施例提供的方案,路由路径选择模型的训练在逻辑上集中离线训练,并将训练所得的模型相应地分散到每个域的域间推理子系统或各个边缘路由器,执行分布式推理以确定优选路由路径。因此,本申请实施例所提供的技术方案针对机器学习训练速度慢,但是基于训练所得模型推理速度快的特点,通过将训练和推理过程分开,从而使得模型能够基于历史数据不断更新学习,以及时响应网络流量动态变化。
本申请可以是任何可能的技术细节集成水平的系统、方法和/或计算机程序产品。计算机程序产品可以包括其上具有计算机可读程序指令的计算机可读存储介质(或多个介质),所述计算机可读程序指令用于使处理器执行本申请的各方面。
计算机可读存储介质可以是有形装置,该有形装置可以保留和存储由指令执行装置使用的指令。计算机可读存储介质可以是例如但不限于电子存储装置、磁存储装置、光存储装置、电磁存储装置、半导体存储装置或任何前述的合适的组合。计算机可读存储介质的更具体示例的非穷尽性列表包括下述各者:便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式光盘只读存储器(CD-ROM)、数字多功能盘(DVD)、存储棒、软盘、机械编码装置比如凹槽中的其上已经记录了指令的打孔卡或凸起结构以及前述的任何合适的组合。如本文所使用的,计算机可读存储介质不应当被解释为瞬时信号本身,比如无线电波、或其他自由传播的电磁波、通过波导或其他传输介质传播的电磁波(例如,通过光纤电缆的光脉冲)或通过导线传输的电信号。
本文所描述的计算机可读程序指令可以从计算机可读存储介质下载到相应的计算装置/处理装置,或者经由网络(例如因特网、局域网、广域网和/或无线网络)下载到外部计算机或外部存储装置。网络可以使用铜传输线缆、光传输光纤、无线传输、路由器、防火墙、交 换机、网关计算机和/或边缘服务器。每个计算装置/处理装置中的网络适配器卡或网络接口从网络接收计算机可读程序指令,并且转发计算机可读程序指令以存储在相应的计算装置/处理装置内的计算机可读存储介质中。
用于执行本申请的操作的计算机可读程序指令可以是汇编程序指令、指令集体系结构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、集成电路的配置数据,或者以一种或多种编程语言的任意组合编写的源代码或目标代码(包括面向对象的编程语言,比如Smalltalk、C++等;以及过程编程语言,比如“C”编程语言或类似的编程语言)。计算机可读程序指令可以完全在用户的计算机上,部分在用户的计算机上,作为独立的软件包,部分在用户的计算机上并且部分在远程计算机上或者完全在远程计算机或服务器上执行。在后一种情况下,远程计算机可以通过包括局域网(LAN)或广域网(WAN)的任何类型的网络连接至用户的计算机,或者可以连接至外部计算机(例如,通过使用互联网服务提供商的互联网)。在一些实施例中,包括例如可编程逻辑电路、现场可编程门阵列(FPGA)、或可编程逻辑阵列(PLA)的电子电路可以通过利用计算机可读程序指令的状态信息来个性化电子电路来执行计算机可读程序指令,以便执行本申请的各方面。
这里参照根据本申请的实施例的方法、设备(系统)和计算机程序产品的流程图和/或框图来描述本申请的各方面。应当理解的是,流程图和/或框图的每个框以及流程图和/或框图中的框的组合可以由计算机可读程序指令来实现。
这些计算机可读程序指令可以被提供给通用计算机、专用计算机或其他可编程数据处理设备的处理器以产生机器,使得经由计算机或其他可编程数据处理设备的处理器执行的指令创建用于实现流程图和/或框图的一个或多个方框中指定的功能/动作的装置。这些计算机可读程序指令还可以存储在计算机可读存储介质中,该计算机可读存储介质可以引导计算机、可编程数据处理设备和/或其他装置以特定方式起作用,使得具有存储在其中的指令的计算机可读存储介质包括制品,该制品包括实现流程图和/或一个或多个框图框中指定的功能/动作的各方面的指令。
计算机可读程序指令还可以加载到计算机、其他可编程数据处理设备或其他装置上,以使一系列操作步骤在计算机、其他可编程设备或其他装置上执行,从而产生计算机实现的过程,使得在计算机、其他可编程设备或其他装置上执行的指令实现流程图和/或框图的一个或多个方框中指定的功能/动作。
应当理解的是,尽管本公开包括关于云计算的详细描述,但是这里所述教导的实现不限于云计算环境。相反,本申请的实施例能够结合现在已知或以后开发的任何其他类型的计算环境来实现。
附图中的流程图和框图说明了根据本申请的各种实施例的系统、方法和计算机程序产品的可能实现的体系结构、功能和操作。在这方面,流程图或框图中的每个框可以表示指令的模块、部段或部分,其包括用于实现指定的一个或多个逻辑功能的一个或多个可执行指令。在一些替代性实施方案中,框中所提及的功能可以不按图中所提及的次序发生。例如,连续示出的两个框实际上可以基本上同时执行,或者框有时可以以相反的顺序执行,这取决于所涉及的功能。还将注意的是,框图和/或流程图说明的每个框以及框图和/或流程图说明中的框的组合可以由执行指定功能或动作或执行专用硬件和计算机指令的组合的基于专用硬件的系统来实现。
已经出于说明的目的给出了本申请的各种实施例的描述,但并非旨在穷举或限制于所公开的实施例。在不脱离所描述的实施例的范围和本质的情况下,许多修改和变化对于本领域普通技术人员将是明显的。选择本文使用的术语是为了最好地解释实施例的原理、实际应用、或对市场中发现的技术的改进,或者使得本领域普通技术人员能够理解本文公开的实施例。

Claims (12)

  1. 一种分布式路由确定方法,包括:
    接收多个边缘路由器发送的训练信息,所述训练信息包括所述边缘路由器所在域的域内流量数据和所述边缘路由器对应的域间时延数据;
    根据所述域内流量数据训练生成域内路径模型,将所述域内路径模型发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域内路径模型调整域内路径生成域内路径方案;
    根据所述域间时延数据训练生成域间路径模型,并根据所述域间路径模型生成域间路径方案;
    将所述域间路径方案发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域间路径方案和所述域内路径方案确定路由路径。
  2. 根据权利要求1所述的确定方法,其中,所述根据所述域间时延数据训练生成域间路径模型,包括:
    将每个所述边缘路由器发送的所述域间时延数据汇总为特定时刻原始数据集;
    基于所述特定时刻原始数据集进行强化学习训练生成域间路径模型。
  3. 根据权利要求1所述的确定方法,其中,所述根据所述域内流量数据训练生成域内路径模型,包括:
    基于每个所述边缘路由器定期采集真实历史流量需求数据,形成基于历史运行状态的模拟网络;
    基于所述模拟网络,以离线交互迭代的方式训练生成域内路径模型。
  4. 根据权利要求3所述的确定方法,其中,所述将所述域内路径模型发送至对应的所述边缘路由器,以供所述边缘路由器根据所述域内路径模型调整域内路径生成域内路径方案,包括:
    将所述域内路径模型发送至对应的所述边缘路由器,以供所述边缘路由器基于本地网络状态,通过所述域内路径模型分布式推理确定域内路径方案。
  5. 根据权利要求1至4中任一所述的确定方法,其中,响应于以下至少一项,根据所述域间时延数据训练生成域间路径模型:
    前后两个采集周期的时延数据超过预设阈值;
    域内路径资产数据变更。
  6. 根据权利要求1至4中任一所述的确定方法,其中,响应于以下至少一项,根据所述域内流量数据训练生成域内路径模型:
    域内训练子系统的定时器经过预设的训练周期时间;
    域内配置数据变更。
  7. 一种分布式路由确定方法,包括:
    获取训练信息,将所述训练信息发送至训练系统,所述训练信息包括所述边缘路由器所在域的域内流量数据和所述边缘路由器对应的域间时延数据;
    接收所述训练系统发送的域内路径模型,根据所述域内路径模型调整域内路径生成域内路径方案,所述域内路径模型由所述训练系统根据所述域内流量数据训练生成;
    接收所述训练系统发送的域间路径方案,所述域间路径方案由域间路径模型生成,所述 域间路径模型由所述训练系统根据所述域间时延数据训练生成;
    根据所述域间路径方案和所述域内路径方案确定路由路径。
  8. 根据权利要求7所述的确定方法,其中,所述获取训练信息,将所述训练信息发送至训练系统,包括:
    将获取的训练信息汇总为特定时刻原始数据集。
  9. 根据权利要求7所述的确定方法,其中,所述获取训练信息,将所述训练信息发送至训练系统,包括:
    定期采集真实历史流量需求数据,以形成基于历史运行状态的模拟网络;
    基于所述模拟网络,以离线交互迭代的方式训练生成域内路径模型。
  10. 根据权利要求8所述的确定方法,其中,所述接收所述训练系统发送的域内路径模型,根据所述域内路径模型调整域内路径生成域内路径方案包括:
    接收所述训练系统发送的域内路径模型;
    基于本地网络状态及各所述边缘路由器的域内路径模型,分布式推理确定域内路径方案。
  11. 一种电子设备,包括存储器、处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现权利要求1至6任意一项所述的路由确定方法,或实现权利要求7至10任意一项所述的路由确定方法。
  12. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至6中任意一所述的路由确定方法,或实现权利要求7至10任意一项所述的路由确定方法。
PCT/CN2022/102398 2021-06-30 2022-06-29 分布式路由确定方法、电子设备及存储介质 WO2023274304A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110738084.9A CN115550233A (zh) 2021-06-30 2021-06-30 分布式路由确定方法、电子设备及存储介质
CN202110738084.9 2021-06-30

Publications (1)

Publication Number Publication Date
WO2023274304A1 true WO2023274304A1 (zh) 2023-01-05

Family

ID=84691460

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/102398 WO2023274304A1 (zh) 2021-06-30 2022-06-29 分布式路由确定方法、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN115550233A (zh)
WO (1) WO2023274304A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240774A (zh) * 2023-11-15 2023-12-15 云南省地矿测绘院有限公司 一种跨域智能sdn路由方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117592239B (zh) * 2024-01-17 2024-04-26 北京邮电大学 一种多目标优化光缆网络路由智能规划方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106375214A (zh) * 2016-11-10 2017-02-01 北京邮电大学 一种基于sdn的层次化路由路径确定方法及装置
CN110167054A (zh) * 2019-05-20 2019-08-23 天津理工大学 一种面向边缘计算节点能量优化的QoS约束路由方法
WO2019182590A1 (en) * 2018-03-21 2019-09-26 Visa International Service Association Automated machine learning systems and methods
CN111445026A (zh) * 2020-03-16 2020-07-24 东南大学 面向边缘智能应用的深度神经网络多路径推理加速方法
US20200359159A1 (en) * 2015-09-02 2020-11-12 Estimote Polska Sp z o.o. System and method for low power data routing
CN112583725A (zh) * 2019-09-27 2021-03-30 中国电信股份有限公司 Sdn网络的路由确定方法、系统、sdn网络系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200359159A1 (en) * 2015-09-02 2020-11-12 Estimote Polska Sp z o.o. System and method for low power data routing
CN106375214A (zh) * 2016-11-10 2017-02-01 北京邮电大学 一种基于sdn的层次化路由路径确定方法及装置
WO2019182590A1 (en) * 2018-03-21 2019-09-26 Visa International Service Association Automated machine learning systems and methods
CN110167054A (zh) * 2019-05-20 2019-08-23 天津理工大学 一种面向边缘计算节点能量优化的QoS约束路由方法
CN112583725A (zh) * 2019-09-27 2021-03-30 中国电信股份有限公司 Sdn网络的路由确定方法、系统、sdn网络系统
CN111445026A (zh) * 2020-03-16 2020-07-24 东南大学 面向边缘智能应用的深度神经网络多路径推理加速方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240774A (zh) * 2023-11-15 2023-12-15 云南省地矿测绘院有限公司 一种跨域智能sdn路由方法
CN117240774B (zh) * 2023-11-15 2024-01-23 云南省地矿测绘院有限公司 一种跨域智能sdn路由方法

Also Published As

Publication number Publication date
CN115550233A (zh) 2022-12-30

Similar Documents

Publication Publication Date Title
WO2023274304A1 (zh) 分布式路由确定方法、电子设备及存储介质
CN112491714B (zh) SDN环境下基于深度强化学习的智能QoS路由优化方法、系统
WO2017032254A1 (zh) 网络控制策略的生成方法、装置及网络控制器
WO2021169308A1 (zh) 一种数据流类型识别模型更新方法及相关设备
WO2021052379A1 (zh) 一种数据流类型识别方法及相关设备
CN112532409B (zh) 网络参数配置方法、装置、计算机设备以及存储介质
WO2001076143A1 (en) Apparatus for adapting distribution of network events
CN110430072B (zh) 一种用于控制传输设备进行数据传输的系统及方法
Chen et al. Minimizing age-of-information for fog computing-supported vehicular networks with deep Q-learning
Huang et al. Intelligent traffic control for QoS optimization in hybrid SDNs
WO2023045565A1 (zh) 网络管控方法及其系统、存储介质
CN116599904A (zh) 并行传输负载均衡装置及方法
US10313470B2 (en) Hierarchical caching and analytics
Jin et al. A congestion control method of SDN data center based on reinforcement learning
US11556100B2 (en) Control method, related device, and system
CN111901237B (zh) 源路由选路方法及系统、相关设备及计算机可读存储介质
CN110138674B (zh) 可编程数据平面流调度方法、系统、介质及网络设备
WO2022166715A1 (zh) 一种智能流水线处理方法、装置、存储介质及电子装置
Owusu et al. A framework for QoS-based routing in SDNs using deep learning
CN113453255B (zh) 一种边设备容器的业务数据传输负载均衡优化方法及装置
CN114938374A (zh) 跨协议负载均衡方法及系统
Poncea et al. Design and implementation of an Openflow SDN controller in NS-3 discrete-event network simulator
CN115842766B (zh) 流量仿真方法和装置
Zitouna Learning-based Orchestrator for Intelligent Software-defined Networking Controllers
CN117240774B (zh) 一种跨域智能sdn路由方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22832116

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE