CN114221691A - Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning - Google Patents

Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning Download PDF

Info

Publication number
CN114221691A
CN114221691A CN202111558363.3A CN202111558363A CN114221691A CN 114221691 A CN114221691 A CN 114221691A CN 202111558363 A CN202111558363 A CN 202111558363A CN 114221691 A CN114221691 A CN 114221691A
Authority
CN
China
Prior art keywords
network
module
reinforcement learning
deep reinforcement
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111558363.3A
Other languages
Chinese (zh)
Inventor
孙永亮
廖森山
陈沁柔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Tech University
Original Assignee
Nanjing Tech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Tech University filed Critical Nanjing Tech University
Priority to CN202111558363.3A priority Critical patent/CN114221691A/en
Publication of CN114221691A publication Critical patent/CN114221691A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/1851Systems using a satellite or space-based relay
    • H04B7/18513Transmission in a satellite or space-based system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/70Routing based on monitoring results
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/125Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • Astronomy & Astrophysics (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a software-defined air-space-ground integrated network routing optimization method based on deep reinforcement learning. And establishing a deep reinforcement learning model according to network characteristics, taking the collected state data as the input of the deep reinforcement learning model, and outputting a link weight matrix of the network through training. When data forwarding is carried out, K Paths are calculated by using a K Shortest path Algorithm (K Shortest Paths Algorithm, KSP) and an alternative path set is formed, meanwhile, a proper path is selected for data forwarding according to real-time monitoring of a controller on a link state, and finally, the convergence speed of a deep reinforcement learning model is improved by calculating a reward value under the current state, so that optimization of software-defined air-ground integrated network routing is realized. The invention not only can effectively adapt to the dynamically changing network topology, but also obviously improves the average end-to-end time delay and the throughput in comparison with the prior method, and improves the data transmission efficiency of the air-space-ground integrated network.

Description

Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning
Technical Field
The invention relates to the field of wireless communication, in particular to a software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning.
Background
With the development of communication technology and the increase of internet service demand, the requirements of users on the network communication range and the network communication quality are continuously increased. The traditional foundation network has good communication quality, but cannot cover areas with severe environments, such as forests, mountains, oceans and the like. The space-based network can ensure the global coverage of signals by using a satellite as a relay node, but due to the influence of the space environment, the space-based network has the problems of long time delay, high error rate and the like. With the continuous increase of user demands, the air-space-ground integrated network combining the foundation network and the space-base network becomes one of effective solutions. The air-ground integrated network has the characteristics of large coverage area, high communication speed, high reliability and the like, and can meet the requirements of different fields on network communication. However, due to the problems of dynamic change of network topology, poor link quality and the like, the space-air-ground integrated network needs to establish an effective route optimization strategy to improve the performance of the network.
Due to the complex characteristics of dynamic topology change, high error rate, large transmission delay and the like in the air-space-ground integrated network, the network is difficult to construct a stable end-to-end transmission path on the basis of guaranteeing the service quality. Because the topology which can not deal with the dynamic change in real time, the traditional static topology routing algorithm can not adjust the corresponding routing strategy according to the real-time change of the node and the link state. The dynamic topological routing algorithm has high requirements on the hardware conditions of the network, occupies a large amount of node resources, and cannot completely adapt to the characteristic that the node resources in the air-space-ground integrated network are limited. Therefore, realizing data forwarding under the condition of adapting to dynamically-changed air-space-ground integrated network topology becomes a problem which needs to be solved urgently.
In recent years, the deep reinforcement learning algorithm is widely applied to various scenes. The method combines deep learning on the basis of reinforcement learning, improves the perception capability to the environment while ensuring the decision-making capability, and can directly control the whole process from original input to output. According to different selection modes of actions in the optimization process, the deep reinforcement learning can be divided into value function-based deep reinforcement learning and strategy gradient-based deep reinforcement learning. Due to the development of network communication technology and the proposal of emerging network architecture, the deep reinforcement learning has the possibility of realizing dynamic routing under the air-space-ground integrated network architecture on the aspects of software and hardware. Therefore, the deep reinforcement learning algorithm is applied to the network routing module, and a new idea is provided for the optimization of the air-space-ground integrated network routing.
Disclosure of Invention
Aiming at the existing problems, the invention provides a software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning, so that the dynamic network service quality requirement is met, the data transmission efficiency is improved, and the optimization of network routes is finally realized.
Different from the existing treatment method, the improved method of the invention is as follows: (1) according to the characteristics of the air-space-ground integrated network, a deep reinforcement learning model is established, so that the network environment is better perceived, and the stability and reliability of network routing are improved; (2) the time sequence characteristics between adjacent states are extracted by utilizing the capability of processing the internal relation between the states of the long-term and short-term memory network, so that the time sensitivity of the deep reinforcement learning model is improved; (3) and calculating an alternative path set by using a KSP algorithm, and selecting a proper path according to the network state detected by the controller, thereby avoiding the problem of local congestion caused by frequently using a single path. Compared with the prior art, the invention can realize the self-adaptive routing strategy according to the network link state, and realize the load balance of the network overall situation while acquiring the shortest path.
The method has the beneficial effects that: (1) establishing a deep reinforcement learning model according to the characteristics of the air-space-ground integrated network, and improving the adaptability of a routing algorithm to dynamic topology; (2) the average end-to-end delay and the throughput are remarkably improved, the data transmission efficiency of the air-space-ground integrated network is improved, and the method has higher theoretical value and practical significance.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2 is a diagram showing a neural network structure in the present invention.
Fig. 3 is a diagram of a software-defined space-time-ground integrated network structure in an embodiment.
Fig. 4 is a graph comparing simulations of normalized throughput of the route optimization algorithm of the present invention with other route optimization algorithms in an embodiment.
Fig. 5 is a simulation comparison graph of average end-to-end delay of the route optimization algorithm of the present invention and other route optimization algorithms in the embodiment.
Detailed Description
The following describes a specific embodiment of a software-defined air-ground integrated network route optimization method based on deep reinforcement learning in detail with reference to the flow of fig. 1.
As shown in fig. 1, the method for optimizing a software-defined air-space-ground integrated network route based on deep reinforcement learning provided by the present invention includes:
step 1: and establishing a software-defined air-space-ground integrated network topology according to the software-defined network thought and the air-space-ground integrated network node parameters, and initializing a topology discovery module, a network perception module and a routing decision module.
Step 2: through the topology discovery module and the network sensing module initialized in the step 1, the controller monitors the network topology in the current state and the state data of each link in the current network, and stores the collected link state data as a state matrix L.
And step 3: modeling a network transmission process into a Markov decision process, establishing a deep reinforcement learning model based on a double-delay deep certainty strategy gradient algorithm, inputting the state matrix L in the step 2 into the deep reinforcement learning model, and outputting a link weight matrix W of the network topology through training.
The deep reinforcement learning model is built by combining the characteristics of the air-space-ground integrated network, and the specific building process is as follows:
firstly, modeling a network transmission process into a Markov decision process, wherein parameters needing to be designed comprise: state S, action A, reward value r;
a. taking the state matrix L collected by the controller as the state S of the Markov decision process, wherein the state S is expressed as follows:
Figure BSA0000261160290000041
wherein, bijAnd dijAre respectively a link lijN is the total number of nodes.
b. Taking a link weight matrix W output by the deep reinforcement learning model as an action A of the Markov decision process, wherein the action A is represented as follows:
Figure BSA0000261160290000042
wherein, wijIs a link l output through the Actor networkijThe weight of (2).
c. Introducing the bandwidth and the delay of each link in the network to calculate the reward value r of the Markov decision process, wherein the calculation formula of the reward value r is as follows:
Figure BSA0000261160290000043
where α and β are adjustment factors determined according to the routing policy.
Then, because the deep reinforcement learning model is based on a dual-delay deep deterministic strategy gradient algorithm, a reasonably designed neural network structure is needed to update the cost function. In the present embodiment, the neural network structure is shown in fig. 2;
the neural network module of the double-delay depth deterministic strategy gradient algorithm comprises an Actor module, a Critic _1 module and a Critic _2 module, each module consists of an online network and a target network, the neural network structures of the online network and the target network are the same, and the network structures of the Actor module, the Critic _1 module and the Critic _2 module are as follows:
a. the Actor module comprises a Long Short-Term Memory (LSTM) layer and a Full Connected (FC) layer, wherein the input of the Actor module is a state matrix L, and the output of the Actor module is a weight matrix W formed by the weights of links in the network.
b. The critical _1 module and the critical _2 module comprise a layer of long-short term memory network and a layer of three layers of full-connection layer networks, the inputs of the critical _1 network and the critical _2 network are two parts, namely a state matrix L which is the same as the input of the Actor network and a weight matrix W output by the Actor network, and the outputs of the critical _1 module and the critical _2 module are Q values of corresponding actions in the current state.
And 4, step 4: inputting the link weight matrix W output by the deep reinforcement learning model in the step 3 into the routing decision module initialized in the step 2, acquiring an optimal path by executing a K Shortest path Algorithm (K Shortest Paths Algorithm, KSP), forwarding data, and calculating the reward value r of the current statet
a. Calculating k available paths by using a KSP algorithm, and forming an alternative path set P, wherein the alternative path set P is represented as follows:
P={pi|i=1,2,…,k} (4)
wherein p isiIs the ith path in the alternative path set.
b. And sorting the paths in the alternative path set from low to high according to the path weights, and selecting one path with the lowest path weight as the optimal path for data packet transmission.
c. In the transmission process, when the controller monitors that the available bandwidth of a certain node in the path is smaller than the size of a data packet, a suboptimal path is selected from the alternative path set for forwarding.
And 5: obtaining the state matrix L of the current time slot t obtained in the step 2tAnd with the next time slotState matrix L oft+1And 3, outputting the link weight matrix W of the current time slot ttAnd 4, calculating the reward value r of the current statetWith (L)t,Wt,rt,Lt+1) The method is input into an experience playback pool of the deep reinforcement learning model, meanwhile, a random sampling strategy is adopted to update parameters, and iterative training is carried out according to the updated parameters until the model converges.
The invention is analyzed below by way of an example. The space-air-ground integrated network topology structure is composed of 3 Geosynchronous Orbit (GEO) satellites, 70 Low Orbit (LEO) satellites and 16 ground stations. Wherein the orbit height of the GEO satellite is 36000 kilometers, the number of the orbits is 1, and the inclination angle of the orbit is 0 degree. The orbital height of the LEO satellite is 550 kilometers, the number of orbits is 7, and the orbital inclination angle is 53 degrees. Meanwhile, a virtual switch is deployed on the LEO satellite, and a controller is deployed on the GEO satellite so as to realize a software-defined air-ground integrated network, as shown in fig. 3.
The invention carries out 150 times of data transmission simulation experiments in total, and uses bwm-ng tool and ping tool to monitor the standardized throughput and average end-to-end delay of the method and the comparison method provided by the invention in the experimental process, and the monitoring results are shown in fig. 4 and fig. 5. Compared with the result of a routing algorithm based on a deep deterministic strategy gradient algorithm, the algorithm provided by the invention has lower average end-to-end time delay and higher standardized throughput. The algorithm provided by the invention can adaptively adjust the routing strategy according to the network state, can well adapt to dynamically changing space-sky-ground one-day network scenes, and has higher theoretical value and practical significance.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent structures or flow transformations made by using the contents of the specification and the drawings, or directly or indirectly applied to the related art, are included in the scope of the present invention.

Claims (4)

1. A software-defined air-space-ground integrated network routing optimization method based on deep reinforcement learning is characterized by comprising the following steps:
the method comprises the following steps: according to the software-defined network idea and the air-space-ground integrated network node parameters, a software-defined air-space-ground integrated network topology is built, and a topology discovery module, a network perception module and a routing decision module are initialized;
step two: through the topology discovery module and the network sensing module initialized in the first step, the controller monitors the network topology in the current state and the state data of each link in the current network, and stores the collected link state data as a state matrix L;
step three: modeling a network transmission process into a Markov decision process, establishing a deep reinforcement learning model based on a double-delay deep certainty strategy gradient algorithm, inputting the state matrix L in the step two into the deep reinforcement learning model, and outputting a link weight matrix W of the network topology through training;
step four: inputting the link weight matrix W output by the deep reinforcement learning model established in the third step into the routing decision module initialized in the first step, acquiring an optimal path by executing a K Shortest path Algorithm (K Shortest Paths Algorithm, KSP), forwarding data, and calculating the reward value r of the current statet
Step five: obtaining the state matrix L of the current time slot t obtained in the step twotAnd the state matrix L of the next time slott+1Step three, outputting the link weight matrix W of the current time slot ttAnd step four, calculating the reward value r of the current statetWith (L)t,Wt,rt,Lt+1) The method is input into an experience playback pool of the deep reinforcement learning model, meanwhile, a random sampling strategy is adopted to update parameters, and iterative training is carried out according to the updated parameters until the model converges.
2. The method for optimizing the routing of the software-defined air-space-ground integrated network based on the deep reinforcement learning of claim 1, wherein the process of modeling the transmission process into the markov decision process in the step three is as follows:
a. taking the state matrix L collected by the controller as the state S of the Markov decision process, wherein the state S is expressed as follows:
Figure FSA0000261160280000021
wherein, bijAnd dijAre respectively a link lijN is the total number of nodes;
b. taking a link weight matrix W output by the deep reinforcement learning model as an action A of the Markov decision process, wherein the action A is represented as follows:
Figure FSA0000261160280000022
wherein, wijIs the link l of the Actor network outputijThe weight of (2);
c. introducing the bandwidth and the delay of each link in the network to calculate the reward value r of the Markov decision process, wherein the calculation formula of the reward value r is as follows:
Figure FSA0000261160280000023
where α and β are adjustment factors determined according to the routing policy, respectively.
3. The method for optimizing routing of a software-defined air-space-ground integrated network based on deep reinforcement learning according to claim 1, wherein the deep reinforcement learning model is established based on a double-delay depth deterministic policy gradient algorithm in step three, a neural network module of the double-delay depth deterministic policy gradient algorithm includes an Actor module, a critical _1 module and a critical _2 module, each module is composed of an online network and a target network, the neural networks of the online network and the target network have the same structure, and the network structures of the Actor module, the critical _1 module and the critical _2 module are as follows:
a. the Actor module comprises a Long Short-Term Memory (LSTM) layer and a Full Connected (FC) layer, wherein the input of the Actor module is a state matrix L, and the output of the Actor module is a weight matrix W formed by the weights of all links in the network;
b. the critical _1 module and the critical _2 module comprise a layer of long-short term memory network and a layer of three layers of full-connection layer networks, the inputs of the critical _1 network and the critical _2 network are two parts, namely a state matrix L which is the same as the input of the Actor network and a weight matrix W output by the Actor network, and the outputs of the critical _1 module and the critical _2 module are Q values of corresponding actions in the current state.
4. The deep reinforcement learning-based software-defined air-space-ground integrated network routing optimization method according to claim 1, wherein the KSP algorithm in step four calculates the optimal path in the topology as follows:
the method comprises the following steps: calculating k available paths by using a KSP algorithm, and forming an alternative path set P, wherein the alternative path set P is represented as follows:
P={pi|i=1,2,…,k} (4)
wherein p isiIs the ith path in the alternative path set;
step two: sorting the paths in the alternative path set from low to high according to the path weights, and selecting one path with the lowest path weight as an optimal path for data packet transmission;
step three: in the transmission process, when the controller monitors that the available bandwidth of a certain node in the path is smaller than the size of a data packet, a suboptimal path is selected from the alternative path set for forwarding.
CN202111558363.3A 2021-12-17 2021-12-17 Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning Pending CN114221691A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111558363.3A CN114221691A (en) 2021-12-17 2021-12-17 Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111558363.3A CN114221691A (en) 2021-12-17 2021-12-17 Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning

Publications (1)

Publication Number Publication Date
CN114221691A true CN114221691A (en) 2022-03-22

Family

ID=80704169

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111558363.3A Pending CN114221691A (en) 2021-12-17 2021-12-17 Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN114221691A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022231A (en) * 2022-06-30 2022-09-06 武汉烽火技术服务有限公司 Optimal path planning method and system based on deep reinforcement learning
CN115150335A (en) * 2022-06-30 2022-10-04 武汉烽火技术服务有限公司 Optimal flow segmentation method and system based on deep reinforcement learning
CN116366529A (en) * 2023-04-20 2023-06-30 哈尔滨工业大学 Adaptive routing method based on deep reinforcement learning in SDN (software defined network) background
CN117395188A (en) * 2023-12-07 2024-01-12 南京信息工程大学 Deep reinforcement learning-based heaven-earth integrated load balancing routing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111988225A (en) * 2020-08-19 2020-11-24 西安电子科技大学 Multi-path routing method based on reinforcement learning and transfer learning
CN113328938A (en) * 2021-05-25 2021-08-31 电子科技大学 Network autonomous intelligent management and control method based on deep reinforcement learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111988225A (en) * 2020-08-19 2020-11-24 西安电子科技大学 Multi-path routing method based on reinforcement learning and transfer learning
CN113328938A (en) * 2021-05-25 2021-08-31 电子科技大学 Network autonomous intelligent management and control method based on deep reinforcement learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
康朝海: "基于动态延迟策略更新的TD3算法", 吉林大学学报, vol. 38, no. 4, pages 1 *
肖扬: "一种基于深度强化学习的动态路由算法", 信息通信技术与政策, pages 1 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022231A (en) * 2022-06-30 2022-09-06 武汉烽火技术服务有限公司 Optimal path planning method and system based on deep reinforcement learning
CN115150335A (en) * 2022-06-30 2022-10-04 武汉烽火技术服务有限公司 Optimal flow segmentation method and system based on deep reinforcement learning
CN115150335B (en) * 2022-06-30 2023-10-31 武汉烽火技术服务有限公司 Optimal flow segmentation method and system based on deep reinforcement learning
CN115022231B (en) * 2022-06-30 2023-11-03 武汉烽火技术服务有限公司 Optimal path planning method and system based on deep reinforcement learning
CN116366529A (en) * 2023-04-20 2023-06-30 哈尔滨工业大学 Adaptive routing method based on deep reinforcement learning in SDN (software defined network) background
CN117395188A (en) * 2023-12-07 2024-01-12 南京信息工程大学 Deep reinforcement learning-based heaven-earth integrated load balancing routing method
CN117395188B (en) * 2023-12-07 2024-03-12 南京信息工程大学 Deep reinforcement learning-based heaven-earth integrated load balancing routing method

Similar Documents

Publication Publication Date Title
CN110012516B (en) Low-orbit satellite routing strategy method based on deep reinforcement learning architecture
CN109714219B (en) Virtual network function rapid mapping method based on satellite network
CN114221691A (en) Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning
CN107294592B (en) Satellite network based on distributed SDN and construction method thereof
Rischke et al. QR-SDN: Towards reinforcement learning states, actions, and rewards for direct flow routing in software-defined networks
CN112821940B (en) Satellite network dynamic routing method based on inter-satellite link attribute
CN113572686A (en) Heaven and earth integrated self-adaptive dynamic QoS routing method based on SDN
CN113315569B (en) Satellite reliability routing method and system with weighted link survival time
CN109586785B (en) Low-orbit satellite network routing strategy based on K shortest path algorithm
CN108307435A (en) A kind of multitask route selection method based on SDSIN
CN112600609B (en) Network capacity estimation method of satellite network system
Wang et al. Fuzzy-CNN based multi-task routing for integrated satellite-terrestrial networks
Han et al. Time-varying topology model for dynamic routing in LEO satellite constellation networks
CN113258982A (en) Satellite information transmission method, device, equipment, medium and product
Ebrahim et al. A deep learning approach for task offloading in multi-UAV aided mobile edge computing
CN117579126A (en) Satellite mobile edge calculation unloading decision method based on deep reinforcement learning
Qiao et al. A service function chain deployment scheme of the software defined satellite network
Mao et al. Digital Twin Satellite Networks Towards 6G: Motivations, Challenges, and Future Perspectives
CN114513241B (en) SDN-based high-performance QoS guaranteed low-orbit satellite inter-satellite routing method
CN112020085B (en) Node failure sweep effect analysis method for aviation ad hoc network
Liu et al. A routing model based on multiple-user requirements and the optimal solution
Wu et al. QoS provisioning in space information networks: Applications, challenges, architectures, and solutions
Wei et al. Dynamic controller placement for software-defined LEO network using deep reinforcement learning
CN116938322B (en) Networking communication method, system and storage medium of space-based time-varying topology
Shi et al. Heterogeneous satellite network routing algorithm based on reinforcement learning and mobile agent

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination