CN111555907B - Data center network energy consumption and service quality optimization method based on reinforcement learning - Google Patents

Data center network energy consumption and service quality optimization method based on reinforcement learning Download PDF

Info

Publication number
CN111555907B
CN111555907B CN202010308862.6A CN202010308862A CN111555907B CN 111555907 B CN111555907 B CN 111555907B CN 202010308862 A CN202010308862 A CN 202010308862A CN 111555907 B CN111555907 B CN 111555907B
Authority
CN
China
Prior art keywords
data center
network
reinforcement learning
center network
link
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010308862.6A
Other languages
Chinese (zh)
Other versions
CN111555907A (en
Inventor
郭泽华
孙鹏浩
窦松石
张云天
韩宁
夏元清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202010308862.6A priority Critical patent/CN111555907B/en
Publication of CN111555907A publication Critical patent/CN111555907A/en
Application granted granted Critical
Publication of CN111555907B publication Critical patent/CN111555907B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5019Ensuring fulfilment of SLA
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements

Abstract

The invention discloses a data center network energy consumption and service quality optimization method based on reinforcement learning, which comprises the steps of constructing a data center network energy consumption and service quality optimization model based on a deep reinforcement learning framework, comparing the link utilization rate, the completion time related performance obtained by network performance calculation and the link margin as the state, reward and action of the optimization model, and then adjusting the flow of a data center network according to the link margin comparison output by the optimization model, so that the adjustment process simultaneously considers the time volatility of data flow and the spatial distribution characteristic of the data flow, and the energy efficiency of the data center network is improved while the FCT is ensured.

Description

Data center network energy consumption and service quality optimization method based on reinforcement learning
Technical Field
The invention belongs to the technical field of computer networks, and particularly relates to a data center network energy consumption and service quality optimization method based on reinforcement learning.
Background
The high power consumption of data centers has become a big problem for data center operators. Recent studies have shown that power consumption in the U.S. data center is expected to reach 1390 hundred million kilowatt-hours in 2020. In a data center, a Data Center Network (DCN) consisting of switches and links consumes about 10% to 20% of the total energy consumption. Traffic in DCNs can generally be divided into two categories: delay sensitive traffic, which is primarily traffic for delay sensitive services (e.g., web searches), which is typically small (from a few KB to MB) and has explicit data Flow Completion Time (FCT) limitations; while delay tolerant traffic is typically large (hundreds of MB or more), it is mainly used for data synchronization or backup between servers that do not have strict FCT requirements. When merging traffic in DCN, delay sensitive traffic should be carefully allocated to ensure quality of service (QoS), while most of all traffic is delay tolerant traffic, so it is important to find a method to improve efficiency in this respect.
Since traffic in DCNs exhibits volatility, some recent studies propose an efficient DCN that reduces the power consumption of DCNs through traffic consolidation using Software Defined Networking (SDN). In such DCNs, traffic is consolidated into a lowest power subnet that is served by switches and links, while unused switches and links may go into sleep mode or shut down to save power consumption.
However, the existing traffic scheduling scheme has the following two disadvantages: first, a coarse integration scheme may result in a decrease in QoS. Data centers have many delay sensitive applications, and FCT is a typical QoS metric context. In a power efficient DCN, shutting down a portion of the network devices may increase the FCT for certain data flows due to possible congestion of the data flows on the active links. However, much prior work has incorporated data stream transmission lines without regard to FCT requirements. Secondly, the diversity of DCN flow types is not fully considered in the existing scheme, and the adaptability is poor. The distribution of different kinds of traffic in DCN has time fluctuation and imbalance of spatial distribution, i.e. the load on a certain link is different at different times, and the load on different links at the same time is also different. However, using these two features to obtain an optimized merging scheme results in high computational complexity. Most of the existing schemes only provide some fixed modes, and cannot be well popularized to most situations.
Disclosure of Invention
In view of this, the invention provides a data center network energy consumption and service quality optimization method based on reinforcement learning, which can dynamically schedule traffic of a data center network according to the performance of the data center network, and reduce the power consumption of the data center network on the premise of ensuring the FCT requirement.
The invention provides a data center network energy consumption and service quality optimization method based on reinforcement learning, which comprises the following steps:
step 1, establishing a data center network energy consumption and service quality optimization model by adopting a deep reinforcement learning framework; the optimization model comprises a deep reinforcement learning agent and an environment, wherein the environment is a data center network to be optimized;
step 2, counting and calculating historical flow and network performance of each link in the data center network to be optimized, wherein the network performance comprises time for completing transmission of delay sensitive streams, the number of currently used links, the number of closed or dormant links and the number of streams violating the time for completing data streams; calculating the network performance to obtain the performance related to the completion time as the initial reward of the deep reinforcement learning agent; calculating the link utilization rate of each link in the data center network to be optimized according to the flow and the network performance, wherein the link utilization rate is used as an initial input state of the deep reinforcement learning agent;
step 3, inputting the initial input state and the initial reward into the deep reinforcement learning agent, and calculating by the deep reinforcement learning agent to obtain an action, wherein the action is used as a link margin ratio of each link in the data center network to be optimized; applying the link margin ratio to the environment, namely adjusting the flow path in the data center network to be optimized according to the link margin ratio, calculating the flow and the network performance under the current adjustment, and updating the deep reinforcement learning agent; iteratively executing the step 3 until the maximum iteration times are reached, and finishing the training of the deep reinforcement learning agent;
and 4, in the actual deployment process, calculating the flow and the network performance of each link in the data center network to be optimized to obtain the link utilization rate and the completion time related performance, inputting the obtained link utilization rate and the completion time related performance into a deep reinforcement learning agent obtained by training as a state and a reward respectively to obtain the link margin ratio of the output data center network to be optimized, adjusting the flow path in the data center network to be optimized according to the link margin ratio, and completing the optimization of the energy consumption and the service quality of the data center network to be optimized.
Further, the award rtCalculated using the following formula: r ist=D[Ut]-λCtWherein, D [ U ]t]Number of closed and dormant links at time t for the data centre network to be optimized, CtAnd lambda is the punishment weight of the flow violating the data flow completion time in the data center network to be optimized at the time t.
Further, the process of calculating the historical traffic and the network performance of each link in the data center network to be optimized through statistics in the step 2 and the process of adjusting the path of the flow in the data center network to be optimized according to the link margin ratio in the step 4 are both realized through a Software Defined Network (SDN) controller.
Further, the adjustment of the flow path in the data center network to be optimized according to the link margin ratio is realized by adopting an optimal matching decreasing boxing algorithm.
Further, the deep reinforcement learning framework is an AC algorithm framework.
Further, the reinforcement learning agents in the step 2 and the step 3 are target networks of the reinforcement learning agents in the step 4, and the target networks and the reinforcement learning agents in the step 4 have the same network structures and network parameters.
Has the advantages that:
the invention constructs a data center network energy consumption and service quality optimization model based on a deep reinforcement learning framework, compares the link utilization rate, the completion time related performance obtained by network performance calculation and the link margin as the state, reward and action of the optimization model, and then adjusts the flow of the data center network according to the link margin comparison output by the optimization model. Test data show that the scheme provided by the invention can save more than 12.2% of power of a data center network compared with the existing scheme under the condition of ensuring FCT constraint.
According to the invention, by adopting the Actor-critical framework and combining the characteristic that the Actor-critical framework generates continuous actions, the matching degree of the continuity characteristic of the value of the margin ratio of the optimization model and the link is improved, so that the effectiveness of the optimization model is improved.
According to the method, the training process and the deployment and use process are separated, namely the reinforcement learning agent obtained through training in the training process is used as the target network of the deep reinforcement learning agent used in the deployment process to complete calculation of the link margin ratio of the data center network to be optimized, and the target network is adopted in the training process to improve smoothness, so that the stability of the deep reinforcement learning agent training process is improved on the whole.
Detailed Description
The present invention will be described in detail below with reference to examples.
The invention provides a data center network energy consumption and service quality optimization method based on reinforcement learning, which has the following basic ideas: the method comprises the steps of establishing a data center network energy consumption and service quality optimization model by utilizing a deep reinforcement learning framework (DRL framework), constructing a training sample set of the data center network energy consumption and service quality optimization model by adopting historical data of flow and network performance of each link in a data center network to be optimized, finishing training of the data center network energy consumption and service quality optimization model by adopting the sample set, inputting current flow and network performance characteristics of the data center network to be optimized into the optimization model obtained by training to obtain a link margin ratio in an actual deployment process, and adjusting a flow path in the data center network to be optimized by adopting the link margin ratio to finish optimization of the data center network energy consumption and service quality.
The invention provides a data center network energy consumption and service quality optimization method based on reinforcement learning, which specifically comprises the following steps:
step 1, establishing a data center network energy consumption and service quality optimization model by adopting a DRL framework, wherein the optimization model comprises a deep reinforcement learning agent (DRL agent) and an environment, and the environment is a data center network to be optimized.
In the prior art, a DRL framework is designed based on a typical Reinforcement Learning (RL) framework, and includes a deep reinforcement learning agent and a working environment thereof, and training of the DRL framework is completed through interaction between the deep reinforcement learning agent and the working environment thereof. In the DRL framework, the action generation strategy is implemented by a deep neural network that completes the slave state stTo action atI.e. each step t in the interaction, the agent observes the state s of the network environmenttAnd generating an operation a based on the temporary policy mut. This action will result in a change of environment from which the reward r of the agent is calculatedt(e.g., energy saving rate in the network). The agent then updates its policy calculation function based on this reward. The purpose of the DRL agent is to maximize the overall cumulative reward with a discount factor.
In the invention, the data center network to be optimized is the working environment in the DRL framework, and the training of the deep reinforcement learning agent is completed in the interaction process between the deep reinforcement learning agent established by the invention and the data center network to be optimized.
Step 2, counting and calculating historical data of flow and network performance of each link in the data center network to be optimized, wherein the network performance comprises time for completing transmission of delay sensitive streams, the number of currently used links, the number of closed or dormant links and the number of streams violating the time for completing data streams; calculating the network performance to obtain the performance related to the completion time as the initial reward of the deep reinforcement learning agent; and calculating the link utilization rate of each link in the data center network to be optimized according to the flow and the network performance, wherein the link utilization rate is used as an initial input state of the deep reinforcement learning agent.
The data center network energy consumption and service quality optimization method based on reinforcement learning can be realized by adopting a Software Defined Network (SDN), a data center network energy consumption and service quality deep reinforcement learning model is realized by a deep reinforcement learning agent (DRL agent) deployed on an SDN controller, and the DRL agent is communicated with the SDN controller through a northbound interface. The SDN controller is used for collecting the flow and network performance statistical data of the data center network to be optimized, and calculating the flow data and network performance of each link in the network, wherein the flow data and network performance include the time FCT for completing the transmission of the delay sensitive streams, the number of currently used links, the number of closed or dormant links and the number of streams violating the time for completing the data streams.
Rewards in deep reinforcement learning are used to evaluate the effectiveness of deep reinforcement learning algorithms, which reflect the overall goal of the user, which for the present invention is to minimize power consumption while ensuring FCT. In the invention, the SDN controller calculates the completion time correlation performance according to the network performance to be used as the initial reward r of the deep reinforcement learning agenttR is a prizetThe following formula is used for calculation:
rt=D[Ut]-λCt
wherein, D [ U ]t]Number of closed or dormant links for the data centre network to be optimized at time t, CtAnd lambda is the penalty weight of the flow violating the data flow completion time.
And the SDN controller calculates the link utilization rate of each link in the data center network to be optimized according to the flow data and the network performance obtained by statistics, and the link utilization rate is used as the state of the input DRL agent.
Step 3, inputting the link utilization rate into a deep reinforcement learning agent, and calculating by the deep reinforcement learning agent to obtain an action which is used as a link margin ratio of each link in the data center network to be optimized; the link margin ratio is used in the environment, namely, the flow path in the data center network to be optimized is adjusted according to the link margin ratio, the flow and the network performance under the current adjustment are calculated, and the deep reinforcement learning agent is updated; and (5) iteratively executing the step (3) until the maximum iteration times are reached, and finishing the training of the deep reinforcement learning agent.
And (3) sending the link utilization rate and the completion time related performance obtained by the calculation in the step (2) to a DRL agent, generating an action list by the DRL agent according to the input state and sending the action list to an SDN controller, wherein each element in the action list is the link margin ratio of each link in the data center network to be optimized. The SDN controller adjusts a flow path according to the link margin ratio, and then updates a flow table in a corresponding switch, where the specific adjustment manner may be: when no traffic is passed on a link or a device, it is put in a sleep mode or turned off directly to save energy. And finally, the SDN controller calculates the network performance of the adjusted data center network to be optimized again. And (5) iteratively executing the process to the maximum iteration times, namely finishing the training of the DRL agent.
In consideration of the situation that congestion may be caused by unpredictability of burst flow in link capacity, the invention introduces a concept of link margin ratio to avoid congestion of a link caused by the burst flow, thereby ensuring the requirement of delay sensitive flow transmission completion time FCT of delay sensitive flow of a network.
And 4, in the actual deployment process, calculating the flow and the network performance of each link in the data center network to be optimized to obtain the link utilization rate and the completion time related performance, inputting the obtained link utilization rate and the completion time related performance into a deep reinforcement learning agent obtained by training as a state and a reward respectively to obtain the link margin ratio of the output data center network to be optimized, adjusting the flow path in the data center network to be optimized according to the link margin ratio, and completing the optimization of the energy consumption and the service quality of the data center network to be optimized.
In the actual deployment process, the trained DRL agent calculates and outputs a link margin ratio of the data center network to be optimized based on the link utilization ratio and the completion time related performance of the data center network to be optimized, the SDN controller determines a strategy of adjusting a flow path of the network according to the link margin ratio, namely, the flow is distributed to each link according to the link capacity reserved with margin, an algorithm adopted for distributing the flow can adopt a Best-fit progressive Bin-packing Algorithm (Best-fit progressive Bin-packing Algorithm), namely, the flow is merged, and idle links or equipment are placed in a sleep mode or are directly closed to save energy, so that the aim of merging the flow into fewer links and reducing the used equipment as much as possible is fulfilled.
In the present invention, the action of the deep reinforcement learning agent is referred to as a link margin ratio, and since the values of the link margin ratio should be continuous, it is preferable to select a model that generates continuous action when selecting the deep reinforcement learning model. An Actor-critical framework (such as DDPG) in the deep reinforcement learning model is a typical method for generating continuous actions, the Actor-critical framework comprises two types of neural network Actor networks and critical networks, for the invention, the quality of a temporary adjustment method (optimized measurement) is evaluated by using the critical networks, and the Actor networks are used for generating the actions according to input states.
Meanwhile, in order to further improve the stability of the deep reinforcement learning agent training process, a target network can be adopted in the training process to improve the smoothness, namely the reinforcement learning agent trained in the training process is used as the target network of the deep reinforcement learning agent used in the deployment process to complete the calculation of the link margin ratio of the data center network to be optimized, and the target network has the same structure as the neural networks of the Actor network and the Critic network of the original network.
To further refine the neural network portion of the DRL algorithm, the present invention combines gated-round-element networks (GRUs) with forward-propagating neural networks (FFs). The GRU is derived from a Recurrent Neural Network (RNN) for extracting timing information from input data. In our application scenario, the GRU can achieve similar performance with less consumption of computational resources compared to the hot LSTM, so the GRU is selected. State list s ═ s1,s2,…,sL]The input data is entered into the input layer of the GRU as the input data, and after the GRU processing, the output list h ═ h is obtained1,h2,…hL]And serves as an input to FF. The output of FF is the final output of the whole neural network.
In practical deployment, the invention focuses on the problem of the flow statistics collection frequency. On one hand, the higher flow statistics and acquisition frequency can enable the network state information input by the DRL algorithm to be more accurate, and enables the flow distribution to be more timely. On the other hand, a higher traffic statistics collection frequency occupies a larger bandwidth of the control channel. According to the data center network energy consumption and service quality optimization method based on reinforcement learning, provided by the invention, a Software Defined Network (SDN) is adopted to realize a SmartFCT application system, and an experiment is carried out by using the SmartFCT, in the experiment, the SmartFCT generates comprehensive consideration by combining flow statistics collection operation and flow, a frequency index sigma is set, and the optimal sigma value is determined to be 100ms through simulation and evaluation. At this value, we can find the best balance point for pros and cons. The performance of the embodiment is evaluated through experimental simulation, the SmartFCT realized based on the method provided by the invention can ensure the FCT constraint of the flow, and the power is saved by more than 12.2% compared with the existing scheme.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. The data center network energy consumption and service quality optimization method based on reinforcement learning is characterized by comprising the following steps:
step 1, establishing a data center network energy consumption and service quality optimization model by adopting a deep reinforcement learning framework; the optimization model comprises a deep reinforcement learning agent and an environment, wherein the environment is a data center network to be optimized;
step 2, counting and calculating historical flow and network performance of each link in the data center network to be optimized, wherein the network performance comprises time for completing transmission of delay sensitive streams, the number of currently used links, the number of closed or dormant links and the number of streams violating the time for completing data streams; calculating the network performance to obtain the performance related to the completion time as the initial reward of the deep reinforcement learning agent; calculating the link utilization rate of each link in the data center network to be optimized according to the flow and the network performance, wherein the link utilization rate is used as an initial input state of the deep reinforcement learning agent;
step 3, inputting the initial input state and the initial reward into the deep reinforcement learning agent, and calculating by the deep reinforcement learning agent to obtain an action, wherein the action is used as a link margin ratio of each link in the data center network to be optimized; applying the link margin ratio to the environment, namely adjusting the flow path in the data center network to be optimized according to the link margin ratio, calculating the flow and the network performance under the current adjustment, and updating the deep reinforcement learning agent; iteratively executing the step 3 until the maximum iteration times are reached, and finishing the training of the deep reinforcement learning agent;
and 4, in the actual deployment process, calculating the flow and the network performance of each link in the data center network to be optimized to obtain the link utilization rate and the completion time related performance, inputting the obtained link utilization rate and the completion time related performance into a deep reinforcement learning agent obtained by training as a state and a reward respectively to obtain the link margin ratio of the output data center network to be optimized, adjusting the flow path in the data center network to be optimized according to the link margin ratio, and completing the optimization of the energy consumption and the service quality of the data center network to be optimized.
2. The method of claim 1, wherein the reward r istCalculated using the following formula: r ist=D[Ut]-λCtWherein, D [ U ]t]Number of closed and dormant links at time t for the data centre network to be optimized, CtAnd lambda is the punishment weight of the flow violating the data flow completion time in the data center network to be optimized at the time t.
3. The method according to claim 1, wherein the statistical calculation of the historical traffic and network performance of each link in the data center network to be optimized in step 2 and the adjustment of the path of the flow in the data center network to be optimized according to the link margin ratio in step 4 are both implemented by a Software Defined Network (SDN) controller.
4. The method of claim 3, wherein the adjusting the path of the flow in the data center network to be optimized according to the link margin ratio is implemented by using a best-match decreasing binning algorithm.
5. The method of claim 1, wherein the deep reinforcement learning framework is an AC algorithm framework.
6. The method of claim 5, wherein the reinforcement learning agents in step 2 and step 3 are target networks of the strong learning agents in step 4, and the target networks have the same network structures and network parameters as the strong learning agents in step 4.
CN202010308862.6A 2020-04-19 2020-04-19 Data center network energy consumption and service quality optimization method based on reinforcement learning Active CN111555907B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010308862.6A CN111555907B (en) 2020-04-19 2020-04-19 Data center network energy consumption and service quality optimization method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010308862.6A CN111555907B (en) 2020-04-19 2020-04-19 Data center network energy consumption and service quality optimization method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN111555907A CN111555907A (en) 2020-08-18
CN111555907B true CN111555907B (en) 2021-04-23

Family

ID=72002535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010308862.6A Active CN111555907B (en) 2020-04-19 2020-04-19 Data center network energy consumption and service quality optimization method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN111555907B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11275429B2 (en) * 2020-06-29 2022-03-15 Dell Products L.P. Reducing power consumption of a data center utilizing reinforcement learning framework
CN112822109B (en) * 2020-12-31 2023-04-07 上海缔安科技股份有限公司 SDN core network QoS route optimization method based on reinforcement learning
CN112801303A (en) * 2021-02-07 2021-05-14 中兴通讯股份有限公司 Intelligent pipeline processing method and device, storage medium and electronic device
CN112953844B (en) * 2021-03-02 2023-04-28 中国农业银行股份有限公司 Network traffic optimization method and device
CN113783720B (en) * 2021-08-20 2023-06-27 华东师范大学 Network energy consumption two-stage control method based on parameterized action space
CN114745337B (en) * 2022-03-03 2023-11-28 武汉大学 Real-time congestion control method based on deep reinforcement learning
CN114710439A (en) * 2022-04-22 2022-07-05 南京南瑞信息通信科技有限公司 Network energy consumption and throughput joint optimization routing method based on deep reinforcement learning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108880888A (en) * 2018-06-20 2018-11-23 浙江工商大学 A kind of SDN network method for predicting based on deep learning
CN109324875A (en) * 2018-09-27 2019-02-12 杭州电子科技大学 A kind of data center server power managed and optimization method based on intensified learning
CN109656702A (en) * 2018-12-20 2019-04-19 西安电子科技大学 A kind of across data center network method for scheduling task based on intensified learning
CN109818786A (en) * 2019-01-20 2019-05-28 北京工业大学 A kind of cloud data center applies the more optimal choosing methods in combination of resources path of appreciable distribution
CN109922004A (en) * 2019-04-24 2019-06-21 清华大学 The traffic engineering method and device of IPv6 network based on partial deployment Segment routing
CN110278149A (en) * 2019-06-20 2019-09-24 南京大学 Multi-path transmission control protocol data packet dispatching method based on deeply study
CN110662238A (en) * 2019-10-24 2020-01-07 南京大学 Reinforced learning scheduling method and device for burst request under edge network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104661291A (en) * 2015-03-12 2015-05-27 厦门大学 Energy saving method for WiFi access device based on traffic filtering and Web cache prefetching
KR102355678B1 (en) * 2017-05-08 2022-01-26 삼성전자 주식회사 METHOD AND APPARATUS FOR CONFIGURATING QoS FLOW IN WIRELESS COMMUNICATION
CN109614215B (en) * 2019-01-25 2020-10-02 广州大学 Deep reinforcement learning-based stream scheduling method, device, equipment and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108880888A (en) * 2018-06-20 2018-11-23 浙江工商大学 A kind of SDN network method for predicting based on deep learning
CN109324875A (en) * 2018-09-27 2019-02-12 杭州电子科技大学 A kind of data center server power managed and optimization method based on intensified learning
CN109656702A (en) * 2018-12-20 2019-04-19 西安电子科技大学 A kind of across data center network method for scheduling task based on intensified learning
CN109818786A (en) * 2019-01-20 2019-05-28 北京工业大学 A kind of cloud data center applies the more optimal choosing methods in combination of resources path of appreciable distribution
CN109922004A (en) * 2019-04-24 2019-06-21 清华大学 The traffic engineering method and device of IPv6 network based on partial deployment Segment routing
CN110278149A (en) * 2019-06-20 2019-09-24 南京大学 Multi-path transmission control protocol data packet dispatching method based on deeply study
CN110662238A (en) * 2019-10-24 2020-01-07 南京大学 Reinforced learning scheduling method and device for burst request under edge network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"QOS-Aware Flow Control for Power-Efficient Data Center Networks with Deep Reinforcement Learning";Penghao Sun,etc.;《2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)》;20200514;第3552-3556页 *

Also Published As

Publication number Publication date
CN111555907A (en) 2020-08-18

Similar Documents

Publication Publication Date Title
CN111555907B (en) Data center network energy consumption and service quality optimization method based on reinforcement learning
Ghadimi et al. A reinforcement learning approach to power control and rate adaptation in cellular networks
CN107819840A (en) Distributed mobile edge calculations discharging method in the super-intensive network architecture
CN106411770B (en) A kind of data center network energy-saving routing algorithm based on SDN framework
Zhang et al. Joint offloading and resource allocation in mobile edge computing systems: An actor-critic approach
CN111556572B (en) Spectrum resource and computing resource joint allocation method based on reinforcement learning
CN110351754A (en) Industry internet machinery equipment user data based on Q-learning calculates unloading decision-making technique
CN110798858A (en) Distributed task unloading method based on cost efficiency
CN111148131A (en) Wireless heterogeneous network terminal access control method based on energy consumption
CN114884895B (en) Intelligent flow scheduling method based on deep reinforcement learning
Gao et al. Reinforcement learning based cooperative coded caching under dynamic popularities in ultra-dense networks
CN109982434A (en) Wireless resource scheduling integrated intelligent control system and method, wireless communication system
Zuo et al. An intelligent routing algorithm for LEO satellites based on deep reinforcement learning
CN106604288B (en) Wireless sensor network interior joint adaptively covers distribution method and device on demand
Fan et al. Delay-aware resource allocation in fog-assisted IoT networks through reinforcement learning
CN108322274B (en) Greedy algorithm based energy-saving and interference optimization method for W L AN system AP
Wang et al. Task allocation mechanism of power internet of things based on cooperative edge computing
CN115066006A (en) Base station dormancy method, equipment and medium based on reinforcement learning
CN111191955B (en) Power CPS risk area prediction method based on dependent Markov chain
Sun et al. QoS-aware flow control for power-efficient data center networks with deep reinforcement learning
Qin et al. Traffic optimization in satellites communications: A multi-agent reinforcement learning approach
Zhao et al. Reinforcement learning for resource mapping in 5G network slicing
Li et al. Deep reinforcement learning-based resource allocation and seamless handover in multi-access edge computing based on SDN
Li et al. DQN-based computation-intensive graph task offloading for internet of vehicles
Peng et al. Real-time transmission optimization for edge computing in industrial cyber-physical systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant