CN109802964A - A kind of HTTP self adaptation stream control energy consumption optimization method based on DQN - Google Patents
A kind of HTTP self adaptation stream control energy consumption optimization method based on DQN Download PDFInfo
- Publication number
- CN109802964A CN109802964A CN201910060941.7A CN201910060941A CN109802964A CN 109802964 A CN109802964 A CN 109802964A CN 201910060941 A CN201910060941 A CN 201910060941A CN 109802964 A CN109802964 A CN 109802964A
- Authority
- CN
- China
- Prior art keywords
- state
- energy consumption
- value
- network
- dqn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
Landscapes
- Information Transfer Between Computers (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
A kind of HTTP self adaptation stream control energy consumption optimization method based on DQN, consider different Network status, loaded condition in buffer zone, and client device electricity residue situation, and based on this environment Imitating behaviour in service, in client and the interactive process of server, Streaming Media carries out the different switching of quality to multimedia file by DQN learning system, and the switching of high frequency low frequency kernel is to achieve the purpose that energy optimization.
Description
Technical field
The invention belongs to computer network communication technology fields, and in particular to a kind of HTTP self-adapting flow control based on DQN
Energy consumption optimization method processed.
Background technique
In recent years, MultiMedia Field development speed is very fast, and the transmission of multimedia content is also increasingly valued by the people,
HTTP video protocols are after internet is universal, and a kind of mode of the Online Video viewing of mainstream, http protocol transmits multimedia
File is broadly divided into two stages, and first stage is the progressive download stage, is generally exactly to support user in downloading
It plays, without being played again after finishing entire file download.But this is not really to transmit as a stream, with ordinary file
Downloading is not different, and second stage HTTP fluidization technique, is mainly divided into media file in server end small one by one
Slice, service receive the slice that request sends the media file by http response again, and in the friendship of server and client
During mutually, client adjusts slice code rate according to the state of network in real time, and high code rate is used in the case where network state is good,
Low bit- rate is used when network state is busy and is automatically switched, and the method mainly realized is each list text of the server section in offer
There is dated code rate in part, the player of client can be automatically adjusted according to the progress of broadcasting and the speed of downloading, protect
Demonstrate,prove play continuity and fluency on the basis of, as far as possible promotion user experience, and we to do is to guarantee everything
Under the premise of the optimization of a deep level is carried out to client device energy consumption, in client terminal playing Online Video, network state is delayed
Deposit state, mobile phone remaining capacity is the part that people often ignore, HTTP self adaptation stream there is also code rate selection flexibility compared with
It is low, complicated Network status can not be coped with well, and the code rate of frequent switching video flowing can not only cause discomfort to viewer
Experience also ignores switching bring energy consumption expense, here it is proposed that a kind of deep based on enhancing study and neural network
The energy optimization model of q learning.
Q-learning is a kind of classical way of intensified learning, and the essential core thought of intensified learning is that intelligent body passes through
With the continuous interaction of environment, intelligent body is by taking the suitable movement value that is recompensed to enter NextState, and Q-learning
Core Q-table, row and column shows respectively state and action, and the Q value in Q-table is exactly that measurement state s takes
The quality of a is acted, and how neural network works herein, we can be it as a flight data recorder, and input is
One state value, output be this state value, and training data from whole system operate during generate
Some data can be corrected during calculating and returning by these data, we use corrected value as nerve
The input of network, second training are finally reached convergent effect, select optimal policy.
Summary of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the object of the present invention is to provide a kind of HTTP based on DQN is adaptive
Flow control energy consumption optimization method is increased using the q-learning that one kind combines BP (black propagation) neural network
Strong study with environment to interact, and during user watches video online, environment is constantly in change, the change of network
Change, the consumption of electricity, the system under changeable environment in video player video quality carry out Dynamic Matching switching with
Dynamic dispatching is carried out to different cpu kernels, obtains most suitable media quality rank and most suitable cpu core.It is finally reached
Reduce the function of energy consumption.
To achieve the goals above, the technical solution adopted by the present invention is that:
A kind of HTTP self adaptation stream control energy consumption optimization method based on DQN, comprising the following steps:
1) environment acquisition modeling: using network used in Dummynet simulation daily life, in 3g, 4g, Wifi net
Client is used under network environment, and current context information is acquired, and has client-side data cache state B respectively, i.e., currently
Fragment length in buffer zone, network state N, the set of tri- states of battery capacity E composition, S=(B, N, E), by the time
Multiple time points are divided into, are corresponded, and save data;
2) definition of client-action collection and Reward Program: according to environmental data collected in step 1) as state set
The state space for establishing Q-learning, establishes the behavior aggregate of model, and system passes through to network state, buffer status and battery
Mainly there are two action states to constitute to select suitably to act the behavior aggregate for establishing model into next state for electricity, cuts
Change video quality, the switching of high frequency core low frequency core;The switching of video segment quality, will consume energy the sum of grade and handover overhead
It is defined as Reward Program, Reward Program composition has following two points, and first is energy consumption grade point, by energy consumption grade, different networks
Grade, different video qualities, different cpu core uses form a mapping relations, and energy consumption grade point here is by mapping
It is chosen in table, second value is that video switching and big small nut switch brought expense, this value is a negative-feedback, so
Reward Program expression formula are as follows: R=C1Re nergy+C2Rswitch, here with C1C2It is the weight of two return values respectively, according to user
Preference stresses to set specific value, and weighted value can be 1;
3) algorithm is realized: using Deep Q Learning algorithm, the Q-learning for being combined with bp neural network is calculated
Method chooses best movement, the main function of neural network is to be converted to high-dimensional state by the continuous interaction with environment
Low-dimensional output, neural network is that ambient condition s is inputted by will become low latitudes state value in ambient condition, output
The corresponding Q value of movement has used ε-greedy greedy algorithm, under each state, with small in the form of a vector
Probability ε random selection acts action, and optimal action is selected according to bp neural network with 1- ε, later will be randomly selected
Two are carried out in movement and the replay_buffer experience pond being added in our neural networks according to the action that neural network selects
Movement is made in secondary training, reaches NextState, and neural metwork training optimizes input state, and output valve uses optimal solution strategy, defeated
Optimal solution out;
4) in practical problem, equipment obtains ambient condition value by system, selects most matched quality video by DQN
The kernel of user experience is not influenced with most power saving and.
The based environment information contains network hierarchy in the state set S of definition, is divided into six etc. on earth according to again high
Grade, but by measurement, in the case of 1,2 two-stage or 3g can not quality is minimum in normal load test video, mobile phone electricity is remaining
Value caches fragment length, calls the script of cache information by writing here, selects the buffer status of unit time point, also
It is segment length.
System in the present invention distributes each segment reasonable by interacting with state changing in environment
Stream media quality and reasonable CPU core, the experimental results showed that, this optimization method in the case where not influencing user experience,
Mobile flow medium energy consumption caused by equipment can be effectively reduced, loading section energy consumption reduces 21 percent.
Detailed description of the invention
Fig. 1 is present system flow chart.
Fig. 2 is DQN learning process figure of the present invention.
Fig. 3 is application scenario diagram of the present invention.
Specific embodiment
The present invention is further discussed below with reference to embodiments, but the present invention does not limit to and following embodiment.
A kind of HTTP self adaptation stream control energy consumption optimization method based on DQN, as shown in Figure 1,3, http self adaptation stream
The groundwork situation of work be files in stream media is divided into one by one a lesser segment carry out HTTP request, transmit
Deng, thus we first client receive be files in stream media slice, system acquisition network environment and current electricity
Situation, and data are handled, specific process is as follows:
Definition status collection S, network hierarchy are divided into six grades according to again high on earth, but by measurement, 1,2 two-stage or 3g
In the case of can not in normal load test video quality it is minimum, so being 0 to calculate by return value, mobile phone electricity remaining value is delayed
Fragment length is deposited, calls the script of cache information by writing here, selects the buffer status of unit time point, that is, segment
Length.
Define set of actions, the exploitation version Odroid XU3 that we use here, main Cortex-A15 high frequency kernel with
Cortex-A7 low frequency kernel, movement here are mainly worked according to the variation of environment adjustment using which core, which core is slept
It sleeps, main actions are task choosing A15 and task choosing A7, and stream media quality is divided into lossless, high definition, it is low clear, it is only limitted to here
Test the video collection of test.
3) building of reward function and model is selected,
Firstly, initializing to neural network, we are each states of estimation using the main function of BP neural network here
The value of lower movement, and the dimension of vector is reduced, to the learning rate α and discount factor γ in Q value iterative formula, and movement choosing
Exploration probability ε assignment in selecting.To the period of each iteration, following process will do it as shown in Fig. 2, initializing completion
Afterwards, the state state of system is inputted, and output is value caused by current action, we replace according to this output of estimation
Optimal solution is found in output before instead of, optimization step by step, and after obtaining the value of each movement, we use ε-
Gree strategy finds optimal solution, we initialize a threshold value here, and initial value is set to 0.8, that is to say, that we are dynamic in selection
When work, 8 percent tenth is that random selection one movement, 2 percent tenth is that by neural computing act income,
Choose it is most suitable that, with continuous study, that value that we initialize is lower and lower, until not randomly choosing.
Claims (2)
1. a kind of HTTP self adaptation stream based on DQN controls energy consumption optimization method, which comprises the following steps:
1) environment acquisition modeling: using network used in Dummynet simulation daily life, in 3g, 4g, Wifi network rings
Client is used under border, and current context information is acquired, and has client-side data cache state B, i.e. current cache respectively
Fragment length in region, network state N, the set of tri- states of battery capacity E composition, S=(B, N, E) will be divided the time
It for multiple time points, corresponds, and saves data;
2) it the definition of client-action collection and Reward Program: is established according to environmental data collected in step 1) as state set
The state space of Q-learning, establishes the behavior aggregate of model, and system passes through to network state, buffer status and battery capacity
To select suitably to act the behavior aggregate for establishing model into next state, mainly there are two action states to constitute, and switching regards
Frequency quality, the switching of high frequency core low frequency core;The switching of video segment quality, the sum of the grade and handover overhead of will consuming energy definition
For Reward Program, Reward Program composition has following two points, and first is energy consumption grade point, by energy consumption grade, different network hierarchies,
Different video qualities, different cpu core uses form a mapping relations, and energy consumption grade point here in mapping table by selecting
It takes, second value is that video switching and big small nut switch brought expense, this value is a negative-feedback, so return letter
Number expression formula are as follows: R=C1Re nergy+C2Rswitch, here with C1C2It is the weight of two return values respectively, according to user preference
Stress to set specific value, weighted value can be 1;
3) algorithm is realized: being used Deep Q Learning algorithm, is combined with the Q-learning algorithm of bp neural network, leads to
The continuous interaction with environment is crossed, chooses best movement, the main function of neural network is that high-dimensional state is converted to low-dimensional
Output, neural network is that ambient condition s is inputted by will become low latitudes state value in ambient condition, output action
Corresponding Q value has used ε-greedy greedy algorithm in the form of a vector, under each state, with small probability
ε random selection acts action, optimal action is selected according to bp neural network with 1- ε, later by randomly selected movement
Secondary instruction is carried out in the replay_buffer experience pond being added in our neural networks with the action selected according to neural network
Practice, make movement, reach NextState, neural metwork training optimizes input state, and output valve uses optimal solution strategy, and output is most
Excellent solution;
4) in practical problem, equipment obtains ambient condition value by system, by the most matched quality video of DQN selection and most
Power saving and the kernel for not influencing user experience.
2. a kind of HTTP self adaptation stream based on DQN according to claim 1 controls energy consumption optimization method, feature exists
Contain network hierarchy in, the based environment information, the state set S of definition, is divided into six grades on earth according to again high, but pass through
Cross measurement, in the case of 1,2 two-stage or 3g can not in normal load test video quality it is minimum, mobile phone electricity remaining value, caching
Fragment length calls the script of cache information by writing here, selects the buffer status of unit time point, that is, piece segment length
It is short.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910060941.7A CN109802964B (en) | 2019-01-23 | 2019-01-23 | DQN-based HTTP adaptive flow control energy consumption optimization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910060941.7A CN109802964B (en) | 2019-01-23 | 2019-01-23 | DQN-based HTTP adaptive flow control energy consumption optimization method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109802964A true CN109802964A (en) | 2019-05-24 |
CN109802964B CN109802964B (en) | 2021-09-28 |
Family
ID=66560085
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910060941.7A Active CN109802964B (en) | 2019-01-23 | 2019-01-23 | DQN-based HTTP adaptive flow control energy consumption optimization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109802964B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414725A (en) * | 2019-07-11 | 2019-11-05 | 山东大学 | The integrated wind power plant energy-storage system dispatching method of forecast and decision and device |
CN114885208A (en) * | 2022-03-21 | 2022-08-09 | 中南大学 | Dynamic self-adapting method, equipment and medium for scalable streaming media transmission under NDN (named data networking) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180129974A1 (en) * | 2016-11-04 | 2018-05-10 | United Technologies Corporation | Control systems using deep reinforcement learning |
CN108063961A (en) * | 2017-12-22 | 2018-05-22 | 北京联合网视文化传播有限公司 | A kind of self-adaption code rate video transmission method and system based on intensified learning |
CN108737382A (en) * | 2018-04-23 | 2018-11-02 | 浙江工业大学 | SVC based on Q-Learning encodes HTTP streaming media self-adapting methods |
AU2017268276A1 (en) * | 2016-05-16 | 2018-12-06 | Wi-Tronix, Llc | Video content analysis system and method for transportation system |
CN108966330A (en) * | 2018-09-21 | 2018-12-07 | 西北大学 | A kind of mobile terminal music player dynamic regulation energy consumption optimization method based on Q-learning |
-
2019
- 2019-01-23 CN CN201910060941.7A patent/CN109802964B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2017268276A1 (en) * | 2016-05-16 | 2018-12-06 | Wi-Tronix, Llc | Video content analysis system and method for transportation system |
US20180129974A1 (en) * | 2016-11-04 | 2018-05-10 | United Technologies Corporation | Control systems using deep reinforcement learning |
CN108063961A (en) * | 2017-12-22 | 2018-05-22 | 北京联合网视文化传播有限公司 | A kind of self-adaption code rate video transmission method and system based on intensified learning |
CN108737382A (en) * | 2018-04-23 | 2018-11-02 | 浙江工业大学 | SVC based on Q-Learning encodes HTTP streaming media self-adapting methods |
CN108966330A (en) * | 2018-09-21 | 2018-12-07 | 西北大学 | A kind of mobile terminal music player dynamic regulation energy consumption optimization method based on Q-learning |
Non-Patent Citations (3)
Title |
---|
HONGFENG XU: "Live Streaming with Content Centric Networking", 《2012 THIRD INTERNATIONAL CONFERENCE ON NETWORKING AND DISTRIBUTED COMPUTING》 * |
VIRGINIA MARTIN: "Evaluation of Q-Learning approach for HTTP Adaptive Streaming", 《2016 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS》 * |
熊丽荣: "基于Q-learning的HTTP自适应流码率控制方法研究", 《通信学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414725A (en) * | 2019-07-11 | 2019-11-05 | 山东大学 | The integrated wind power plant energy-storage system dispatching method of forecast and decision and device |
CN114885208A (en) * | 2022-03-21 | 2022-08-09 | 中南大学 | Dynamic self-adapting method, equipment and medium for scalable streaming media transmission under NDN (named data networking) |
CN114885208B (en) * | 2022-03-21 | 2023-08-08 | 中南大学 | Dynamic self-adapting method, equipment and medium for scalable streaming media transmission under NDN (network discovery network) |
Also Published As
Publication number | Publication date |
---|---|
CN109802964B (en) | 2021-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109639760B (en) | It is a kind of based on deeply study D2D network in cache policy method | |
CN111835827B (en) | Internet of things edge computing task unloading method and system | |
CN113434212B (en) | Cache auxiliary task cooperative unloading and resource allocation method based on meta reinforcement learning | |
CN107690176B (en) | Network selection method based on Q learning algorithm | |
CN110062357B (en) | D2D auxiliary equipment caching system and caching method based on reinforcement learning | |
Gaing | Constrained dynamic economic dispatch solution using particle swarm optimization | |
CN112218337B (en) | Cache strategy decision method in mobile edge calculation | |
CN109814951A (en) | The combined optimization method of task unloading and resource allocation in mobile edge calculations network | |
CN110351754A (en) | Industry internet machinery equipment user data based on Q-learning calculates unloading decision-making technique | |
CN113114756A (en) | Video cache updating method for self-adaptive code rate selection in mobile edge calculation | |
CN110312277B (en) | Mobile network edge cooperative cache model construction method based on machine learning | |
Zhu et al. | Computation offloading for workflow in mobile edge computing based on deep Q-learning | |
CN109802964A (en) | A kind of HTTP self adaptation stream control energy consumption optimization method based on DQN | |
Yan et al. | Distributed edge caching with content recommendation in fog-rans via deep reinforcement learning | |
CN114205791A (en) | Depth Q learning-based social perception D2D collaborative caching method | |
CN107949007A (en) | A kind of resource allocation algorithm based on Game Theory in wireless caching system | |
CN116321307A (en) | Bidirectional cache placement method based on deep reinforcement learning in non-cellular network | |
Lin et al. | Vehicle-to-cloudlet: Game-based computation demand response for mobile edge computing through vehicles | |
CN111314960A (en) | Social awareness-based collaborative caching method in fog wireless access network | |
US11570063B2 (en) | Quality of experience optimization system and method | |
CN111643901A (en) | Method and device for intelligently rendering cloud game interface | |
CN112822727B (en) | Self-adaptive edge content caching method based on mobility and popularity perception | |
Mowafi et al. | Energy efficient fuzzy-based DASH adaptation algorithm | |
CN113672372B (en) | Multi-edge collaborative load balancing task scheduling method based on reinforcement learning | |
Lin et al. | Knn-q learning algorithm of bitrate adaptation for video streaming over http |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |