CN108965949B - Code rate self-adaption method for satisfying user personalized experience in video service - Google Patents
Code rate self-adaption method for satisfying user personalized experience in video service Download PDFInfo
- Publication number
- CN108965949B CN108965949B CN201810844053.XA CN201810844053A CN108965949B CN 108965949 B CN108965949 B CN 108965949B CN 201810844053 A CN201810844053 A CN 201810844053A CN 108965949 B CN108965949 B CN 108965949B
- Authority
- CN
- China
- Prior art keywords
- code rate
- value
- video
- user
- qoe
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/266—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
- H04N21/2662—Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/762—Media network packet handling at the source
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The scheme of personalized user experience in video service is a technology for improving the user viewing experience in the video playing process. The method aims to construct a function approximator by designing a neural network, and predict the influence of code rate selection on subsequent video playing performance indexes, thereby meeting different user experience requirements. The design flow is 1) evaluation: and evaluating the influence of each code rate selection on different element performance indexes by using a neural network. 2) And (3) decision making: and (4) utilizing the evaluation value of the element performance index obtained in the evaluation process to be explicitly multiplied by the optimization target g, and selecting the code rate corresponding to the maximum value. The invention can realize the maximization of the user experience under different optimization targets, and can quickly realize the generalization on the user target with low cost when the optimization target of the user experience is changed.
Description
Technical Field
The invention belongs to the technical field of streaming media video, relates to user experience optimization, and particularly relates to a code rate self-adaption method for meeting user personalized experience in video service.
Background
In recent years, video traffic in the internet has emerged, and it is expected that video traffic accounts for nearly eight times of the entire internet traffic in 2019. The problem of video performance becomes more and more important, because the performance of video directly affects the user's experience, and further affects the duration of watching video by the user, and ultimately the revenue of the content provider. The user expects that the video can be clearer, the video playing process cannot be blocked, and the video is smooth and low in time delay. However, these performance indexes are contradictory and restrictive. With the advent of new scenes and new forms of presentation, such as live scenes, Virtual Reality (VR), etc., meeting the requirements of the user experience becomes more challenging.
A tool that describes and quantifies user experience and user demand for video is user quality of experience (QoE). The bitrate Adaptive (ABR) algorithm is a common method to improve user QoE by selecting an appropriate bitrate for the next video block to be played to maximize user experience. The user QoE generally includes several meta-metrics as follows: code rate, video pause time, code rate switching and time delay. When watching videos, different users and different watching scenes have different requirements on each performance index of the QoE. For example, in the case of live game, the user would prefer to have high-definition video and would prefer no pause, but the requirement for delay would be low. In case of a highly interactive scenario, the user may have a higher requirement for latency, while the requirement for sharpness may be lower than latency. It is therefore meaningful to provide a way to meet the needs of a user's personalized experience when faced with different users. Balancing different performance metrics to maximize user experience has become a key point of academic and industrial concern and research.
Disclosure of Invention
Aiming at the problems of the essential difficulty in improving the user experience in the video service and the desire to meet the user personalized experience, the invention provides a code rate self-adaption method which meets the user personalized experience in the video service and is a model with generalization capability so as to realize the goal of personalized user experience in video playing. The invention is a code rate self-adaptive algorithm based on reinforcement learning, which can select the most suitable code rate in the network scene according to the network environment and optimize various performance indexes in the video service so as to meet the individual experience requirements of users. The performance of the algorithm is superior to that of the prior code rate adaptive algorithm, namely, the best user experience is provided under the condition of a specific user QoE target. Meanwhile, when the user or the playing content is changed, the algorithm can be generalized on the user preference quickly and with low cost, the watching experience of the user in the video playing process is finally improved, and the maximization of the user experience under different optimization targets is realized.
In order to achieve the purpose, the invention adopts the technical scheme that:
a code rate self-adaption method meeting user personalized experience in video services is characterized in that a neural network is used as an evaluation function Q (s, a, m, g), the influence of each code rate selection a on different element performance indexes m is evaluated, the evaluation value of the element performance indexes obtained in the evaluation process is used for being multiplied by an optimized target weight value, namely a given user preference g in an explicit mode, the code rate corresponding to the maximum value is selected, and therefore different user experience requirements are met, wherein the evaluation function Q (s, a, m, g) represents how each element performance index m is influenced by each code rate selection a under the conditions of different network states s and given user preference g.
The input of the evaluation process consists of a state value s and an optimized target weight value g, wherein the state value s describes the condition of the network and the occupation condition of the buffer area; the optimization target weight value g represents different user video performance requirements;
the output of the evaluation process is the cumulative sum of the QoE observations by the end of the video playback, output Q∞(s, a, m, g), where [ infinity ] indicates the end of video playback.
The linear combination of the meta-performance index m and the user preference g is used to represent the QoE of the user experience, then
Where N is the number of blocks in a video being played, RnIs the code rate of the nth block, q (R)n) Is the nth video block quality, TnIs the stuck time of the nth block, | q (R)n+1)-q(Rn) I is the code rate difference of two adjacent blocks when the video is played, which represents the smoothness of the video, DnIs the time delay for downloading the nth block, α, γ, μ is the four terms of the optimization objective g.
The two parts of input of the evaluation process are a state value s and an optimized target weight value g, the state value s and the optimized target weight value g are respectively processed by two neural networks, the output connection of two modules is used as the input of the next neural network, the future QoE value is based on the connected input, the neural network simultaneously outputs the future observed value corresponding to each action, the neural network is divided into two modules, one module is an expected module, the predicted future QoE observed value is the average value of the future QoE observed values, and the partial values are only related to the state value s and are not related to the action; the other is an action module, which predicts the QoE observed value corresponding to different actions taken under a certain state. The two parts of output are added to be used as the output of the whole neural network, namely, under a certain specific state, different QoE four-element performance index values corresponding to different actions are taken until the video playing is finished.
When online, the evaluation value of the element performance index obtained in the evaluation process is used for being explicitly multiplied by the optimized target weight value g, and the calculation formula is as follows:
a=argmaxgTQ∞(s,a,m,g)
according to the formula, the optimal code rate under a certain specific target can be selected, when the product of the Q value and the optimal target g is maximum, the optimal target value is obtained, and the corresponding code rate a is the code rate required to be selected by the block.
In training the neural network model, randomly generated optimization target weight values g are utilized. Compared with the prior art, the invention has the beneficial effects that:
the output dimension of the neural network increases. The output of a conventional reinforcement learning algorithm is a scalar reward value that represents the reward that is obtained after an action is taken, but the information content of the scalar value is small. The increase in output dimensions leads to an increase in the operability of the algorithm. Meanwhile, the personalized QoE requirements of different users can be met by setting different g values.
Drawings
FIG. 1 is a model of an evaluation process, where the inputs are state, optimization objectives, and the output is the cumulative impact of selecting each code rate on the meta-performance index.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the drawings and examples.
The invention relates to a method for improving user experience in video service, which aims to realize personalized user experience by utilizing a model with generalization capability. The user QoE generally includes several meta-metrics as follows: code rate, video pause time, code rate switching and time delay. The demands on video performance indicators by different users watching the video are different. When different video optimization targets exist, the invention can quickly perform performance optimization with low cost.
The design idea of the invention is as follows:
(1) the design idea outlines: and designing under a deep reinforcement learning framework. Meanwhile, by explicitly introducing the user preference g, the evaluation process and the decision process of the ordinary reinforcement learning are decoupled. As evaluation function Q (s, a, m, g) a neural network is used, which represents: and under the condition of different network states and given user preference g, selecting the code rate of the next block by utilizing the evaluation function on the influence of each element performance index m.
(2) And (3) evaluation process: the method aims to construct a function approximator to predict the value of the element performance index in the future by utilizing the idea of a universal value estimation function.
The evaluation process inputs: the input consists of two parts, state s, optimizing the target weight value g. Where the status values describe the status of the network and the buffer occupancy. g is a weight value corresponding to the optimization target, and represents different preferences of different users on video performance.
And (4) outputting an evaluation process: the output is the QoE observed value at the end of the video playback. The traditional bonus value Q (s, a) is divided into a action metric values Q (s, a, m), a representing the number of selectable code rates. The user experience QoE may be represented by a linear combination of meta performance indicator values m and user preferences g, i.e. the user experience QoE is expressed as a linear combination of meta performance indicator values m and user preferences g
The simple representation is:
QoE=gTQ
thus, the QoE of each action at any preference g can be obtained by calculation.
Description of an evaluation process model: the two part inputs are state and optimization targets, which are processed by two neural networks respectively, and the outputs of the two modules are connected as the inputs of the next layer of neural network. Future QoE observations are based on the concatenated input. And the neural network simultaneously outputs future observed values corresponding to all the actions. The neural network is divided into two modules, one is an expectation module, the predicted value is the average value of future QoE observed values, and the partial value is only related to state values and is not related to actions; the other is an action module, which predicts the QoE observed value corresponding to different actions taken under a certain state. The two parts of output are added to be used as the output of the whole neural network, namely, under a certain specific state, different QoE four-element performance index values corresponding to different actions are taken until the video playing is finished.
(3) And (3) decision making process: when online, the algorithm can utilize the meta-performance indexes (definition, stuck, smooth and time delay) and the optimization target obtained in the evaluation process when the video playing is finished,
a=argmaxgTQ∞(s,a,m,g)
and selecting the optimal code rate under a certain specific target according to the formula.
In summary, the present invention provides a code rate adaptive algorithm capable of realizing personalized user experience. A function approximator is constructed by utilizing a neural network, and the influence of code rate selection on the subsequent video playing performance index is predicted, so that different user experience requirements are met. According to the scheme, different code rates can be selected according to different playing contents, users and user behaviors, the maximization of user experience under different optimization targets is achieved, and when the optimization target of the user experience is changed, generalization on the user target can be achieved rapidly and at low cost, so that the requirement of personalized user experience is met.
Claims (3)
1. A code rate self-adaption method for satisfying user personalized experience in video service utilizes a neural network as an evaluation function Q (s, a, m, g), evaluates the influence of each code rate selection a on different element performance indexes m, utilizes the evaluation value of the element performance indexes obtained in the evaluation process to be explicitly multiplied by an optimized target weight value, namely a given user preference g, and selects a code rate corresponding to the maximum value, thereby satisfying different user experience requirements, wherein the evaluation function Q (s, a, m, g) represents how each code rate selection a influences each element performance index m under the conditions of different network states s and the given user preference g, the input of the evaluation process is composed of a state value s and the optimized target weight value g, wherein the state value s describes the network condition and the buffer area occupation condition; the optimization target weight value g represents different user video performance requirements;
the output of the evaluation process is the cumulative sum of the QoE observations by the end of the video playback, output Q∞(s, a, m, g), where the end of video playback is represented by ∞ in the formula;
the method is characterized in that the QoE (quality of experience) of the user is expressed by linear combination of the meta-performance index m and the user preference g
Where N is the number of blocks in a video being played, RnIs the code rate of the nth block, q (R)n) Is the nth video block quality, TnIs the stuck time of the nth block, | q (R)n+1)-q(Rn) I is the code rate difference of two adjacent blocks when the video is played, which represents the smoothness of the video, DnIs the time delay for downloading the nth block, α, γ, μ are the four terms of the optimization objective g;
the two parts of input of the evaluation process are a state value s and an optimized target weight value g, which are respectively processed by two neural networks, the output of the two modules is connected as the input of the next neural network, the future QoE value is based on the connected input, the neural network simultaneously outputs the future QoE observed value corresponding to each action, the neural network is divided into two modules, one module is an expected module, the predicted future QoE observed value is the average value of the future QoE observed values, and the future QoE observed values are only related to the state value s and are unrelated to the actions; the other is an action module, which predicts that in a certain state, different actions are taken to correspond to future QoE observed values; the two parts of output are added to be used as the output of the whole neural network, namely, under a certain specific state, different QoE four-element performance index values corresponding to different actions are taken until the video playing is finished.
2. The code rate adaptation method satisfying the personalized experience of the user in the video service according to claim 1, wherein on-line, the evaluation value of the meta-performance index obtained in the evaluation process is explicitly multiplied by the optimized target weight value g by the following calculation formula:
a=argmaxgTQ∞(s,a,m,g)
according to the formula, the optimal code rate under a certain specific target can be selected, when the product of the Q value and the optimal target g is maximum, the optimal target value is obtained, and the corresponding code rate a is the code rate required to be selected by the block.
3. The adaptive bitrate method according to claim 1, wherein the optimal target weight value g is randomly generated when training the neural network model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810844053.XA CN108965949B (en) | 2018-07-27 | 2018-07-27 | Code rate self-adaption method for satisfying user personalized experience in video service |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810844053.XA CN108965949B (en) | 2018-07-27 | 2018-07-27 | Code rate self-adaption method for satisfying user personalized experience in video service |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108965949A CN108965949A (en) | 2018-12-07 |
CN108965949B true CN108965949B (en) | 2020-06-16 |
Family
ID=64465739
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810844053.XA Active CN108965949B (en) | 2018-07-27 | 2018-07-27 | Code rate self-adaption method for satisfying user personalized experience in video service |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108965949B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111698536B (en) * | 2019-03-15 | 2023-03-28 | 瑞昱半导体股份有限公司 | Video processing method and system |
CN110191362B (en) * | 2019-05-29 | 2021-03-16 | 鹏城实验室 | Data transmission method and device, storage medium and electronic equipment |
CN110324621B (en) * | 2019-07-04 | 2021-05-18 | 北京达佳互联信息技术有限公司 | Video encoding method, video encoding device, electronic equipment and storage medium |
CN111432246B (en) * | 2020-03-23 | 2022-11-15 | 广州市百果园信息技术有限公司 | Method, device and storage medium for pushing video data |
CN111447471B (en) * | 2020-03-26 | 2022-03-22 | 广州市百果园信息技术有限公司 | Model generation method, play control method, device, equipment and storage medium |
CN113810089B (en) * | 2020-06-11 | 2023-09-29 | 华为技术有限公司 | Communication method and device |
CN111669627B (en) * | 2020-06-30 | 2022-02-15 | 广州市百果园信息技术有限公司 | Method, device, server and storage medium for determining video code rate |
CN112202802B (en) * | 2020-10-10 | 2021-10-01 | 中国科学技术大学 | VR video multi-level caching method and system based on reinforcement learning in C-RAN architecture |
CN112911408B (en) * | 2021-01-25 | 2022-03-25 | 电子科技大学 | Intelligent video code rate adjustment and bandwidth allocation method based on deep learning |
EP4284054A4 (en) * | 2021-03-02 | 2024-03-20 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Wireless communication method, terminal device and network device |
CN115515161A (en) * | 2021-06-23 | 2022-12-23 | 华为技术有限公司 | Data transmission method and communication device |
CN115776590A (en) * | 2021-09-07 | 2023-03-10 | 北京字跳网络技术有限公司 | Dynamic image quality video playing method and device, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102802089A (en) * | 2012-09-13 | 2012-11-28 | 浙江大学 | Shifting video code rate regulation method based on experience qualitative forecast |
CN106604026A (en) * | 2016-12-16 | 2017-04-26 | 浙江工业大学 | Quality-of-experience (QoE) evaluation method of mobile streaming media user |
-
2018
- 2018-07-27 CN CN201810844053.XA patent/CN108965949B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102802089A (en) * | 2012-09-13 | 2012-11-28 | 浙江大学 | Shifting video code rate regulation method based on experience qualitative forecast |
CN106604026A (en) * | 2016-12-16 | 2017-04-26 | 浙江工业大学 | Quality-of-experience (QoE) evaluation method of mobile streaming media user |
Non-Patent Citations (1)
Title |
---|
CFA: A practical prediction system for video QoE Optimization;Junchen Jiang, et al;《The Processdings of the 13th USENIX Symposium on Networked Systems Design and Implementation》;20160318;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108965949A (en) | 2018-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108965949B (en) | Code rate self-adaption method for satisfying user personalized experience in video service | |
Zhang et al. | DRL360: 360-degree video streaming with deep reinforcement learning | |
KR102254579B1 (en) | System and method for streaming personalized media content | |
CN107038213B (en) | Video recommendation method and device | |
CN107205178A (en) | Direct broadcasting room recommends method and device | |
CN108989847B (en) | System and method for encoding and streaming video | |
CN107454446A (en) | Video frame management method and its device based on Quality of experience analysis | |
US20170169040A1 (en) | Method and electronic device for recommending video | |
CN108419134B (en) | Channel recommendation method based on fusion of individual history and group current behaviors | |
US11463538B2 (en) | Adapting playback settings based on change history | |
Gao et al. | Content-aware personalised rate adaptation for adaptive streaming via deep video analysis | |
CN109086822A (en) | A kind of main broadcaster's user classification method, device, equipment and storage medium | |
US20180139501A1 (en) | Optimized delivery of sequential content by skipping redundant segments | |
Li et al. | An apprenticeship learning approach for adaptive video streaming based on chunk quality and user preference | |
Sun et al. | Live 360 degree video delivery based on user collaboration in a streaming flock | |
JP2004519902A (en) | Television viewer profile initializer and related methods | |
Ye et al. | VRCT: A viewport reconstruction-based 360 video caching solution for tile-adaptive streaming | |
CN112866756B (en) | Code rate control method, device, medium and equipment for multimedia file | |
CN114747225B (en) | Method, system and medium for selecting a format of a streaming media content item | |
Lu et al. | Deep-reinforcement-learning-based user-preference-aware rate adaptation for video streaming | |
Chen et al. | Energy-efficient and QoE-aware 360-degree video streaming on mobile devices | |
Xie et al. | Deep Curriculum Reinforcement Learning for Adaptive 360$^{\circ} $ Video Streaming With Two-Stage Training | |
Ran et al. | SSR: Joint optimization of recommendation and adaptive bitrate streaming for short-form video feed | |
Huo et al. | TS360: A two-stage deep reinforcement learning system for 360-degree video streaming | |
Li et al. | Improving ABR performance for short video streaming using multi-agent reinforcement learning with expert guidance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |