CN108965949B - Code rate self-adaption method for satisfying user personalized experience in video service - Google Patents

Code rate self-adaption method for satisfying user personalized experience in video service Download PDF

Info

Publication number
CN108965949B
CN108965949B CN201810844053.XA CN201810844053A CN108965949B CN 108965949 B CN108965949 B CN 108965949B CN 201810844053 A CN201810844053 A CN 201810844053A CN 108965949 B CN108965949 B CN 108965949B
Authority
CN
China
Prior art keywords
code rate
value
video
user
qoe
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810844053.XA
Other languages
Chinese (zh)
Other versions
CN108965949A (en
Inventor
崔勇
王莫为
左旭彤
杨啖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201810844053.XA priority Critical patent/CN108965949B/en
Publication of CN108965949A publication Critical patent/CN108965949A/en
Application granted granted Critical
Publication of CN108965949B publication Critical patent/CN108965949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/762Media network packet handling at the source 

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The scheme of personalized user experience in video service is a technology for improving the user viewing experience in the video playing process. The method aims to construct a function approximator by designing a neural network, and predict the influence of code rate selection on subsequent video playing performance indexes, thereby meeting different user experience requirements. The design flow is 1) evaluation: and evaluating the influence of each code rate selection on different element performance indexes by using a neural network. 2) And (3) decision making: and (4) utilizing the evaluation value of the element performance index obtained in the evaluation process to be explicitly multiplied by the optimization target g, and selecting the code rate corresponding to the maximum value. The invention can realize the maximization of the user experience under different optimization targets, and can quickly realize the generalization on the user target with low cost when the optimization target of the user experience is changed.

Description

Code rate self-adaption method for satisfying user personalized experience in video service
Technical Field
The invention belongs to the technical field of streaming media video, relates to user experience optimization, and particularly relates to a code rate self-adaption method for meeting user personalized experience in video service.
Background
In recent years, video traffic in the internet has emerged, and it is expected that video traffic accounts for nearly eight times of the entire internet traffic in 2019. The problem of video performance becomes more and more important, because the performance of video directly affects the user's experience, and further affects the duration of watching video by the user, and ultimately the revenue of the content provider. The user expects that the video can be clearer, the video playing process cannot be blocked, and the video is smooth and low in time delay. However, these performance indexes are contradictory and restrictive. With the advent of new scenes and new forms of presentation, such as live scenes, Virtual Reality (VR), etc., meeting the requirements of the user experience becomes more challenging.
A tool that describes and quantifies user experience and user demand for video is user quality of experience (QoE). The bitrate Adaptive (ABR) algorithm is a common method to improve user QoE by selecting an appropriate bitrate for the next video block to be played to maximize user experience. The user QoE generally includes several meta-metrics as follows: code rate, video pause time, code rate switching and time delay. When watching videos, different users and different watching scenes have different requirements on each performance index of the QoE. For example, in the case of live game, the user would prefer to have high-definition video and would prefer no pause, but the requirement for delay would be low. In case of a highly interactive scenario, the user may have a higher requirement for latency, while the requirement for sharpness may be lower than latency. It is therefore meaningful to provide a way to meet the needs of a user's personalized experience when faced with different users. Balancing different performance metrics to maximize user experience has become a key point of academic and industrial concern and research.
Disclosure of Invention
Aiming at the problems of the essential difficulty in improving the user experience in the video service and the desire to meet the user personalized experience, the invention provides a code rate self-adaption method which meets the user personalized experience in the video service and is a model with generalization capability so as to realize the goal of personalized user experience in video playing. The invention is a code rate self-adaptive algorithm based on reinforcement learning, which can select the most suitable code rate in the network scene according to the network environment and optimize various performance indexes in the video service so as to meet the individual experience requirements of users. The performance of the algorithm is superior to that of the prior code rate adaptive algorithm, namely, the best user experience is provided under the condition of a specific user QoE target. Meanwhile, when the user or the playing content is changed, the algorithm can be generalized on the user preference quickly and with low cost, the watching experience of the user in the video playing process is finally improved, and the maximization of the user experience under different optimization targets is realized.
In order to achieve the purpose, the invention adopts the technical scheme that:
a code rate self-adaption method meeting user personalized experience in video services is characterized in that a neural network is used as an evaluation function Q (s, a, m, g), the influence of each code rate selection a on different element performance indexes m is evaluated, the evaluation value of the element performance indexes obtained in the evaluation process is used for being multiplied by an optimized target weight value, namely a given user preference g in an explicit mode, the code rate corresponding to the maximum value is selected, and therefore different user experience requirements are met, wherein the evaluation function Q (s, a, m, g) represents how each element performance index m is influenced by each code rate selection a under the conditions of different network states s and given user preference g.
The input of the evaluation process consists of a state value s and an optimized target weight value g, wherein the state value s describes the condition of the network and the occupation condition of the buffer area; the optimization target weight value g represents different user video performance requirements;
the output of the evaluation process is the cumulative sum of the QoE observations by the end of the video playback, output Q(s, a, m, g), where [ infinity ] indicates the end of video playback.
The linear combination of the meta-performance index m and the user preference g is used to represent the QoE of the user experience, then
Figure BDA0001746231780000021
Where N is the number of blocks in a video being played, RnIs the code rate of the nth block, q (R)n) Is the nth video block quality, TnIs the stuck time of the nth block, | q (R)n+1)-q(Rn) I is the code rate difference of two adjacent blocks when the video is played, which represents the smoothness of the video, DnIs the time delay for downloading the nth block, α, γ, μ is the four terms of the optimization objective g.
The two parts of input of the evaluation process are a state value s and an optimized target weight value g, the state value s and the optimized target weight value g are respectively processed by two neural networks, the output connection of two modules is used as the input of the next neural network, the future QoE value is based on the connected input, the neural network simultaneously outputs the future observed value corresponding to each action, the neural network is divided into two modules, one module is an expected module, the predicted future QoE observed value is the average value of the future QoE observed values, and the partial values are only related to the state value s and are not related to the action; the other is an action module, which predicts the QoE observed value corresponding to different actions taken under a certain state. The two parts of output are added to be used as the output of the whole neural network, namely, under a certain specific state, different QoE four-element performance index values corresponding to different actions are taken until the video playing is finished.
When online, the evaluation value of the element performance index obtained in the evaluation process is used for being explicitly multiplied by the optimized target weight value g, and the calculation formula is as follows:
a=argmaxgTQ(s,a,m,g)
according to the formula, the optimal code rate under a certain specific target can be selected, when the product of the Q value and the optimal target g is maximum, the optimal target value is obtained, and the corresponding code rate a is the code rate required to be selected by the block.
In training the neural network model, randomly generated optimization target weight values g are utilized. Compared with the prior art, the invention has the beneficial effects that:
the output dimension of the neural network increases. The output of a conventional reinforcement learning algorithm is a scalar reward value that represents the reward that is obtained after an action is taken, but the information content of the scalar value is small. The increase in output dimensions leads to an increase in the operability of the algorithm. Meanwhile, the personalized QoE requirements of different users can be met by setting different g values.
Drawings
FIG. 1 is a model of an evaluation process, where the inputs are state, optimization objectives, and the output is the cumulative impact of selecting each code rate on the meta-performance index.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the drawings and examples.
The invention relates to a method for improving user experience in video service, which aims to realize personalized user experience by utilizing a model with generalization capability. The user QoE generally includes several meta-metrics as follows: code rate, video pause time, code rate switching and time delay. The demands on video performance indicators by different users watching the video are different. When different video optimization targets exist, the invention can quickly perform performance optimization with low cost.
The design idea of the invention is as follows:
(1) the design idea outlines: and designing under a deep reinforcement learning framework. Meanwhile, by explicitly introducing the user preference g, the evaluation process and the decision process of the ordinary reinforcement learning are decoupled. As evaluation function Q (s, a, m, g) a neural network is used, which represents: and under the condition of different network states and given user preference g, selecting the code rate of the next block by utilizing the evaluation function on the influence of each element performance index m.
(2) And (3) evaluation process: the method aims to construct a function approximator to predict the value of the element performance index in the future by utilizing the idea of a universal value estimation function.
The evaluation process inputs: the input consists of two parts, state s, optimizing the target weight value g. Where the status values describe the status of the network and the buffer occupancy. g is a weight value corresponding to the optimization target, and represents different preferences of different users on video performance.
And (4) outputting an evaluation process: the output is the QoE observed value at the end of the video playback. The traditional bonus value Q (s, a) is divided into a action metric values Q (s, a, m), a representing the number of selectable code rates. The user experience QoE may be represented by a linear combination of meta performance indicator values m and user preferences g, i.e. the user experience QoE is expressed as a linear combination of meta performance indicator values m and user preferences g
Figure BDA0001746231780000041
The simple representation is:
QoE=gTQ
thus, the QoE of each action at any preference g can be obtained by calculation.
Description of an evaluation process model: the two part inputs are state and optimization targets, which are processed by two neural networks respectively, and the outputs of the two modules are connected as the inputs of the next layer of neural network. Future QoE observations are based on the concatenated input. And the neural network simultaneously outputs future observed values corresponding to all the actions. The neural network is divided into two modules, one is an expectation module, the predicted value is the average value of future QoE observed values, and the partial value is only related to state values and is not related to actions; the other is an action module, which predicts the QoE observed value corresponding to different actions taken under a certain state. The two parts of output are added to be used as the output of the whole neural network, namely, under a certain specific state, different QoE four-element performance index values corresponding to different actions are taken until the video playing is finished.
(3) And (3) decision making process: when online, the algorithm can utilize the meta-performance indexes (definition, stuck, smooth and time delay) and the optimization target obtained in the evaluation process when the video playing is finished,
a=argmaxgTQ(s,a,m,g)
and selecting the optimal code rate under a certain specific target according to the formula.
In summary, the present invention provides a code rate adaptive algorithm capable of realizing personalized user experience. A function approximator is constructed by utilizing a neural network, and the influence of code rate selection on the subsequent video playing performance index is predicted, so that different user experience requirements are met. According to the scheme, different code rates can be selected according to different playing contents, users and user behaviors, the maximization of user experience under different optimization targets is achieved, and when the optimization target of the user experience is changed, generalization on the user target can be achieved rapidly and at low cost, so that the requirement of personalized user experience is met.

Claims (3)

1. A code rate self-adaption method for satisfying user personalized experience in video service utilizes a neural network as an evaluation function Q (s, a, m, g), evaluates the influence of each code rate selection a on different element performance indexes m, utilizes the evaluation value of the element performance indexes obtained in the evaluation process to be explicitly multiplied by an optimized target weight value, namely a given user preference g, and selects a code rate corresponding to the maximum value, thereby satisfying different user experience requirements, wherein the evaluation function Q (s, a, m, g) represents how each code rate selection a influences each element performance index m under the conditions of different network states s and the given user preference g, the input of the evaluation process is composed of a state value s and the optimized target weight value g, wherein the state value s describes the network condition and the buffer area occupation condition; the optimization target weight value g represents different user video performance requirements;
the output of the evaluation process is the cumulative sum of the QoE observations by the end of the video playback, output Q(s, a, m, g), where the end of video playback is represented by ∞ in the formula;
the method is characterized in that the QoE (quality of experience) of the user is expressed by linear combination of the meta-performance index m and the user preference g
Figure FDA0002442836100000011
Where N is the number of blocks in a video being played, RnIs the code rate of the nth block, q (R)n) Is the nth video block quality, TnIs the stuck time of the nth block, | q (R)n+1)-q(Rn) I is the code rate difference of two adjacent blocks when the video is played, which represents the smoothness of the video, DnIs the time delay for downloading the nth block, α, γ, μ are the four terms of the optimization objective g;
the two parts of input of the evaluation process are a state value s and an optimized target weight value g, which are respectively processed by two neural networks, the output of the two modules is connected as the input of the next neural network, the future QoE value is based on the connected input, the neural network simultaneously outputs the future QoE observed value corresponding to each action, the neural network is divided into two modules, one module is an expected module, the predicted future QoE observed value is the average value of the future QoE observed values, and the future QoE observed values are only related to the state value s and are unrelated to the actions; the other is an action module, which predicts that in a certain state, different actions are taken to correspond to future QoE observed values; the two parts of output are added to be used as the output of the whole neural network, namely, under a certain specific state, different QoE four-element performance index values corresponding to different actions are taken until the video playing is finished.
2. The code rate adaptation method satisfying the personalized experience of the user in the video service according to claim 1, wherein on-line, the evaluation value of the meta-performance index obtained in the evaluation process is explicitly multiplied by the optimized target weight value g by the following calculation formula:
a=argmaxgTQ(s,a,m,g)
according to the formula, the optimal code rate under a certain specific target can be selected, when the product of the Q value and the optimal target g is maximum, the optimal target value is obtained, and the corresponding code rate a is the code rate required to be selected by the block.
3. The adaptive bitrate method according to claim 1, wherein the optimal target weight value g is randomly generated when training the neural network model.
CN201810844053.XA 2018-07-27 2018-07-27 Code rate self-adaption method for satisfying user personalized experience in video service Active CN108965949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810844053.XA CN108965949B (en) 2018-07-27 2018-07-27 Code rate self-adaption method for satisfying user personalized experience in video service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810844053.XA CN108965949B (en) 2018-07-27 2018-07-27 Code rate self-adaption method for satisfying user personalized experience in video service

Publications (2)

Publication Number Publication Date
CN108965949A CN108965949A (en) 2018-12-07
CN108965949B true CN108965949B (en) 2020-06-16

Family

ID=64465739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810844053.XA Active CN108965949B (en) 2018-07-27 2018-07-27 Code rate self-adaption method for satisfying user personalized experience in video service

Country Status (1)

Country Link
CN (1) CN108965949B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111698536B (en) * 2019-03-15 2023-03-28 瑞昱半导体股份有限公司 Video processing method and system
CN110191362B (en) * 2019-05-29 2021-03-16 鹏城实验室 Data transmission method and device, storage medium and electronic equipment
CN110324621B (en) * 2019-07-04 2021-05-18 北京达佳互联信息技术有限公司 Video encoding method, video encoding device, electronic equipment and storage medium
CN111432246B (en) * 2020-03-23 2022-11-15 广州市百果园信息技术有限公司 Method, device and storage medium for pushing video data
CN111447471B (en) * 2020-03-26 2022-03-22 广州市百果园信息技术有限公司 Model generation method, play control method, device, equipment and storage medium
CN113810089B (en) * 2020-06-11 2023-09-29 华为技术有限公司 Communication method and device
CN111669627B (en) * 2020-06-30 2022-02-15 广州市百果园信息技术有限公司 Method, device, server and storage medium for determining video code rate
CN112202802B (en) * 2020-10-10 2021-10-01 中国科学技术大学 VR video multi-level caching method and system based on reinforcement learning in C-RAN architecture
CN112911408B (en) * 2021-01-25 2022-03-25 电子科技大学 Intelligent video code rate adjustment and bandwidth allocation method based on deep learning
EP4284054A4 (en) * 2021-03-02 2024-03-20 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Wireless communication method, terminal device and network device
CN115515161A (en) * 2021-06-23 2022-12-23 华为技术有限公司 Data transmission method and communication device
CN115776590A (en) * 2021-09-07 2023-03-10 北京字跳网络技术有限公司 Dynamic image quality video playing method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102802089A (en) * 2012-09-13 2012-11-28 浙江大学 Shifting video code rate regulation method based on experience qualitative forecast
CN106604026A (en) * 2016-12-16 2017-04-26 浙江工业大学 Quality-of-experience (QoE) evaluation method of mobile streaming media user

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102802089A (en) * 2012-09-13 2012-11-28 浙江大学 Shifting video code rate regulation method based on experience qualitative forecast
CN106604026A (en) * 2016-12-16 2017-04-26 浙江工业大学 Quality-of-experience (QoE) evaluation method of mobile streaming media user

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CFA: A practical prediction system for video QoE Optimization;Junchen Jiang, et al;《The Processdings of the 13th USENIX Symposium on Networked Systems Design and Implementation》;20160318;全文 *

Also Published As

Publication number Publication date
CN108965949A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN108965949B (en) Code rate self-adaption method for satisfying user personalized experience in video service
Zhang et al. DRL360: 360-degree video streaming with deep reinforcement learning
KR102254579B1 (en) System and method for streaming personalized media content
CN107038213B (en) Video recommendation method and device
CN107205178A (en) Direct broadcasting room recommends method and device
CN108989847B (en) System and method for encoding and streaming video
CN107454446A (en) Video frame management method and its device based on Quality of experience analysis
US20170169040A1 (en) Method and electronic device for recommending video
CN108419134B (en) Channel recommendation method based on fusion of individual history and group current behaviors
US11463538B2 (en) Adapting playback settings based on change history
Gao et al. Content-aware personalised rate adaptation for adaptive streaming via deep video analysis
CN109086822A (en) A kind of main broadcaster's user classification method, device, equipment and storage medium
US20180139501A1 (en) Optimized delivery of sequential content by skipping redundant segments
Li et al. An apprenticeship learning approach for adaptive video streaming based on chunk quality and user preference
Sun et al. Live 360 degree video delivery based on user collaboration in a streaming flock
JP2004519902A (en) Television viewer profile initializer and related methods
Ye et al. VRCT: A viewport reconstruction-based 360 video caching solution for tile-adaptive streaming
CN112866756B (en) Code rate control method, device, medium and equipment for multimedia file
CN114747225B (en) Method, system and medium for selecting a format of a streaming media content item
Lu et al. Deep-reinforcement-learning-based user-preference-aware rate adaptation for video streaming
Chen et al. Energy-efficient and QoE-aware 360-degree video streaming on mobile devices
Xie et al. Deep Curriculum Reinforcement Learning for Adaptive 360$^{\circ} $ Video Streaming With Two-Stage Training
Ran et al. SSR: Joint optimization of recommendation and adaptive bitrate streaming for short-form video feed
Huo et al. TS360: A two-stage deep reinforcement learning system for 360-degree video streaming
Li et al. Improving ABR performance for short video streaming using multi-agent reinforcement learning with expert guidance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant