CN114866461A - RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium - Google Patents
RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN114866461A CN114866461A CN202210471454.1A CN202210471454A CN114866461A CN 114866461 A CN114866461 A CN 114866461A CN 202210471454 A CN202210471454 A CN 202210471454A CN 114866461 A CN114866461 A CN 114866461A
- Authority
- CN
- China
- Prior art keywords
- strategy
- streaming media
- state
- transmission
- rtc
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000005540 biological transmission Effects 0.000 title claims abstract description 146
- 238000000034 method Methods 0.000 title claims abstract description 53
- 230000008569 process Effects 0.000 claims abstract description 20
- 230000006870 function Effects 0.000 claims description 57
- 230000009471 action Effects 0.000 claims description 46
- 230000003044 adaptive effect Effects 0.000 claims description 33
- 238000010521 absorption reaction Methods 0.000 claims description 14
- 230000007704 transition Effects 0.000 claims description 14
- 230000006872 improvement Effects 0.000 claims description 10
- 238000004891 communication Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 230000004048 modification Effects 0.000 claims 1
- 238000012986 modification Methods 0.000 claims 1
- 230000003068 static effect Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012804 iterative process Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/12—Shortest path evaluation
- H04L45/124—Shortest path evaluation using a combination of metrics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/80—Responding to QoS
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Optimization (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Algebra (AREA)
- Probability & Statistics with Applications (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The application provides a RTC streaming media self-adaptive transmission method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition; and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration. The method and the device find the optimal solution from the intelligent dynamic planning angle so as to achieve the lowest time delay and transmission effect.
Description
Technical Field
The present application relates to the field of communications technologies, and in particular, to an RTC streaming media adaptive transmission method and apparatus, a computer device, and a computer-readable storage medium.
Background
At present, the time delay is often large in the RTC transmission process due to different selected streaming media nodes, and especially the time delay and jitter of long-distance transmission are very large; the current common method in the industry is static path planning, that is, configuring a static routing table, selecting a transport stream media node according to the static routing in the RTC transmission process, and updating the static routing table at intervals; or a path is searched by selecting a limited number of nodes; due to the limitations of these methods, the given path is not the optimal path, and the real-time path condition of the RTC cannot be adapted to the minimum delay.
Disclosure of Invention
In view of the foregoing problems in the prior art, embodiments of the present application provide an RTC streaming media adaptive transmission method, apparatus, computer device, and computer-readable storage medium.
In a first aspect, the present application provides an RTC streaming media adaptive transmission method, including:
acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy.
In some embodiments, the obtaining the state parameters of all streaming media nodes in the target RTC transmission scenario includes:
acquiring original state data of all streaming media nodes in a target RTC transmission scene, wherein the original state data comprises the number of users, bandwidth and CPU (Central processing Unit) resources of each streaming media node;
and normalizing the number of users, the bandwidth and the CPU resource of each streaming media node, and determining the state parameter corresponding to each streaming media node according to preset different weight coefficients.
In some embodiments, the obtaining state parameters of all streaming media nodes in a target RTC transmission scenario, and establishing a streaming media transmission policy model based on a markov decision process theory and a preset user experience quality condition includes:
defining a four-tuple (S, A, Psa, R) based on a Markov decision process theory according to the state parameters of all streaming media nodes under the target RTC transmission scene;
constructing a condition function F according to the preset user experience quality condition;
constructing an optimal state value function and an optimal action value function based on a Bellman equation;
wherein S represents the state set of all streaming media nodes, and S is i ∈S,s i Representing the state parameter of the ith streaming media node in the target RTC transmission scene; a represents a set of actions, having a i ∈A,a i Representing the action that the streaming media node in the ith step selects the next streaming media node; psa is expressed at the present s i In the state of passing through a i State transition probability for transition to the next state after the action; r represents a return function for transferring the node state of the transmission state of the strategy to be transmitted based on path selection; f is the return value of the state transition of the strategy to be transmitted when the strategy enters the action of the absorption state.
In some embodiments, the conditional function F comprises:
taking the last streaming media node of the strategy to be transmitted as an absorption state, and when the strategy to be transmitted enters the action of the absorption state, if the time delay of the strategy to be transmitted is smaller than a preset time delay threshold value and the code rate of the strategy to be transmitted is smaller than a preset code rate threshold value, the return value of state transition of the strategy to be transmitted when the strategy to be transmitted enters the action of the absorption state is 0, otherwise, the return value is-1.
In some embodiments, based on the streaming media transmission policy model, solving an optimal transmission policy in the target RTC transmission scenario through policy iteration includes:
aiming at a plurality of strategy selections of an initial node under a target RTC transmission scene, establishing a strategy estimation formula based on a Bellman equation;
performing policy estimation on all states of the current policy according to the policy estimation formula to update a state value function of the current policy;
changing an action for the state of the initial node by the current strategy through a strategy improvement principle to enable the action value function of the current strategy to be larger than the corresponding state value function, and traversing the states and all actions of all streaming media nodes through a greedy algorithm to carry out strategy improvement to obtain a new strategy;
and performing strategy estimation and strategy improvement on the new strategy, and obtaining a maximum state value function and a corresponding optimal transmission strategy through iterative computation.
In some embodiments, the policy estimation formula is as follows:
wherein, p (s '| s, pi (s)) represents the probability of transferring to the state s' after executing the action a corresponding to the current strategy pi(s) under the current node state s; r (s '| s, π (s)) represents a return function for transferring to state s' after executing an action a corresponding to a current policy π(s) from a current node state s; gamma represents a discount factor; there are many possibilities for action a corresponding to π(s), each possibility being denoted as π (a | s).
In some embodiments, the performing policy estimation and policy improvement on the new policy by iterative computation to obtain a maximum state value function and a corresponding optimal transmission policy includes:
and when the calculated change value of the current state value function is smaller than a preset threshold value, determining that the current state value function is the maximum state value function, and determining that the strategy corresponding to the maximum state value function is the optimal transmission strategy.
In a second aspect, the present application provides an RTC streaming media adaptive transmission apparatus, including:
the modeling module is used for acquiring state parameters of all streaming media nodes in a target RTC transmission scene and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and the path planning module is used for solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration based on the streaming media transmission strategy model, and transmitting the target streaming media based on the optimal transmission strategy.
In a third aspect, an embodiment of the present application provides a computer device, including:
at least one processor, at least one memory, and a communication interface; wherein,
the processor, the memory and the communication interface are communicated with each other;
the memory stores program instructions executable by the processor, and the processor calls the program instructions to execute the RTC streaming media adaptive transmission method provided by any of the various implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing computer instructions that, when executed on a computer device, cause the computer device to perform the RTC streaming media adaptive transmission method provided in any one of the various implementations of the first aspect.
According to the embodiment, on the basis of a Markov decision process theory, user experience quality conditions are added to construct a streaming media transmission strategy model, so that more effective strategies are conveniently searched, the real-time experience and viewing quality of a user are met, and the time delay is reduced; and then based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, thereby improving the adaptability of RTC transmission and reducing the transmission time delay.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below to the drawings required for the description of the embodiments or the prior art, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of an RTC streaming media adaptive transmission method according to an embodiment of the present application;
fig. 2 is a flowchart of a step S101 of an RTC streaming media adaptive transmission method according to an embodiment of the present application;
fig. 3 is a flowchart of a step S101 of an RTC streaming media adaptive transmission method according to another embodiment of the present application;
fig. 4 is a flowchart of a step S102 of an RTC streaming media adaptive transmission method according to an embodiment of the present application;
fig. 5 is a schematic diagram of an RTC streaming media adaptive transmission apparatus according to an embodiment of the present application;
fig. 6 is a schematic diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. In addition, the technical features of the various embodiments or individual embodiments provided in this application may be arbitrarily combined with each other to form a feasible technical solution, and such combination is not limited by the sequence of steps and/or the structural composition mode, but must be based on the realization of the capability of a person skilled in the art, and when the technical solution combination is contradictory or cannot be realized, the technical solution combination should be considered to be absent and not within the protection scope of the present application.
Referring to fig. 1, the RTC streaming media adaptive transmission method according to the embodiment of the present application may include the following steps:
s101, acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition.
It should be noted that the target RTC transmission scenario may be a streaming media transmission scenario based on an RTC (Real-Time Communication) protocol, such as live-cast streaming media transmission, where a media to be transmitted is transmitted from the source node to the target node through the RTC protocol. The preset user experience quality condition refers to Qoe index conditions, and includes a delay threshold and a code rate threshold. The streaming media node, the source node, the target node and the initial node appearing in this embodiment are all referred to as a streaming media server.
In some embodiments, referring to fig. 2, the obtaining of the state parameters of all streaming media nodes in the target RTC transmission scenario in step S101 may include:
s201, acquiring original state data of all streaming media nodes in a target RTC transmission scene, wherein the original state data comprises the number of users, bandwidth and CPU (central processing unit) resources of each streaming media node;
s202, normalizing the number of users, the bandwidth and the CPU resource of each streaming media node, and determining the state parameter corresponding to each streaming media node according to different preset weight coefficients.
For the parameters after normalization of the number of users, the bandwidth, and the CPU resources, different weight coefficients may be set according to actual conditions, such as 0.3, and 0.4, or 0.4, and 0.2, and the state parameter corresponding to each streaming media node is calculated.
In some embodiments, referring to fig. 3, step S101 may include the following steps:
s301, defining a four-tuple (S, A, Psa, R) based on a Markov decision process theory according to the state parameters of all streaming media nodes in the target RTC transmission scene;
s302, constructing a condition function F according to the preset user experience quality condition;
s303, constructing an optimal state value function and an optimal action value function based on the Bellman equation;
wherein S represents the state set of all streaming media nodes, and S is i ∈S,s i Representing a target RTC transmission scenarioThe state parameters of the streaming media node in the ith step; a represents a set of actions, having a i ∈A,a i Representing the action that the streaming media node in the ith step selects the next streaming media node; psa is expressed at the present s i In the state of passing through a i State transition probability for transition to the next state after the action; r represents a return function for transferring the node state of the transmission state of the strategy to be transmitted based on path selection; f is the return value of the state transition of the strategy to be transmitted when the strategy enters the action of the absorption state.
In some embodiments, the optimum state value function V * (s) and an optimal action value function Q * (s) is represented as follows:
in the formula, Σ p (s '| s, π (s)) represents the probability of transitioning to state s' after the action a corresponding to the current policy π(s) is executed from the current node state s; r (s '| s, pi (s)) represents a return function transferred to the state s' after the action a corresponding to the current strategy pi(s) is executed in the current node state s, and r (s '| s, pi (s)) can also be recorded as r (s' | s, a); gamma represents a discount factor; s 0 Indicates an initial state, a 0 Representing an initial action; v π (s ') a state value function representing the state s' of the subsequent streaming media node; q * (s ', a') represents the maximum action value function of the state s 'of the subsequent streaming media node after performing action a'. The above-mentioned optimum state value function V * (s) and an optimal action value function Q * (s) is called Bellman optimality equation.
Based on the Markov decision process theory, it is necessary to search for the state s in any initial state 0 The strategy pi of the state value function and the action value function can be maximized.
In some embodiments, the conditional function F comprises: taking the last streaming media node of the strategy to be transmitted as an absorption state, and when the strategy to be transmitted enters the action of the absorption state, if the time delay of the strategy to be transmitted is smaller than a preset time delay threshold value and the code rate of the strategy to be transmitted is smaller than a preset code rate threshold value, the return value of state transition of the strategy to be transmitted when the strategy to be transmitted enters the action of the absorption state is 0, otherwise, the return value is-1.
It should be noted that, in the embodiment, when a policy is searched, two Qoe index conditions, namely a delay threshold and a code rate threshold, are incorporated into the streaming media transmission policy model and are used to determine a return value of the state transition of the streaming media when the policy to be transmitted is in the absorption state, if the return value of the state transition of the streaming media when the policy to be transmitted is in the absorption state is-1, it indicates that the policy to be transmitted is an invalid policy, and if the return value is 0, it indicates that the policy to be transmitted is an effective policy, so that the delay of the found optimal transmission policy is low, and the real-time experience and viewing quality of a user can be met.
In a specific embodiment, for a domestic RTC transmission scenario, a delay threshold value may be set to 90-100 ms, and a code rate threshold value may be set to 1.8-2.2 million, and for a foreign RTC transmission scenario, a delay threshold value may be set to 200-300 ms, and a code rate threshold value may be set to 1.8-2.2 million, so that specific values of the delay threshold value and the code rate threshold value may be set according to actual situations.
S102, based on the streaming media transmission strategy model, solving an optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy; the target streaming media can be an audio-video resource for live viewing by a user.
In some embodiments, referring to fig. 4, step S102 may include:
s401, aiming at multiple strategy selections of an initial node in a target RTC transmission scene, establishing a strategy estimation formula based on a Bellman equation;
s402, performing strategy estimation on all states of the current strategy according to the strategy estimation formula to update a state value function of the current strategy;
s403, changing an action for the state of the initial node by the current strategy according to a strategy improvement principle, so that the action value function of the current strategy is larger than the corresponding state value function, and traversing the states and all actions of all streaming media nodes by a greedy algorithm to perform strategy improvement to obtain a new strategy;
s404, strategy estimation and strategy improvement are carried out on the new strategy, and a maximum state value function and a corresponding optimal transmission strategy are obtained through iterative calculation.
In some embodiments, the policy estimation formula is specified as follows:
the meaning of the strategy estimation formula is that the strategy estimation formula is in the current node state s, if the next streaming media node is selected according to the current strategy pi, actions a corresponding to pi(s) have multiple possibilities, and each possibility is marked as pi (a | s). In the formula, p (s '| s, pi (s)) represents the probability of transferring to the state s' after the action a corresponding to the current strategy pi(s) is executed in the current node state s; r (s '| s, pi (s)) represents a return function for transferring to the state s' after the action a corresponding to the current strategy pi(s) is executed from the current node state s; gamma denotes a discount factor.
In this embodiment, an iterative method is used for policy estimation, and the (k + 1) th iteration may be represented as:
in the iterative process, each state is scanned once in each iteration, and in the (k + 1) th iteration, larger V which can be directly obtained π (s) assigning a value to V k+1 Specifically, an array may be used to store each state value function, and each time a new larger state value is obtained, the old smaller state value is overwritten. The array may be of the form: [ V ] k+1 (s 1 ),V k+1 (s 2 ),V k+1 (s 3 ),...V k+1 (s n )]. Through an iterative process, the strategy estimation formula can be continuously drivenConvergence is achieved.
In some embodiments, step S404 includes: and when the change value of the current state value function obtained by calculation is smaller than a preset threshold value, determining that the current state value function is the maximum state value function and the strategy corresponding to the maximum state value function is the optimal transmission strategy, so that iteration can be exited and the operation amount is reduced. In this embodiment, the change value of the current state value function may refer to a change rate, and the threshold value may be set according to an actual situation, for example, 1% to 5%.
According to the embodiment, on the basis of a Markov decision process theory, user experience quality conditions are added to construct a streaming media transmission strategy model, so that more effective strategies are conveniently searched, the real-time experience and viewing quality of a user are met, and the time delay is reduced; and then based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, thereby improving the adaptability of RTC transmission and reducing the transmission time delay.
The implementation basis of the various embodiments of the present application is realized by a programmed process performed by a device having a processor function. Therefore, in engineering practice, the technical solutions and functions thereof of the embodiments of the present application can be packaged into various modules.
Based on this reality, on the basis of the foregoing embodiments, embodiments of the present application provide an apparatus for adaptive RTC streaming, where the apparatus is configured to execute the method for adaptive RTC streaming in the foregoing method embodiments. Referring to fig. 5, the apparatus for adaptive transmission of RTC streaming media includes:
the modeling module 501 is configured to acquire state parameters of all streaming media nodes in a target RTC transmission scene, and establish a streaming media transmission policy model based on a markov decision process theory and preset user experience quality conditions;
and the path planning module 502 is configured to solve an optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration based on the streaming media transmission strategy model, and transmit the target streaming media based on the optimal transmission strategy.
For specific limitations of each module of the RTC streaming media adaptive transmission apparatus, reference may be made to the above limitations on the RTC streaming media adaptive transmission method, which is not described herein again. In addition, it should be noted that all or part of the modules in the RTC streaming media adaptive transmission apparatus may be implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
Referring to fig. 6, the present embodiment further provides a computer device, which may be a computing device such as a mobile terminal, a desktop computer, a notebook, a palmtop computer, and a server. The computer device comprises a processor 601, a memory 602 and a display 603. FIG. 6 shows some of the components of a computer device, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The memory 602 may be, in some embodiments, an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The memory 602 may also be an external storage device of the computer device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device. Further, the memory 602 may also include both internal and external storage units of the computer device. The memory 602 is used for storing application software installed on the computer device and various data, such as program codes for installing the computer device. The memory 602 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 602 stores an RTC streaming adaptive transmission program 604.
The processor 601 may be a Central Processing Unit (CPU), microprocessor or other data Processing chip in some embodiments, and is used for executing program codes stored in the memory 602 or Processing data, such as executing an RTC streaming media adaptive transmission method.
The display 603 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 603 is used for displaying information at the computer device and for displaying a visualized user interface. The components 601 and 603 of the computer device communicate with each other via a system bus.
In one embodiment, when the processor 601 executes the RTC streaming media adaptive transmission program 604 in the memory 602, the following steps are implemented:
acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy.
The present embodiment further provides a computer-readable storage medium, on which an RTC streaming media adaptive transmission program is stored, and when executed by a processor, the RTC streaming media adaptive transmission program implements the following steps:
acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above.
Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Claims (10)
1. An adaptive transmission method for RTC streaming media, comprising:
acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy.
2. The RTC streaming media adaptive transmission method according to claim 1, wherein the acquiring the state parameters of all streaming media nodes in the target RTC transmission scenario includes:
acquiring original state data of all streaming media nodes in a target RTC transmission scene, wherein the original state data comprises the number of users, bandwidth and CPU (Central processing Unit) resources of each streaming media node;
and normalizing the number of users, the bandwidth and the CPU resource of each streaming media node, and determining the state parameter corresponding to each streaming media node according to preset different weight coefficients.
3. The RTC streaming media adaptive transmission method according to claim 1, wherein the acquiring the state parameters of all streaming media nodes in the target RTC transmission scene and establishing a streaming media transmission policy model based on a Markov decision process theory and a preset user experience quality condition comprises:
defining a four-tuple (S, A, Psa, R) based on a Markov decision process theory according to the state parameters of all streaming media nodes under the target RTC transmission scene;
constructing a condition function F according to the preset user experience quality condition;
constructing an optimal state value function and an optimal action value function based on a Bellman equation;
wherein S represents the state set of all streaming media nodes, and S is i ∈S,s i Representing the state parameter of the ith streaming media node in the target RTC transmission scene; a represents a set of actions, having a i ∈A,a i Representing the action that the streaming media node in the ith step selects the next streaming media node; psa is expressed at the present s i In the state of passing through a i State transition probability for transition to the next state after the action; r represents a return function for transferring the node state of the transmission state of the strategy to be transmitted based on path selection; f is the return value of the state transition of the strategy to be transmitted when the strategy enters the action of the absorption state.
4. The RTC streaming adaptive transmission method according to claim 3, characterized in that the conditional function F comprises:
taking the last streaming media node of the strategy to be transmitted as an absorption state, and when the strategy to be transmitted enters the action of the absorption state, if the time delay of the strategy to be transmitted is smaller than a preset time delay threshold value and the code rate of the strategy to be transmitted is smaller than a preset code rate threshold value, the return value of state transition of the strategy to be transmitted when the strategy to be transmitted enters the action of the absorption state is 0, otherwise, the return value is-1.
5. The RTC streaming media adaptive transmission method according to claim 2, wherein the step of solving the optimal transmission strategy in the target RTC transmission scenario through strategy iteration based on the streaming media transmission strategy model comprises:
aiming at a plurality of strategy selections of an initial node under a target RTC transmission scene, establishing a strategy estimation formula based on a Bellman equation;
performing policy estimation on all states of the current policy according to the policy estimation formula to update a state value function of the current policy;
changing an action for the state of the initial node by the current strategy according to a strategy improvement principle, so that an action value function of the current strategy is larger than a corresponding state value function, and traversing the states and all actions of all streaming media nodes by a greedy algorithm to carry out strategy improvement so as to obtain a new strategy;
and performing strategy estimation and strategy improvement on the new strategy, and obtaining a maximum state value function and a corresponding optimal transmission strategy through iterative computation.
6. The RTC streaming adaptive transmission method according to claim 5, characterized in that the policy estimation formula comprises:
p (s '| s, pi (s)) represents the probability of transferring to the state s' after the action a corresponding to the current strategy pi(s) is executed from the current node state s; r (s '| s, pi (s)) represents a return function for transferring to the state s' after the action a corresponding to the current strategy pi(s) is executed from the current node state s; gamma represents a discount factor; there are many possibilities for action a corresponding to π(s), each possibility being denoted as π (a | s).
7. The RTC streaming media adaptive transmission method according to claim 5, wherein the performing policy estimation and policy modification on the new policy through iterative computation to obtain a maximum state value function and a corresponding optimal transmission policy comprises:
and when the calculated change value of the current state value function is smaller than a preset threshold value, determining that the current state value function is the maximum state value function, and determining that the strategy corresponding to the maximum state value function is the optimal transmission strategy.
8. An apparatus for adaptive transmission of RTC streaming media, comprising:
the modeling module is used for acquiring state parameters of all streaming media nodes in a target RTC transmission scene and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and the path planning module is used for solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration based on the streaming media transmission strategy model, and transmitting the target streaming media based on the optimal transmission strategy.
9. A computer device, comprising:
at least one processor, at least one memory, and a communication interface; wherein,
the processor, the memory and the communication interface are communicated with each other;
the memory stores program instructions executable by the processor, the processor calling the program instructions to perform the RTC streaming adaptive transmission method of any of claims 1 to 7.
10. A computer readable storage medium storing computer instructions which, when run on a computer device, cause the computer device to perform the RTC streaming adaptive transmission method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210471454.1A CN114866461A (en) | 2022-04-28 | 2022-04-28 | RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210471454.1A CN114866461A (en) | 2022-04-28 | 2022-04-28 | RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114866461A true CN114866461A (en) | 2022-08-05 |
Family
ID=82636365
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210471454.1A Pending CN114866461A (en) | 2022-04-28 | 2022-04-28 | RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114866461A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116668364A (en) * | 2022-09-29 | 2023-08-29 | 中兴通讯股份有限公司 | Route planning method and device for real-time audio and video network |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101677289A (en) * | 2008-09-17 | 2010-03-24 | 华为技术有限公司 | Method and device for optimizing route |
CN103326946A (en) * | 2013-07-02 | 2013-09-25 | 中国(南京)未来网络产业创新中心 | SVC streaming media transmission optimization method based on OpenFlow |
CN108391143A (en) * | 2018-04-24 | 2018-08-10 | 南京邮电大学 | A kind of wireless network transmission of video self-adaptation control method based on Q study |
CN109511123A (en) * | 2018-12-27 | 2019-03-22 | 沈阳航空航天大学 | A kind of software definition vehicle network adaptive routing method based on temporal information |
CN109587519A (en) * | 2018-12-28 | 2019-04-05 | 南京邮电大学 | Heterogeneous network Multipath Video control system and method based on Q study |
CN109743600A (en) * | 2019-01-15 | 2019-05-10 | 国网河南省电力公司 | Based on wearable live O&M adaptive video stream transmission rate control |
CN111107602A (en) * | 2019-12-24 | 2020-05-05 | 杭州电子科技大学 | Safe routing method with minimum energy consumption and time delay weighting for wireless body area network |
CN112153716A (en) * | 2019-09-24 | 2020-12-29 | 中兴通讯股份有限公司 | Transmission path selection method and device and storage medium |
CN113472671A (en) * | 2020-03-30 | 2021-10-01 | 中国电信股份有限公司 | Method and device for determining multicast route and computer readable storage medium |
CN114124823A (en) * | 2021-10-18 | 2022-03-01 | 西安电子科技大学 | Self-adaptive routing method, system and equipment oriented to high-dynamic network topology |
-
2022
- 2022-04-28 CN CN202210471454.1A patent/CN114866461A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101677289A (en) * | 2008-09-17 | 2010-03-24 | 华为技术有限公司 | Method and device for optimizing route |
CN103326946A (en) * | 2013-07-02 | 2013-09-25 | 中国(南京)未来网络产业创新中心 | SVC streaming media transmission optimization method based on OpenFlow |
CN108391143A (en) * | 2018-04-24 | 2018-08-10 | 南京邮电大学 | A kind of wireless network transmission of video self-adaptation control method based on Q study |
CN109511123A (en) * | 2018-12-27 | 2019-03-22 | 沈阳航空航天大学 | A kind of software definition vehicle network adaptive routing method based on temporal information |
CN109587519A (en) * | 2018-12-28 | 2019-04-05 | 南京邮电大学 | Heterogeneous network Multipath Video control system and method based on Q study |
CN109743600A (en) * | 2019-01-15 | 2019-05-10 | 国网河南省电力公司 | Based on wearable live O&M adaptive video stream transmission rate control |
CN112153716A (en) * | 2019-09-24 | 2020-12-29 | 中兴通讯股份有限公司 | Transmission path selection method and device and storage medium |
CN111107602A (en) * | 2019-12-24 | 2020-05-05 | 杭州电子科技大学 | Safe routing method with minimum energy consumption and time delay weighting for wireless body area network |
CN113472671A (en) * | 2020-03-30 | 2021-10-01 | 中国电信股份有限公司 | Method and device for determining multicast route and computer readable storage medium |
CN114124823A (en) * | 2021-10-18 | 2022-03-01 | 西安电子科技大学 | Self-adaptive routing method, system and equipment oriented to high-dynamic network topology |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116668364A (en) * | 2022-09-29 | 2023-08-29 | 中兴通讯股份有限公司 | Route planning method and device for real-time audio and video network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hsu | Age of information: Whittle index for scheduling stochastic arrivals | |
US10873864B2 (en) | Incorporation of expert knowledge into machine learning based wireless optimization framework | |
WO2021223662A1 (en) | Page access based on code scanning | |
US20130198729A1 (en) | Automated improvement of executable applications based on evaluating independent execution heuristics | |
CN111756646A (en) | Network transmission control method, network transmission control device, computer equipment and storage medium | |
CN112383485B (en) | Network congestion control method and device | |
CN111914285B (en) | Geographic distributed graph calculation method and system based on differential privacy | |
CN114866461A (en) | RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium | |
Yang et al. | Edge computing in the dark: Leveraging contextual-combinatorial bandit and coded computing | |
CN116192960A (en) | Dynamic construction method and system for computing power network cluster based on constraint condition | |
Soto et al. | Towards autonomous VNF auto-scaling using deep reinforcement learning | |
CN116896511B (en) | Special line cloud service speed limiting method, device, equipment and storage medium | |
CN116737451B (en) | Data recovery method and device of flash memory, solid state disk and storage medium | |
CN112488563A (en) | Determination method and device for force calculation parameters | |
CN115955685B (en) | Multi-agent cooperative routing method, equipment and computer storage medium | |
CN117076563A (en) | Pruning method and device applied to blockchain | |
US20240022644A1 (en) | Caching method, device, and computer program product for edge server | |
CN115409180A (en) | Distributed graph reasoning calculation method, device, terminal and storage medium | |
Leconte et al. | Adaptive replication in distributed content delivery networks | |
CN114257647A (en) | Conference video caching method, server and system based on D2D communication | |
Yu et al. | Blockchain-enabled rcs task offloading and resource allocation policy using drl approach | |
Xiong et al. | Intelligent Sharding Decision for Blockchain-Enabled Industrial IoT Based on A3C Approach | |
CN118233382B (en) | Congestion control method, congestion control model training method, switch and medium | |
CN116192629B (en) | Differential data processing method and system based on edge calculation | |
CN113746611B (en) | Method for configuring timing value of signaling in communication, computer device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220805 |
|
RJ01 | Rejection of invention patent application after publication |