CN114866461A - RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium - Google Patents

RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium Download PDF

Info

Publication number
CN114866461A
CN114866461A CN202210471454.1A CN202210471454A CN114866461A CN 114866461 A CN114866461 A CN 114866461A CN 202210471454 A CN202210471454 A CN 202210471454A CN 114866461 A CN114866461 A CN 114866461A
Authority
CN
China
Prior art keywords
strategy
streaming media
state
transmission
rtc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210471454.1A
Other languages
Chinese (zh)
Inventor
田昌
刘莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jitter Technology Shenzhen Co ltd
Original Assignee
Jitter Technology Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jitter Technology Shenzhen Co ltd filed Critical Jitter Technology Shenzhen Co ltd
Priority to CN202210471454.1A priority Critical patent/CN114866461A/en
Publication of CN114866461A publication Critical patent/CN114866461A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • H04L45/124Shortest path evaluation using a combination of metrics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application provides a RTC streaming media self-adaptive transmission method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition; and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration. The method and the device find the optimal solution from the intelligent dynamic planning angle so as to achieve the lowest time delay and transmission effect.

Description

RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to an RTC streaming media adaptive transmission method and apparatus, a computer device, and a computer-readable storage medium.
Background
At present, the time delay is often large in the RTC transmission process due to different selected streaming media nodes, and especially the time delay and jitter of long-distance transmission are very large; the current common method in the industry is static path planning, that is, configuring a static routing table, selecting a transport stream media node according to the static routing in the RTC transmission process, and updating the static routing table at intervals; or a path is searched by selecting a limited number of nodes; due to the limitations of these methods, the given path is not the optimal path, and the real-time path condition of the RTC cannot be adapted to the minimum delay.
Disclosure of Invention
In view of the foregoing problems in the prior art, embodiments of the present application provide an RTC streaming media adaptive transmission method, apparatus, computer device, and computer-readable storage medium.
In a first aspect, the present application provides an RTC streaming media adaptive transmission method, including:
acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy.
In some embodiments, the obtaining the state parameters of all streaming media nodes in the target RTC transmission scenario includes:
acquiring original state data of all streaming media nodes in a target RTC transmission scene, wherein the original state data comprises the number of users, bandwidth and CPU (Central processing Unit) resources of each streaming media node;
and normalizing the number of users, the bandwidth and the CPU resource of each streaming media node, and determining the state parameter corresponding to each streaming media node according to preset different weight coefficients.
In some embodiments, the obtaining state parameters of all streaming media nodes in a target RTC transmission scenario, and establishing a streaming media transmission policy model based on a markov decision process theory and a preset user experience quality condition includes:
defining a four-tuple (S, A, Psa, R) based on a Markov decision process theory according to the state parameters of all streaming media nodes under the target RTC transmission scene;
constructing a condition function F according to the preset user experience quality condition;
constructing an optimal state value function and an optimal action value function based on a Bellman equation;
wherein S represents the state set of all streaming media nodes, and S is i ∈S,s i Representing the state parameter of the ith streaming media node in the target RTC transmission scene; a represents a set of actions, having a i ∈A,a i Representing the action that the streaming media node in the ith step selects the next streaming media node; psa is expressed at the present s i In the state of passing through a i State transition probability for transition to the next state after the action; r represents a return function for transferring the node state of the transmission state of the strategy to be transmitted based on path selection; f is the return value of the state transition of the strategy to be transmitted when the strategy enters the action of the absorption state.
In some embodiments, the conditional function F comprises:
taking the last streaming media node of the strategy to be transmitted as an absorption state, and when the strategy to be transmitted enters the action of the absorption state, if the time delay of the strategy to be transmitted is smaller than a preset time delay threshold value and the code rate of the strategy to be transmitted is smaller than a preset code rate threshold value, the return value of state transition of the strategy to be transmitted when the strategy to be transmitted enters the action of the absorption state is 0, otherwise, the return value is-1.
In some embodiments, based on the streaming media transmission policy model, solving an optimal transmission policy in the target RTC transmission scenario through policy iteration includes:
aiming at a plurality of strategy selections of an initial node under a target RTC transmission scene, establishing a strategy estimation formula based on a Bellman equation;
performing policy estimation on all states of the current policy according to the policy estimation formula to update a state value function of the current policy;
changing an action for the state of the initial node by the current strategy through a strategy improvement principle to enable the action value function of the current strategy to be larger than the corresponding state value function, and traversing the states and all actions of all streaming media nodes through a greedy algorithm to carry out strategy improvement to obtain a new strategy;
and performing strategy estimation and strategy improvement on the new strategy, and obtaining a maximum state value function and a corresponding optimal transmission strategy through iterative computation.
In some embodiments, the policy estimation formula is as follows:
Figure BDA0003622618760000031
wherein, p (s '| s, pi (s)) represents the probability of transferring to the state s' after executing the action a corresponding to the current strategy pi(s) under the current node state s; r (s '| s, π (s)) represents a return function for transferring to state s' after executing an action a corresponding to a current policy π(s) from a current node state s; gamma represents a discount factor; there are many possibilities for action a corresponding to π(s), each possibility being denoted as π (a | s).
In some embodiments, the performing policy estimation and policy improvement on the new policy by iterative computation to obtain a maximum state value function and a corresponding optimal transmission policy includes:
and when the calculated change value of the current state value function is smaller than a preset threshold value, determining that the current state value function is the maximum state value function, and determining that the strategy corresponding to the maximum state value function is the optimal transmission strategy.
In a second aspect, the present application provides an RTC streaming media adaptive transmission apparatus, including:
the modeling module is used for acquiring state parameters of all streaming media nodes in a target RTC transmission scene and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and the path planning module is used for solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration based on the streaming media transmission strategy model, and transmitting the target streaming media based on the optimal transmission strategy.
In a third aspect, an embodiment of the present application provides a computer device, including:
at least one processor, at least one memory, and a communication interface; wherein,
the processor, the memory and the communication interface are communicated with each other;
the memory stores program instructions executable by the processor, and the processor calls the program instructions to execute the RTC streaming media adaptive transmission method provided by any of the various implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing computer instructions that, when executed on a computer device, cause the computer device to perform the RTC streaming media adaptive transmission method provided in any one of the various implementations of the first aspect.
According to the embodiment, on the basis of a Markov decision process theory, user experience quality conditions are added to construct a streaming media transmission strategy model, so that more effective strategies are conveniently searched, the real-time experience and viewing quality of a user are met, and the time delay is reduced; and then based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, thereby improving the adaptability of RTC transmission and reducing the transmission time delay.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below to the drawings required for the description of the embodiments or the prior art, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of an RTC streaming media adaptive transmission method according to an embodiment of the present application;
fig. 2 is a flowchart of a step S101 of an RTC streaming media adaptive transmission method according to an embodiment of the present application;
fig. 3 is a flowchart of a step S101 of an RTC streaming media adaptive transmission method according to another embodiment of the present application;
fig. 4 is a flowchart of a step S102 of an RTC streaming media adaptive transmission method according to an embodiment of the present application;
fig. 5 is a schematic diagram of an RTC streaming media adaptive transmission apparatus according to an embodiment of the present application;
fig. 6 is a schematic diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. In addition, the technical features of the various embodiments or individual embodiments provided in this application may be arbitrarily combined with each other to form a feasible technical solution, and such combination is not limited by the sequence of steps and/or the structural composition mode, but must be based on the realization of the capability of a person skilled in the art, and when the technical solution combination is contradictory or cannot be realized, the technical solution combination should be considered to be absent and not within the protection scope of the present application.
Referring to fig. 1, the RTC streaming media adaptive transmission method according to the embodiment of the present application may include the following steps:
s101, acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition.
It should be noted that the target RTC transmission scenario may be a streaming media transmission scenario based on an RTC (Real-Time Communication) protocol, such as live-cast streaming media transmission, where a media to be transmitted is transmitted from the source node to the target node through the RTC protocol. The preset user experience quality condition refers to Qoe index conditions, and includes a delay threshold and a code rate threshold. The streaming media node, the source node, the target node and the initial node appearing in this embodiment are all referred to as a streaming media server.
In some embodiments, referring to fig. 2, the obtaining of the state parameters of all streaming media nodes in the target RTC transmission scenario in step S101 may include:
s201, acquiring original state data of all streaming media nodes in a target RTC transmission scene, wherein the original state data comprises the number of users, bandwidth and CPU (central processing unit) resources of each streaming media node;
s202, normalizing the number of users, the bandwidth and the CPU resource of each streaming media node, and determining the state parameter corresponding to each streaming media node according to different preset weight coefficients.
For the parameters after normalization of the number of users, the bandwidth, and the CPU resources, different weight coefficients may be set according to actual conditions, such as 0.3, and 0.4, or 0.4, and 0.2, and the state parameter corresponding to each streaming media node is calculated.
In some embodiments, referring to fig. 3, step S101 may include the following steps:
s301, defining a four-tuple (S, A, Psa, R) based on a Markov decision process theory according to the state parameters of all streaming media nodes in the target RTC transmission scene;
s302, constructing a condition function F according to the preset user experience quality condition;
s303, constructing an optimal state value function and an optimal action value function based on the Bellman equation;
wherein S represents the state set of all streaming media nodes, and S is i ∈S,s i Representing a target RTC transmission scenarioThe state parameters of the streaming media node in the ith step; a represents a set of actions, having a i ∈A,a i Representing the action that the streaming media node in the ith step selects the next streaming media node; psa is expressed at the present s i In the state of passing through a i State transition probability for transition to the next state after the action; r represents a return function for transferring the node state of the transmission state of the strategy to be transmitted based on path selection; f is the return value of the state transition of the strategy to be transmitted when the strategy enters the action of the absorption state.
In some embodiments, the optimum state value function V * (s) and an optimal action value function Q * (s) is represented as follows:
Figure BDA0003622618760000061
Figure BDA0003622618760000062
in the formula, Σ p (s '| s, π (s)) represents the probability of transitioning to state s' after the action a corresponding to the current policy π(s) is executed from the current node state s; r (s '| s, pi (s)) represents a return function transferred to the state s' after the action a corresponding to the current strategy pi(s) is executed in the current node state s, and r (s '| s, pi (s)) can also be recorded as r (s' | s, a); gamma represents a discount factor; s 0 Indicates an initial state, a 0 Representing an initial action; v π (s ') a state value function representing the state s' of the subsequent streaming media node; q * (s ', a') represents the maximum action value function of the state s 'of the subsequent streaming media node after performing action a'. The above-mentioned optimum state value function V * (s) and an optimal action value function Q * (s) is called Bellman optimality equation.
Based on the Markov decision process theory, it is necessary to search for the state s in any initial state 0 The strategy pi of the state value function and the action value function can be maximized.
In some embodiments, the conditional function F comprises: taking the last streaming media node of the strategy to be transmitted as an absorption state, and when the strategy to be transmitted enters the action of the absorption state, if the time delay of the strategy to be transmitted is smaller than a preset time delay threshold value and the code rate of the strategy to be transmitted is smaller than a preset code rate threshold value, the return value of state transition of the strategy to be transmitted when the strategy to be transmitted enters the action of the absorption state is 0, otherwise, the return value is-1.
It should be noted that, in the embodiment, when a policy is searched, two Qoe index conditions, namely a delay threshold and a code rate threshold, are incorporated into the streaming media transmission policy model and are used to determine a return value of the state transition of the streaming media when the policy to be transmitted is in the absorption state, if the return value of the state transition of the streaming media when the policy to be transmitted is in the absorption state is-1, it indicates that the policy to be transmitted is an invalid policy, and if the return value is 0, it indicates that the policy to be transmitted is an effective policy, so that the delay of the found optimal transmission policy is low, and the real-time experience and viewing quality of a user can be met.
In a specific embodiment, for a domestic RTC transmission scenario, a delay threshold value may be set to 90-100 ms, and a code rate threshold value may be set to 1.8-2.2 million, and for a foreign RTC transmission scenario, a delay threshold value may be set to 200-300 ms, and a code rate threshold value may be set to 1.8-2.2 million, so that specific values of the delay threshold value and the code rate threshold value may be set according to actual situations.
S102, based on the streaming media transmission strategy model, solving an optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy; the target streaming media can be an audio-video resource for live viewing by a user.
In some embodiments, referring to fig. 4, step S102 may include:
s401, aiming at multiple strategy selections of an initial node in a target RTC transmission scene, establishing a strategy estimation formula based on a Bellman equation;
s402, performing strategy estimation on all states of the current strategy according to the strategy estimation formula to update a state value function of the current strategy;
s403, changing an action for the state of the initial node by the current strategy according to a strategy improvement principle, so that the action value function of the current strategy is larger than the corresponding state value function, and traversing the states and all actions of all streaming media nodes by a greedy algorithm to perform strategy improvement to obtain a new strategy;
s404, strategy estimation and strategy improvement are carried out on the new strategy, and a maximum state value function and a corresponding optimal transmission strategy are obtained through iterative calculation.
In some embodiments, the policy estimation formula is specified as follows:
Figure BDA0003622618760000081
the meaning of the strategy estimation formula is that the strategy estimation formula is in the current node state s, if the next streaming media node is selected according to the current strategy pi, actions a corresponding to pi(s) have multiple possibilities, and each possibility is marked as pi (a | s). In the formula, p (s '| s, pi (s)) represents the probability of transferring to the state s' after the action a corresponding to the current strategy pi(s) is executed in the current node state s; r (s '| s, pi (s)) represents a return function for transferring to the state s' after the action a corresponding to the current strategy pi(s) is executed from the current node state s; gamma denotes a discount factor.
In this embodiment, an iterative method is used for policy estimation, and the (k + 1) th iteration may be represented as:
Figure BDA0003622618760000082
in the iterative process, each state is scanned once in each iteration, and in the (k + 1) th iteration, larger V which can be directly obtained π (s) assigning a value to V k+1 Specifically, an array may be used to store each state value function, and each time a new larger state value is obtained, the old smaller state value is overwritten. The array may be of the form: [ V ] k+1 (s 1 ),V k+1 (s 2 ),V k+1 (s 3 ),...V k+1 (s n )]. Through an iterative process, the strategy estimation formula can be continuously drivenConvergence is achieved.
In some embodiments, step S404 includes: and when the change value of the current state value function obtained by calculation is smaller than a preset threshold value, determining that the current state value function is the maximum state value function and the strategy corresponding to the maximum state value function is the optimal transmission strategy, so that iteration can be exited and the operation amount is reduced. In this embodiment, the change value of the current state value function may refer to a change rate, and the threshold value may be set according to an actual situation, for example, 1% to 5%.
According to the embodiment, on the basis of a Markov decision process theory, user experience quality conditions are added to construct a streaming media transmission strategy model, so that more effective strategies are conveniently searched, the real-time experience and viewing quality of a user are met, and the time delay is reduced; and then based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, thereby improving the adaptability of RTC transmission and reducing the transmission time delay.
The implementation basis of the various embodiments of the present application is realized by a programmed process performed by a device having a processor function. Therefore, in engineering practice, the technical solutions and functions thereof of the embodiments of the present application can be packaged into various modules.
Based on this reality, on the basis of the foregoing embodiments, embodiments of the present application provide an apparatus for adaptive RTC streaming, where the apparatus is configured to execute the method for adaptive RTC streaming in the foregoing method embodiments. Referring to fig. 5, the apparatus for adaptive transmission of RTC streaming media includes:
the modeling module 501 is configured to acquire state parameters of all streaming media nodes in a target RTC transmission scene, and establish a streaming media transmission policy model based on a markov decision process theory and preset user experience quality conditions;
and the path planning module 502 is configured to solve an optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration based on the streaming media transmission strategy model, and transmit the target streaming media based on the optimal transmission strategy.
For specific limitations of each module of the RTC streaming media adaptive transmission apparatus, reference may be made to the above limitations on the RTC streaming media adaptive transmission method, which is not described herein again. In addition, it should be noted that all or part of the modules in the RTC streaming media adaptive transmission apparatus may be implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
Referring to fig. 6, the present embodiment further provides a computer device, which may be a computing device such as a mobile terminal, a desktop computer, a notebook, a palmtop computer, and a server. The computer device comprises a processor 601, a memory 602 and a display 603. FIG. 6 shows some of the components of a computer device, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The memory 602 may be, in some embodiments, an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The memory 602 may also be an external storage device of the computer device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device. Further, the memory 602 may also include both internal and external storage units of the computer device. The memory 602 is used for storing application software installed on the computer device and various data, such as program codes for installing the computer device. The memory 602 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 602 stores an RTC streaming adaptive transmission program 604.
The processor 601 may be a Central Processing Unit (CPU), microprocessor or other data Processing chip in some embodiments, and is used for executing program codes stored in the memory 602 or Processing data, such as executing an RTC streaming media adaptive transmission method.
The display 603 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 603 is used for displaying information at the computer device and for displaying a visualized user interface. The components 601 and 603 of the computer device communicate with each other via a system bus.
In one embodiment, when the processor 601 executes the RTC streaming media adaptive transmission program 604 in the memory 602, the following steps are implemented:
acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy.
The present embodiment further provides a computer-readable storage medium, on which an RTC streaming media adaptive transmission program is stored, and when executed by a processor, the RTC streaming media adaptive transmission program implements the following steps:
acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above.
Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (10)

1. An adaptive transmission method for RTC streaming media, comprising:
acquiring state parameters of all streaming media nodes in a target RTC transmission scene, and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and based on the streaming media transmission strategy model, solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration, and transmitting the target streaming media based on the optimal transmission strategy.
2. The RTC streaming media adaptive transmission method according to claim 1, wherein the acquiring the state parameters of all streaming media nodes in the target RTC transmission scenario includes:
acquiring original state data of all streaming media nodes in a target RTC transmission scene, wherein the original state data comprises the number of users, bandwidth and CPU (Central processing Unit) resources of each streaming media node;
and normalizing the number of users, the bandwidth and the CPU resource of each streaming media node, and determining the state parameter corresponding to each streaming media node according to preset different weight coefficients.
3. The RTC streaming media adaptive transmission method according to claim 1, wherein the acquiring the state parameters of all streaming media nodes in the target RTC transmission scene and establishing a streaming media transmission policy model based on a Markov decision process theory and a preset user experience quality condition comprises:
defining a four-tuple (S, A, Psa, R) based on a Markov decision process theory according to the state parameters of all streaming media nodes under the target RTC transmission scene;
constructing a condition function F according to the preset user experience quality condition;
constructing an optimal state value function and an optimal action value function based on a Bellman equation;
wherein S represents the state set of all streaming media nodes, and S is i ∈S,s i Representing the state parameter of the ith streaming media node in the target RTC transmission scene; a represents a set of actions, having a i ∈A,a i Representing the action that the streaming media node in the ith step selects the next streaming media node; psa is expressed at the present s i In the state of passing through a i State transition probability for transition to the next state after the action; r represents a return function for transferring the node state of the transmission state of the strategy to be transmitted based on path selection; f is the return value of the state transition of the strategy to be transmitted when the strategy enters the action of the absorption state.
4. The RTC streaming adaptive transmission method according to claim 3, characterized in that the conditional function F comprises:
taking the last streaming media node of the strategy to be transmitted as an absorption state, and when the strategy to be transmitted enters the action of the absorption state, if the time delay of the strategy to be transmitted is smaller than a preset time delay threshold value and the code rate of the strategy to be transmitted is smaller than a preset code rate threshold value, the return value of state transition of the strategy to be transmitted when the strategy to be transmitted enters the action of the absorption state is 0, otherwise, the return value is-1.
5. The RTC streaming media adaptive transmission method according to claim 2, wherein the step of solving the optimal transmission strategy in the target RTC transmission scenario through strategy iteration based on the streaming media transmission strategy model comprises:
aiming at a plurality of strategy selections of an initial node under a target RTC transmission scene, establishing a strategy estimation formula based on a Bellman equation;
performing policy estimation on all states of the current policy according to the policy estimation formula to update a state value function of the current policy;
changing an action for the state of the initial node by the current strategy according to a strategy improvement principle, so that an action value function of the current strategy is larger than a corresponding state value function, and traversing the states and all actions of all streaming media nodes by a greedy algorithm to carry out strategy improvement so as to obtain a new strategy;
and performing strategy estimation and strategy improvement on the new strategy, and obtaining a maximum state value function and a corresponding optimal transmission strategy through iterative computation.
6. The RTC streaming adaptive transmission method according to claim 5, characterized in that the policy estimation formula comprises:
Figure FDA0003622618750000021
p (s '| s, pi (s)) represents the probability of transferring to the state s' after the action a corresponding to the current strategy pi(s) is executed from the current node state s; r (s '| s, pi (s)) represents a return function for transferring to the state s' after the action a corresponding to the current strategy pi(s) is executed from the current node state s; gamma represents a discount factor; there are many possibilities for action a corresponding to π(s), each possibility being denoted as π (a | s).
7. The RTC streaming media adaptive transmission method according to claim 5, wherein the performing policy estimation and policy modification on the new policy through iterative computation to obtain a maximum state value function and a corresponding optimal transmission policy comprises:
and when the calculated change value of the current state value function is smaller than a preset threshold value, determining that the current state value function is the maximum state value function, and determining that the strategy corresponding to the maximum state value function is the optimal transmission strategy.
8. An apparatus for adaptive transmission of RTC streaming media, comprising:
the modeling module is used for acquiring state parameters of all streaming media nodes in a target RTC transmission scene and establishing a streaming media transmission strategy model based on a Markov decision process theory and a preset user experience quality condition;
and the path planning module is used for solving the optimal transmission strategy of the target streaming media in the target RTC transmission scene through strategy iteration based on the streaming media transmission strategy model, and transmitting the target streaming media based on the optimal transmission strategy.
9. A computer device, comprising:
at least one processor, at least one memory, and a communication interface; wherein,
the processor, the memory and the communication interface are communicated with each other;
the memory stores program instructions executable by the processor, the processor calling the program instructions to perform the RTC streaming adaptive transmission method of any of claims 1 to 7.
10. A computer readable storage medium storing computer instructions which, when run on a computer device, cause the computer device to perform the RTC streaming adaptive transmission method of any one of claims 1 to 7.
CN202210471454.1A 2022-04-28 2022-04-28 RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium Pending CN114866461A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210471454.1A CN114866461A (en) 2022-04-28 2022-04-28 RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210471454.1A CN114866461A (en) 2022-04-28 2022-04-28 RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114866461A true CN114866461A (en) 2022-08-05

Family

ID=82636365

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210471454.1A Pending CN114866461A (en) 2022-04-28 2022-04-28 RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114866461A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116668364A (en) * 2022-09-29 2023-08-29 中兴通讯股份有限公司 Route planning method and device for real-time audio and video network

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101677289A (en) * 2008-09-17 2010-03-24 华为技术有限公司 Method and device for optimizing route
CN103326946A (en) * 2013-07-02 2013-09-25 中国(南京)未来网络产业创新中心 SVC streaming media transmission optimization method based on OpenFlow
CN108391143A (en) * 2018-04-24 2018-08-10 南京邮电大学 A kind of wireless network transmission of video self-adaptation control method based on Q study
CN109511123A (en) * 2018-12-27 2019-03-22 沈阳航空航天大学 A kind of software definition vehicle network adaptive routing method based on temporal information
CN109587519A (en) * 2018-12-28 2019-04-05 南京邮电大学 Heterogeneous network Multipath Video control system and method based on Q study
CN109743600A (en) * 2019-01-15 2019-05-10 国网河南省电力公司 Based on wearable live O&M adaptive video stream transmission rate control
CN111107602A (en) * 2019-12-24 2020-05-05 杭州电子科技大学 Safe routing method with minimum energy consumption and time delay weighting for wireless body area network
CN112153716A (en) * 2019-09-24 2020-12-29 中兴通讯股份有限公司 Transmission path selection method and device and storage medium
CN113472671A (en) * 2020-03-30 2021-10-01 中国电信股份有限公司 Method and device for determining multicast route and computer readable storage medium
CN114124823A (en) * 2021-10-18 2022-03-01 西安电子科技大学 Self-adaptive routing method, system and equipment oriented to high-dynamic network topology

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101677289A (en) * 2008-09-17 2010-03-24 华为技术有限公司 Method and device for optimizing route
CN103326946A (en) * 2013-07-02 2013-09-25 中国(南京)未来网络产业创新中心 SVC streaming media transmission optimization method based on OpenFlow
CN108391143A (en) * 2018-04-24 2018-08-10 南京邮电大学 A kind of wireless network transmission of video self-adaptation control method based on Q study
CN109511123A (en) * 2018-12-27 2019-03-22 沈阳航空航天大学 A kind of software definition vehicle network adaptive routing method based on temporal information
CN109587519A (en) * 2018-12-28 2019-04-05 南京邮电大学 Heterogeneous network Multipath Video control system and method based on Q study
CN109743600A (en) * 2019-01-15 2019-05-10 国网河南省电力公司 Based on wearable live O&M adaptive video stream transmission rate control
CN112153716A (en) * 2019-09-24 2020-12-29 中兴通讯股份有限公司 Transmission path selection method and device and storage medium
CN111107602A (en) * 2019-12-24 2020-05-05 杭州电子科技大学 Safe routing method with minimum energy consumption and time delay weighting for wireless body area network
CN113472671A (en) * 2020-03-30 2021-10-01 中国电信股份有限公司 Method and device for determining multicast route and computer readable storage medium
CN114124823A (en) * 2021-10-18 2022-03-01 西安电子科技大学 Self-adaptive routing method, system and equipment oriented to high-dynamic network topology

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116668364A (en) * 2022-09-29 2023-08-29 中兴通讯股份有限公司 Route planning method and device for real-time audio and video network

Similar Documents

Publication Publication Date Title
Hsu Age of information: Whittle index for scheduling stochastic arrivals
US10873864B2 (en) Incorporation of expert knowledge into machine learning based wireless optimization framework
WO2021223662A1 (en) Page access based on code scanning
US20130198729A1 (en) Automated improvement of executable applications based on evaluating independent execution heuristics
CN111756646A (en) Network transmission control method, network transmission control device, computer equipment and storage medium
CN112383485B (en) Network congestion control method and device
CN111914285B (en) Geographic distributed graph calculation method and system based on differential privacy
CN114866461A (en) RTC (real time clock) streaming media self-adaptive transmission method, device, equipment and storage medium
Yang et al. Edge computing in the dark: Leveraging contextual-combinatorial bandit and coded computing
CN116192960A (en) Dynamic construction method and system for computing power network cluster based on constraint condition
Soto et al. Towards autonomous VNF auto-scaling using deep reinforcement learning
CN116896511B (en) Special line cloud service speed limiting method, device, equipment and storage medium
CN116737451B (en) Data recovery method and device of flash memory, solid state disk and storage medium
CN112488563A (en) Determination method and device for force calculation parameters
CN115955685B (en) Multi-agent cooperative routing method, equipment and computer storage medium
CN117076563A (en) Pruning method and device applied to blockchain
US20240022644A1 (en) Caching method, device, and computer program product for edge server
CN115409180A (en) Distributed graph reasoning calculation method, device, terminal and storage medium
Leconte et al. Adaptive replication in distributed content delivery networks
CN114257647A (en) Conference video caching method, server and system based on D2D communication
Yu et al. Blockchain-enabled rcs task offloading and resource allocation policy using drl approach
Xiong et al. Intelligent Sharding Decision for Blockchain-Enabled Industrial IoT Based on A3C Approach
CN118233382B (en) Congestion control method, congestion control model training method, switch and medium
CN116192629B (en) Differential data processing method and system based on edge calculation
CN113746611B (en) Method for configuring timing value of signaling in communication, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220805

RJ01 Rejection of invention patent application after publication