WO2016119560A1 - 音频传输的自适应方法及装置 - Google Patents

音频传输的自适应方法及装置 Download PDF

Info

Publication number
WO2016119560A1
WO2016119560A1 PCT/CN2015/099813 CN2015099813W WO2016119560A1 WO 2016119560 A1 WO2016119560 A1 WO 2016119560A1 CN 2015099813 W CN2015099813 W CN 2015099813W WO 2016119560 A1 WO2016119560 A1 WO 2016119560A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
audio stream
encoded
frame
transmission
Prior art date
Application number
PCT/CN2015/099813
Other languages
English (en)
French (fr)
Inventor
刘霖
赵旭
刘聪
Original Assignee
中国移动通信集团公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国移动通信集团公司 filed Critical 中国移动通信集团公司
Publication of WO2016119560A1 publication Critical patent/WO2016119560A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters

Definitions

  • the present disclosure relates to the field of streaming media transmission, and in particular, to an adaptive method and apparatus for audio transmission.
  • the audio stream delay includes: network transmission delay and coding equipment delay.
  • the audio stream delay In order to reduce the audio stream delay, it usually starts from two directions: one is to reduce the network delay and optimize the network structure, for example, to establish an end-to-end direct physical connection. Efficient transmission control protocol, and optimize the network environment; First, optimize device processing efficiency, improve device computing speed, optimize processing logic, and improve program efficiency.
  • the present disclosure provides an adaptive method and apparatus for audio transmission, which solves the problem that the audio stream delay exceeds the standard when the network is jittered.
  • an adaptive method for audio transmission is provided, which is applied to a streaming media server, and includes:
  • the audio stream is encoded according to a predetermined coding strategy and sent to the streaming media Account
  • the predetermined coding strategy is adjusted, the number of bits of the audio frame after the audio stream is encoded is reduced, and the encoded audio stream is sent to the streaming client.
  • the step of obtaining the current network transmission rate between the streaming media client includes:
  • the transmission rate with the streaming client is calculated based on the time difference between the second time and the first time, and the number of bits of the network probe message.
  • the step of adjusting the predetermined coding strategy and reducing the number of bits of the encoded audio frame includes:
  • the number of bits of the audio frame after the audio stream is encoded is reduced according to the first coding strategy, and it is determined whether the transmission time of the audio frame after the number of bits is reduced exceeds the transmission delay threshold; if not, the encoded audio is The stream is sent to the streaming media client; if it is exceeded, it is detected whether the encoding feature supports the second encoding strategy;
  • detecting whether the coding feature supports the second coding strategy if the second coding strategy is not supported, discarding some frames in the audio stream encoded audio frame; if the second coding strategy is supported, according to the second coding strategy Decrease the number of bits of the audio frame after the audio stream is encoded, and determine whether the transmission time of the audio frame after the number of bits is reduced exceeds the transmission delay threshold; if not, send the encoded audio stream to the streaming client; If it is exceeded, discard some of the frames in the audio stream encoded audio stream;
  • the first coding strategy is one of a framing strategy and a multi-code rate policy
  • the second coding strategy is another.
  • the first coding strategy is a framing strategy, and the framing strategy includes multiple frame lengths.
  • the step of reducing the number of bits of the audio stream encoded audio frame according to the first coding strategy includes:
  • the audio stream is divided into a plurality of first audio streams, and the first audio stream is encoded according to a current encoding rate, wherein the length of the first audio stream is the shortest frame length in the framing strategy.
  • the second coding strategy is a multi-code rate policy, and the multi-code rate policy includes: the supported coding code a rate set; the step of reducing the number of bits of the audio stream encoded audio frame according to the second coding strategy comprises:
  • the first audio stream is re-encoded using an encoding code rate in the encoded code rate set that is lower than the current encoding rate.
  • the first coding strategy is a multi-code rate policy, and the multi-code rate policy includes: a supported code rate set; and the step of reducing the number of bits of the audio stream encoded audio frame according to the first coding strategy includes:
  • the audio stream is re-encoded using an encoding code rate in the encoded code rate set that is lower than the current encoding rate.
  • the second coding strategy is a framing policy, and the framing strategy includes multiple frame lengths.
  • the step of reducing the number of bits of the audio stream encoded audio frame according to the second coding strategy includes:
  • the encoded audio frame is divided into a plurality of first audio frames, and the length of the first audio frame is the shortest frame length in the framing strategy.
  • an adaptive device for audio transmission is further provided, which is applied to a streaming media server, and includes:
  • An obtaining module configured to acquire a transmission rate of a current network between the streaming media client and the streaming media client;
  • a calculating module configured to calculate a transmission time of the audio frame according to the number of bits of the audio frame and the transmission rate of the audio stream to be transmitted under a predetermined coding policy
  • a determining module configured to determine whether a preset transmission delay threshold is exceeded
  • a first adjusting module configured to: when the transmission time does not exceed the transmission delay threshold, encode the audio stream according to a predetermined coding policy, and send the audio stream to the streaming media client;
  • the second adjusting module is configured to: when the transmission time exceeds the transmission delay threshold, adjust the predetermined coding strategy, reduce the number of bits of the audio frame after the audio stream is encoded, and send the encoded audio stream to the streaming client.
  • the acquisition module includes:
  • a sending unit configured to send a network probe message to the streaming media client, where the network probe message carries a first time to send the network probe message;
  • a receiving unit configured to receive a probe response message sent by the streaming media client after responding to the network probe message, where the probe response message carries a second time when the streaming media client receives the network probe message;
  • a calculating unit configured to calculate a transmission rate between the streaming media client according to the time difference between the second time and the first time, and the number of bits of the network detection message.
  • the second adjustment module includes:
  • a first detecting unit configured to detect whether the encoding feature supports the first encoding strategy
  • a first adjusting unit configured to: when the first encoding policy is supported, reduce the number of bits of the audio stream encoded audio frame according to the first encoding strategy
  • a first determining unit configured to determine whether a transmission time of the audio frame after the number of bits is reduced exceeds a transmission delay threshold; if not, sending the encoded audio stream to the streaming client; if not, detecting the encoding characteristic Whether to support the second coding strategy;
  • a second detecting unit configured to detect whether the encoding feature supports the second encoding strategy when the first encoding policy is not supported
  • a second adjusting unit configured to discard a partial frame in the audio stream encoded audio frame when the second encoding policy is not supported, and reduce the audio stream encoded audio frame according to the second encoding strategy when the second encoding strategy is supported Number of bits;
  • a second determining unit configured to determine whether the transmission time of the audio frame after the number of bits is reduced exceeds a transmission delay threshold; if not, the encoded audio stream is sent to the streaming client; if it is exceeded, the audio stream is discarded a partial frame in the encoded audio frame;
  • the first coding strategy is one of a framing strategy and a multi-code rate policy
  • the second coding strategy is another.
  • the first coding strategy is a framing strategy, and the framing strategy includes multiple frame lengths; the first adjustment unit includes:
  • a first adjusting subunit configured to divide the audio stream into a plurality of first audio streams, and encode the first audio stream according to a current encoding code rate, where the length of the first audio stream is a shortest frame in the framing strategy long.
  • the second coding strategy is a multi-code rate policy, and the multi-code rate policy includes: a set of supported code rate rates; the second adjustment unit includes:
  • a second adjusting subunit configured to re-encode the first audio stream by using an encoding code rate lower than a current encoding rate in the encoded code rate set.
  • the first coding strategy is a multi-code rate policy, and the multi-rate policy includes: a set of supported code rates; the first adjustment unit further includes:
  • a third adjustment subunit configured to use an encoding code lower than a current encoding rate in the encoded code rate set The rate recodes the audio stream.
  • the second coding strategy is a framing strategy, the framing strategy includes multiple frame lengths, and the second adjustment unit further includes:
  • a fourth adjustment subunit configured to divide the encoded audio frame into a plurality of first audio frames, where the length of the first audio frame is the shortest frame length in the framing strategy.
  • an adaptive method and apparatus for audio transmission by transmitting a network probe message, calculating a transmission rate of a current network, according to an audio frame of a to-be-transmitted audio stream under a predetermined coding policy
  • the number of bits and the transmission rate are calculated, and the transmission time of the audio frame is calculated, and it is determined whether the obtained transmission time exceeds a preset transmission delay threshold. If not, the audio stream is encoded according to a predetermined coding strategy and sent to the stream. If the media client exceeds, the predetermined encoding strategy is adjusted, the number of bits after the audio stream is encoded is reduced, and the encoded audio stream is sent to the streaming client.
  • the coding strategy of the audio stream is adjusted to adapt the audio transmission to the current network transmission rate, and the higher than standard requirements due to network jitter or network instability are solved.
  • the delay which in turn affects the problem of abnormality between devices.
  • 1 is a flow chart showing an adaptive method of audio transmission of the present disclosure
  • Figure 2 is a flow chart showing the first embodiment of the present disclosure
  • Figure 3 shows a flow chart of the second embodiment of the present disclosure
  • FIG. 4 is a block diagram showing the adaptive device of the audio transmission of the present disclosure.
  • an embodiment of the present disclosure provides an adaptive method for audio transmission, which is applied to a streaming media server, and adjusts an encoding strategy of an audio stream according to a current network state, and the method mainly includes :
  • Step 10 Obtain the transmission rate of the current network with the streaming media client.
  • the streaming media server sends a network probe message to the streaming media client, where the network probe message carries the first time when the streaming media server sends the network probe message, and after the streaming media client receives the network probe message, The message is responsive, and the probe response message is fed back to the streaming media server, where the probe response message carries a second time when the streaming media client receives the network probe message, and a third time when the probe response message is fed back.
  • the downlink transmission time of the network probe message may be calculated by the time difference between the second time and the first time.
  • the number of bytes or the number of bits of the network probe message is determined according to the data volume of the network probe message and the calculated transmission time.
  • the ratio of the downlink transmission rate of the current network is calculated.
  • the network probe message is determined according to the type of the network protocol.
  • the RTSP message is taken as an example. Since the number of bits carried by the RTSP message is small, in order to obtain an accurate transmission rate, it is preferable to transmit multiple RTSP messages and take the average number of RTSP probe message transmission rates as a basis for weighing the downlink transmission rate of the network. Although the number of bits carried by the RTSP message is small, the frequent transmission still imposes a certain load on the network. Therefore, the network condition is detected every predetermined time, and the coding strategy of the audio stream is adjusted according to the current network condition.
  • Step 20 Calculate the transmission time of the audio frame according to the number of bits of the audio frame and the transmission rate of the audio stream to be transmitted under the predetermined coding strategy.
  • the encoder of the streaming media server is initially configured, for example, the encoder: encoding code rate, encoding frame length, etc., so that the streaming server has a predetermined encoding strategy. Calculating a transmission time of the audio stream according to a ratio of a coded bit number of the audio stream to be transmitted in the predetermined coding policy to a calculated transmission rate, that is, a transmission required to transmit the audio stream in a current network condition time.
  • Step 30 Determine whether the transmission time exceeds a preset transmission delay threshold.
  • the specific calculation method of the preset transmission delay threshold is as follows.
  • the standard required delay is: the time of establishing the transmission channel, the processing time of the streaming media server, the processing time of the streaming media client, and the transmission.
  • the time of the audio stream is transmitted, so the upper limit of the transmission delay threshold is the standard required delay minus the time to establish the transmission channel, and then the processing time of the streaming media server and the processing time of the streaming client are subtracted, for example, the standard required delay 40 ms, the time for establishing a transmission channel between the streaming server and the streaming client is 20 ms, and the time for processing the audio signal of one frame for the streaming server and the streaming client is 3 ms, and the time for transmitting one frame of the audio signal is
  • the transmission delay threshold is 14ms.
  • Step 40 If not exceeded, encode the audio stream according to a predetermined coding strategy, and send it to the streaming media client; if it is exceeded, adjust the predetermined encoding strategy, reduce the number of bits of the audio frame after the audio stream is encoded, and encode The subsequent audio stream is sent to the streaming client.
  • step 20 If the transmission time calculated in step 20 does not exceed the transmission delay threshold, the current network status is good, and the audio stream to be transmitted is encoded according to the initially set predetermined coding strategy to implement reliable transmission of the audio stream.
  • the coding strategy of the audio stream needs to be adjusted to reduce the number of bits of the audio frame after the audio stream is encoded. So that each audio frame can be reliably transmitted.
  • the specific step of adjusting the predetermined coding strategy and reducing the number of bits of the encoded audio frame includes the following steps:
  • the number of bits of the audio frame after the audio stream is encoded is reduced according to the first coding strategy, and it is determined whether the transmission time of the audio frame after the number of bits is reduced exceeds the transmission delay threshold, and if not exceeded, The encoded audio stream is sent to the streaming media client; if it is exceeded, it is detected whether the encoding feature supports the second encoding strategy;
  • the number of bits of the audio frame after the audio stream is encoded is reduced according to the second coding strategy, and it is determined whether the transmission time of the audio frame after the number of bits is reduced exceeds the transmission delay threshold, and if not exceeded, The encoded audio stream is sent to the streaming client; if it is exceeded, some of the frames in the audio stream encoded audio frame are discarded.
  • the first coding strategy is one of a framing strategy and a multi-code rate policy
  • the second coding strategy is another.
  • the encoder supporting the framing strategy includes an existing variable frame length coder, such as a typical generation.
  • the AMR-WB+ encoder is characterized in that the encoder provides a plurality of frame length modes, including: a 20 ms frame length, a 40 ms frame length, and an 80 ms frame length.
  • the 20 ms frame length encapsulates the 2 ms audio stream into one frame, that is, One frame of audio frame carries a 20 ms audio signal.
  • Encoders supporting multi-rate strategies include existing code rate variable encoders, such as typical representatives: AMR-NB and AMR-WB, etc., which are characterized in that although the audio signal length per frame of the audio frame is not variable, However, the number of coded bits for each frame of audio frame is variable. For example, AMR-WB has multiple code rate, and the number of coded bits of each frame of audio frame is 477, 461, 397, and 365. Whether using a shorter frame length or a lower encoding rate, the number of bits per frame of audio frames is reduced to shorten the transmission time of each frame of audio frames, so that the audio stream is in the current network state. Reliable transmission.
  • the time period in which the network jitter or the network is unstable is short. Therefore, only the current frame of the audio stream to be transmitted is adjusted, that is, only the first frame of the audio stream is encoded according to the first coding strategy or the second coding strategy.
  • the second frame of the audio stream is encoded by using a predetermined coding strategy.
  • the partial frame in the audio frame encoded by the audio stream is discarded, and the first frame of the audio stream is discarded, and the coded transmission is started from the second frame. Because usually the length of an audio frame does not exceed 100 ms, changing the encoding strategy of one frame of audio signal or discarding one frame of audio signal does not have any influence on the audio quality. Therefore, the impact of the adaptive method on the quality of the audio stream may be Neglected, and ensures reliable transmission of audio streams when network jitter or network instability.
  • the framing strategy includes multiple frame lengths
  • the multi-code rate policy includes the supported coding rate set.
  • Obtaining the transmission rate of the current network that is, sending a network probe message to the streaming media client through the streaming media server to detect the downlink transmission rate of the current network.
  • the audio stream is encoded according to a predetermined coding strategy and sent to the streaming client.
  • the audio stream is shunted according to the shortest frame length, and encoded according to the current encoding rate, and then the transmission time of each frame of the encoded audio stream is detected to exceed the preset transmission.
  • the delay threshold is sent; if not, the encoded audio stream is sent to the streaming client. If it is exceeded, the encoding feature is tested to support the multi-code rate policy; if not, the audio stream encoded audio frame is discarded.
  • Part of the frame if supported, encodes the shunted audio stream at an encoding rate lower than the current encoding rate, and then detects whether the transmission time of each frame of the encoded audio stream exceeds a preset If the transmission delay threshold is not exceeded, the encoded audio stream is sent to the streaming client; if the transmission time of the audio frame adjusted by the encoding strategy exceeds the transmission delay threshold, the audio stream coding is discarded. Part of the frame after the audio frame.
  • the framing strategy is not supported, it is detected whether the coding characteristic of the encoder supports the multi-code rate policy; if not, the partial frame in the audio frame encoded by the audio stream is discarded, and if supported, the lower than the current coding rate is adopted.
  • the coded rate encodes the streamed audio stream, and then detects whether the transmitted time of the encoded audio frame exceeds a preset transmission delay threshold. If not, the encoded audio stream is sent to the streaming client. If the transmission time of the audio frame adjusted by the coding strategy still exceeds the transmission delay threshold, part of the frames in the audio stream encoded audio frame are discarded.
  • the coding strategy of the audio stream is adjusted according to the relationship between the current network transmission rate and the preset transmission delay threshold. If the current transmission time of the audio frame does not exceed the transmission delay threshold at the current network transmission rate, The audio stream is encoded according to a predetermined encoding strategy and the encoded audio frame is transmitted to the streaming client. If, at the current network transmission rate, the transmission time of the current audio frame exceeds the transmission delay threshold, it is sequentially detected whether the framing strategy and the multi-code rate policy are supported, and the audio stream is encoded according to the corresponding coding strategy, so that the audio transmission is performed. It adapts to the transmission rate of the current network, and solves the delay caused by network jitter or network instability, which is higher than the standard requirement, and thus affects the problem of abnormality between devices.
  • the multi-code rate policy includes a supported code rate set
  • the framing strategy includes multiple frame lengths.
  • Obtaining the transmission rate of the current network that is, sending a network probe message to the streaming media client through the streaming media server to detect the downlink transmission rate of the current network.
  • the audio stream is encoded according to a predetermined coding strategy and sent to the streaming client.
  • the audio stream is encoded with an encoding rate lower than the current encoding rate, preferably the lowest possible encoding rate is used to encode the audio stream, and then each frame of the encoded audio stream is detected. Whether the transmission time of the frame exceeds the preset transmission delay threshold; if not, the encoded audio stream is sent to the streaming client. If it is exceeded, the encoding feature is detected to support the framing strategy; if not, the packet is discarded. Part of the audio stream encoded audio frame, if supported, the audio stream is shunted according to the shortest frame length and encoded according to the current encoding rate, that is, the audio stream is encoded according to the shortest frame length and the lowest encoding rate.
  • the multi-rate policy is not supported, it is detected whether the coding characteristic of the encoder supports the framing strategy; if not, the partial frame in the audio frame encoded by the audio stream is discarded, and if supported, the audio stream is shortened according to the shortest frame length. Performing offloading, and encoding according to the current coding rate, and then detecting whether the transmission time of the encoded audio frame exceeds a preset transmission delay threshold, and if not, transmitting the encoded audio stream to the streaming client; If the transmission time of the audio frame adjusted by the coding strategy still exceeds the transmission delay threshold, part of the frames in the audio stream encoded audio frame are discarded.
  • the coding strategy of the audio stream is adjusted according to the relationship between the current network transmission rate and the preset transmission delay threshold. If the current transmission time of the audio frame does not exceed the transmission delay threshold at the current network transmission rate, The audio stream is encoded according to a predetermined encoding strategy and the encoded audio frame is transmitted to the streaming client. If the current audio frame transmission time exceeds the transmission delay threshold at the current network transmission rate, it is sequentially detected whether the multi-code rate policy and the framing strategy are supported, and the audio stream is encoded according to the corresponding coding strategy, so that the audio transmission is performed. It adapts to the transmission rate of the current network, and solves the delay caused by network jitter or network instability, which is higher than the standard requirement, and thus affects the problem of abnormality between devices.
  • an embodiment of the present disclosure further provides an adaptive device for audio transmission.
  • an adaptive device for audio transmission Applied to the streaming server side, including:
  • An obtaining module configured to acquire a transmission rate of a current network between the streaming media client and the streaming media client;
  • a calculating module configured to calculate a transmission time of the audio frame according to the number of bits of the audio frame and the transmission rate of the audio stream to be transmitted under a predetermined coding policy
  • a determining module configured to determine whether a preset transmission delay threshold is exceeded
  • a first adjusting module configured to: when the transmission time does not exceed the transmission delay threshold, encode the audio stream according to a predetermined coding policy, and send the audio stream to the streaming media client;
  • the second adjusting module is configured to: when the transmission time exceeds the transmission delay threshold, adjust the predetermined coding strategy, reduce the number of bits of the audio frame after the audio stream is encoded, and send the encoded audio stream to the streaming client.
  • the acquisition module includes:
  • a sending unit configured to send a network probe message to the streaming media client, where the network probe message carries a first time to send the network probe message;
  • a receiving unit configured to receive a probe response message sent by the streaming media client after responding to the network probe message, where the probe response message carries a second time when the streaming media client receives the network probe message;
  • a calculating unit configured to calculate a transmission rate between the streaming media client according to the time difference between the second time and the first time, and the number of bits of the network detection message.
  • the second adjustment module includes:
  • a first detecting unit configured to detect whether the encoding feature supports the first encoding strategy
  • a first adjusting unit configured to: when the first encoding policy is supported, reduce the number of bits of the audio stream encoded audio frame according to the first encoding strategy
  • a first determining unit configured to determine whether a transmission time of the audio frame after the number of bits is reduced exceeds a transmission delay threshold; if not, sending the encoded audio stream to the streaming client; if not, detecting the encoding characteristic Whether to support the second coding strategy;
  • a second detecting unit configured to detect whether the encoding feature supports the second encoding strategy when the first encoding policy is not supported
  • a second adjusting unit configured to discard a partial frame in the audio stream encoded audio frame when the second encoding policy is not supported, and reduce the audio stream encoding according to the second encoding strategy when the second encoding strategy is supported The number of bits of the audio frame after the code;
  • a second determining unit configured to determine whether the transmission time of the audio frame after the number of bits is reduced exceeds a transmission delay threshold; if not, the encoded audio stream is sent to the streaming client; if it is exceeded, the audio stream is discarded a partial frame in the encoded audio frame;
  • the first coding strategy is one of a framing strategy and a multi-code rate policy
  • the second coding strategy is another.
  • the first coding strategy is a framing strategy, and the framing strategy includes multiple frame lengths; the first adjustment unit includes:
  • a first adjusting subunit configured to divide the audio stream into a plurality of first audio streams, and encode the first audio stream according to a current encoding code rate, where the length of the first audio stream is a shortest frame in the framing strategy long.
  • the second coding strategy is a multi-code rate policy, and the multi-code rate policy includes: a set of supported code rate rates; the second adjustment unit includes:
  • a second adjusting subunit configured to re-encode the first audio stream by using an encoding code rate lower than the current encoding rate in the encoded code rate set, and notify the streaming media client of the current encoding bit rate.
  • the first coding strategy is a multi-code rate policy, and the multi-rate policy includes: a set of supported code rates; the first adjustment unit further includes:
  • a third adjusting subunit configured to re-encode the audio stream by using an encoding code rate lower than the current encoding rate in the encoded code rate set, and notify the streaming media client of the current encoding bit rate.
  • the second coding strategy is a framing strategy, the framing strategy includes multiple frame lengths, and the second adjustment unit further includes:
  • a fourth adjustment subunit configured to divide the encoded audio frame into a plurality of first audio frames, where the length of the first audio frame is the shortest frame length in the framing strategy.
  • the device is a device corresponding to the above-mentioned adaptive method for audio transmission. All the implementations in the foregoing method embodiments are applicable to the embodiment of the device, and the same technical effects can be achieved.

Abstract

一种音频传输的自适应方法及装置,其方法包括:获取与流媒体客户端之间的当前网络的传输速率(S10);根据待传输的音频流在预定编码策略下的音频帧的比特数与传输速率,计算音频帧的传输时间(S20);判断传输时间是否超出一预设的传输时延阈值(S30);若未超出,按照预定编码策略,对音频流进行编码后并发送至向流媒体客户端;若超出,则调整预定编码策略,降低音频流编码后的音频帧的比特数,并将编码后的音频流发送至流媒体客户端(S40)。

Description

音频传输的自适应方法及装置
相关申请的交叉引用
本申请主张在2015年1月29日在中国提交的中国专利申请号No.201510047890.6的优先权,其全部内容通过引用包含于此。
技术领域
本公开涉及流媒体传输领域,尤其涉及一种音频传输的自适应方法及装置。
背景技术
目前流媒体传输已成为网络通信中的一常用功能,在流媒体传输的框架中,存在某些对于音频流的限制,其中一个比较普遍的问题是对于音频流时延的要求。音频流时延包括:网络传输时延和编码设备时延,为了降低音频流时延通常由两个方向入手:一是降低网络时延,优化网络结构,例如建立端到端的直接物理连接,采用高效传输控制协议,以及优化网络环境等;一是优化设备处理效率,提高设备的运算速率,优化处理逻辑,提高程序效率。
虽然以上两种方式可以在很大程度上解决音频流时延的问题,但均不能解决网络环境的随机性问题。现有技术中,无论物理层基于何种网络拓扑结构,何种网络介质,网络抖动、网络不稳定都无法避免,在这样的情况下,可能出现高于标准要求的时延,从而可能导致设备之间配合异常。
发明内容
为了解决上述技术问题,本公开提供了一种音频传输的自适应方法及装置,解决了当网络抖动时,音频流时延超标的问题。
依据本公开的一个方面,提供了一种音频传输的自适应方法,应用于流媒体服务器端,包括:
获取与流媒体客户端之间的当前网络的传输速率;
根据待传输的音频流在预定编码策略下的音频帧的比特数与传输速率,计算音频帧的传输时间;
判断传输时间是否超出一预设的传输时延阈值;
若未超出,按照预定编码策略对音频流进行编码,并发送至向流媒体客 户端;
若超出,则调整预定编码策略,降低音频流编码后的音频帧的比特数,并将编码后的音频流发送至流媒体客户端。
其中,获取与流媒体客户端之间的当前网络的传输速率的步骤包括:
向流媒体客户端发送网络探测消息,其中,网络探测消息携带有发送网络探测消息的第一时间;
接收流媒体客户端响应网络探测消息后发送的探测响应消息,其中,探测响应消息携带有流媒体客户端接收到网络探测消息的第二时间;
根据第二时间与第一时间的时间差,以及网络探测消息的比特数,计算与流媒体客户端之间的传输速率。
其中,调整预定编码策略,降低编码后的音频帧的比特数的步骤包括:
检测编码特性是否支持第一编码策略;
若支持,则根据第一编码策略降低音频流编码后的音频帧的比特数,并判断降低比特数后的音频帧的传输时间是否超出传输时延阈值;若未超出,则将编码后的音频流发送至流媒体客户端;若超出,则检测编码特性是否支持第二编码策略;
若不支持,则检测编码特性是否支持第二编码策略;若不支持第二编码策略,则丢弃音频流编码后的音频帧中的部分帧;若支持第二编码策略,则根据第二编码策略降低音频流编码后的音频帧的比特数,并判断降低比特数后的音频帧的传输时间是否超出传输时延阈值;若未超出,则将编码后的音频流发送至流媒体客户端;若超出,则丢弃音频流编码后的音频帧中的部分帧;
第一编码策略为分帧策略和多码率策略中的一种,第二编码策略为另一种。
其中,第一编码策略为分帧策略,分帧策略包括多种帧长;根据第一编码策略降低音频流编码后的音频帧的比特数的步骤包括:
将音频流分为多个第一音频流,并按照当前编码码率对第一音频流进行编码,其中,第一音频流的长度为分帧策略中的最短帧长。
其中,第二编码策略为多码率策略,多码率策略包括:所支持的编码码 率集合;根据第二编码策略降低音频流编码后的音频帧的比特数的步骤包括:
采用编码码率集合中低于当前编码码率的编码码率对第一音频流重新编码。
其中,第一编码策略为多码率策略,多码率策略包括:所支持的编码码率集合;根据第一编码策略降低音频流编码后的音频帧的比特数的步骤包括:
采用编码码率集合中低于当前编码码率的编码码率对音频流重新编码。
其中,第二编码策略为分帧策略,分帧策略包括多种帧长,根据第二编码策略降低音频流编码后的音频帧的比特数的步骤包括:
将编码后的音频帧分为多个第一音频帧,第一音频帧的长度为分帧策略中的最短帧长。
依据本公开的另一个方面,还提供了一种音频传输的自适应装置,应用于流媒体服务器端,包括:
获取模块,用于获取与流媒体客户端之间的当前网络的传输速率;
计算模块,用于根据待传输的音频流在预定编码策略下的音频帧的比特数与传输速率,计算音频帧的传输时间;
判断模块,用于判断是否超出一预设的传输时延阈值;
第一调整模块,用于当传输时间未超出传输时延阈值时,按照预定编码策略对音频流进行编码,并发送至向流媒体客户端;
第二调整模块,用于当传输时间超出传输时延阈值时,调整预定编码策略,降低音频流编码后的音频帧的比特数,并将编码后的音频流发送至流媒体客户端。
其中,获取模块包括:
发送单元,用于向流媒体客户端发送网络探测消息,其中,网络探测消息携带有发送网络探测消息的第一时间;
接收单元,用于接收流媒体客户端响应网络探测消息后发送的探测响应消息,其中,探测响应消息携带有流媒体客户端接收到网络探测消息的第二时间;
计算单元,用于根据第二时间与第一时间的时间差,以及网络探测消息的比特数,计算与流媒体客户端之间的传输速率。
其中,第二调整模块包括:
第一检测单元,用于检测编码特性是否支持第一编码策略;
第一调整单元,用于当支持第一编码策略时,根据第一编码策略降低音频流编码后的音频帧的比特数;
第一判断单元,用于判断降低比特数后的音频帧的传输时间是否超出传输时延阈值;若未超出,则将编码后的音频流发送至流媒体客户端;若超出,则检测编码特性是否支持第二编码策略;
第二检测单元,用于当不支持第一编码策略时,检测编码特性是否支持第二编码策略;
第二调整单元,用于当不支持第二编码策略时,丢弃音频流编码后的音频帧中的部分帧;当支持第二编码策略时,根据第二编码策略降低音频流编码后的音频帧的比特数;
第二判断单元,用于判断降低比特数后的音频帧的传输时间是否超出传输时延阈值;若未超出,则将编码后的音频流发送至流媒体客户端;若超出,则丢弃音频流编码后的音频帧中的部分帧;
其中,第一编码策略为分帧策略和多码率策略中的一种,第二编码策略为另一种。
其中,第一编码策略为分帧策略,分帧策略包括多种帧长;第一调整单元包括:
第一调整子单元,用于将音频流分为多个第一音频流,并按照当前编码码率对第一音频流进行编码,其中,第一音频流的长度为分帧策略中的最短帧长。
其中,第二编码策略为多码率策略,多码率策略包括:所支持的编码码率集合;第二调整单元包括:
第二调整子单元,用于采用编码码率集合中低于当前编码码率的编码码率对第一音频流重新编码。
其中,第一编码策略为多码率策略,多码率策略包括:所支持的编码码率集合;第一调整单元还包括:
第三调整子单元,用于采用编码码率集合中低于当前编码码率的编码码 率对音频流重新编码。
其中,第二编码策略为分帧策略,分帧策略包括多种帧长,第二调整单元还包括:
第四调整子单元,用于将编码后的音频帧分为多个第一音频帧,第一音频帧的长度为分帧策略中的最短帧长。
本公开的实施例的有益效果是:一种音频传输的自适应方法及装置,通过发送网络探测消息的方式,计算当前网络的传输速率,根据待传输音频流在预定编码策略下的音频帧的比特数与传输速率,计算该音频帧的传输时间,再判断得到的传输时间是否超出了预设的传输时延阈值,若未超出,则按照预定编码策略对该音频流进行编码并发送至流媒体客户端,若超出,则调整预定编码策略,降低音频流编码后的比特数,并将编码后的音频流发送至流媒体客户端。根据当前网络传输速率与预设传输时延阈值的关系,来调整音频流的编码策略,以使音频传输适应当前网络的传输速率,解决了因网络抖动或网络不稳定而造成的高于标准要求的时延,进而影响设备之间配合异常的问题。
附图说明
图1表示本公开的音频传输的自适应方法的流程图;
图2表示本公开的实施例一的流程图;
图3表示本公开的实施例二的流程图;
图4表示本公开的音频传输的自适应装置的模块示意图。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
实施例一
随着流媒体传输功能的发展与普及,对流媒体传输过程中的时延要求越来越高,而当前情况下,无论是物理层基于何种网络拓扑结构,何种网络介质,都无法避免网络抖动和网络不稳定的问题,这样即使优化了设备处理效 率,优化了网络拓扑结构,仍然可能出现高于标准要求的时延,从而导致设备之间配合异常。为了解决上述问题,如图1所示,本公开的实施例提供了一种音频传输的自适应方法,应用于流媒体服务器端,根据当前网络状态来调整音频流的编码策略,其方法主要包括:
步骤10:获取与流媒体客户端之间的当前网络的传输速率。
流媒体服务器通过向流媒体客户端发送一网络探测消息,其中,该网络探测消息中携带有流媒体服务器发送该网络探测消息的第一时间,当流媒体客户端收到该网络探测消息后对该消息进行响应,并将探测响应消息反馈至流媒体服务器,其中,该探测响应消息中携带有流媒体客户端接收到网络探测消息的第二时间,以及反馈探测响应消息的第三时间。由第二时间与第一时间的时间差可计算出该网络探测消息的下行传输时间,由于网络探测消息的字节数或比特数是确定的,根据网络探测消息的数据量与计算出的传输时间的比值,计算出当前网络的下行传输速率。其中,网络探测消息根据网络协议的类型而定,本实施例中以RTSP消息为例。由于RTSP消息携带的比特数很少,为了得到准确的传输速率,优选地可选用发送多个RTSP消息,取多次RTSP探测消息传输速率的平均数,作为权衡网络下行传输速率的依据。虽然RTSP消息携带的比特数很少,但频繁发送仍会对网络造成一定的负荷,故设定每隔预定时间对网络状况进行一次探测,并根据当时网络状况对音频流的编码策略进行调整。
步骤20:根据待传输的音频流在预定编码策略下的音频帧的比特数与传输速率,计算音频帧的传输时间。
在配置流媒体服务器时,会对流媒体服务器端的编码器进行初始配置,例如配置编码器的:编码码率,编码帧长等,这样流媒体服务器端有一预定编码策略。根据待传输的音频流在该预定编码策略下的编码比特数,与上述计算得到的传输速率的比值,计算该音频流的传输时间,即在当前网络情况下,传输该音频流所需要的传输时间。
步骤30:判断传输时间是否超出一预设的传输时延阈值。
预设的传输时延阈值的具体计算方法如下,标准要求时延为:建立传输通道的时间、流媒体服务器端的处理时间、流媒体客户端的处理时间以及传 输音频流的时间,故传输时延阈值的上限即为标准要求时延减去建立传输通道的时间,再减去流媒体服务器端的处理时间和流媒体客户端的处理时间,例如:标准要求时延为40ms,流媒体服务器端和流媒体客户端之间建立传输通道的时间为20ms,流媒体服务器端和流媒体客户端的处理一帧音频信号的时间均为3ms,则传输一帧音频信号的时间的传输时延阈值为14ms。
步骤40:若未超出,按照预定编码策略对音频流进行编码,并发送至向流媒体客户端;若超出,则调整预定编码策略,降低音频流编码后的音频帧的比特数,并将编码后的音频流发送至流媒体客户端。
若步骤20中计算出的传输时间未超出传输时延阈值,说明当前网络状态良好,按照初始设置的预定编码策略对待传输的音频流进行编码即可实现音频流的可靠传输。
若步骤20中计算出的传输时间已超出传输时延阈值,说明当前网络状态较差或网络状态不稳定,则需要调整音频流的编码策略,以降低音频流编码后的音频帧的比特数,以使得每一音频帧能够可靠传输。
其中,调整预定编码策略,降低编码后的音频帧的比特数的具体包括以下步骤:
检测流媒体服务器端自身的编码特性是否支持第一编码策略;
若支持第一编码策略,则根据第一编码策略降低音频流编码后的音频帧的比特数,并判断降低比特数后的音频帧的传输时间是否超出传输时延阈值,若未超出,则将编码后的音频流发送至流媒体客户端;若超出,则检测自身编码特性是否支持第二编码策略;
若不支持第一编码策略,则检测自身编码特性是否支持第二编码策略;
若不支持第二编码策略,则丢弃音频流编码后的音频帧中的部分帧;
若支持第二编码策略,则根据第二编码策略降低音频流编码后的音频帧的比特数,并判断降低比特数后的音频帧的传输时间是否超出传输时延阈值,若未超过,则将编码后的音频流发送至流媒体客户端;若超出,则丢弃音频流编码后的音频帧中的部分帧。
第一编码策略为分帧策略和多码率策略中的一种,第二编码策略为另一种。其中,支持分帧策略的编码器包括现有的可变帧长编码器,例如典型代 表AMR-WB+编码器,其特点在于编码器提供了多种帧长模式,包括:20ms帧长、40ms帧长和80ms帧长等,20ms帧长为将2ms的音频流封装为一帧,即一帧音频帧携带20ms的音频信号。支持多码率策略的编码器包括现有的编码码率可变编码器,例如典型代表:AMR-NB和AMR-WB等,其特点在于虽然每帧音频帧携带的音频信号时间长度不可变,但对于每帧音频帧的编码比特数可变,例如AMR-WB有多种编码码率,每帧音频帧的编码比特数为477、461、397和365等。无论是采用较短帧长的方式,还是采用较低编码码率的方式,都是为了降低每帧音频帧的比特数,以缩短每帧音频帧的传输时间,使得音频流在当前网络状态下可靠传输。
其中,由于发生网络抖动或网络不稳定的时间很短,故只对待传输音频流的当前帧进行调整编码策略,即仅音频流的第一帧按照第一编码策略或第二编码策略进行编码,音频流的第二帧则采用预定编码策略进行编码;其中,丢弃音频流编码后的音频帧中的部分帧,指的是丢弃音频流的第一帧,从第二帧开始进行编码传输。因为通常一音频帧的长度不超过100ms,改变一帧音频信号的编码策略或丢弃一帧音频信号并不会对音频质量在感知上造成任何影响,因此该自适应方法对于音频流质量的影响可忽略不计,而且使得在网络抖动或网络不稳定时保证了音频流的可靠传输。
其中,当第一编码策略为分帧策略,第二编码策略为多码率策略时,分帧策略包括多种帧长,多码率策略包括所支持的编码码率集合。该音频传输的自适应方法的具体实现方式可参照图2所示,
获取当前网络的传输速率,即通过流媒体服务端向流媒体客户端发送网络探测消息对当前网络的下行传输速率进行探测。
是否超过预设传输时延阈值,检测计算得到的当前网络的传输速率是否超过了预设传输时延阈值。
若未超过,按照预定编码策略对音频流进行编码,并发送至流媒体客户端。
若超过,则检测流媒体服务器端的编码器的编码特性是否支持分帧策略;
若支持,则将音频流按照最短帧长进行分流,并按照当前编码码率进行编码,然后再检测编码后的音频流的每帧音频帧的传输时间是否超过预设传 输时延阈值;若未超过,则将编码后的音频流发送至流媒体客户端,若超过,则检测编码特性是否支持多码率策略;若不支持,则丢弃音频流编码后的音频帧中的部分帧,若支持,则采用低于当前编码码率的编码码率对分流后的音频流进行编码,然后再检测编码后的音频流的每帧音频帧的传输时间是否超过了预设传输时延阈值,若未超过,则将编码后的音频流发送至流媒体客户端;若经过两次编码策略调整后的音频帧的传输时间仍超过了传输时延阈值,则丢弃音频流编码后的音频帧中的部分帧。
若不支持分帧策略,则检测编码器的编码特性是否支持多码率策略;若不支持,则丢弃音频流编码后的音频帧中的部分帧,若支持,则采用低于当前编码码率的编码码率对分流后的音频流进行编码,然后再检测编码后的音频帧的传输时间是否超过预设传输时延阈值,若未超过,则将编码后的音频流发送至流媒体客户端;若经编码策略调整后的音频帧的传输时间仍超过了传输时延阈值,则丢弃音频流编码后的音频帧中的部分帧。
本实施例中,根据当前网络传输速率与预设传输时延阈值的关系,来调整音频流的编码策略,若在当前网络传输速率下,当前音频帧的传输时间未超出传输时延阈值,则按照预定编码策略对音频流进行编码,并将编码后的音频帧传输至流媒体客户端。若在当前网络传输速率下,当前音频帧的传输时间超出了传输时延阈值,则依次检测是否支持分帧策略和多码率策略,并按照相应编码策略对音频流进行编码,以使音频传输适应当前网络的传输速率,解决了因网络抖动或网络不稳定而造成的高于标准要求的时延,进而影响设备之间配合异常的问题。
实施例二
其中,当第一编码策略为多码率策略,第二编码策略为分帧策略时,多码率策略包括所支持的编码码率集合,分帧策略包括多种帧长。该音频传输的自适应方法的具体实现方式可参照图3所示,
获取当前网络的传输速率,即通过流媒体服务端向流媒体客户端发送网络探测消息对当前网络的下行传输速率进行探测。
是否超过预设传输时延阈值,检测计算得到的当前网络的传输速率是否超过了预设传输时延阈值。
若未超过,按照预定编码策略对音频流进行编码,并发送至流媒体客户端。
若超过,则检测流媒体服务器端的编码器的编码特性是否支持多码率策略;
若支持,则将音频流采用低于当前编码码率的编码码率对音频流进行编码,优选地可采用最低编码码率对音频流进行编码,然后再检测编码后的音频流的每帧音频帧的传输时间是否超过预设传输时延阈值;若未超过,则将编码后的音频流发送至流媒体客户端,若超过,则检测编码特性是否支持分帧策略;若不支持,则丢弃音频流编码后的音频帧中的部分帧,若支持,则将音频流按照最短帧长进行分流,并按照当前编码码率进行编码,即按照最短帧长和最低编码码率对音频流进行编码,然后再检测编码后的音频流的每帧音频帧的传输时间是否超过了预设传输时延阈值,若未超过,则将编码后的音频流发送至流媒体客户端;若经过两次编码策略调整后的音频帧的传输时间仍超过了传输时延阈值,则丢弃音频流编码后的音频帧中的部分帧。
若不支持多码率策略,则检测编码器的编码特性是否支持分帧策略;若不支持,则丢弃音频流编码后的音频帧中的部分帧,若支持,则将音频流按照最短帧长进行分流,并按照当前编码码率进行编码,然后再检测编码后的音频帧的传输时间是否超过预设传输时延阈值,若未超过,则将编码后的音频流发送至流媒体客户端;若经编码策略调整后的音频帧的传输时间仍超过了传输时延阈值,则丢弃音频流编码后的音频帧中的部分帧。
本实施例中,根据当前网络传输速率与预设传输时延阈值的关系,来调整音频流的编码策略,若在当前网络传输速率下,当前音频帧的传输时间未超出传输时延阈值,则按照预定编码策略对音频流进行编码,并将编码后的音频帧传输至流媒体客户端。若在当前网络传输速率下,当前音频帧的传输时间超出了传输时延阈值,则依次检测是否支持多码率策略和分帧策略,并按照相应编码策略对音频流进行编码,以使音频传输适应当前网络的传输速率,解决了因网络抖动或网络不稳定而造成的高于标准要求的时延,进而影响设备之间配合异常的问题。
如图4所示,本公开的实施例中还提供了一种音频传输的自适应装置, 应用于流媒体服务器端,包括:
获取模块,用于获取与流媒体客户端之间的当前网络的传输速率;
计算模块,用于根据待传输的音频流在预定编码策略下的音频帧的比特数与传输速率,计算音频帧的传输时间;
判断模块,用于判断是否超出一预设的传输时延阈值;
第一调整模块,用于当传输时间未超出传输时延阈值时,按照预定编码策略对音频流进行编码,并发送至向流媒体客户端;
第二调整模块,用于当传输时间超出传输时延阈值时,调整预定编码策略,降低音频流编码后的音频帧的比特数,并将编码后的音频流发送至流媒体客户端。
其中,获取模块包括:
发送单元,用于向流媒体客户端发送网络探测消息,其中,网络探测消息携带有发送网络探测消息的第一时间;
接收单元,用于接收流媒体客户端响应网络探测消息后发送的探测响应消息,其中,探测响应消息携带有流媒体客户端接收到网络探测消息的第二时间;
计算单元,用于根据第二时间与第一时间的时间差,以及网络探测消息的比特数,计算与流媒体客户端之间的传输速率。
其中,第二调整模块包括:
第一检测单元,用于检测编码特性是否支持第一编码策略;
第一调整单元,用于当支持第一编码策略时,根据第一编码策略降低音频流编码后的音频帧的比特数;
第一判断单元,用于判断降低比特数后的音频帧的传输时间是否超出传输时延阈值;若未超出,则将编码后的音频流发送至流媒体客户端;若超出,则检测编码特性是否支持第二编码策略;
第二检测单元,用于当不支持第一编码策略时,检测编码特性是否支持第二编码策略;
第二调整单元,用于当不支持第二编码策略时,丢弃音频流编码后的音频帧中的部分帧;当支持第二编码策略时,根据第二编码策略降低音频流编 码后的音频帧的比特数;
第二判断单元,用于判断降低比特数后的音频帧的传输时间是否超出传输时延阈值;若未超出,则将编码后的音频流发送至流媒体客户端;若超出,则丢弃音频流编码后的音频帧中的部分帧;
其中,第一编码策略为分帧策略和多码率策略中的一种,第二编码策略为另一种。
其中,第一编码策略为分帧策略,分帧策略包括多种帧长;第一调整单元包括:
第一调整子单元,用于将音频流分为多个第一音频流,并按照当前编码码率对第一音频流进行编码,其中,第一音频流的长度为分帧策略中的最短帧长。
其中,第二编码策略为多码率策略,多码率策略包括:所支持的编码码率集合;第二调整单元包括:
第二调整子单元,用于采用编码码率集合中低于当前编码码率的编码码率对第一音频流重新编码,并向流媒体客户端告知当前的编码码率。
其中,第一编码策略为多码率策略,多码率策略包括:所支持的编码码率集合;第一调整单元还包括:
第三调整子单元,用于采用编码码率集合中低于当前编码码率的编码码率对音频流重新编码,并向流媒体客户端告知当前的编码码率。
其中,第二编码策略为分帧策略,分帧策略包括多种帧长,第二调整单元还包括:
第四调整子单元,用于将编码后的音频帧分为多个第一音频帧,第一音频帧的长度为分帧策略中的最短帧长。
需要说明的是,该装置是与上述音频传输的自适应方法对应的装置,上述方法实施例中所有实现方式均适用于该装置的实施例中,也能达到相同的技术效果。
以上所述的是本公开的优选实施方式,应当指出对于本技术领域的普通人员来说,在不脱离本公开所述的原理前提下还可以作出若干改进和润饰,这些改进和润饰也在本公开的保护范围内。

Claims (14)

  1. 一种音频传输的自适应方法,应用于流媒体服务器端,所述方法包括:
    获取与流媒体客户端之间的当前网络的传输速率;
    根据待传输的音频流在预定编码策略下的音频帧的比特数与所述传输速率,计算所述音频帧的传输时间;
    判断所述传输时间是否超出一预设的传输时延阈值;
    若未超出,按照所述预定编码策略对所述音频流进行编码,并发送至所述流媒体客户端;
    若超出,则调整所述预定编码策略,降低所述音频流编码后的音频帧的比特数,并将编码后的音频流发送至所述流媒体客户端。
  2. 根据权利要求1所述的音频传输的自适应方法,其中,获取与流媒体客户端之间的当前网络的传输速率的步骤包括:
    向所述流媒体客户端发送网络探测消息,其中,所述网络探测消息携带有发送所述网络探测消息的第一时间;
    接收所述流媒体客户端响应所述网络探测消息后发送的探测响应消息,其中,所述探测响应消息携带有所述流媒体客户端接收到所述网络探测消息的第二时间;
    根据所述第二时间与所述第一时间的时间差,以及所述网络探测消息的比特数,计算与所述流媒体客户端之间的传输速率。
  3. 根据权利要求1或2所述的音频传输的自适应方法,其中,调整所述预定编码策略,降低编码后的音频帧的比特数的步骤包括:
    检测所述编码特性是否支持第一编码策略;
    若支持,则根据所述第一编码策略降低所述音频流编码后的音频帧的比特数,并判断降低比特数后的音频帧的传输时间是否超出所述传输时延阈值;若未超出,则将编码后的音频流发送至所述流媒体客户端;若超出,则检测所述编码特性是否支持第二编码策略;
    若不支持,则检测所述编码特性是否支持第二编码策略;若不支持所述第二编码策略,则丢弃所述音频流编码后的音频帧中的部分帧;若支持所述 第二编码策略,则根据所述第二编码策略降低所述音频流编码后的音频帧的比特数,并判断降低比特数后的音频帧的传输时间是否超出所述传输时延阈值;若未超出,则将编码后的音频流发送至所述流媒体客户端;若超出,则丢弃所述音频流编码后的音频帧中的部分帧;
    所述第一编码策略为分帧策略和多码率策略中的一种,所述第二编码策略为另一种。
  4. 根据权利要求3所述的音频传输的自适应方法,其中,所述第一编码策略为分帧策略,所述分帧策略包括多种帧长;根据所述第一编码策略降低所述音频流编码后的音频帧的比特数的步骤包括:
    将所述音频流分为多个第一音频流,并按照当前编码码率对所述第一音频流进行编码,其中,第一音频流的长度为所述分帧策略中的最短帧长。
  5. 根据权利要求4所述的音频传输的自适应方法,其中,所述第二编码策略为多码率策略,所述多码率策略包括:所支持的编码码率集合;根据所述第二编码策略降低所述音频流编码后的音频帧的比特数的步骤包括:
    采用所述编码码率集合中低于当前编码码率的编码码率对所述第一音频流重新编码。
  6. 根据权利要求3所述的音频传输的自适应方法,其中,所述第一编码策略为多码率策略,所述多码率策略包括:所支持的编码码率集合;根据所述第一编码策略降低所述音频流编码后的音频帧的比特数的步骤包括:
    采用所述编码码率集合中低于当前编码码率的编码码率对所述音频流重新编码。
  7. 根据权利要求6所述的音频传输的自适应方法,其中,所述第二编码策略为分帧策略,所述分帧策略包括多种帧长,根据所述第二编码策略降低所述音频流编码后的音频帧的比特数的步骤包括:
    将编码后的音频帧分为多个第一音频帧,所述第一音频帧的长度为所述分帧策略中的最短帧长。
  8. 一种音频传输的自适应装置,应用于流媒体服务器端,包括:
    获取模块,用于获取与流媒体客户端之间的当前网络的传输速率;
    计算模块,用于根据待传输的音频流在预定编码策略下的音频帧的比特 数与所述传输速率,计算所述音频帧的传输时间;
    判断模块,用于判断是否超出一预设的传输时延阈值;
    第一调整模块,用于当所述传输时间未超出所述传输时延阈值时,按照所述预定编码策略对所述音频流进行编码,并发送至向所述流媒体客户端;
    第二调整模块,用于当所述传输时间超出所述传输时延阈值时,调整所述预定编码策略,降低所述音频流编码后的音频帧的比特数,并将编码后的音频流发送至所述流媒体客户端。
  9. 根据权利要求8所述的音频传输的自适应装置,其中,所述获取模块包括:
    发送单元,用于向所述流媒体客户端发送网络探测消息,其中,所述网络探测消息携带有发送所述网络探测消息的第一时间;
    接收单元,用于接收所述流媒体客户端响应所述网络探测消息后发送的探测响应消息,其中,所述探测响应消息携带有所述流媒体客户端接收到所述网络探测消息的第二时间;
    计算单元,用于根据所述第二时间与所述第一时间的时间差,以及所述网络探测消息的比特数,计算与所述流媒体客户端之间的传输速率。
  10. 根据权利要求8或9所述的音频传输的自适应装置,其中,所述第二调整模块包括:
    第一检测单元,用于检测所述编码特性是否支持第一编码策略;
    第一调整单元,用于当支持所述第一编码策略时,根据所述第一编码策略降低所述音频流编码后的音频帧的比特数;
    第一判断单元,用于判断降低比特数后的音频帧的传输时间是否超出所述传输时延阈值;若未超出,则将编码后的音频流发送至所述流媒体客户端;若超出,则检测所述编码特性是否支持第二编码策略;
    第二检测单元,用于当不支持所述第一编码策略时,检测所述编码特性是否支持第二编码策略;
    第二调整单元,用于当不支持所述第二编码策略时,丢弃所述音频流编码后的音频帧中的部分帧;当支持所述第二编码策略时,根据所述第二编码策略降低所述音频流编码后的音频帧的比特数;
    第二判断单元,用于判断降低比特数后的音频帧的传输时间是否超出所述传输时延阈值;若未超出,则将编码后的音频流发送至所述流媒体客户端;若超出,则丢弃所述音频流编码后的音频帧中的部分帧;
    其中,所述第一编码策略为分帧策略和多码率策略中的一种,所述第二编码策略为另一种。
  11. 根据权利要求10所述的音频传输的自适应装置,其中,所述第一编码策略为分帧策略,所述分帧策略包括多种帧长;所述第一调整单元包括:
    第一调整子单元,用于将所述音频流分为多个第一音频流,并按照当前编码码率对所述第一音频流进行编码,其中,第一音频流的长度为所述分帧策略中的最短帧长。
  12. 根据权利要求11所述的音频传输的自适应方法装置,其中,所述第二编码策略为多码率策略,所述多码率策略包括:所支持的编码码率集合;所述第二调整单元包括:
    第二调整子单元,用于采用所述编码码率集合中低于当前编码码率的编码码率对所述第一音频流重新编码。
  13. 根据权利要求10所述的音频传输的自适应装置,其中,所述第一编码策略为多码率策略,所述多码率策略包括:所支持的编码码率集合;所述第一调整单元还包括:
    第三调整子单元,用于采用所述编码码率集合中低于当前编码码率的编码码率对所述音频流重新编码。
  14. 根据权利要求13所述的音频传输的自适应装置,其中,所述第二编码策略为分帧策略,所述分帧策略包括多种帧长,所述第二调整单元还包括:
    第四调整子单元,用于将编码后的音频帧分为多个第一音频帧,所述第一音频帧的长度为所述分帧策略中的最短帧长。
PCT/CN2015/099813 2015-01-29 2015-12-30 音频传输的自适应方法及装置 WO2016119560A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510047890.6A CN105989844B (zh) 2015-01-29 2015-01-29 一种音频传输的自适应方法及装置
CN201510047890.6 2015-01-29

Publications (1)

Publication Number Publication Date
WO2016119560A1 true WO2016119560A1 (zh) 2016-08-04

Family

ID=56542364

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/099813 WO2016119560A1 (zh) 2015-01-29 2015-12-30 音频传输的自适应方法及装置

Country Status (2)

Country Link
CN (1) CN105989844B (zh)
WO (1) WO2016119560A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110364170A (zh) * 2019-05-29 2019-10-22 平安科技(深圳)有限公司 语音传输方法、装置、计算机装置及存储介质
GB2596107A (en) * 2020-06-18 2021-12-22 Nokia Technologies Oy Managing network jitter for multiple audio streams
CN115102931A (zh) * 2022-05-20 2022-09-23 阿里巴巴(中国)有限公司 自适应调整音频延迟的方法及电子设备

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106688233A (zh) * 2016-12-08 2017-05-17 深圳市大疆创新科技有限公司 用于传输图像的方法、设备和无人机
CN107122159B (zh) * 2017-04-20 2020-04-17 维沃移动通信有限公司 一种在线音频的品质切换方法及移动终端
CN108417219B (zh) * 2018-02-22 2020-10-13 武汉大学 一种适应于流媒体的音频对象编解码方法
CN111245769B (zh) * 2018-11-28 2022-09-30 深圳技威时代科技有限公司 一种音视频传输方法、装置和存储介质
CN113314133A (zh) * 2020-02-11 2021-08-27 华为技术有限公司 音频传输方法及电子设备
CN113645177A (zh) * 2020-05-11 2021-11-12 同响科技股份有限公司 可靠传输网络中维持实时音讯串流播放延迟的方法及系统
CN112087627A (zh) * 2020-08-04 2020-12-15 西安万像电子科技有限公司 图像的编码控制方法、装置、设备及存储介质
CN113365143B (zh) * 2021-05-31 2024-03-19 努比亚技术有限公司 一种音频爆音的消除方法及相关设备
CN114006890B (zh) * 2021-10-26 2024-02-06 深圳Tcl新技术有限公司 一种数据传输方法、设备及存储介质和终端设备
CN117527771B (zh) * 2024-01-05 2024-03-29 深圳旷世科技有限公司 音频传输方法、装置、存储介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101800697A (zh) * 2010-01-27 2010-08-11 深圳市宇速科技有限公司 一种自适应网络带宽实时视频传输方法
CN101917612A (zh) * 2009-12-17 2010-12-15 新奥特(北京)视频技术有限公司 一种流媒体视频编码方法及装置
CN103248884A (zh) * 2012-02-14 2013-08-14 华为技术有限公司 一种控制视频速率的系统、基站及方法
WO2013125375A1 (ja) * 2012-02-21 2013-08-29 ソニー株式会社 映像送信装置、映像送信方法、及びプログラム
CN103560862A (zh) * 2013-10-18 2014-02-05 华为终端有限公司 移动终端及其编码速率控制方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102387086B (zh) * 2011-12-09 2015-01-28 西安电子科技大学 具有QoS保障的深空网络路由方法
CN103915097B (zh) * 2013-01-04 2017-03-22 中国移动通信集团公司 一种语音信号处理方法、装置和系统
CN104009819B (zh) * 2013-02-22 2018-08-07 南京中兴软件有限责任公司 基于分层编码的抗丢包实时通信方法、系统及相关设备
CN103152497A (zh) * 2013-03-29 2013-06-12 贵阳朗玛信息技术股份有限公司 手机VoIP系统中动态比特率的实现方法、装置及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101917612A (zh) * 2009-12-17 2010-12-15 新奥特(北京)视频技术有限公司 一种流媒体视频编码方法及装置
CN101800697A (zh) * 2010-01-27 2010-08-11 深圳市宇速科技有限公司 一种自适应网络带宽实时视频传输方法
CN103248884A (zh) * 2012-02-14 2013-08-14 华为技术有限公司 一种控制视频速率的系统、基站及方法
WO2013125375A1 (ja) * 2012-02-21 2013-08-29 ソニー株式会社 映像送信装置、映像送信方法、及びプログラム
CN103560862A (zh) * 2013-10-18 2014-02-05 华为终端有限公司 移动终端及其编码速率控制方法

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110364170A (zh) * 2019-05-29 2019-10-22 平安科技(深圳)有限公司 语音传输方法、装置、计算机装置及存储介质
CN110364170B (zh) * 2019-05-29 2024-01-30 平安科技(深圳)有限公司 语音传输方法、装置、计算机装置及存储介质
GB2596107A (en) * 2020-06-18 2021-12-22 Nokia Technologies Oy Managing network jitter for multiple audio streams
CN115102931A (zh) * 2022-05-20 2022-09-23 阿里巴巴(中国)有限公司 自适应调整音频延迟的方法及电子设备
CN115102931B (zh) * 2022-05-20 2023-12-19 阿里巴巴(中国)有限公司 自适应调整音频延迟的方法及电子设备

Also Published As

Publication number Publication date
CN105989844A (zh) 2016-10-05
CN105989844B (zh) 2019-12-13

Similar Documents

Publication Publication Date Title
WO2016119560A1 (zh) 音频传输的自适应方法及装置
JP6420006B2 (ja) ビデオ電話における遅延の低減
US8964115B2 (en) Transmission capacity probing using adaptive redundancy adjustment
KR100408525B1 (ko) 네트워크에 적응적인 실시간 멀티미디어 스트리밍 시스템및 방법
US8489758B2 (en) Method of transmitting data in a communication system
US7653005B2 (en) Method, device and system for monitoring network performance
US8811167B2 (en) Shaping multimedia stream bit-rates to adapt to network conditions
US10382495B2 (en) Method and interworking network node for enabling bit rate adaption in media streaming
CN109150876A (zh) 一种视频无线传输的qos方法、装置及系统
JP3730974B2 (ja) メディア伝送方法及びその送信装置
CN103238349A (zh) 用于无线通信中的信道适配的方法和装置
JP5483807B2 (ja) エレベータ遠隔監視用通信システム
JP2017139521A (ja) ストリーム配信装置、ストリーム受信装置、ストリーム配信システム、ストリーム配信方法、及びストリーム配信プログラム
WO2023207067A1 (zh) 数据发送装置、接收装置、传输方法及传输系统
US9584759B2 (en) Determination of bit rate request
JP7296423B2 (ja) ラウンドトリップ推定
JP2011172162A (ja) 受信装置、送信装置、通信システム及び通信方法並びにプログラム
CN107154913B (zh) 一种ip电话终端通信方法
JP6287409B2 (ja) 通信品質監視装置及び方法とシステム並びにプログラム
JP2013162441A (ja) データ送信装置、システム及び方法
JP2008283583A (ja) データ通信装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15879758

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15879758

Country of ref document: EP

Kind code of ref document: A1