GB2521883A - Media controller - Google Patents

Media controller Download PDF

Info

Publication number
GB2521883A
GB2521883A GB1407782.0A GB201407782A GB2521883A GB 2521883 A GB2521883 A GB 2521883A GB 201407782 A GB201407782 A GB 201407782A GB 2521883 A GB2521883 A GB 2521883A
Authority
GB
United Kingdom
Prior art keywords
media
processing device
data packet
data
timestamp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB1407782.0A
Other versions
GB2521883B (en
GB201407782D0 (en
Inventor
Senthil Kumar Mani
Harish Rajamani
Bala Manikya Prasad Puram
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Imagination Technologies Ltd
Original Assignee
Imagination Technologies Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Imagination Technologies Ltd filed Critical Imagination Technologies Ltd
Priority to GB1407782.0A priority Critical patent/GB2521883B/en
Publication of GB201407782D0 publication Critical patent/GB201407782D0/en
Priority to GB1417242.3A priority patent/GB2524349B/en
Priority to GB1512312.8A priority patent/GB2524430B/en
Priority to US14/703,558 priority patent/US10778257B2/en
Priority to US14/703,479 priority patent/US9985660B2/en
Publication of GB2521883A publication Critical patent/GB2521883A/en
Application granted granted Critical
Publication of GB2521883B publication Critical patent/GB2521883B/en
Priority to US15/975,883 priority patent/US10680657B2/en
Priority to US16/866,829 priority patent/US20200266839A1/en
Priority to US16/999,540 priority patent/US11323136B2/en
Priority to US17/709,206 priority patent/US11750227B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/35Unequal or adaptive error protection, e.g. by providing a different level of protection according to significance of source information or by adapting the coding according to the change of transmission channel characteristics
    • H03M13/353Adaptation to the channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/28Flow control; Congestion control in relation to timing considerations
    • H04L47/283Flow control; Congestion control in relation to timing considerations in response to processing delays, e.g. caused by jitter or round trip time [RTT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/26Flow control; Congestion control using explicit feedback to the source, e.g. choke packets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/65Purpose and implementation aspects
    • H03M13/6502Reduction of hardware complexity or efficient processing
    • H03M13/6505Memory efficient implementations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0045Arrangements at the receiver end
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/12Arrangements for detecting or preventing errors in the information received by using return channel
    • H04L1/16Arrangements for detecting or preventing errors in the information received by using return channel in which the return channel carries supervisory signals, e.g. repetition request signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • H04L43/0829Packet loss
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • H04L43/087Jitter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • H04L43/106Active monitoring, e.g. heartbeat, ping or trace-route using time related information in packets, e.g. by adding timestamps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/34Flow control; Congestion control ensuring sequence integrity, e.g. using sequence numbers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • H04L47/62Queue scheduling characterised by scheduling criteria
    • H04L47/625Queue scheduling characterised by scheduling criteria for service slots or service orders
    • H04L47/626Queue scheduling characterised by scheduling criteria for service slots or service orders channel conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/762Media network packet handling at the source 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/764Media network packet handling at the destination 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/006Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
    • H04M7/0081Network operation, administration, maintenance, or provisioning
    • H04M7/0084Network monitoring; Error detection; Error recovery; Network testing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]

Abstract

A data processing device (100) comprising: a jitter buffer (112) for receiving data packets; a media decoder (104) configured to decode the data packets so as to form a stream of media frames, each frame comprising a plurality of samples; a media consumer (111) having an input buffer (102) for receiving the stream of media frames and being configured to play media frames from the input buffer (102) according to a first frame rate; a buffer interface (113) configured to monitor the input buffer (102) so as to detect when the number of samples at the input buffer (102) of the media consumer (111) falls below a predetermined level and, in response, generate a play-out request (109); and a media controller (105) configured to, responsive to each of the generated play-out requests (109) play-out one or more data packets from the jitter buffer (112) to the media decoder (104) so as to cause media frames of the stream to be delivered into the input buffer (102) at a rate commensurate with the first frame rate.

Description

Media Controller
BACKGROUND OF THE INVENTION
This invention relates to a data processing device for playing out buffered media data packets to a media consumer.
Expectation of voice over internet protocol (V0IP) services is growing rapidly due to improvements in high-speed wireless internet technology and more powerful mobile devices. In packet-switched networks, the regularity of a VoIP stream is however naturally impaired by routing, queuing, scheduling and serialization effects, which result in loss and jitter (including delays) to data packets. The main factors affecting voice quality are in fact delay and loss which cannot generally be known in advance to the receiving device because they depend on the real-time behaviour of connections throughout the network.
Achieving high quality real-time voice transmission between VoIP devices requires mechanisms for smoothing out the jitter inherent in a received stream of network data packets. This is generally done by means of an Adaptive Jitter Buffer (AJB).
Most of the existing jitter buffer algorithms calculate play-out times of data packets to a media decoder using adaptive estimation of network jitter. The adaptive algorithm typically uses adaptive dual alpha or other relevant weighting factors, for example as is described in "Perceptual optimisation of playout buffer in voip applications", Chun-Feng Wu and Wen-Whei Chang, First International Conference on Communications and Networking in China, ChinaCom 2006. Network statistics and a history of measurements may also be used for controlling the adaptation, for example as described in "Jitter Buffer Loss Estimate for Effective Equipment Impairment Factor", Pavol Partila et at, International journal of mathematics and computers in simulation.
Such conventional algorithms can sometimes work under slightly impaired network conditions, but the behaviour of bursty traffic, self-similar traffic and long range dependent traffic often differs from the ideal stochastic models of absolutely independent packets which these techniques use when trying to assess or describe traffic inter-arrival times (e.g. using standard distributions such as Markov models, Poison distributions, exponential distributions, neural network modelling, etc.) These algorithms therefore suffer from suboptimal performance as these models can give wrong or inaccurate predictions on the inter-frame dependency between consecutive packets.
Recently EMOS (Equivalent Mean Opinion Score) based algorithms are becoming more popular due to better performance than the performance of adaptive estimation algorithms. EMOS algorithms for predicting the subjective quality of packetized voice have been standardised in ITU-T G.107. Examples of EMOS algorithms are described in "E-model MOS estimate precision improvement and modelling of jitter effects", Information and Communication Technologies and Services, Vol. 10, 2012. However, EMOS algorithms are sensitive to network delay and can often discard a significant number of packets even under slightly poor network conditions -for example, if a gateway or media server adds considerable fixed delay.
Both adaptive estimation and EMOS algorithms suffer severely when streams of network packets experience significant jitter and bunching effects.
BRIEF SUMMARY OF THE INVENTION
According to a first aspect of the present invention there is provided a data processing device comprising: a jitter buffer for receiving data packets; a media decoder configured to decode the data packets so as to form a stream of media frames, each frame comprising a plurality of samples; a media consumer having an input buffer for receiving the stream of media frames and being configured to play media frames from the input buffer according to a first frame rate; a buffer interface configured to monitor the input buffer so as to detect when the number of samples at the input buffer of the media consumer falls below a predetermined level and, in response, generate a play-out request; and a media controller configured to, responsive to each of the generated play-out requests, play-out one or more data packets to the media decoder so as to cause media frames of the stream to be delivered into the input buffer at a rate commensurate with the first frame rate.
The buffer interface may be supported at the media consumer.
The buffer interface may be supported at the media controller.
The predetermined level may be at least the number of samples comprised in a media frame.
The buffer interface may be configured to periodically check the number of samples at the input buffer at a rate commensurate with the first frame rate.
The data processing device may further comprise a receive queue for receiving data packets from the data packets from a network, the media controller being configured to periodically store in the jitter buffer all of the data packets available at the receive queue whose timestamps are greater than the timestamp of the last data packet played out by the media controller.
The media controller may be configured to, on storing one or more data packets at the jitter buffer, increase the size of the jitter buffer by the size of those data packets.
The media controller may be configured to maintain a histogram representing a distribution of time periods between the timestamps of successive packets stored at the jitter buffer, the histogram indicating for each of a predetermined range of time periods a measure of the number of successive data packets separated by that time period.
The media controller may be arranged to update the histogram on storing each of the data packets.
The media controller may be configured to estimate a minimum size for the jitter buffer by identifying the lowest time period between the timestamps of successive packets for which the measure of the number of successive data packets separated by that time period is zero.
The media controller may be configured to cause the size of the jitter buffer to adapt so as to be at least the estimated minimum size.
The media controller may be configured to, responsive to each of the play-out requests, estimate the timestamp of the next packet to be played out from the jitter buffer based on the timestamp of the preceding data packet played out from the jitter buffer and the size of that preceding data packet.
The media controller may be further configured to estimate the timestamp of the next packet to be played out from the jitter buffer based on a measure of the number of media samples added or discarded in accordance with time scale modification algorithms operating at the data processing device.
The media controller may be configured to search the jitter buffer for a best match data packet having a timestamp equal to the estimated timestamp or within the size of one media frame of the estimated timestamp according to the codec in use at the decoder, and if such a best match data packet is identified, play out the best match data packet.
The media controller may be configured to decrease the size of the jitter buffer by the size of the best match data packet.
The media controller may be further configured to search the jitter buffer for the data packet having the lowest timestamp and, if that lowest timestamp is not equal to the timestamp of the best match data packet, discarding the data packet having that lowest timestamp.
The media controller may be configured to play out each best match data packet only if the last data packet played out by the jitter buffer was a SPEECH, DTX, or SID data packet.
The media controller may be configured to, if the size of the jitter buffer was zero on the preceding play-out request being received, play out a synthetic data packet selected in accordance with a time scale modification algorithm and irrespective of the presence or otherwise of a best match data packet.
The media controller may be configured to, if a best match data packet is not identified, play out: if the lowest timestamp is lower than the timestamp of the latest data packet played-out by the media controller, the data packet having the lowest timestamp provided that the latest data packet played out by the media controller was a DTX, LOST, EXP, or DTMF data packet, and otherwise discard the data packet having the lowest timestamp and play-out an EXP data packet; if the lowest timestamp is greater than the timestamp of the latest data packet played-out by the media controller then play-out a synthetic data packet selected in accordance with a time scale modification algorithm.
The media controller may be configured to, on each data packet being played out, iteratively search for each next best match data packet until an amount of data has been played-out to the decoder to satisfy a number of samples indicated in or represented by the play-out request.
The data processing device may further comprise a frame processor between the decoder and the input buffer, the frame processor configured to perform one or more of noise cancellation, automatic gain control, delay adjustment, sample rate conversion, and multiplexing of media streams.
The data processing device may further comprise packet concealment logic at the decoder or at a packet concealment module between the media controller and decoder, the packet concealment logic being configured to generate media samples in accordance with synthetic packets received from the jitter buffer.
The media controller may be configured to, on storing a data packet whose timestamp precedes the timestamp of the latest played-out data packet by less than the size of the data packet, store only that part of the data packet representing media samples subsequent to the timestamp of the latest played-out data packet, and discarding that part of the data packet representing media samples preceding the timestamp of the latest played out data packet.
Each of the said timestamps may be a send timestamp indicative of the time at which each respective data packet was sent over the network.
The data processing device may further comprise a first timer and the media controller being configured to, on receiving the play-out request, calculate an overflow size of the jitter buffer and to: if the overflow size of the jitter buffer exceeds a first threshold, increment the first timer by a measure of the number of samples requested in the play-out request; and otherwise, reset the first timer to zero.
The media controller may be configured to, when the first timer exceeds a second threshold, generate one or more data packets so as to cause the decoder to perform compression by an amount selected in dependence on the overflow size.
The data processing device may further comprise a second timer and the media controller being configured to, when the first timer exceeds a third threshold: if the overflow size of the jitter buffer exceeds the first threshold, increment the second timer by a measure of the number of samples requested in the play-out request; and otherwise, not increment the second timer.
The media controller may be configured to, when the second timer exceeds a fourth threshold, generate one or more data packets so as to cause the decoder to perform compression by an amount selected in dependence on the overflow size.
The fourth threshold may be an adaptive threshold selected in dependence on the overflow size.
The media controller may be configured to reset the second timer to zero on generating the one or more data packets so as to cause the decoder to perform compression.
The overflow size may be the difference between a measure of the size of the jitter buffer on receiving the play-out request and the estimated minimum size of the jitter buffer.
The measure of the size of the jitter buffer may be an average size of the jitter buffer calculated in dependence on the size of the jitter buffer at one or more preceding play-out requests.
The media controller may be configured to select the amount of compression to be around 25% of the overflow size.
According to a second aspect of the present invention there is provided a method for controlling a stream of data packets received over a network for a media consumer, the media consumer having an input buffer for receiving media frames decoded from the stream of data packets and being configured to play the media frames according to a first frame rate, the method comprising: receiving data packets into a jitter buffer; generating a play-out request when the number of samples comprised in media frames at the input buffer of the media consumer falls below a predetermined level; receiving the play-out request at the media controller; and responsive to that request, the media controller playing-out one or more data packets to a media decoder so as to cause media frames decoded from the stream of data packets to be delivered into the input buffer at a rate commensurate with the first frame rate.
Receiving data packets into the jitter buffer may comprise periodically storing in the jitter buffer all of the data packets available at a network receive queue whose timestamps are greater than the timestamp of the last data packet played out by the media controller.
The method may further comprise: estimating the timestam p of the next packet to be played out from the jitter buffer based on the timestamp of the preceding data packet played out from the jitter buffer and the size of that preceding data packet; searching the jitter buffer for a best match data packet having a timestamp equal to the estimated timestamp or within the size of one media frame of the estimated timestamp according to the codec in use at the decoder; and if such a best match data packet is identified, the media controller playing-out the best match data packet.
The method may further comprise iteratively searching for each next best match data packet and playing-out each such best match data packet until an amount of data has been played-out to the decoder to satisfy a number of samples indicated in or represented by the play-out request.
In embodiments of the invention, machine readable code may be provided for generating the data processing device or media controller. In embodiments of the invention, a machine readable storage medium having encoded thereon non-transitory machine readable code may be provided for generating the data processing device or media controller.
In embodiments of the invention, machine readable code may be provided for implementing the method of controlling a stream of data packets. In embodiments of the invention, a machine readable storage medium having encoded thereon non-transitory machine readable code may be provided for implementing the method of controlling a stream of data packets.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will now be described by way of example with reference to the accompanying drawings. In the drawings: Figure 1 shows a schematic diagram of a data processing device according to an example of the present invention.
Figure 2 is a flowchart illustrating a method performed by the data processing device.
Figure 3 is a schematic diagram of a frame processor of the media controller.
Figure 4 illustrates the performance of the data processing device operating on a simulated network.
DETAILED DESCRIPTION
The following description is presented by way of example to enable any person skilled in the art to make and use the invention. The present invention is not limited to the embodiments described herein and various modifications to the disclosed embodiments will be readily apparent to those skilled in the ad.
There is a need for a jitter control mechanism which provides improved performance over conventional techniques when a received network data stream experiences significant jitter, including in the face of high packet delays.
A data processing device is provided that includes a media controller and jitter buffer configured to play out data packets in response to play-out requests from a buffer interface. A data processing device configured as described herein could consume any kind of media, including audio and video, and the media frames generated by a decoder of the data processing device can be any grouping of samples or media data appropriate to the particular implementation (e.g. for an audio consumer, each media frame can be an audio frame comprising a fixed or variable number of media samples).
The data processing device can receive data packets according to any suitable network protocol. Since the play-out of data packets from the buffer is not performed according to a timer of the media controller but in response to play-out requests generated according to frame consumption by the media consumer, the device does not suffer from the problems associated with clock skew between a clock supporting a timer-controlled adaptive jitter buffer and a clock at the media consumer controlling the play rate of media frames.
Figure 1 shows a receive path of a data processing device 100 comprising a media consumer 111, media processor 110 and a receive queue 106 for receiving packet data 108 from a network 107. The media processor 110 includes a decoder 104 for decoding data packets received from the network at the receive queue and forming a stream of media frames. The media consumer receives the decoded media stream into an input buffer 102 from which a media interface 101 reads media frames for playing by the media consumer. In the example shown in Figure 1, the media consumer is an audio device comprising an audio interface 101, the data packets 108 carrying media data are RTP (Realtime Transport Protocol) data packets, and the media stream is a PCM (Pulse Code Modulation) audio stream. The receive queue 106 would in this example typically be a receive queue of an RTP socket. Decoder 104 decodes data packets according to the appropriate codec for the received media streams. Audio codecs typically used for speech compression and decompression include ITU-T G.71 1 ITU-T G.729AB, ITU-T G.723 and ITU-T G.722.
The media processor 110 includes a play-out media controller (PMC) 105 configured in accordance with the teaching herein and can optionally include a frame processor 103 for processing media data decoded by the decoder 104. The PMC 105 comprises a jitter buffer 112 and logic for controlling the buffer in accordance with the principles described herein. The PMC, decoder and frame processor need not be provided at a common processing entity and are shown grouped together in Figure 1 at a media processor 110 for illustrative purposes only. Generally, the PMC, decoder and frame processor can be provided by any suitable combination of hardware and/or software functions. Only one media stream is shown in Figure 1 but in practice there could be multiple media streams on the receive path stemming from one or more receive queues, each stream terminating at the media interface 101.
Only the receive path of the data processing device is shown in Figure 1. The data processing device could further provide a transmit path for processing a media stream and generating media data packets for transmission over the network. For example, if the data processing device is capable of VoIP (Voice-over-Internet Protocol) communication, the device could comprise a speaker coupled to the media consumer for playing decoded PCM data received over the network and a microphone coupled to a media source arranged to generate PCM data for encoding into data packets that are transmitted over the network. In this manner the data processing device could provide an endpoint for a two-way VoIP conversation.
Conventionally, the receive path of a data processing device for consuming media data received over a network would be arranged to (a) periodically read data packets from the receive queue into an adaptive jitter buffer, and to (b) periodically play-out packets from the adaptive jitter buffer for decoding according to a timer of the jitter buffer. Both (a) and (b) would be performed according to algorithms for jitter control (e.g. that estimate network jitter or use an EMOS measure of network conditions) which operate with reference to a timer available to the jitter buffer. This approach often leads to accumulation or depletion of packets at the jitter buffer due to one or more of the following reasons.
a. A media consumer typically plays media data at a rate controlled by a hardware timer of the media consumer. Since the adaptive jitter buffer will generally be supported by a different clock, clock skew can become a significant problem over time as a media stream is played-out. This is due to drift between the clock of the media consumer and the clock accessible to the adaptive jitter buffer (often a low accuracy system timer).
b. Where a media consumer further receives media data from a jitter buffer by means of one or more intermediate application layers, those layers can introduce further sources of skew due to clock drift between the timers on which the application layers are based and the hardware timer of the media consumer.
c. In certain instances a media consumer can require multiple media frames in quick succession (e.g. when playing media at an accelerated rate); such instances cannot be efficiently serviced by an architecture in which the jitter buffer is arranged to continuously push out data packets at a given rate.
The data processing device shown in Figure 1 overcomes these problems by arranging that play-out requests are sent to the PMC 105 at a rate commensurate with the frame rate at which decoded frames are being consumed by the media consumer 111. This is achieved by arranging that a buffer interface 113 signals play-out requests to the PMC. The PMC does not therefore play-out data packets at a rate determined by its timer based on estimates of jitter or EMOS measures of network conditions. In this manner, the play-out of data packets from the PMC is independent of the time at which the data packets are received at the data processing device; rather the play-out of data packets depends on their send timestamps. The buffer interface could be configured to generate play-out requests at a rate proportional to the frame rate at which the media consumer plays frames from its input buffer such that new frames are delivered into the input buffer at the appropriate rate. For example, the buffer interface can be configured to ensure that new samples/frames are delivered into the input buffer at the appropriate rate by detecting when the number of samples (or frames) at its buffer interface drops below a predetermined level -e.g. for a VoIP implementation, fewer than 2Oms of samples has been found to offer good performance.
In Figure 1, the buffer interface 113 is shown as being part of the media consumer, but more generally it could be supported at any suitable aspect of the data processing device. For example, the buffer interface could form part of the PMC itself, with the interface being arranged to monitor the input buffer 102 of the media consumer in order to detect when the number of samples (or frames) at the buffer interface drops below a predetermined level. The buffer interface could be a software thread tasked with polling the input buffer in order to identify when the number of samples (or frames) drops below a predetermined level. The buffer interface could be one and the same as the media consumer, with the buffer interface being that aspect of the media consumer configured to detect whether the number of samples at the input buffer has dropped below the predetermined level -this could be done when the media consumer accesses the input buffer to play out samples.
Allowing the consumption of media samples by the media consumer to control the play-out of data packets from the PMC avoids the complexity associated with jitter buffers of having to estimate the level of jitter in the network 107, and, in the case of EMOS mechanisms, avoids the poor performance encountered during periods of network delay.
The data processing device of Figure 1 is configured as follows. The buffer interface 113 is configured to signal media controller (PMC) 105 by means of a play-out request 109 so as to cause the PMC to play-out data packets from its jitter buffer 112 to the decoder 104. For example, each play-out request could cause the PMC to play-out one or more data packets, a sufficient number of data packets to correspond to a given number of samples, or data packets corresponding to a given set of samples (this could be specified by the sequence numbers of those samples). The data for each media frame (such as an audio or video frame) would typically be carried in multiple data packets. The buffer interface can be arranged to signal multiple play-out requests in order to cause the PMC to play-out sufficient data packets to form each media frame for delivery into input buffer 102, or to service a play event performed by the media consumer which comprises a given number of samples/frames.
Each play-out request could indicate to the PMC to play-out a predetermined number of one or more data packets from jitter buffer 112, or to play-out data packets carrying data in respect of an indicated number of samples/period of time (e.g. a range of one or more media frame sequence numbers). For instance, each play-out request could include an indication of the number of samples required to satisfy a play event to be performed by the media consumer, or each play-out request could itself represent to the media controller that some predefined number of samples are to be played-out.
By arranging that play-out requests are generated until there is sufficient data in the input buffer 102 to service a play event, the PMC is not required to estimate the rate at which it must pass data packets onto the decoder. Such estimates are not straightforward because, as well as the fact that with many media codecs there can be a complex relationship between the size of data packet payload and the play length that payload represents, typically there will be gaps in a data packet stream due to late or missing packets. These gaps can be filled by expansion techniques provided by time scale modification algorithms (e.g. playing synthetic data in place of the missing packet), but these techniques further complicate the estimation of the rate at which data packets might be required at the media consumer.
By arranging that play-out requests are sent to the PMC when the media consumer requires data, any skew between the clock of the media consumer and the clock available to the PMC becomes irrelevant in terms of controlling the play-out rate of the data packets. Furthermore, the rate at which media frames are provided to the input buffer of the media consumer can be maintained commensurate with the rate at which the media consumer consumes those frames. This is true irrespective of the processing performed between the PMC and the input buffer 102, such as processing to compensate for delays and gaps between frames, as well as decoder processing according to a given codec. Thus, the media consumer receives data at the correct rate in a manner that is platform-independent and insensitive to the implementation-specific packet and frame processing performed at a given data processing device.
Figure 2 illustrates an exemplary method performed by the receive path of the data processing device of Figure 1 in order to play a media stream received over network 107 in a series of data packets. On a play event 301 being scheduled at the media consumer 111, the buffer interface 113 of the media consumer checks at 302 whether the media data is present at its input buffer 102 to service that play event. If there is, that media data is read from the input buffer 102 and played by the media consumer 303 at the appropriate time in accordance with the scheduled play event. If there is not sufficient media data at its input buffer to service the play event, the buffer interface generates a play-out request and signals the PMG 105 so as to cause the PMG to play-out the appropriate data packets for decoding 304. Independently of the operation of the media consumer, at 305 the receive queue 106 accepts data packets carrying media data from the network.
On receiving packets from the network, the PMC reads data packets from the receive queue 306 into its jitter buffer 112 and calculates one or more buffer control parameters 307, as appropriate to the particular implementation and as described in more detail below. This step can be termed the storing process and is performed independently of the play-out of data packets from the jitter buffer. The PMC could read all of the data packets in the receive queue, a predetermined number of data packets, or a number of data packets determined in dependence on one or more parameters of the system (such as a measure of available space in a data store at which the jitter buffer is supported).
In response to each play-out request, the PMC plays out 308 one or more data packets from its buffer 112 in sequence to decoder 104 for decoding 309. The number of data packets played out can depend on an amount of data indicated in each play-out request, e.g. a number of samples, or length of samples in milliseconds required to service a play event. Step 308 can be termed the packet pick-up process. Typically, various processing steps 310 will be performed on the frame data generated by the decoder at the frame processor 103. These are described in more detail with reference to Figure 3. Finally, at 311 the processed frame data is written to the input buffer 102.
According to its schedule, the buffer interface again checks whether there is sufficient data at the input buffer to service the play event. If there is now sufficient data, the play event is performed 303; if not, a further play-out request is sent to the PMC.
It should be appreciated that Figure 2 does not suggest that the buffer interface 113 must wait for the PMG to cause frame data to actually be posted into the input buffer 102 before it checks again whether there is sufficient data in the input buffer to service the play event. In fact, the buffer interface would preferably periodically check the input buffer for data independently of the progress of the PMC, decoder and frame processor. This avoids the media consumer becoming stalled and ensures that the buffer interface checks for data at a rate appropriate to its play rate of the media stream. For example, if the media consumer requires media frames at a certain rate, the buffer interface 113 can be configured to check its input buffer 102 at a frequency such that, through the signalling of play-out requests, the PMC is caused to deliver data packets into the receive chain at a rate appropriate to meet the demand for data from the media consumer. The media consumer would typically check its input buffer for data at a frequency commensurate with the rate at which the media consumer requires new frames for playing.
It is advantageous if the buffer interface is configured to poll the input buffer at a rate which is greater than the rate at which samples are being consumed by the media consumer from the input buffer 102. For example, if each frame is 2Oms in length then the buffer interface is preferably configured to poll the input buffer at least as frequently as every lOms. This helps to ensure that play-out requests are generated at an early opportunity when the number of samples at the input buffer drops below a predetermined level.
Decoded frame data provided by decoder 104 could optionally be subject to various types of processing, as appropriate to the codecs used and the particular implementation. An example of a frame processor 103 adapted for performing such processing is illustrated in Figure 3 for concurrently handling a plurality of decoded RTP media streams 201 of the type decoded at decoder 104. The frame processor could perform any suitable processing of media frames as is known in the art, including noise cancellation (NC) and/or automatic gain control (AGC) 202, delay adjustment 203, sample rate conversion 204/206 for converting multiple media streams into a common sample rate (e.g. ITU-T G.711 uses an B kHz sample rate, G.722 a 16 kHz sample rate, etc), and a mixer for multiplexing of a plurality of media streams 205 into a single stream of media frames 207 for provision to the input buffer 102.
Lost packets, packets discarded by the PMC due to high jitter variation, or problems with higher order out of sequence packets can be concealed through the use of Packet Loss Concealment (PLC) algorithms. This improves voice quality. In the event that the codec being used supports PLC, the decoder supports the Packet Loss Concealment algorithms. In the event that the codec that does not support PLC, the Packet Loss Concealment algorithms can be supported at a Packet Loss Concealment module as shown in Figure 1. The Packet Loss Concealment algorithms defined in the ITU-T G.71 1 Appendix I provide good quality with very low complexity. It is the Packet Loss Concealment algorithms that support the synthesis of media samples in response to LOST data packets generated by the PMC as will be explained below. For example, in response to a LOST data packet generated by the PMC, the PLC algorithms can be configured to repeat the preceding data packet.
A data processing device as described herein could be implemented according to any suitable combination of hardware and software functionalities. For example, the receive path of the data processing device shown in Figure 1 could be implemented entirely in hardware, entirely in software, or as a combination of both hardware and software. In one example, audio interface 101 is a kernel driver of an operating system supported at the data processing device, and frame processor 103, decoder 104, PMC and buffer interface 113 are implemented in software at application-level. Kernel driver 101 would in this case provide a software interface to audio hardware configured to effect the playing of media frames received into input buffer 102. For instance, the audio hardware could include a DAC (digital to analogue converter) to which the PCM frames are directed by the interface for conversion into analogue signals for driving a speaker. In a software implementation of the PMC, the play-out request could be a call to an API provided by the PMC.
As well as playing out packets at the request of the buffer interface, the PMC 105 performs ordering of data packets received at the receive queue 106 into their proper play sequence. This can be performed on play-out of the packets from the PMC according to a packet pick-up process.
On packet data being received into the receive queue 106, the PMC stores at the jitter buffer all the packets which have been received into receive queue 106. The packet storing process involves unpacking the data packets (e.g. their RTP headers) into the receive queue. Packet ordering need not be performed at this stage. It is further advantageous to form one or more buffer control parameters to aid in the proper ordering of data packets and to permit packet validation, overlap time stamp correction and the formation of out-of-order distributions in dependence on which the size of buffer 112 can be adapted.
On receiving each play-out request, the PMC invokes a packet pick-up process in order to play-out packets held at buffer 112 to the decoder. The packet pick-up process is preferably independent of the packet storing process. The packet pick-up process searches jitter buffer 112 of the PMC for the next packet to provide to the decoder given, at each instance, the last packet provided to the decoder. The searching can be facilitated by the buffer control parameters generated by the storing process.
If on receiving a play-out request the jitter buffer 112 is empty, synthetic packets can be generated in the manner described below so as to trigger packet concealment mechanisms in the receive chain.
Examples of the storing and packet pick-up processes will now be described in more detail.
Storing process and buffer control parameters In the present example, the storing process involves unpacking the media payload of received data packets from their RTP headers and storing the data packets in buffer 112. The PMC does not however store packets having timestamps earlier than those data packets it has already played out -such packets are discarded since they represent missing late packets from an earlier media frame. A late gap parameter can be formed to indicate which data packets are to be discarded, as follows.
Late gap parameter For each packet read from the receive queue, a time stamp gap ts1(k) between last played packet and current received packet is estimated by subtracting their send time stamps representing the time those packets were sent: ts1(k) = ts(k) -ts(k)where ts(k) corresponds to the timestamp of the latest played packet and ts(k) corresponds to the timestamp of the received packet.
A negative value of time stamp gap IS(k) indicates that the packet received is a late arrival and should be discarded. The late gap parameter in millisecond is estimated from the time stamp gap tskas: late gap = 2_mc comx(fltc (k) Where ms convf(k) is a millisecond conversion factor for the codec used. Suitable values for the conversion factor are 3 for narrowband or 8kHz sampling rate codecs, 4 for wideband or 16kHz sampling rate codecs, and 5 for super wideband or 32kHz sampling rate codecs (and so on). Thus, received packets with a corresponding negative late gap are discarded.
Histogram parameters Further buffer control parameters can be formed during the storing process, including one or more histograms which can be used to adapt the size of buffer 112. The first histogram is a short term out of order distribution which can be updated as packets are read from the receive queue as follows. Firstly the time stamp gap tsjk) between successive packets received can be estimated by subtracting the previous packet's time stamp, ts(k-1), from the time stamp of the current packet received, ts(k): = ts(k) -ts(k -1) A negative value of time stamp gap t,(k)indicates that the packet received is out of order. The out of order gap Ois estimated by imposing a ceiling on the time stamp gap ts(k) of 0 as given below: r° ifOs(k)>O) L-t (k) otherwise The out of order gap o(k) is converted into milliseconds and quantized into lOms segments to give a quantized gap oq(k), by: O(k) = (2-"' d (k) + 9)! 1(1 where again ms -convf(k) is the millisecond conversion factor of the codec.
The PMC is configured to maintain an array representing a histogram of the quantized out of order gaps. A packet segment number sg_no indicated by a counter t't_ojb that is supported at the PMG and incremented every time the PMC plays out a packet can be used as an index for the array as follows, with each quantized out of order value of the k-" packet being stored against its corresponding packet segment number: ofo,.,r(sguo.oq(k))=ofo,,.Lgno,o(k))+I The short term histogram Of 0ht can be configured to store quantized out of order values for a predetermined length of time, e.g. 15 seconds is typically appropriate in data processing devices for VoIP implementations. In the present example, the duration of each segment is 1 OOms and hence the histogram contains 150 segments, each segment containing 20 locations to store quantized out of order information of late received packets (typically up to some maximum delay, such as 200ms).
A long term histogram of the out of order distribution can also be updated as packets are read from the receive queue during the storing process. The long term histogram represents the distribution of quantized out of order gaps for a predetermined number of packet segments (e.g. a sum of the quantized out of order gaps for the last 120 packet segments). For example, let Al be the number of segments and N be the maximum quantized out of order gap considered, then the long term out of order distribution, ofoi,(k), can be estimated as: of o1(k) = E Of OhistQ, k) k = 1,2 N Limiting this calculation to a maximum number of segments N can help to avoid high memory usage and delay. Typically only a small number of packets will have a delay which exceeds N. In other examples, the time stamp gap can be used in place of the out of order gap to calculate the short and long term histograms.
Minimum size of jitter buffer The long term histogram can be used to estimate a minimum size for buffer 112. The long term histogram (1? is 2D filtered using a window /3j = [111111]. The index corresponding to the Vt zero (Zindx) in the filtered output is used as an estimate of the minimum jitter buffer size. Let y be the filtered output and y71,. be the first index of the filtered output that has zero output, then the minimum jitter buffer size, jbems(k), is: JK(k) =(y, _1)*1O The size of the buffer can then be adapted so as to be at least the minimum buffer size.
Once the buffer control parameters have been updated, the respective data packet is stored in the buffer. The buffer therefore includes all valid data packets whose timestamps are later than the timestamp of the last data packet played out by the PMC.
For each data packet stored in the buffer, the size of the buffer is incremented by the size of that data packet.
Loss impact on jitter buffer size When there is no loss in the network, the size of the jitter buffer maintained according to the mechanisms described above represents the correct value. However, when packet loss occurs, the size of the jitter buffer will not be correct. Whenever there is significant loss in the network, the jitter buffer size jI*' (k) is lower than the true value.
Hence, it is possible that j (k) is greater than the minimum jitter buffer size "C (k) and yet compression will not be invoked where in fact compression should be performed in order to control the buffer size. It is therefore important to account for the impact of packet loss on, (k) for proper delay control.
The impact of packet loss on jitter buffer size can be accounted for as follows. One or more counters are established to keep track of the number/duration of samples lost as a result of partial or complete packet loss. For example, each time the number of samples lost exceeds a defined segment of time, a value representing that segment can be stored in a corresponding buffer. The corrected size of the jitter buffer, Jbcorr, can then be given by: / total length of segments lost\ = I 1. + lJbms length of a segment I In other words, the minimum jitter buffer size can be scaled by the number of segments of time lost.
Packet pick-up process If the buffer 112 is not of zero size (i.e. contains at least one data packet) and the late gap of the last played-out packet is zero, the PMC performs the packet pick-up process which plays out data packets from the jitter buffer in sequence to the decoder.
Preferably the packet pick-up process is performed independently to the storing process because this means the next packet for play-out is more likely to be found when the buffer is searched (due to network jitter packets may be received out of order into the receive queue). The packet pick-up process could be performed concurrently such that the pick-up process overlaps the storing process. If the buffer is of zero size (i.e. is empty) and the late gap of the last played-out packet is greater than zero (it cannot be less than zero since those packets are discarded), then the buffer can generate one or more EXP (expansion) packets to cause subsequent entities in the receive chain to generate synthetic samples to fill the missing period in the stream of media data. EXP packets can cause the decoder to generate synthetic samples to fill a timegap of the length indicated by the EXF' packet. This is performed in accordance with a time scale modification (TSM) scheme in place at the decoder.
The packet pick-up process then estimates the expected timestamp and sequence number of the next packet which is expected to be played. In the present example, the expected timestamp, tse(k), and sequence number, sqne(k), of the kth packet can be calculated as: tsc(k) = (ts(k -1) + 2 COTtV[ (k_i) pkt_sz(k -1) (ts(k 1) + 2 convf(k-1) p/a sz(k -1)
-
+dfx(k -1) + phcd(k -I) sqn' (k) = sqn(k) + I + p/icc! -pkt(k -1) where, ts(k-1) is the timestamp of the previous (k-i]" packet, pkt_sz(k-1) is the size of that previous packet, and dtx(k -.1) and phcd(k -.1) represent corrections in the event that the data processing device supports the use of discontinuous transmission (DTX) frames and phase synchronization. dix(/c -1) is a measure of the number of samples comprised in DTX frames played following the previous packet played out by the PMC, and phcd(k-1) represents the number of samples discarded for phase synchronization following the previous packet. sqn(k) is the next sequence number after the sequence number of the previous packet played out by the PMC.
phcdpkt(k-1) represents the number of best match packets discarded for phase synchronization following the previous packet played out by the PMC.
The PMC then searches in its buffer for a packet having the expected timestamp L4k) or sequence number sqn?(k) using an appropriate search algorithm. A linear search algorithm has been found to offer good performance. This packet is referred to as the best match packet (BMP). The PMC also searches for the minimum timestamp packet (MW) or minimum sequence number packet (MSP) (i.e. those packets having the minimum timestamp or sequence number) in its buffer.
If a valid best match packet having the expected timestarnp or sequence number is found, it is played out unless at the last attempt to play out a packet the jitter buffer was empty. It can be useful to check whether the best match packet is valid according to the mechanism described below. If at the last attempt to play out a packet the buffer was empty, the PMC is configured to play out a synthetic packet such as a DTX, DTMF (Dual-tone multi-frequency) or expansion packet that will cause the decoder and/or frame processor to generate synthetic samples. The selection of a DTX, DTMF, or EXP packet would be made according to the communication protocols in operation at the data processing device; generally the selection will be dependent on the type of one or more preceding data packets played out by the PMC. Furthermore, if at the last attempt to play out a packet the buffer was empty, it can be advantageous to increase the size of the buffer, for example by playing an expansion frame. This is because an empty buffer is a sign of a high level of network jitter for which a larger buffer would be appropriate.
On playing out the best match packet, the size of the buffer 112 is correspondingly decreased: pktsz(k)=hrn sz(k) lb (/c)=Jb,jk-1)-p/cl_sz(/c) where pkt_sz(k) is a working packet size parameter, bm_sz(k) is the size of the best match packet and jb50(k) is the size of the buffer, and k is the packet index.
Minimum timestamp packet validation The PMC searches for both the best match packet (BMP) and the packet having the minimum timestamp (MTP) or, equivalently, the minimum segment number (MSP).
When a best match packet is found, both the MTP/MSP and BMP should be same.
Otherwise the MTP/MSP packet is an invalid or old packet and is discarded from the buffer. The size of the buffer is then also updated: jb (k) =jb (k) -mtp sz(k) where mtp_sz(k) is the size of the MTP/MSP packet.
Best match packet validation If the last played packet is a packet of type SPEECH (i.e. a regular packet carrying media samples for a frame), DTX or SID (Silence Insertion Description) packet, then the best match packet is considered as valid packet and played-out.
If the last packet played is a SYNTHETIC packet of type EXP or LOST but the packet before that was a SPEECH packet, then an estimate of the expected timestamp for the best match packet is formed. LOST packets can be generated by the PMC so as to trigger the operation of packet loss concealment algorithms in the receive chain. The number of samples concealed by a sequence of one or more LOST frames can be determined through the use of a counter lost(k i)which is incremented by the appropriate number of samples when a LOST packet is played-out and reset to 0 when another packet type is played-out. With iost(/ 1) being number of samples concealed from the LOST frames after (k-if packet played, and lost -pks(k -1) be number of lost packets played after the (k-if' GOOD packet played. Then the expected timestamp of the BMP is: (k) = bin rs(k -1) + losr(k -I) + pkt -sz(k -1) Then the timestamp gap tsm(k) between time stamp of the BMP ts1(k) and expected timestamp of BMP tse (k)is calculated. If the time stamp gap tsf(k) is zero or lower than codec frame size, the BMP is considered valid and played out. Similarly, if the timestamp gap ts"(k) is greater than 10 times the codec frame size, or the packet type of the BMP is either SID or DTX, the BMP is considered valid and played out.
Otherwise the BMP is treated as invalid and discarded.
Validation of buffer size If the timestamp gap ts'"(k) is smaller than the size of the BMP, the size of the buffer is considered optimal. When the timestamp gap tsf"'(k)is greater than size of BMP, the size of the buffer is checked using the following condition: jb (k) »= jbe (k) +codec frsz +10 1h (k) «= 2pkt -sz(k) where codec_frsz is the frame size of the codec according to which the packet is formed. It depends on the codec used. Its value for general codecs like G71 1 or G722 codec is 1 Oms and for the AMR codec is 2Oms.
If both of the above conditions are satisfied, the jitter buffer size is not optimal and the best match packet is discarded. The next best match packet is then searched for in the jitter buffer. Otherwise the buffer size is optimal and the best match packet is played-out.
Frame Prediction In the case that the best match packet is not found in the jitter buffer, a frame prediction algorithm is invoked by the PMC. The algorithm provides two prediction methods which are selected using a lag parameter. lag is formed from the timestamp gap Ls(k), which is the timestamp gap between the minimum timestamp packet (MTP) and the timestamp of the last played packet: Ii if(ts'(k) > 0) lag(k)= otherwise When /ag(k)=2 the timestamp of the MTP is lower than the timestamp of the last played packet, which indicates that the MW is a late packet. If the last packet played out was an EXP, DTMF, DTX or LOST packet, the minimum timestamp packet is played out in the place of the best match packet. Otherwise, the MTP is discarded and an EXP packet is played out.
The value of iag(k)=1 indicates that the expected packet is not available but one or more future packets are available -i.e. the expected packet might be lost in the network or is going to arrive late. The selection of the frame type during this case can be controlled in accordance with any suitable algorithm for coping with missing packets or gaps between packets. These can include the use of EXP packets to cause the decoder to replace the missing packet with synthetic samples, a DTMF packet to cause the decoder to replace the missing packet with one or more tones, an SID or DTX packet to cause the decoder to insert silence, or a LOST packet to cause a subsequent entity of the receive chain (such as at a packet concealment module or PCM as described above) to replace the missing packet in accordance with a concealment algorithm at that entity. The choice of synthetic packet can depend on many factors, such as the past frame played, buffer size, and the timestamp gap between the last played out packet and the next immediate available packet's timestamp.
By performing the packet pick-up process for each packet held at its jitter buffer, the PMC 105 achieves the play-out of data packets in sequence and allows synthetic packets generated according to algorithms running at the PMC to be generated. This ensures that the decoder 104 is provided with a continuous stream of data packets from which it can generate a continuous stream of samples for media frames, without gaps between frames or missing data. The decoder can therefore be optimised purely for decoding and is not required to perform packet concealment on the fly.
Overlap timestamp correction One of the most commonly used methods to sustain voice quality during bad network conditions is to resend missing payloads by piggybacking the missing payloads at the transmitter with subsequent payloads. RFC 2195 provides the interoperability requirements for such schemes. However, interleaving payloads is not possible in the data processing device described herein since each payload should represent a continuous segment of data. In order to address this, the PMC is configured to detect timestamp overlaps between the media data carried in data packet payloads and discard those parts of payloads that have already been received in preceding data packets.
Correction of overlapping timestamps can be achieved by configuring the packet storing process of the PMC to discard in their entirety those packets whose timestamp precedes the timestamp of the last played-out data packet by at least the size of a data packet (see the late gap parameter above). If during the storing process the timestamp of a data packet read from the receive queue precedes the timestamp of the last played-out data packet by less than the size of a data packet, then that part of the payload of the received data packet which falls subsequent to the timestamp of the last played-out data packet is stored in the buffer and the earlier portion is discarded. A data packet payload can be divided into segments each representing a certain the length of a sample according to the codec used (e.g. 10 ms for many audio codecs).
This ensures that a data packet payload is not split at an inappropriate point and maintains the integrity of the payload media data.
Buffer compression management In order to cope with overflow conditions at the jitter buffer (e.g. a larger than expected buffer size due to network jitter), buffer compression algorithms can be used to maintain a smooth flow of data packets to the decoder whilst reducing the size of the buffer back to its desired level. A buffer compression algorithm suitable for use at the PMC will now be described.
The overflow size of the jitter buffer can be given by the difference in size between the current (potentially average) size of the buffer (i.e. for the current data packet k required for play-out) and the calculated minimum size of the buffer jb(k) described above, as potentially modified by any loss impact mechanisms in operation at the media processor 110. Thus, the average overflow size can be defined as: Ibaug = Jbsize (k) -Jbi°ns (k) where jb(k) is the current size of the jitter buffer.
the average overflow size is defined as: o.fms(k) jbavg(k) jb,1(k) where Jbuvq (k) is the average size of the jitter buffer. In order to avoid sharp changes in overflow size, it is advantageous if jb0(k) is an average value calculated from the current and previous k_lth data packet. In one example, the average size of the buffer can be calculated from: 0 if Ibsize(k) »= Jb(k) -j 0 elseif lag(k) = 1 c13(k) -) 0 elseif the last packet was a DTX or SID packet jbsize(k) jbavg(k -1) otherwise jbavg(k) jbavg(k 1) +cc(k)d3(k) where jbs&e(k) is the current size of the jitter buffer for data packet k, and p/ct sz(/c) a(k) = 2' 10 is an averaging factor with pkt sz(k) the size in milliseconds of data packet k.
A two-stage timer mechanism can be used with the PMC in order to control fluctuations in the size of the jitter buffer. An overflow detection timer can provide the first stage of control. The overflow detection timer is configured to, when the average overflow size is greater than zero, increment from a starting point of zero on each play request being received at the PMC. The timer is incremented by the size of the play request received.
For example, if a play request is received for lOms of packet data, the overflow detection timer will increase by 1 Oms. It will be appreciated that metrics other than the time represented by data packets could be used by the overflow detection timer, such as a number of samples or an amount of data. The overflow detection timer is reset to zero whenever the size of the jitter buffer is equal to or smaller than the calculated minimum size -i.e. when the average overflow size is zero or negative.
The overflow detection timer is arranged to trigger the second-stage CMP triggering timer to start when the overflow size of the jitter buffer exceeds some predefined level.
This avoids compression being triggered by small fluctuations in the size of the jitter buffer.
The CMP triggering timer increments in the same manner as the overflow detection timer: when the average overflow size is greater than zero, the timer increments from a starting point of zero on each play request being received at the PMC. The timer is incremented by the size of the play request received, or by some other suitable metric.
The CMP triggering timer is arranged to trigger when it reaches an adaptive threshold Ta which can be selected in dependence on the overflow size. For example: -[2 * (minimum of 7'4 and 2"ofms(Ic)) when ofms(k) > lOms and jb(k) «= 10 minimum of T4 and 2cOfms(/c) otherwise Suitable values for T4 can be around 3500ms for a packet size of 2Oms.
The adaptive threshold can be recalculated on each play request being received so as to constantly adapt the threshold in dependence on the overflow size. By arranging that the adaptive threshold is capped at Ta, the threshold can increase as the overflow size increases at smaller values of threshold overflow, but the threshold does not exceed the cap so as to ensure that the PMC can rapidly respond to large overflow values.
Once the CMP triggering timer reaches its threshold (whether adaptive or otherwise), compression of the samples carried in the data packets at the jitter buffer can be performed. This can be achieved by configuring the PMG 105 to generate a CMP packet for indicating to the decoder 104 that compression by a certain length of samples (e.g. a certain number of milliseconds is required). The decoder can be configured to perform such compression according to any suitable compression technique. The PMC can be configured to cause compression by some proportion of the buffer overflow size. For example, the PMC can be configured to generate CMP packets each requesting compression by 25% of the overflow size, 20% of the overflow size, 30% of the overflow size, or 35% of the overflow size. The particular choice of the amount of compression to perform by each compression operation can depend on the characteristics of the particular compression mechanisms performed by the decoder.
Following the generation of each CMP packet, the CMP triggering timer (and potentially the overflow detection timer) can be reset to zero. This ensures that compression operations are only performed when both timers indicate that compression is required.
It will be appreciated that other examples are possible which utilise only a single timer to trigger compression. For example, a single timer with an adaptive threshold in the manner described above.
Performance The performance of a data processing device configured in the manner described herein is illustrated in the plots shown in Figure 4. The plots relate to a data processing device implemented as a VoIP endpoint and supporting a VoIP communication link over a network simulated using NistNet and the Linux TC network simulator. Network parameters such as packet loss, network jitter and delay have been used from widely accepted statistical models.
Figure 4 shows the jitter tracking behavior of the data processing device as the simulated network jitter is varied. It can be seen from the figure that the data processing device closely estimates the jitter applied and correspondingly changes the buffer size to store the out of order packets. This provides smooth voice quality with optimum end to end delay for the VoIP link.
The data processing device of Figure 1 and the frame processor of Figure 3 are shown as comprising a number of functional blocks. This is for illustrative purposes only and is not intended to define a strict division between different parts of hardware on a chip or between different programs, procedures or functions in software. The term logic as used herein can refer to any kind of software, hardware, or combination of hardware and software.
Data processing devices configured in accordance with the present invention could be embodied in hardware, software or any suitable combination of hardware and software.
A data processing device of the present invention could comprise, for example, software for execution at one or more processors (such as at a CPU and/or GPU), and/or one or more dedicated processors (such as ASICs), and/or one or more programmable processors (such as FPGA5) suitably programmed so as to provide functionalities of the data processing device, and/or heterogeneous processors comprising one or more dedicated, programmable and general purpose processing functionalities. In preferred embodiments of the present invention, the data processing device comprises one or more processors and one or more memories having program code stored thereon, the data processors and the memories being such as to, in combination, provide the claimed data processing device and/or perform the claimed methods.
The term software as used herein includes executable code for processors (e.g. CPUs and/or GPU5), firmware, bytecode, programming language code such as C or OpenCL, and modules for reconfigurable logic devices such as FPGAs. Machine-readable code includes software and code for defining hardware, such as register transfer level (RTL) code as might be generated in Verilog or VHDL.
Any one or more of the algorithms and methods described herein could be performed by one or more physical processing units executing program code that causes the unit(s) to perform the algorithms/methods. The or each physical processing unit could be any suitable processor, such as a CPU or GPU (or a core thereof), or fixed function or programmable hardware. The program code could be stored in non-transitory form at a machine readable medium such as an integrated circuit memory, or optical or magnetic storage. A machine readable medium might comprise several memories, such as on-chip memories, computer working memories, and non-volatile storage devices.
The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Claims (42)

  1. CLAIMS1. A data processing device comprising: a jitter buffer for receiving data packets; a media decoder configured to decode the data packets so as to form a stream of media frames, each frame comprising a plurality of samples; a media consumer having an input buffer for receiving the stream of media frames and being configured to play media frames from the input buffer according to a first frame rate; a buffer interface configured to monitor the input buffer so as to detect when the number of samples at the input buffer of the media consumer falls below a predetermined level and, in response, generate a play-out request; and a media controller configured to, responsive to each of the generated play-out requests, play-out one or more data packets to the media decoder so as to cause media frames of the stream to be delivered into the input buffer at a rate commensurate with the first frame rate.
  2. 2. A data processing device as claimed in claim 1, wherein the buffer interface is supported at the media consumer.
  3. 3. A data processing device as claimed in claim 1, wherein the buffer interface is supported at the media controller.
  4. 4. A data processing device as claimed in any preceding claim, wherein the predetermined level is at least the number of samples comprised in a media frame.
  5. 5. A data processing device as claimed in any preceding claim, the buffer interface being configured to periodically check the number of samples at the input buffer at a rate commensurate with the first frame rate.
  6. 6. A data processing device as claimed in any preceding claim, further comprising a receive queue for receiving data packets from the data packets from a network, the media controller being configured to periodically store in the jitter buffer all of the data packets available at the receive queue whose timestamps are greater than the timestamp of the last data packet played out by the media controller.
  7. 7. A data processing device as claimed in any preceding claim, the media controller being configured to, on storing one or more data packets at the jitter buffer, increase the size of the jitter buffer by the size of those data packets.
  8. 8. A data processing device as claimed in any preceding claim, the media controller being configured to maintain a histogram representing a distribution of time periods between the timestamps of successive packets stored at the jitter buffer, the histogram indicating for each of a predetermined range of time periods a measure of the number of successive data packets separated by that time period.
  9. 9. A data processing device as claimed in claim 8, the media controller being arranged to update the histogram on storing each of the data packets.
  10. 10. A data processing device as claimed in claim 8 or 9, the media controller being configured to estimate a minimum size for the jitter buffer by identifying the lowest time period between the timestamps of successive packets for which the measure of the number of successive data packets separated by that time period is zero.
  11. 11. A data processing device as claimed in claim 10, the media controller being configured to cause the size of the jitter buffer to adapt so as to be at least the estimated minimum size.
  12. 12. A data processing device as claimed in any preceding claim, the media controller being configured to, responsive to each of the play-out requests. estimate the timestamp of the next packet to be played out from the jitter buffer based on the timestamp of the preceding data packet played out from the jitter buffer and the size of that preceding data packet.
  13. 13. A data processing device as claimed in claim 12, the media controller being further configured to estimate the timestamp of the next packet to be played out from the jitter buffer based on a measure of the number of media samples added or discarded in accordance with time scale modification algorithms operating at the data processing device.
  14. 14. A data processing device as claimed in claim 12 or 13, the media controller being configured to search the jitter buffer for a best match data packet having a timestamp equal to the estimated timestamp or within the size of one media frame of the estimated timestamp according to the codec in use at the decoder, and if such a best match data packet is identified, play out the best match data packet.
  15. 15. A data processing device as claimed in claim 14, the media controller being configured to decrease the size of the jitter buffer by the size of the best match data packet.
  16. 16. A data processing device as claimed in any of claims 12 to 15, the media controller being further configured to search the jitter buffer for the data packet having the lowest timestamp and, if that lowest timestamp is not equal to the timestamp of the best match data packet, discarding the data packet having that lowest timestamp.
  17. 17. A data processing device as claimed in any of claims 12 to 16, the media controller being configured to play out each best match data packet only if the last data packet played out by the jitter buffer was a SPEECH, DTX, or SID data packet.
  18. 18. A data processing device as claimed in any of claims 12 to 17, the media controller being configured to, if the size of the jitter buffer was zero on the preceding play-out request being received, play out a synthetic data packet selected in accordance with a time scale modification algorithm and irrespective of the presence or otherwise of a best match data packet.
  19. 19. A data processing device as claimed in any of claims 12 to 18, the media controller being configured to, if a best match data packet is not identified, play out: if the lowest timestamp is lower than the timestamp of the latest data packet played-out by the media controller, the data packet having the lowest timestamp provided that the latest data packet played out by the media controller was a DTX, LOST, EXP, or DTMF data packet, and otherwise discard the data packet having the lowest timestamp and play-out an EXP data packet; if the lowest timestamp is greater than the timestamp of the latest data packet played-out by the media controller then play-out a synthetic data packet selected in accordance with a time scale modification algorithm.
  20. 20. A data processing device as claimed in any of claims 12 to 19, the media controller being configured to, on each data packet being played out, iteratively search for each next best match data packet until an amount of data has been played-out to the decoder to satisfy a number of samples indicated in or represented by the play-out request.
  21. 21. A data processing device as claimed in any preceding claim, further comprising a frame processor between the decoder and the input buffer, the frame processor configured to perform one or more of noise cancellation, automatic gain control, delay adjustment, sample rate conversion, and multiplexing of media streams.
  22. 22. A data processing device as claimed in any preceding claim, further comprising packet concealment logic at the decoder or at a packet concealment module between the media controller and decoder, the packet concealment logic being configured to generate media samples in accordance with synthetic packets received from the jitter buffer.
  23. 23. A data processing device as claimed in any preceding claim, the media controller being configured to, on storing a data packet whose timestamp precedes the timestamp of the latest played-out data packet by less than the size of the data packet, store only that part of the data packet representing media samples subsequent to the timestamp of the latest played-out data packet, and discarding that part of the data packet representing media samples preceding the timestamp of the latest played out data packet.
  24. 24. A data processing device as claimed in any of claims 6 to 20 and 23, wherein each of the said timestamps is a send timestamp indicative of the time at which each respective data packet was sent over the network.
  25. 25. A data processing device as claimed in any preceding claim, further comprising a first timer and the media controller being configured to, on receiving the play-out request, calculate an overflow size of the jitter buffer and to: if the overflow size of the jitter buffer exceeds a first threshold, increment the first timer by a measure of the number of samples requested in the play-out request; and otherwise, reset the first timer to zero.
  26. 26. A data processing device as claimed in claim 25, the media controller being configured to, when the first timer exceeds a second threshold, generate one or more data packets so as to cause the decoder to perform compression by an amount selected in dependence on the overflow size.
  27. 27. A data processing device as claimed in claim 25, further comprising a second timer and the media controller being configured to, when the first timer exceeds a third threshold: if the overflow size of the jitter buffer exceeds the first threshold. increment the second timer by a measure of the number of samples requested in the play-out request; and otherwise, not increment the second timer.
  28. 28. A data processing device as claimed in claim 27, the media controller being configured to, when the second timer exceeds a fourth threshold, generate one or more data packets so as to cause the decoder to perform compression by an amount selected in dependence on the overflow size.
  29. 29. A data processing device as claimed in claim 28, wherein the fourth threshold is an adaptive threshold selected in dependence on the overflow size.
  30. 30. A data processing device as claimed in claim 28 or 29, the media controller being configured to reset the second timer to zero on generating the one or more data packets so as to cause the decoder to perform compression.
  31. 31. A data processing device as claimed in any of claims 25 to 30 as dependent on claim 10, wherein the overflow size is the difference between a measure of the size of the jitter buffer on receiving the play-out request and the estimated minimum size of the jitter buffer.
  32. 32. A data processing device as claimed in claim 31, wherein the measure of the size of the jitter buffer is an average size of the jitter buffer calculated in dependence on the size of the jitter buffer at one or more preceding play-out requests.
  33. 33. A data processing device as claimed in claim 26 or 28, wherein the media controller is configured to select the amount of compression to be around 25% of the overflow size.
  34. 34. A method for controlling a stream of data packets received over a network for a media consumer, the media consumer having an input buffer for receiving media frames decoded from the stream of data packets and being configured to play the media frames according to a first frame rate, the method comprising: receiving data packets into a jitter buffer; generating a play-out request when the number of samples comprised in media frames at the input buffer of the media consumer falls below a predetermined level; receiving the play-out request at the media controller; and responsive to that request, the media controller playing-out one or more data packets to a media decoder so as to cause media frames decoded from the stream of data packets to be delivered into the input buffer at a rate commensurate with the first frame rate.
  35. 35. A method as claimed in claim 34, the receiving data packets into the jitter buffer comprising periodically storing in the jitter buffer all of the data packets available at a network receive queue whose timestamps are greater than the timestamp of the last data packet played out by the media controller.
  36. 36. A method as claimed in claim 34 or 35, further comprising: estimating the timestam p of the next packet to be played out from the jitter buffer based on the timestamp of the preceding data packet played out from the jitter buffer and the size of that preceding data packet; searching the jitter buffer for a best match data packet having a timestarnp equal to the estimated timestamp or within the size of one media frame of the estimated timestamp according to the codec in use at the decoder; and if such a best match data packet is identified, the media controller playing-out the best match data packet.
  37. 37. A method as claimed in claim 36, further comprising iteratively searching for each next best match data packet and playing-out each such best match data packet until an amount of data has been played-out to the decoder to satisfy a number of samples indicated in or represented by the play-out request.
  38. 38. Machine readable code for generating a data processing device according to any of claims Ito 33.
  39. 39. A machine readable storage medium having encoded thereon non-transitory machine readable code for generating a data processing device according to any of claims 1 to 33.
  40. 40. Machine readable code for implementing a method as claimed in any of claims 34 to 37.
  41. 41. A machine readable storage medium having encoded thereon non-transitory machine-readable code for implementing a method as claimed in any of claims 34 to 37.
  42. 42. A data processing device substantially as described herein with reference to any of Figures 1 to 4.Amendments to the claims have been filed as followsCLAIMS1. A data processing device comprising: a jitter buffer for receiving data packets; a media decoder configured to decode the data packets so as to form a stream of media frames, each frame comprising a plurality of samples; a media consumer having an input buffer for receiving the stream of media frames and being configured to play media frames from the input buffer according to a first frame rate; a buffer interface configured to monitor the input buffer so as to detect when the number of samples at the input buffer of the media consumer falls below a predetermined level and, in response, generate a play-out request; and a media controller configured to, responsive to each of the generated play-out requests: estimate the timestamp of the next packet to be played out from the cv) jitter buffer based on the preceding data packet played out from the jitter 0 buffer; search the jitter buffer for a best match data packet having a C) timestamp in accordance with the estimated timestamp; and if such a best match data packet is identified, play out the best match data packet to the media decoder so as to cause media frames of the stream to be delivered into the input buffer at a rate commensurate with the first frame rate.2. A data processing device as claimed in claim 1, wherein the buffer interface is part of the media consumer.3. A data processing device as claimed in claim 1, wherein the buffer interface is part of the media controller.4. A data processing device as claimed in any preceding claim, wherein the predetermined level is at least the number of samples comprised in a media frame.5. A data processing device as claimed in any preceding claim, the buffer interface being configured to periodically check the number of samples at the input buffer at a rate commensurate with the first frame rate.6. A data processing device as claimed in any preceding claim, further comprising a receive queue for receiving data packets from the data packets from a network, the media controller being configured to periodically store in the jitter buffer all of the data packets available at the receive queue whose timestamps are greater than the timestamp of the last data packet played out by the media controller.7. A data processing device as claimed in any preceding claim, the media controller being configured to, on storing one or more data packets at the jitter buffer, increase the size of the jitter buffer by the size of those data packets.8. A data processing device as claimed in any preceding claim, the media controller 0 being configured to maintain a histogram representing a distribution of time periods between the timestamps of successive packets stored at the jitter buffer, the CD histogram indicating for each of a predetermined range of time periods a measure of the number of successive data packets separated by that time period.9. A data processing device as claimed in claim 8, the media controller being arranged to update the histogram on storing each of the data packets.10. A data processing device as claimed in claim 8 or 9, the media controller being configured to estimate a minimum size for the jitter buffer by identifying the lowest time period between the timestamps of successive packets for which the measure of the number of successive data packets separated by that time period is zero.11. A data processing device as claimed in claim 10, the media controller being configured to cause the size of the jitter buffer to adapt so as to be at least the estimated minimum size.12. A data processing device as claimed in any preceding claim, the media controller being configured to, responsive to each of the play-out requests, estimate the timestamp of the next packet to be played out from the jitter buffer based on the timestamp of the preceding data packet played out from the jitter buffer and the size of that preceding data packet.13. A data processing device as claimed in claim 12, the media controller being further configured to estimate the timestamp of the next packet to be played out from the jitter buffer based on a measure of the number of media samples added or discarded in accordance with time scale modification algorithms operating at the data processing device.14. A data processing device as claimed in any preceding claim, the media controller being configured to identify as the best match data packet a data packet cv) having a timestamp equal to the estimated timestamp or within the size of one 0 media frame of the estimated timestamp according to the codec in use at the decoder.15. A data processing device as claimed in claim 14, the media controller being configured to decrease the size of the jitter buffer by the size of the best match data packet.16. A data processing device as claimed in any of claims 12 to 15, the media controller being further configured to search the jitter buffer for the data packet having the lowest timestamp and, if that lowest timestamp is not equal to the timestamp of the best match data packet, discarding the data packet having that lowest timestamp.17. A data processing device as claimed in any of claims 12 to 16, the media controller being configured to play out each best match data packet only if the last data packet played out by the jitter buffer was a SPEECH, DTX, or SID data packet.18. A data processing device as claimed in any of claims 12 to 17, the media controller being configured to, if the size of the jitter buffer was zero on the preceding play-out request being received, play out a synthetic data packet selected in accordance with a time scale modification algorithm and irrespective of the presence or otherwise of a best match data packet.19. A data processing device as claimed in any of claims 12 to 18, the media controller being configured to, if a best match data packet is not identified, play out: if the lowest timestamp is lower than the timestamp of the latest data packet played-out by the media controller, the data packet having the lowest timestamp provided that the latest data packet played out by the media controller was a DTX, LOST, EXP, or DTMF data packet, and otherwise discard the data packet having the lowest timestamp and play-out an EXP data packet; if the lowest timestamp is greater than the timestamp of the latest data cv) packet played-out by the media controller then play-out a synthetic data packet 0 selected in accordance with a time scale modification algorithm.O 20. A data processing device as claimed in any of claims 12 to 19, the media controller being configured to, on each data packet being played out, iteratively search for each next best match data packet until an amount of data has been played-out to the decoder to satisfy a number of samples indicated in or represented by the play-out request.21. A data processing device as claimed in any preceding claim, further comprising a frame processor between the decoder and the input buffer, the frame processor configured to perform one or more of noise cancellation, automatic gain control, delay adjustment, sample rate conversion, and multiplexing of media streams.22. A data processing device as claimed in any preceding claim, further comprising packet concealment logic at the decoder or at a packet concealment module between the media controller and decoder, the packet concealment logic being configured to generate media samples in accordance with synthetic packets received from the jitter buffer.23. A data processing device as claimed in any preceding claim, the media controller being configured to, on storing a data packet whose timestamp precedes the timestamp of the latest played-out data packet by less than the size of the data packet, store only that part of the data packet representing media samples subsequent to the timestamp of the latest played-out data packet, and discarding that part of the data packet representing media samples preceding the timestamp of the latest played out data packet.24. A data processing device as claimed in any of claims 6 to 20 and 23, wherein each of the said timestamps is a send timestamp indicative of the time at which each respective data packet was sent over the network.25. A data processing device as claimed in any preceding claim, further comprising 0 a first timer and the media controller being configured to, on receiving the play-out request, calculate an overflow size of the jitter buffer and to: CD if the overflow size of the jitter buffer exceeds a first threshold, increment the first timer by a measure of the number of samples requested in the play-out request; and otherwise, reset the first timer to zero.26. A data processing device as claimed in claim 25, the media controller being configured to, when the first timer exceeds a second threshold, generate one or more data packets so as to cause the decoder to perform compression by an amount selected in dependence on the overflow size.27. A data processing device as claimed in claim 25, further comprising a second timer and the media controller being configured to, when the first timer exceeds a third threshold: if the overflow size of the jitter buffer exceeds the first threshold, increment the second timer by a measure of the number of samples requested in the play-out request; and otherwise, not increment the second timer.28. A data processing device as claimed in claim 27, the media controller being configured to, when the second timer exceeds a fourth threshold, generate one or more data packets so as to cause the decoder to perform compression by an amount selected in dependence on the overflow size.29. A data processing device as claimed in claim 28, wherein the fourth threshold is an adaptive threshold selected in dependence on the overflow size.30. A data processing device as claimed in claim 28 or 29, the media controller being configured to reset the second timer to zero on generating the one or more cv) data packets so as to cause the decoder to perform compression.31. A data processing device as claimed in any of claims 25 to 30 as dependent C) on claim 10, wherein the overflow size is the difference between a measure of the size of the jitter buffer on receiving the play-out request and the estimated minimum size of the jitter buffer.32. A data processing device as claimed in claim 31, wherein the measure of the size of the jitter buffer is an average size of the jitter buffer calculated in dependence on the size of the jitter buffer at one or more preceding play-out requests.33. A data processing device as claimed in claim 26 or 28, wherein the media controller is configured to select the amount of compression to be around 25% of the overflow size.34. A method for controlling a stream of data packets received over a network for a media consumer, the media consumer having an input buffer for receiving media frames decoded from the stream of data packets and being configured to play the media frames according to a first frame rate, the method comprising: receiving data packets into a jitter buffer; generating a play-out request when the number of samples comprised in media frames at the input buffer of the media consumer falls below a predetermined level; receiving the play-out request at a media controller; and responsive to that request, the media controller: estimating the timestamp of the next packet to be played out from the jitter buffer based on the preceding data packet played out from the jitter buffer; searching the jitter buffer for a best match data packet having a timestamp in accordance with the estimated timestamp; and to if such a best match data packet is identified, playing out the best match data packet to the media decoder so as to cause media cv) frames decoded from the stream of data packets to be delivered into 0 the input buffer at a rate commensurate with the first frame rate.O 35. A method as claimed in claim 34, the receiving data packets into the jitter buffer comprising periodically storing in the jitter buffer all of the data packets available at a network receive queue whose timestamps are greater than the timestamp of the last data packet played out by the media controller.36. A method as claimed in claim 34 or 35, wherein the estimating the timestamp of the next packet to be played out from the jitter buffer is performed based on the timestamp of the preceding data packet played out from the jitter buffer and the size of that preceding data packet and the searching the jitter buffer for a best match data packet comprises identifying a data packet having a timestamp equal to the estimated timestamp or within the size of one media frame of the estimated timestamp according to the codec in use at the decoder.37. A method as claimed in claim 36, further comprising iteratively searching for each next best match data packet and playing-out each such best match data packet until an amount of data has been played-out to the decoder to satisfy a number of samples indicated in or represented by the play-out request.38. Machine readable code for implementing a method as claimed in any of claims 34 to 37.39. A data processing device substantially as described herein with reference to any of Figures ito 4. IC)COCO
GB1407782.0A 2014-05-02 2014-05-02 Media controller Active GB2521883B (en)

Priority Applications (9)

Application Number Priority Date Filing Date Title
GB1407782.0A GB2521883B (en) 2014-05-02 2014-05-02 Media controller
GB1417242.3A GB2524349B (en) 2014-05-02 2014-09-30 Adaptive span control
GB1512312.8A GB2524430B (en) 2014-05-02 2014-09-30 Adaptive span control
US14/703,558 US10778257B2 (en) 2014-05-02 2015-05-04 Of invention. fees transmitted by check or draft are
US14/703,479 US9985660B2 (en) 2014-05-02 2015-05-04 Media controller
US15/975,883 US10680657B2 (en) 2014-05-02 2018-05-10 Media controller with jitter buffer
US16/866,829 US20200266839A1 (en) 2014-05-02 2020-05-05 Media Controller with Buffer Interface
US16/999,540 US11323136B2 (en) 2014-05-02 2020-08-21 Method and apparatus for processing a received sequence of data packets by removing unsuitable error correction packets from the sequence
US17/709,206 US11750227B2 (en) 2014-05-02 2022-03-30 Method and device for transmitting a data stream with selectable ratio of error correction packets to data packets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1407782.0A GB2521883B (en) 2014-05-02 2014-05-02 Media controller

Publications (3)

Publication Number Publication Date
GB201407782D0 GB201407782D0 (en) 2014-06-18
GB2521883A true GB2521883A (en) 2015-07-08
GB2521883B GB2521883B (en) 2016-03-30

Family

ID=50980504

Family Applications (3)

Application Number Title Priority Date Filing Date
GB1407782.0A Active GB2521883B (en) 2014-05-02 2014-05-02 Media controller
GB1512312.8A Active GB2524430B (en) 2014-05-02 2014-09-30 Adaptive span control
GB1417242.3A Active GB2524349B (en) 2014-05-02 2014-09-30 Adaptive span control

Family Applications After (2)

Application Number Title Priority Date Filing Date
GB1512312.8A Active GB2524430B (en) 2014-05-02 2014-09-30 Adaptive span control
GB1417242.3A Active GB2524349B (en) 2014-05-02 2014-09-30 Adaptive span control

Country Status (2)

Country Link
US (6) US10778257B2 (en)
GB (3) GB2521883B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103594103B (en) * 2013-11-15 2017-04-05 腾讯科技(成都)有限公司 Audio-frequency processing method and relevant apparatus
GB2521883B (en) * 2014-05-02 2016-03-30 Imagination Tech Ltd Media controller
US9985887B2 (en) * 2015-08-27 2018-05-29 Cavium Inc. Method and apparatus for providing a low latency transmission system using adaptive buffering estimation
KR102422794B1 (en) * 2015-09-04 2022-07-20 삼성전자주식회사 Playout delay adjustment method and apparatus and time scale modification method and apparatus
EP4033770A1 (en) 2015-11-17 2022-07-27 Livestreaming Sweden AB Video distribution synchronization
US10148582B2 (en) * 2016-05-24 2018-12-04 Samsung Electronics Co., Ltd. Managing buffers for rate pacing
CN107888540B (en) * 2016-09-29 2020-12-25 华为技术有限公司 Network anti-attack method and network equipment
US10360598B2 (en) 2017-04-12 2019-07-23 Engine Media, Llc Efficient translation and load balancing of openrtb and header bidding requests
JP6866870B2 (en) * 2018-03-30 2021-04-28 横河電機株式会社 Data acquisition system, data acquisition device, and data synthesizer
EP3553777B1 (en) * 2018-04-09 2022-07-20 Dolby Laboratories Licensing Corporation Low-complexity packet loss concealment for transcoded audio signals
CN108495142B (en) * 2018-04-11 2021-05-25 腾讯科技(深圳)有限公司 Video coding method and device
US11957975B2 (en) * 2018-05-24 2024-04-16 Microsoft Technology Licensing, Llc Dead reckoning and latency improvement in 3D game streaming scenario
US11563644B2 (en) 2019-01-04 2023-01-24 GoTenna, Inc. Method and apparatus for modeling mobility and dynamic connectivity on a stationary wireless testbed
US11924506B2 (en) * 2019-02-18 2024-03-05 Creative Technology Ltd. System and method for data management in a media device
CN113099272A (en) * 2021-04-12 2021-07-09 上海商汤智能科技有限公司 Video processing method and device, electronic equipment and storage medium
EP4106275A1 (en) * 2021-06-18 2022-12-21 Rohde & Schwarz GmbH & Co. KG Jitter determination method, jitter determination module, and packet-based data stream receiver
US11943125B2 (en) * 2022-01-26 2024-03-26 Dish Network Technologies India Private Limited Discontinuity detection in transport streams

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995022233A1 (en) * 1994-02-11 1995-08-17 Newbridge Networks Corporation Method of dynamically compensating for variable transmission delays in packet networks
WO2003029990A1 (en) * 2001-10-03 2003-04-10 Global Ip Sound Ab Network media playout
WO2011163042A2 (en) * 2010-06-21 2011-12-29 Motorola Solutions, Inc. Jitter buffer management for power savings in a wireless communication device

Family Cites Families (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6765904B1 (en) * 1999-08-10 2004-07-20 Texas Instruments Incorporated Packet networks
US8032808B2 (en) * 1997-08-08 2011-10-04 Mike Vargo System architecture for internet telephone
US6434606B1 (en) * 1997-10-01 2002-08-13 3Com Corporation System for real time communication buffer management
US20020101885A1 (en) * 1999-03-15 2002-08-01 Vladimir Pogrebinsky Jitter buffer and methods for control of same
US7606164B2 (en) * 1999-12-14 2009-10-20 Texas Instruments Incorporated Process of increasing source rate on acceptable side of threshold
US6580694B1 (en) * 1999-08-16 2003-06-17 Intel Corporation Establishing optimal audio latency in streaming applications over a packet-based network
DE60018927T2 (en) * 2000-09-07 2005-07-28 Matsushita Electric Industrial Co. Ltd., Kadoma Method and apparatus for data packet transmission
US20030112758A1 (en) * 2001-12-03 2003-06-19 Pang Jon Laurent Methods and systems for managing variable delays in packet transmission
US7017102B1 (en) 2001-12-27 2006-03-21 Network Equipment Technologies, Inc. Forward Error Correction (FEC) for packetized data networks
US8520519B2 (en) * 2002-09-20 2013-08-27 Broadcom Corporation External jitter buffer in a packet voice system
JP2004187099A (en) * 2002-12-04 2004-07-02 Shinko Electric Ind Co Ltd Communication control method, communication system and communication equipment
US7426221B1 (en) * 2003-02-04 2008-09-16 Cisco Technology, Inc. Pitch invariant synchronization of audio playout rates
JP2006518948A (en) * 2003-02-13 2006-08-17 ノキア コーポレイション Streaming quality adaptation and control mechanism signaling method in multimedia streaming
EP1754327A2 (en) * 2004-03-16 2007-02-21 Snowshore Networks, Inc. Jitter buffer management
US20060007943A1 (en) * 2004-07-07 2006-01-12 Fellman Ronald D Method and system for providing site independent real-time multimedia transport over packet-switched networks
US7970020B2 (en) * 2004-10-27 2011-06-28 Telefonaktiebolaget Lm Ericsson (Publ) Terminal having plural playback pointers for jitter buffer
US8218439B2 (en) * 2004-11-24 2012-07-10 Sharp Laboratories Of America, Inc. Method and apparatus for adaptive buffering
US7525903B2 (en) * 2005-05-19 2009-04-28 Alcatel-Lucent Usa Inc. Method for improved packet 1+1 protection
US20070081471A1 (en) * 2005-10-06 2007-04-12 Alcatel Usa Sourcing, L.P. Apparatus and method for analyzing packet data streams
US20080040759A1 (en) * 2006-03-06 2008-02-14 George Geeyaw She System And Method For Establishing And Maintaining Synchronization Of Isochronous Audio And Video Information Streams in Wireless Multimedia Applications
US20090016333A1 (en) * 2006-06-14 2009-01-15 Derek Wang Content-based adaptive jitter handling
US8279884B1 (en) * 2006-11-21 2012-10-02 Pico Mobile Networks, Inc. Integrated adaptive jitter buffer
CN101548500A (en) * 2006-12-06 2009-09-30 艾利森电话股份有限公司 Jitter buffer control
KR100787314B1 (en) * 2007-02-22 2007-12-21 광주과학기술원 Method and apparatus for adaptive media playout for intra-media synchronization
JP5141197B2 (en) * 2007-11-13 2013-02-13 富士通株式会社 Encoder
US8787153B2 (en) * 2008-02-10 2014-07-22 Cisco Technology, Inc. Forward error correction based data recovery with path diversity
JP5109787B2 (en) * 2008-05-02 2012-12-26 富士通株式会社 Data transmission system, program and method
US8325608B2 (en) * 2008-08-07 2012-12-04 Qualcomm Incorporated Efficient packet handling for timer-based discard in a wireless communication system
EP2256991A1 (en) * 2009-05-25 2010-12-01 Canon Kabushiki Kaisha Method and device for determining types of packet loss in a communication network
US8326980B2 (en) * 2010-04-28 2012-12-04 Microsoft Corporation Using DNS reflection to measure network performance
WO2011153194A1 (en) * 2010-06-02 2011-12-08 Onmobile Global Limited Method and apparatus for adapting media
US8885729B2 (en) * 2010-12-13 2014-11-11 Microsoft Corporation Low-latency video decoding
GB201112177D0 (en) * 2011-07-15 2011-08-31 Tracker Network Uk Ltd Data receiver
WO2013083840A1 (en) * 2011-12-09 2013-06-13 Cinemo Gmbh Media playback component comprising playback queue and queue bypass
DK2823616T3 (en) * 2012-03-06 2021-01-11 Appear Tv As METHOD, DEVICE AND SYSTEM FOR PACKET TRANSMISSION OVER IP NETWORKS
US8861932B2 (en) * 2012-05-18 2014-10-14 At&T Mobility Ii Llc Video service buffer management
CN105324813A (en) * 2013-04-25 2016-02-10 诺基亚通信公司 Speech transcoding in packet networks
CN104282309A (en) * 2013-07-05 2015-01-14 杜比实验室特许公司 Packet loss shielding device and method and audio processing system
GB2521883B (en) * 2014-05-02 2016-03-30 Imagination Tech Ltd Media controller

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995022233A1 (en) * 1994-02-11 1995-08-17 Newbridge Networks Corporation Method of dynamically compensating for variable transmission delays in packet networks
WO2003029990A1 (en) * 2001-10-03 2003-04-10 Global Ip Sound Ab Network media playout
WO2011163042A2 (en) * 2010-06-21 2011-12-29 Motorola Solutions, Inc. Jitter buffer management for power savings in a wireless communication device

Also Published As

Publication number Publication date
GB201512312D0 (en) 2015-08-19
GB2521883B (en) 2016-03-30
GB2524430A (en) 2015-09-23
US20150318872A1 (en) 2015-11-05
US20220224360A1 (en) 2022-07-14
US20210006270A1 (en) 2021-01-07
GB2524349A (en) 2015-09-23
GB201407782D0 (en) 2014-06-18
US11750227B2 (en) 2023-09-05
US10778257B2 (en) 2020-09-15
US20200266839A1 (en) 2020-08-20
US11323136B2 (en) 2022-05-03
US20150319212A1 (en) 2015-11-05
GB201417242D0 (en) 2014-11-12
US10680657B2 (en) 2020-06-09
US9985660B2 (en) 2018-05-29
GB2524430B (en) 2016-02-17
US20180262213A1 (en) 2018-09-13
GB2524349B (en) 2016-02-17

Similar Documents

Publication Publication Date Title
US20200266839A1 (en) Media Controller with Buffer Interface
KR101590972B1 (en) A method of scheduling transmission in a communication network, corresponding communication node and computer program product
US9380100B2 (en) Real-time VoIP transmission quality predictor and quality-driven de-jitter buffer
US7881284B2 (en) Method and apparatus for dynamically adjusting the playout delay of audio signals
US10805196B2 (en) Packet loss and bandwidth coordination
US7450601B2 (en) Method and communication apparatus for controlling a jitter buffer
AU2008330261B2 (en) Play-out delay estimation
CN107534589B (en) De-jitter buffer update
EP3466001B1 (en) Media buffering
CN107852348B (en) Method for identifying network state, data processing device and machine readable storage medium
CN100525281C (en) Method of realizing dynamic adjusting dithered buffer in procedure of voice transmission
EP3125498B1 (en) Estimating processor load
CN103888381A (en) Device and method used for controlling jitter buffer
WO2003079620A1 (en) Clock skew compensation for a jitter buffer
TW201644239A (en) Methods and devices for controlling speech quality
US7283548B2 (en) Dynamic latency management for IP telephony
US20040190494A1 (en) Systems and methods for voice quality testing in a non-real-time operating system environment
AU2002310383A1 (en) Dynamic latency management for IP telephony
CN108933768B (en) Method and device for acquiring sending frame rate of video frame
JP5806719B2 (en) Voice packet reproducing apparatus, method and program thereof
JP2007241030A (en) Server device and buffer control method of same device