New! View global litigation for patent families

US20070206645A1 - Method of dynamically adapting the size of a jitter buffer - Google Patents

Method of dynamically adapting the size of a jitter buffer Download PDF

Info

Publication number
US20070206645A1
US20070206645A1 US11745210 US74521007A US2007206645A1 US 20070206645 A1 US20070206645 A1 US 20070206645A1 US 11745210 US11745210 US 11745210 US 74521007 A US74521007 A US 74521007A US 2007206645 A1 US2007206645 A1 US 2007206645A1
Authority
US
Grant status
Application
Patent type
Prior art keywords
buffer
jitter
speech
size
decoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11745210
Inventor
Jim Sundqvist
Hakan Lennestal
Anders Nohlgren
Morgan Lindqvist
Wei Wang
Original Assignee
Jim Sundqvist
Hakan Lennestal
Anders Nohlgren
Morgan Lindqvist
Wei Wang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • H04L2012/6472Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • H04L2012/6481Speech, voice
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • H04L2012/6489Buffer Management, Threshold setting, Scheduling, Shaping

Abstract

The present invention relates to a receiver system in a communication system supporting packet-based communicatior (e.g., an IP-network), including a receiver, speech decoder (40) and a jitter buffer (20) for handling delay variations in the reception of a speech signal consisting of packets containing frames with encoded speech. A jitter buffer controller (50) is provided for keeping information about the functional size of the jitter buffer (20) and for providing the speech decoder (40) with control information, such that the speech decoder (40), based on that information. provides a dynamic adaptation of the size of the jitter buffer (20) using the received encoded, packetized speech signal. The invention also relates to a method of adapting the functional size of the jitter buffer of a receiver system.

Description

    PRIORITY CLAIM
  • [0001]
    This patent application claims priority to PCT patent application number PCT/SE01/01140 filed May 31, 2000.
  • FIELD OF THE INVENTION
  • [0002]
    The present invention relates, generally, to a receiver system in a communication system supporting packet-based communication of speech and data (e.g., an IP-network). More specifically, the present invention relates to a system comprising a receiver, a speech decoder and a jitter buffer for handling variations in the reception of a speech signal consisting of packets containing frames with encoded speech.
  • BACKGROUND OF THE INVENTION
  • [0003]
    One area undergoing fast development within telecommunications is voice over IP (VOIP) (e.g., Transmission Control Protocol/Internet Protocol-TCP/IP). Developments in this area initially focused on making phone calls at very low costs. Now, developments within this telecommunications area seem promising for new and different business applications. Because both speech and data use the same network and the same transmission protocol, it should be much easier to implement different information applications (e.g., call center, call screen, unified messages, etc.) with VOIP than with traditional telecommunication technologies. However, VOIP applications typically group voice or speech data to form packets, which are sent over shared common networks. Due to the nature of such networks, specific technical problems like the loss of packets, delay of packets, and jitter often occur.
  • [0004]
    Jitter can be described as the variation in arrival of consecutive packets. Typically, in real-time services such as in voice transmission, a packet encoded with speech is sent every 20 ms, which corresponds to 160 samples when using a sampling frequency of 8 kHz. Since delays vary throughout a network, different packets are delayed differently. Moreover, clocks of transmitting and receiving terminal units are not synchronized to one another. In order to smooth out delay variations, a receiving system or the receiving module of a terminal (i.e., a terminal that generally functions both as a receiver and a transmitter) is usually provided with a jitter buffer.
  • [0005]
    The relative size of the jitter buffer bears an important relation to resulting speech quality. If the size of the jitter buffer is too large, the one-way delay from mouth to ear will be too large, and the perceived quality will be degraded. For example, ITU-T Recommendations state that the one-way delay should be less than 150 ms for a regular telephony service.
  • [0006]
    If the jitter buffer is too small, however, packets delayed more than the size of the jitter buffer will arrive too late for any speech synthesis, and will be seen as lost. Therefore, an adaptive jitter buffer is needed to balance the size of the jitter buffer (i.e., delay at the receiving side) against packet loss.
  • [0007]
    Delay may also vary with time. In order to handle such variations, the size of the jitter buffer (i.e., the number of samples that the speech parameters within it would represent) needs to be adaptable. The jitter buffer can be measured and adapted in different ways. One conventional method measures jitter buffers through checking maximum variations in arrival times for the received packets. There are also various methods for performing the actual jitter buffer adaptation. For example, one conventional method performs a jitter buffer adaptation by using the beginning of a talk-spurt to reset the jitter buffer to a specified level. The distance, in number of samples, between two consecutive talk-spurts is increased at the receiving side if the jitter buffer is too small (e.g., during silence). Likewise, the number of samples is decreased if the jitter buffer is too large. Through this action the size of the jitter buffer is adaptable. In IP telephony solutions using, for example, the RTP protocol (Real Time Protocol), the marker flag in the RTP header identifies the beginning of a talk-spurt. Accordingly, the size of the jitter buffer can be changed when such a packet arrives at the receiving side.
  • [0008]
    However, the above-mentioned conventional solution statically resets the jitter buffer to a certain level at the beginning of each talk-spurt. It does not, for example, cover the case when network conditions change or if a wrong decision has been taken. Furthermore, if the jitter buffer size becomes too small, packets will be lost. Similarly, if the jitter buffer becomes too large, an unnecessary delay is introduced. In both cases, the perceived speech quality will be affected. This is undesirable. Moreover, because the jitter buffer is adapted only when there is a speech silence period, the problems will be even more severe during periods of long speech where no jitter buffer adaptations occur.
  • SUMMARY OF THE INVENTION
  • [0009]
    Therefore, a receiver, or a receiving module or function within a terminal unit, hereinafter commonly referred to as a receiver system, in a communication system is needed providing efficient handling of the variation in arrival time (i.e., packets that arrive irregularly) between consecutive packets. A receiver system is also needed through which the perceived speech quality will be good—particularly improved over conventional receiving systems. A receiver system is also needed to smooth out delay variations—especially by handling jitter in such a way that the size of the jitter buffer can be adapted in an efficient manner. A receiver system is also needed which contains a jitter buffer adapted to balance the delay at the receiving side while still optimizing packet loss. A receiver system including a jitter buffer is also needed to fulfill any requirements and recommendations relating to regular telephony services, and for which the jitter buffer can adapt in an easy manner.
  • [0010]
    A jitter buffer, according to the present invention, is given a broad interpretation. It refers to a jitter buffer where packets are stored before fetching by a decoder. It may, however, also relate to storing or buffering channels, wherein speech is stored in its decoded form (i.e., after decoding in the decoder). This is applicable when decoding is substantially performed as soon as packets are received in the receiver. This is how the functional size of the jitter buffer is determined.
  • [0011]
    Moreover, a system is needed for adapting the functional size of a jitter buffer in a receiver system, within a communication system supporting packet-based communication of speech and data, through which jitter buffer adaptations can be performed in an efficient and easy manner. A system is also needed to smooth out delay variations in the reception of consecutive packets containing speech. Moreover, a system is needed to provide good speech quality while reducing the risk of lost packets. Still further, a system is needed for improved perceived speech quality at the receiver. A system is also needed through which the jitter buffer handling capability, within a receiver system can be improved, facilitated and, particularly, optimized.
  • [0012]
    Therefore, a receiver system is provided which includes a receiver, speech decoder, and a jitter buffer, in which packets containing frames with encoded speech are received (i.e., the buffer receives packets from the network comprising speech parameters representing one or more speech frames comprising a number of samples). The present invention further provides a jitter buffer controller for keeping information about the size of the jitter buffer and for providing speech decoding with control information such that speech decoding, based on such information, includes information about the received (encoded or decoded) speech signal, and provides for a dynamic adaptation of the size of the jitter buffer by modification of the received, packetized speech signal. According to different embodiments, the speech signal may be modified during the decoding step, or after decoding.
  • [0013]
    In one embodiment, the jitter buffer controller uses information on the current size of the jitter buffer and on the desired (default) size of the jitter buffer to determine if, and how, the jitter buffer size needs to be adapted.
  • [0014]
    A packet particularly contains at least one speech frame, each frame containing a number of parameters representing speech. In a particular implementation, a received packet consists of one speech frame representing, for example, 160 samples when decoded According to the invention, the speech signal is compressed or extended in time to adapt the functional size of the jitter buffer. The number of samples that remain in the receiver until the received packet has to be fetched by the decoder represents the current size of the jitter buffer. The desired, or default, size of the jitter buffer is represented by the number of samples that should remain in the receiver until the received packet has to be fetched by the decoder.
  • [0015]
    The present invention also adjusts the functional size of the jitter buffer, which is relevant when the speech decoder actually fetches packets (substantially as soon as they arrive), decodes them, and stores them before delivery to the D/A converter. This storing after decoding corresponds to the actual storing in the jitter buffer, hence the term functional size of the jitter buffer. Then, however, adaptation is somewhat delayed since adaptation of the size is done in relation to the subsequent packet.
  • [0016]
    Particularly, for controlling the size of the jitter buffer, a number of samples are added or removed upon decoding a packet in the decoder. In one embodiment, for adapting the size of the jitter buffer, a number of pitch periods (or a number of samples representing one or more pitch periods) are added or removed when decoding. Still further, the number of frames that a packet contains may be increased or decreased depending on whether the size of the jitter buffer needs to be increased or reduced.
  • [0017]
    The jitter buffer control detects the arrival times of packets at the jitter buffer and the time at which packets are fetched by the decoder for determining if, and how, the jitter buffer size needs to be adapted. A feedback channel is provided to inform the jitter buffer controller about the current jitter buffer size such that the controller always is provided with updated information about jitter buffer size. Via the feedback channel, information is provided to the controller about how many samples/frames/pitch periods were added/removed from packet at the latest adaptation of the jitter buffer size. Alternatively, if all the intelligence resides in the controller, there is no need to transfer the information, since it is already available to the controller. In one embodiment, for altering the number of samples or pitch periods, the currently received packet is used. Alternatively, for adding a frame to a packet, the parameters of a number of the preceding frames are used to synthesize a new frame. Alternatively, for insertion of a frame, parameters of a preceding frame and of a subsequent frame can be used in an interpolation step to provide a new frame. For deleting a frame, the subsequent frame is advantageously deleted from the jitter buffer.
  • [0018]
    In one embodiment, the decoder comprises a CELP or other similar decoder, and an already existing controller therein may be used as the jitter buffer controller. In an alternative embodiment, the decoder performs an LPC-analysis to provide an LPC-residual and conduct an LPC-synthesis.
  • [0019]
    In one embodiment, a method of adapting the (functional) size of a jitter buffer in a terminal unit in a communication system supporting packet-based communication of speech and data is provided which comprises the steps of: receiving a speech signal, with encoded speech in packets, from a transmitting side in a jitter buffer of a receiver; storing the packets in a jitter buffer; fetching packets from the jitter buffer to a speech decoder. The method may further comprise the steps of detecting if the functional size of the jitter buffer needs to be increased or decreased and, if it does, extending or compressing the received speech signal in time through controlling the number of samples the speech frames stored in the jitter buffer would represent when decoded. This method may, in order to extend or compress the speech signal, increase or decrease the rate at which the decoder fetches packets from the actual jitter buffer or, alternatively, the rate at which the decoder outputs frames, in the case when decoding is done substantially as soon as a packet arrives to the jitter buffer.
  • [0020]
    In another embodiment, the method comprises adapting the size of the jitter buffer by adding or deleting one or more samples when a packet is decoded in the decoder. In an alternative embodiment the method includes adding or removing one or more pitch periods upon decoding in the decoder. In still another embodiment, the method includes adding or removing one or more frames to or from a packet. In this instance, an additional frame is then introduced between two packets. The additional frame will generate a speech frame comprising N samples, e.g. 160 samples.
  • [0021]
    The present invention may further comprise the step of detecting, in an automatic control, the arrival times of packets to the jitter buffer and the times at which packets are fetched by the decoder, to determine if and how the jitter buffer size needs to be adapted. Another embodiment, instead of detecting packets fetched by the decoder, determines when the decoder has to output a packet to D/A converter (e.g., from a play-out buffer associated with the decoder).
  • [0022]
    The present invention may further comprise dynamically adapting the size of the jitter buffer through increasing or decreasing the rate at which the decoder fetches packets from the jitter buffer. According to one embodiment, a CELP or CELP-like decoder is used for adaptation control, thus using existing LPC parameters, giving an LPC-residual. In another embodiment, an LPC-analysis is performed to provide an LPC-residual before adding or removing samples/frames/pitch periods or performing an LPC-synthesis.
  • [0023]
    In one embodiment, a system handling delay variations of a jitter buffer in a receiver system in a communication system supporting packet-based communication of speech and data is provided, wherein packets with frames (one or more) of encoded speech are received in the jitter buffer from a transmitting terminal unit at a varying first frequency, and wherein the speech decoder fetches packets from the jitter buffer with a second frequency. The system comprises a jitter buffer controller to dynamically control the second frequency with which the decoder fetches packets from the jitter buffer, such that the size of the jitter buffer can be changed or adapted. The frequency at which fetching of packets is performed is controlled through increasing or decreasing the number of samples/pitch periods/frames contained in a packet, when decoded in the decoder.
  • [0024]
    In summary, the present invention provides a number of advantages over conventional systems and methods. With the present invention, receiver systems in a communication system provide for efficiently handling variations in arrival time (i.e. packets that arrive irregularly) between consecutive packets while providing improved perceived speech quality; and provide smoothed out delay variations at the receiver while efficiently adapting the size of the jitter buffer-balancing delays at the receiving side and optimizing packet loss. Other advantages offered by the present invention will be readily appreciated upon reading the below detailed description of the embodiments of the invention.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • [0025]
    The above and further advantages of the invention may be better understood by referring to the following description in conjunction with the accompanying drawings, in which:
  • [0026]
    FIG. 1 is a speech signal in the time domain;
  • [0027]
    FIG. 2 is an LPC-residual of a speech signal in the time domain:
  • [0028]
    FIG. 3 is an analysis-by-synthesis speech encoder with an LTP-filter,
  • [0029]
    FIG. 4 is an analysis-by-synthesis speech encoder with an adaptive codebook;
  • [0030]
    FIG. 5 is a block diagram of two telecommunication units, one of which acts as a transmitter transmitting a signal over the network to the other acting as a receiver;
  • [0031]
    FIG. 6 is a block diagram illustrating parts of a receiver system for adaptation of the jitter buffer according to the invention;
  • [0032]
    FIG. 7A is an illustration an original sequence of two speech frames;
  • [0033]
    FIG. 7B is an illustration of the insertion of a frame to the sequence of FIG. 7A;
  • [0034]
    FIG. 8A is an illustration of an original waveform sequence;
  • [0035]
    FIG. 8B is an illustration of the insertion of a pitch based waveform segment to the sequence of FIG. 8A;
  • [0036]
    FIG. 9A is an original waveform;
  • [0037]
    FIG. 9B is an illustration of the insertion of a waveform segment to the waveform of FIG. 9A; and
  • [0038]
    FIG. 10 is a flow diagram describing the functioning of a jitter buffer controller.
  • DETAILED DESCRIPTION OF THE INVENTION
  • [0039]
    The present invention provides a receiver system in a communication system supporting packet-based communication of speech and data (e.g., an IP-network). The receiver system includes a receiver, a speech decoder and a jitter buffer for handling delay variations in the reception of a speech signal consisting of packets containing frames with encoded speech. The present invention also provides a system for improving the handling of delay variations in a jitter buffer within the receiver system in a communication system supporting packet-based communication of speech and data. The present invention also provides a system for adapting the size of a jitter buffer in a receiver system a communication system supporting packet-based communication of speech and data.
  • [0040]
    While the making and using of various embodiments of the present invention are discussed in detail below, it should be appreciated that the present invention provides many applicable inventive concepts that can be embodied in a wide variety of specific contexts. The specific embodiments discussed herein are merely illustrative of specific ways to make and use the invention and do not delimit the scope of the invention. The discussion herein relates to packet-based communication systems supporting packet-based communication of speech and data. Specifically, illustrative embodiments are described as adapted for use in packet-based communication systems supporting packed-based communications of speech and data, where a transmitting communication unit samples input speech, comprising a number of speech parameters, which are then packetized in packets and transmitted over the network to a receiving unit. The receiving unit, or system, plays the recreated, decoded, speech signal with a sampling rate; and, at the receiving side, an adaptable jitter buffer handles the delay variations when the packets are unpacked. However, it shall be understood that the invention is not restricted to these exemplifying embodiments, or to such an application. The basic principles of the invention may likewise be applied to other communication systems.
  • [0041]
    In a packet-based communication system supporting packet based communication of speech and data, a first terminal unit, when acting as a transmitting communication unit, samples input speech comprising a number of speech parameters, (e.g. three types of parameters)—which here represent a speech frame comprising a number of samples (e.g., 160 samples for a 20 ms speech frame if a sampling rate of 8 kHz is used). Those parameters are packetized in packets and transmitted over the network to a receiver system which plays up the recreated, decoded, speech signal (e.g., with a sampling rate of 8 kHz). At the receiving side, the packets are thus unpacked. However, as referred to earlier, delays with which packets are received on the receiving side may be different from one packet to another since some of the packets are transmitted quickly over network whereas others are transmitted slowly. Moreover, the clocks of the transmitting and receiving sides, respectively, are usually not synchronized. If a subsequent packet has not arrived when the preceding packet already has been played out, the user of the receiver will perceive this as a disturbance. The jitter buffer handles such delay variations because it is important that there is something to play up at all times.
  • [0042]
    Referring now to the drawings, FIG. 1 illustrates a typical segment of a speech signal in the time domain. This speech signal shows a short-term correlation corresponding to the vocal track and a long-term correlation corresponding to the vocal cords. The short-term correlation can be predicted by using an LPC-filter and the long-term correlation can be predicted through the use of an LTP-filter. LPC is linear prediction coding. Similarly, LTP is long-term prediction coding. Linear, in this case, implies that the prediction is a linear combination of previous samples of the speech signal.
  • [0043]
    The LPC-filter is usually denoted: H ( z ) 1 A ( z ) = 1 1 - i = 1 n a i z - i
  • [0044]
    Through feeding a speech signal through the LPC-filter, H(z) the LPC residual, rLPC(n), is found.
  • [0045]
    The LPC residual shown in FIG. 2 contains pitch pulses P generated by the vocal cords. The distance between two pitch pulses is a so-called lag. The pitch pulses P are also predictable, and since they represent the long-term correlation of the speech signal, they are predicted through a long-term predictor, i.e. an LTP-filter given by the distance L between the pitch pulses P and the gain b of a pitch pulse P. The LTP filter is commonly denoted:
    F(z)=b·z −L
  • [0046]
    When the LPC-residual is fed through the inverse of the LTP-filter F(z), an LTP-residual rLTP(n) is created. In the LTP-residual the long-term correlation in the LPC-residual is removed, giving the LTP-residual a noise-like appearance.
  • [0047]
    Many low bit rate speech coders are so-called hybrid coders. In FIG. 3 an analysis-by-synthesis speech encoder 100 with LTP-filter 140 is illustrated. The vocal track is described with an LPC-filter 150 and the vocal cords is described with an LTP-filter 140, while the LTP-residual {circumflex over (r)}LTP(N) is waveform-compared with a set of more or less stochastic codebook vectors from the fixed codebook 130. The input signal IIN is divided into frames 110 with a typical length of 10-30 ms. For each frame, an LPC-filter (a short-term predictor) 150 is calculated through an LPC-analysis 120 and the LPC-filter 150 is included in a closed loop to find the parameters of the LTP-filter 140, i.e. the lag portion and LTP gain, the fixed codebook, codebook index and codebook gain. The speech decoder 180 is included in the encoder and consists of the fixed codebook 130 which output {circumflex over (r)}LTP(n) is connected to the LTP-filter 140 the output {circumflex over (r)}LTP(N) of which is connected to the LTP-filter 140, the output of which {circumflex over (r)}LPC(n) is connected to the LPC-filter 150 generating an estimate ŝ(n) of the original speech signal SIN(n). In the analysis to find the best set of parameters to represent the original sequence, the parameters of the fixed codebook, the gain and the long-term predictor are thus combined in different ways to generate different synthesized signals. Each estimated signal ŝ(n) is compared with the original speech signal SIN(n) and a difference signal e(n) is calculated. The difference signal e(n) is then weighted in 160 for calculation of a perceptual weighted error measure ew(n). The set of parameters giving the least perceptual weighted error measure ew(n) is transmitted to the receiving side 170. This method, analysis by synthesis, thus consists in comparing different synthesized signals and selecting the best match.
  • [0048]
    FIG. 4 illustrates another type of analysis-by-synthesis speech encoder 100 in which the LTP filter of FIG. 3 (140) is exchanged through an adaptive codebook 135. The LPC-residual {circumflex over (r)}LPC(n) is the output from the sum of the adaptive and the fixed codebooks 135 and 130. All other elements have the same functionality as in FIG. 3 illustrating the analysis-by-synthesis speech encoder with LTP-filter 140.
  • [0049]
    The above brief summary of the functioning of speech coding with reference to FIGS. 3 and 4, is included merely for the purposes of giving an understanding of the methods that will be described in relation to the present invention. In practice, much work is spent on reducing the complexity and on increasing the perceived speech quality—some of which is beyond the scope of the present invention. Any suitable type of speech coding may be implemented in accordance with the present invention.
  • [0050]
    FIG. 5 schematically illustrates block diagrams of two telephone units, e.g. IP-phones, each of which of course is able to act both as a transmitter and a receiver, but in this figure one of the telephone terminal units 11 is supposed to act as a transmitter transmitting a signal to another telephone terminal unit 12 acting as a receiver system. The terminal units 11, 12 include a microphone 1, A/D converter 2, voice encoder 3, voice packager 4 and voice decryptor 5, for transmitting a signal over the network to another terminal unit 12, or transceiver. The voice decryptors are not mandatory.
  • [0051]
    The receiver comprises a voice buffer 10 for receiving signals from the network, a jitter buffer 20 for handling the delay variations in the reception of packets over the network, voice decryptor 30 and voice decoder 40, a D/A converter 60 and loudspeaker 70. The voice buffer and the jitter buffer may also consist of one common unit, simply denoted jitter buffer. Jitter buffer controller 50 communicates with the jitter buffer 80, and voice decoder 40 is used to control adaptation of the size of jitter buffer 20, as further explained with reference to FIG. 6.
  • [0052]
    According to one embodiment of the present invention, the decoder at the receiving side comprises a CELP-decoder or a CELP-like decoder. In that case, the adaptation control can be performed using the decoder. According to other embodiments, where other decoders are used, the characteristics of the input speech signal are used for the adaptation. In both cases, the present invention processes the input speech signal to either expand or compress it in time, in order to provide for the appropriate adaptation of the jitter buffer (i.e. the size of the jitter buffer).
  • [0053]
    FIG. 6 is a block diagram which simplified to only illustrate those parts of the receiver system that actually are involved in adaptation of the jitter buffer according to the present invention. Thus, input to the jitter buffer consists of packets from a network (e.g., from a terminal unit acting as transmitter), and the packets from the network comprise speech parameters, representing one or more speech frames each comprising a number, N, of samples (e.g., 160 samples if a sampling frequency 8 kHz is used). Thus, packets are stored in jitter buffer 20 and the jitter buffer must have a certain size in order to enable a continuous flow to speech decoder 40. Other sampling frequencies can be used.
  • [0054]
    If, for example, the size of the buffer is too small then there is nothing to play out-which may be the case if the delay in the network is too long. According to the present invention, the jitter buffer size is then increased. If the speech decoder wants to play out with a frequency of 8 kHz, it fetches packets (e.g., normally comprising 160 samples) from the jitter buffer, synthesizes speech, and plays it out. If, for example, the size of the jitter buffer is fixed and corresponds to 160 samples, it is not possible to change the size of the jitter buffer and a packet will be fetched every 20 ms. If, however, according to the present invention, it is detected that the jitter buffer size needs to be increased, then, for example, the speech decoder does not fetch the packet after 20 ms but instead after 22 ms (i.e., it waits two more milliseconds until it fetches the next packet). According to one embodiment, samples are then added in the decoder, which means that the speech signal is extended through two ms. For example, one or more pitch period may be added or, for example, 16 samples during a given period. Thus, a packet is fetched which contains a number of speech parameters. Normally the parameters would be converted to a number of samples—typically 160 samples. If adaptation of the size of the jitter buffer is needed, the number of samples that the parameters represent can be changed and then samples, pitch periods or frames can be added or removed in the decoder, or frames can be added or removed in the jitter buffer. In other words, the time period, until the subsequent packet is fetched, is prolonged or shortened. If the time period is prolonged, the size of the jitter buffer is increased, and vice versa, if the time period is shortened, the size of the jitter buffer is reduced.
  • [0055]
    Returning to FIG. 6, the jitter buffer controller 50 keeps information about the current size of the jitter buffer corresponding to the number of samples the parameters in the jitter buffer would represent after decoding, and of the default jitter buffer size corresponding to the default or desired size of the jitter buffer 20 corresponding to the number of samples the parameters should represent after decoding.
  • [0056]
    In one embodiment, jitter buffer controller 50 detects the times at which packets arrive and the time when packets are fetched by the speech decoder 40. At the input side, the frequency varies due to the delay variations. A solution is then to fetch packets more or less often to dynamically adapt the rate at which packets are fetched. Thus, using the information on arrival times of packets to the jitter buffer, and the times when packets are fetched by the decoding, the jitter buffer controller 50 determines if, and how, the jitter buffer 20 size needs to be adapted.
  • [0057]
    Thus, in one embodiment, information about the speech signal is provided from the speech decoder 40 to the jitter buffer controller 50 containing, for example, an analysis of the speech signal content, which information comprises input data to be used in the decision as to when, if, and how the adaptation is to be performed. The adaptation, or modification, may be done during the decoding step, or after decoding. Information relating to the speech signal, on which the modification decision is based, may relate, in different embodiments, to the encoded or to the decoded speech signal. This procedure will be further described below. Control information relating to the decision that has been made, is forwarded to speech decoder 40, in this implementation. Thus, whenever the speech decoder needs a speech analysis frame to process for speech synthesis generation, speech decoder 40 extracts the frame from the jitter buffer 20 together with data information. The data information may, for example, be estimated bit rate within the frame, etc.
  • [0058]
    As referred to above, in parallel, the speech decoder 40 also gets the control information from the jitter buffer controller 50 as to whether or not it should modify the current frame. If the control information says that a modification is needed, information is also contained relating to how the frame should be modified (i.e., expanded or compressed), and the kind of modification (e.g., if a single sample is to be added or removed, if one or more pitch periods are to be added or removed, or if a frame is to be added or removed).
  • [0059]
    In one embodiment, a modification is performed within the current frame or packet in the most appropriate way. According to another embodiment, the intelligence lies in the jitter buffer controller. Alternatively, the functionality of the jitter buffer controller may be provided in the speech decoder. Still further, the control functionality and the intelligence may be distributed between the speech decoder and the jitter buffer controller. Then, for example, information about the result of an intended modification is provided from the speech decoder to the jitter buffer controller.
  • [0060]
    A modification may be done in different ways—for example, through extracting or compressing the speech signal, depending on whether the size of the jitter buffer needs to be increased or decreased, and the speech signal can be extracted or compressed in different ways. One way is to insert or delete one or more samples. It is also possible to insert or delete one or more pitch periods—which actually is also an insertion or deletion of a number of samples. Still further it is possible to insert or delete one or more frames (i.e., that also being an insertion/deletion of a number of samples, which is the most general definition). There may also be other ways a speech signal can be extracted or compressed in accordance with the present invention, as well.
  • [0061]
    As soon as the speech signal is extracted or compressed in time, jitter buffer size is affected indirectly, since the play back point of the next speech frame is adjusted. Methods for inserting or deleting samples are given in “Method and Apparatus in a Telecommunication System”, which is incorporated herein by reference. However, the methods disclosed therein were evaluated for compensation of clock drift in the sampling frequency between the sending and the receiving side to avoid starvation in the play out buffer or to avoid an increasing delay. Starvation will occur if the receiving side has a higher sampling rate than the sending side and the delay will increase if the receiving side has a lower sampling rate than the sending side. According to the present invention, such methods can be used for jitter buffer adaptation. Hence, jitter buffer size can be adapted in different ways using sample, pitch or frame insertion/deletion.
  • [0062]
    If the modification is a frame insertion (e.g., as decided by controller 50), such modification can be provided for in different ways. In one embodiment, the parameters of a previous frame are repeated and used during the synthesis of the inserted frame. Alternatively, corresponding to an advantageous implementation, a set of parameters is used during the synthesis of the inserted frame that has been derived from interpolation of parameters from the previous frame and the next frame, respectively. Such an approach is similar to that used for concealing lost frames (Error Concealment Units).
  • [0063]
    FIGS. 7A, 7B illustrate insertion of a frame. In FIG. 7A, a portion of an original frame sequence is illustrated in which a previous frame is indicated as well as a next frame. FIG. 7B illustrates how the inserted frame is introduced between the previous frame and the next frame.
  • [0064]
    If a frame is to be deleted, then the next subsequent frame may be deleted from the jitter buffer and a smoothing action is performed in the next frame. From the speech decoder 40, speech information and synthesized speech is output. The speech information comprises a number of samples that have been generated, information of about how many samples are comprised in a pitch period as well as other characteristics of the speech (e.g., if the speech is voiced or unvoiced). The jitter buffer controller 50 uses the speech information for taking the decision about which, if any, modification that needs to be done. Typically, pitch period will be between substantially 20 and 140 samples if a sampling information frequency of 8 kHz is used. Since the pitch period is quasi-stationary, at least during voiced segments of the speech, jitter buffer controller 50 obtains a rough estimate of pitch period through considering the pitch periods of previous frames. Based on this information, the jitter buffer controller 50 is able to decide if a pitch based action or if a frame based action is the most appropriate. The speech decoder 40 gives the result of the modification after the modification action has been done. The size of the speech synthesis frame will vary, depending on which action is taken. For a single sample insertion/deletion the value will be (framesize +1) and (framesize −1), respectively. For a frame insertion/deletion, the value will be (2×framesize) and (0), respectively. For the action pitch insertion/deletion, the size of the speech synthesis frame will vary depending on which pitch period that actually has been inserted or removed.
  • [0065]
    The jitter buffer controller 50 uses two values to make a decision relating to adaptation as briefly mentioned in the foregoing. The first value is a current size of the jitter buffer, which is represented by the number of samples that remains in the receiver system until the received packet has to be fetched by the decoder. The second value is the default size, which is represented by the number of samples that should remain in the receiver system until the received packet has to be fetched by the decoder. If the current size differs greatly from the default size to allow adaptation within one and the same speech frame, subsequent frames will be used to adapt the size of the jitter buffer 50.
  • [0066]
    In embodiments in which the receiving system contains a non CELP-type decoder, the methods of the present invention can be implemented by introducing some additional steps—such as performing an LPC analysis to achieve a LPC residual. The same actions as described above, inserting or deleting one or more samples, frames or pitch periods, are then performed followed by an LPC synthesis. In the patent application “Method and Apparatus in a Telecommunication System” referred to above, such sample rate conversion methods are described.
  • [0067]
    The present invention may be implemented even when no CELP-like decoder is available, utilizing the general methods of the present invention. In one embodiment, a single sample is inserted or deleted on a raw speech signal. In this case, a framing of the speech signal is made. Samples to remove are selected in a manner so as to avoid segments with more information, i.e., where the signal varies rapidly. However, cautiousness should be used when implementing this method because, if insertion/deletion is made too often, the speech quality will run the risk of being deteriorated.
  • [0068]
    If there is a need for a faster adaptation on a raw speech signal, a segment of the waveform can be repeated as illustrated in FIGS. 8A and 8B, wherein FIG. 8A illustrates an original waveform sequence and FIG. 8B illustrates how a pitch based waveform is inserted.
  • [0069]
    The repetition may be a full pitch period, but it can also be limited to a single wave as illustrated schematically in FIG. 9B. FIG. 9A illustrates the original waveform whereas FIG. 9B illustrates the modified waveform wherein a waveform segment has been inserted. Thus, according to the invention, the concept can be implemented when CELP or CELP-like decoders are available, when other decoders are available using a pseudo-CELP approach as described above, or, insertions/deletions can be done to/from the speech signal itself.
  • [0070]
    FIG. 10 is a flow diagram describing processing flow of the jitter buffer controller 50. It functions as follows: from the start, 200, it determines if there has been a network event, 210. In other words it determines whether a packet has arrived. If not, the speech decoder is updated as far as the extraction times are concerned, 211 (i.e., no packet has arrived). If, however, a packet has arrived, the jitter buffer calculations are updated, 220.
  • [0071]
    The size of the jitter buffer can be calculated in different ways. One way to calculate the size of the jitter buffer is to have a sliding average, where for every packet the time left until an incoming packet is to be used for speech synthesis s measured. For consecutive packets, this will of course vary, but through taking a number of values, for example, the ten last values, and then form the average, it is possible to see if the jitter buffer tends to be too large or too small or if it appears to have the appropriate size. It is then established if the packet loss probability exceeds a maximum threshold value THRmax, 230. Also this value depends on the situation. THRmax is the probability threshold that should be observed i.e. it should not be exceeded, in order to provide a sufficient speech quality but it varies depending on which speech decoder that actually is used.
  • [0072]
    The jitter buffer minimum size JBmin is the smallest size the jitter buffer should have depending on the variance in the delay variation of incoming packets. The jitter buffer also has a maximum value JBmax that should not be exceeded; otherwise the speech quality would be negatively affected. Delay variation is the calculated variation of the interval between two consecutive packets. If the jitter buffer size is smaller than the maximum jitter buffer size JBmax, 231, the jitter buffer size should be increased, 232, and then the procedure is stopped, 233, until being repeated again from 200 above. If however the jitter buffer size is not smaller than the maximum, nothing is done, 231A, until the procedure is repeated again from 200 above. If however it was established that the probability of loosing packets was smaller than the maximum threshold value THRmax (230), is examined if the jitter buffer size exceeds the minimum jitter buffer size JBmin, 240. If yes, the jitter buffer size is decreased, 250, and then nothing is done, 260, until the procedure is repeated again from 200 above. If it was established that the jitter buffer size did not exceed the minimum size, nothing is done, 241, until the procedure is repeated from 200 above.
  • [0073]
    It should be clear that the invention is not limited to the explicitly described embodiments, but that it can be varied in a number of ways within the scope of the appended claims. It is, for example, applicable to a number of decoders, not just so called CELP-decoders or CELP-like decoders. Moreover packet storing can be effected in the physical jitter buffer or after decoding in the decoder, hence the reference to functional size of the jitter buffer, or functional jitter buffer.
  • [0074]
    Although preferred embodiments of the present invention have been described in detail, it will be understood by those skilled in the art that various modifications can be made therein without departing from the spirit and scope of the invention as set forth in the amended claims.

Claims (12)

  1. 1-19. (canceled)
  2. 20. A method of dynamically adapting the size of a jitter buffer in a receiving system within a communication system supporting a packet-based communication, comprising the steps of:
    receiving by the receiver system, a signal encoded in packets;
    storing the packets in the jitter buffer;
    fetching packets from the jitter buffer by a decoder;
    decoding the packets in the decoder;
    sending information about the received signal from the jitter buffer and the decoder to a jitter buffer controller;
    using information about the received signal to determine by the jitter buffer controller, whether the size of the jitter buffer needs to be adapted; and
    responsive to the determination, adapting the size of the jitter buffer by extending or compressing the received signal in time by controlling the number of samples the frames stored in the jitter buffer would represent when decoded.
  3. 21. The method of claim 20, wherein the step of extending or compressing the signal further comprises increasing or decreasing the rate at which the decoder fetches packets from the jitter buffer.
  4. 22. The method of claim 20, wherein the step of adapting the size of the jitter buffer further comprises adding or removing a sample when a packet is decoded in the decoder.
  5. 23. The method of claim 20, wherein the step of adapting the size of the jitter buffer further comprises the steps of:
    adding one or more pitch periods upon decoding in the decoder if the size of the jitter buffer needs to be increased; or
    removing one or more pitch periods in the decoder, if the size of the jitter buffer needs to be reduced.
  6. 24. The method of claim 20, wherein the step of adapting the size of the jitter buffer further comprises adding one or more frames to a packet or removing one or more frames from a packet.
  7. 25. The method of claim 20, wherein the step of using information about the received signal to determine whether the size of the jitter buffer needs to be adapted includes determining a time difference between an arrival time of a packet to the jitter buffer and a time at which the packet is fetched by the decoder.
  8. 26. The method of claim 20, wherein the step of decoding further comprises using a CELP-decoder for adaptation control.
  9. 27. The method of claim 20, wherein the step of adapting the size of the jitter buffer by extending or compressing further comprises the steps of:
    performing an LPC-analysis; and
    performing an LPC-synthesis.
  10. 28-29. (canceled)
  11. 30. The method of claim 20, wherein the step of using information about the received signal to determine whether the size of the jitter buffer needs to be adapted includes the steps of:
    determining whether a probability of packet loss exceeds a maximum threshold value; and
    increasing the size of the jitter buffer if the probability of packet loss exceeds the maximum threshold value and the current size of the jitter buffer is less than a maximum allowable size.
  12. 31. The method of claim 20, wherein the step of using information about the received signal to determine whether the size of the jitter buffer needs to be adapted includes the steps of:
    determining whether a probability of packet loss exceeds a maximum threshold value; and
    increasing the size of the jitter buffer if the probability of packet loss is less than the maximum threshold value and the current size of the jitter buffer is greater than a minimum allowable size.
US11745210 2000-05-31 2007-05-07 Method of dynamically adapting the size of a jitter buffer Abandoned US20070206645A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
SE0002016 2000-05-31
SE0002016-4 2000-05-31
PCT/SE2001/001140 WO2001093516A1 (en) 2000-05-31 2001-05-22 Arrangement and method relating to communication of speech
US29685903 true 2003-10-06 2003-10-06
US11745210 US20070206645A1 (en) 2000-05-31 2007-05-07 Method of dynamically adapting the size of a jitter buffer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11745210 US20070206645A1 (en) 2000-05-31 2007-05-07 Method of dynamically adapting the size of a jitter buffer

Publications (1)

Publication Number Publication Date
US20070206645A1 true true US20070206645A1 (en) 2007-09-06

Family

ID=20279894

Family Applications (2)

Application Number Title Priority Date Filing Date
US10296859 Active 2022-04-24 US7246057B1 (en) 2000-05-31 2000-05-31 System for handling variations in the reception of a speech signal consisting of packets
US11745210 Abandoned US20070206645A1 (en) 2000-05-31 2007-05-07 Method of dynamically adapting the size of a jitter buffer

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10296859 Active 2022-04-24 US7246057B1 (en) 2000-05-31 2000-05-31 System for handling variations in the reception of a speech signal consisting of packets

Country Status (5)

Country Link
US (2) US7246057B1 (en)
EP (1) EP1293072B1 (en)
DE (2) DE60129327D1 (en)
ES (1) ES2287133T3 (en)
WO (1) WO2001093516A1 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030091160A1 (en) * 2001-10-03 2003-05-15 Global Ip Sound Ab Network media playout
US20060045139A1 (en) * 2004-08-30 2006-03-02 Black Peter J Method and apparatus for processing packetized data in a wireless communication system
US20060077994A1 (en) * 2004-10-13 2006-04-13 Spindola Serafin D Media (voice) playback (de-jitter) buffer adjustments base on air interface
US20060171373A1 (en) * 2005-02-02 2006-08-03 Dunling Li Packet loss concealment for voice over packet networks
US20060206318A1 (en) * 2005-03-11 2006-09-14 Rohit Kapoor Method and apparatus for phase matching frames in vocoders
US20060206334A1 (en) * 2005-03-11 2006-09-14 Rohit Kapoor Time warping frames inside the vocoder by modifying the residual
US20070263672A1 (en) * 2006-05-09 2007-11-15 Nokia Corporation Adaptive jitter management control in decoder
US20080013451A1 (en) * 2004-03-22 2008-01-17 Jean-Luc Soulard Temporal Slaving Device
US20100034332A1 (en) * 2006-12-06 2010-02-11 Enstroem Daniel Jitter buffer control
US20100082851A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Balancing usage of hardware devices among clients
US20100083256A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Temporal batching of i/o jobs
US20100083274A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Hardware throughput saturation detection
US20100220677A1 (en) * 2007-10-09 2010-09-02 Hang Li Method and device for transmitting voice in wireless system
US20120095758A1 (en) * 2010-10-15 2012-04-19 Motorola Mobility, Inc. Audio signal bandwidth extension in celp-based speech coder
US20120123774A1 (en) * 2010-09-30 2012-05-17 Electronics And Telecommunications Research Institute Apparatus, electronic apparatus and method for adjusting jitter buffer
CN103987009A (en) * 2013-02-13 2014-08-13 森海塞尔通信公司 Method for operating a hearing device and hearing device
US20140334484A1 (en) * 2012-06-24 2014-11-13 Oren KLIMKER System, device, and method of voice-over-ip communication
US20160180857A1 (en) * 2013-06-21 2016-06-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Jitter Buffer Control, Audio Decoder, Method and Computer Program
WO2017058815A1 (en) * 2015-09-29 2017-04-06 Dolby Laboratories Licensing Corporation Method and system for handling heterogeneous jitter

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7047190B1 (en) * 1999-04-19 2006-05-16 At&Tcorp. Method and apparatus for performing packet loss or frame erasure concealment
US7117156B1 (en) 1999-04-19 2006-10-03 At&T Corp. Method and apparatus for performing packet loss or frame erasure concealment
US7246057B1 (en) 2000-05-31 2007-07-17 Telefonaktiebolaget Lm Ericsson (Publ) System for handling variations in the reception of a speech signal consisting of packets
US7110422B1 (en) * 2002-01-29 2006-09-19 At&T Corporation Method and apparatus for managing voice call quality over packet networks
DE50201096D1 (en) * 2002-03-28 2004-10-28 Siemens Schweiz Ag Zuerich A method for adjusting a jitter buffer size in a media gateway
US7564810B2 (en) * 2002-05-08 2009-07-21 Microsoft Corporation Method and system for managing power consumption of a network interface module in a wireless computing device
DE502004011378D1 (en) 2003-03-20 2010-08-26 Siemens Ag A method for controlling a jitter buffer and Jitterpufferregelschaltung
US20050049853A1 (en) * 2003-09-01 2005-03-03 Mi-Suk Lee Frame loss concealment method and device for VoIP system
GB2405773B (en) * 2003-09-02 2006-11-08 Siemens Ag A method of controlling provision of audio communication on a network
US7596488B2 (en) * 2003-09-15 2009-09-29 Microsoft Corporation System and method for real-time jitter control and packet-loss concealment in an audio signal
US7674096B2 (en) * 2004-09-22 2010-03-09 Sundheim Gregroy S Portable, rotary vane vacuum pump with removable oil reservoir cartridge
US7418013B2 (en) * 2004-09-22 2008-08-26 Intel Corporation Techniques to synchronize packet rate in voice over packet networks
US7924711B2 (en) * 2004-10-20 2011-04-12 Qualcomm Incorporated Method and apparatus to adaptively manage end-to-end voice over internet protocol (VolP) media latency
US7599399B1 (en) * 2005-04-27 2009-10-06 Sprint Communications Company L.P. Jitter buffer management
US7916742B1 (en) * 2005-05-11 2011-03-29 Sprint Communications Company L.P. Dynamic jitter buffer calibration
US7831421B2 (en) * 2005-05-31 2010-11-09 Microsoft Corporation Robust decoder
EP1946293A1 (en) 2005-11-07 2008-07-23 Telefonaktiebolaget L M Ericsson (PUBL) Method and arrangement in a mobile telecommunication network
US8213444B1 (en) 2006-02-28 2012-07-03 Sprint Communications Company L.P. Adaptively adjusting jitter buffer characteristics
US7796626B2 (en) 2006-09-26 2010-09-14 Nokia Corporation Supporting a decoding of frames
KR101418354B1 (en) * 2007-10-23 2014-07-10 삼성전자주식회사 Apparatus and method for playout scheduling in voice over internet protocol system
US20090157396A1 (en) * 2007-12-17 2009-06-18 Infineon Technologies Ag Voice data signal recording and retrieving
US8612242B2 (en) * 2010-04-16 2013-12-17 St-Ericsson Sa Minimizing speech delay in communication devices
US20110257964A1 (en) * 2010-04-16 2011-10-20 Rathonyi Bela Minimizing Speech Delay in Communication Devices
US9177570B2 (en) 2011-04-15 2015-11-03 St-Ericsson Sa Time scaling of audio frames to adapt audio processing to communications network timing
US8831001B2 (en) * 2012-06-24 2014-09-09 Audiocodes Ltd. Device, system, and method of voice-over-IP communication
US9787416B2 (en) 2012-09-07 2017-10-10 Apple Inc. Adaptive jitter buffer management for networks with varying conditions

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5812965A (en) * 1995-10-13 1998-09-22 France Telecom Process and device for creating comfort noise in a digital speech transmission system
US6366959B1 (en) * 1997-10-01 2002-04-02 3Com Corporation Method and apparatus for real time communication system buffer size and error correction coding selection
US6377931B1 (en) * 1999-09-28 2002-04-23 Mindspeed Technologies Speech manipulation for continuous speech playback over a packet network

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995022233A1 (en) 1994-02-11 1995-08-17 Newbridge Networks Corporation Method of dynamically compensating for variable transmission delays in packet networks
US5825771A (en) 1994-11-10 1998-10-20 Vocaltec Ltd. Audio transceiver
US6466550B1 (en) * 1998-11-11 2002-10-15 Cisco Technology, Inc. Distributed conferencing system utilizing data networks
US6452950B1 (en) * 1999-01-14 2002-09-17 Telefonaktiebolaget Lm Ericsson (Publ) Adaptive jitter buffering
US6526139B1 (en) * 1999-11-03 2003-02-25 Tellabs Operations, Inc. Consolidated noise injection in a voice processing system
US7027989B1 (en) * 1999-12-17 2006-04-11 Nortel Networks Limited Method and apparatus for transmitting real-time data in multi-access systems
US6975629B2 (en) * 2000-03-22 2005-12-13 Texas Instruments Incorporated Processing packets based on deadline intervals
US7246057B1 (en) 2000-05-31 2007-07-17 Telefonaktiebolaget Lm Ericsson (Publ) System for handling variations in the reception of a speech signal consisting of packets
EP1359698B1 (en) * 2002-04-30 2005-01-12 Psytechnics Ltd Method and apparatus for transmission error characterisation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5812965A (en) * 1995-10-13 1998-09-22 France Telecom Process and device for creating comfort noise in a digital speech transmission system
US6366959B1 (en) * 1997-10-01 2002-04-02 3Com Corporation Method and apparatus for real time communication system buffer size and error correction coding selection
US6377931B1 (en) * 1999-09-28 2002-04-23 Mindspeed Technologies Speech manipulation for continuous speech playback over a packet network

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7453897B2 (en) * 2001-10-03 2008-11-18 Global Ip Solutions, Inc. Network media playout
US20030091160A1 (en) * 2001-10-03 2003-05-15 Global Ip Sound Ab Network media playout
US8068420B2 (en) * 2004-03-22 2011-11-29 Thomson Licensing Temporal slaving device
US20080013451A1 (en) * 2004-03-22 2008-01-17 Jean-Luc Soulard Temporal Slaving Device
US8331385B2 (en) 2004-08-30 2012-12-11 Qualcomm Incorporated Method and apparatus for flexible packet selection in a wireless communication system
US20060050743A1 (en) * 2004-08-30 2006-03-09 Black Peter J Method and apparatus for flexible packet selection in a wireless communication system
US7830900B2 (en) * 2004-08-30 2010-11-09 Qualcomm Incorporated Method and apparatus for an adaptive de-jitter buffer
US7826441B2 (en) 2004-08-30 2010-11-02 Qualcomm Incorporated Method and apparatus for an adaptive de-jitter buffer in a wireless communication system
US7817677B2 (en) 2004-08-30 2010-10-19 Qualcomm Incorporated Method and apparatus for processing packetized data in a wireless communication system
US20060045138A1 (en) * 2004-08-30 2006-03-02 Black Peter J Method and apparatus for an adaptive de-jitter buffer
US20060045139A1 (en) * 2004-08-30 2006-03-02 Black Peter J Method and apparatus for processing packetized data in a wireless communication system
US20110222423A1 (en) * 2004-10-13 2011-09-15 Qualcomm Incorporated Media (voice) playback (de-jitter) buffer adjustments based on air interface
US20060077994A1 (en) * 2004-10-13 2006-04-13 Spindola Serafin D Media (voice) playback (de-jitter) buffer adjustments base on air interface
US8085678B2 (en) 2004-10-13 2011-12-27 Qualcomm Incorporated Media (voice) playback (de-jitter) buffer adjustments based on air interface
US7359409B2 (en) * 2005-02-02 2008-04-15 Texas Instruments Incorporated Packet loss concealment for voice over packet networks
US20060171373A1 (en) * 2005-02-02 2006-08-03 Dunling Li Packet loss concealment for voice over packet networks
US8355907B2 (en) 2005-03-11 2013-01-15 Qualcomm Incorporated Method and apparatus for phase matching frames in vocoders
US8155965B2 (en) 2005-03-11 2012-04-10 Qualcomm Incorporated Time warping frames inside the vocoder by modifying the residual
US20060206334A1 (en) * 2005-03-11 2006-09-14 Rohit Kapoor Time warping frames inside the vocoder by modifying the residual
US20060206318A1 (en) * 2005-03-11 2006-09-14 Rohit Kapoor Method and apparatus for phase matching frames in vocoders
US20070263672A1 (en) * 2006-05-09 2007-11-15 Nokia Corporation Adaptive jitter management control in decoder
US20100034332A1 (en) * 2006-12-06 2010-02-11 Enstroem Daniel Jitter buffer control
US8472320B2 (en) * 2006-12-06 2013-06-25 Telefonaktiebolaget Lm Ericsson (Publ) Jitter buffer control
US8331269B2 (en) * 2007-10-09 2012-12-11 Beijing Xinwei Telecom Technology Inc. Method and device for transmitting voice in wireless system
US20100220677A1 (en) * 2007-10-09 2010-09-02 Hang Li Method and device for transmitting voice in wireless system
US8479214B2 (en) * 2008-09-30 2013-07-02 Microsoft Corporation Hardware throughput saturation detection
US8245229B2 (en) 2008-09-30 2012-08-14 Microsoft Corporation Temporal batching of I/O jobs
US8645592B2 (en) 2008-09-30 2014-02-04 Microsoft Corporation Balancing usage of hardware devices among clients
US20100083274A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Hardware throughput saturation detection
US8346995B2 (en) 2008-09-30 2013-01-01 Microsoft Corporation Balancing usage of hardware devices among clients
US20100083256A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Temporal batching of i/o jobs
US20100082851A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Balancing usage of hardware devices among clients
US20120123774A1 (en) * 2010-09-30 2012-05-17 Electronics And Telecommunications Research Institute Apparatus, electronic apparatus and method for adjusting jitter buffer
US8843379B2 (en) * 2010-09-30 2014-09-23 Electronics And Telecommunications Research Institute Apparatus, electronic apparatus and method for adjusting jitter buffer
US20120095758A1 (en) * 2010-10-15 2012-04-19 Motorola Mobility, Inc. Audio signal bandwidth extension in celp-based speech coder
US8924200B2 (en) * 2010-10-15 2014-12-30 Motorola Mobility Llc Audio signal bandwidth extension in CELP-based speech coder
US9313338B2 (en) * 2012-06-24 2016-04-12 Audiocodes Ltd. System, device, and method of voice-over-IP communication
US20140334484A1 (en) * 2012-06-24 2014-11-13 Oren KLIMKER System, device, and method of voice-over-ip communication
US20140226830A1 (en) * 2013-02-13 2014-08-14 Sennheiser Communications A/S Method for operating a hearing device and hearing device
US9894445B2 (en) * 2013-02-13 2018-02-13 Sennheiser Communications A/S Method for operating a hearing device and hearing device
CN103987009A (en) * 2013-02-13 2014-08-13 森海塞尔通信公司 Method for operating a hearing device and hearing device
US20160180857A1 (en) * 2013-06-21 2016-06-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Jitter Buffer Control, Audio Decoder, Method and Computer Program
WO2017058815A1 (en) * 2015-09-29 2017-04-06 Dolby Laboratories Licensing Corporation Method and system for handling heterogeneous jitter

Also Published As

Publication number Publication date Type
WO2001093516A1 (en) 2001-12-06 application
DE60129327D1 (en) 2007-08-23 grant
EP1293072B1 (en) 2007-07-11 grant
ES2287133T3 (en) 2007-12-16 grant
US7246057B1 (en) 2007-07-17 grant
DE60129327T2 (en) 2008-03-20 grant
EP1293072A1 (en) 2003-03-19 application

Similar Documents

Publication Publication Date Title
US6606593B1 (en) Methods for generating comfort noise during discontinuous transmission
US5995923A (en) Method and apparatus for improving the voice quality of tandemed vocoders
US7117156B1 (en) Method and apparatus for performing packet loss or frame erasure concealment
US5835889A (en) Method and apparatus for detecting hangover periods in a TDMA wireless communication system using discontinuous transmission
US6658027B1 (en) Jitter buffer management
US6898566B1 (en) Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal
US20040081106A1 (en) Delay trading between communication links
US7117152B1 (en) System and method for speech recognition assisted voice communications
US20040120309A1 (en) Methods for changing the size of a jitter buffer and for time alignment, communications system, receiving end, and transcoder
US20050154584A1 (en) Method and device for efficient frame erasure concealment in linear predictive based speech codecs
US7047190B1 (en) Method and apparatus for performing packet loss or frame erasure concealment
US7124079B1 (en) Speech coding with comfort noise variability feature for increased fidelity
US7002913B2 (en) Packet loss compensation method using injection of spectrally shaped noise
US20050243846A1 (en) Method and apparatus providing continuous adaptive control of voice packet buffer at receiver terminal
US6526139B1 (en) Consolidated noise injection in a voice processing system
US20060187970A1 (en) Method and apparatus for handling network jitter in a Voice-over IP communications network using a virtual jitter buffer and time scale modification
US7092875B2 (en) Speech transcoding method and apparatus for silence compression
US20050058145A1 (en) System and method for real-time jitter control and packet-loss concealment in an audio signal
US20020075857A1 (en) Jitter buffer and lost-frame-recovery interworking
US20030212548A1 (en) Apparatus and method for improved voice activity detection
US20040258047A1 (en) Clock difference compensation for a network
US6889187B2 (en) Method and apparatus for improved voice activity detection in a packet voice network
US6356545B1 (en) Internet telephone system with dynamically varying codec
US20040076271A1 (en) Audio signal quality enhancement in a digital network
US20070050189A1 (en) Method and apparatus for comfort noise generation in speech communication systems