WO2002091691A1 - Multiplexed coding - Google Patents

Multiplexed coding Download PDF

Info

Publication number
WO2002091691A1
WO2002091691A1 PCT/EP2002/004713 EP0204713W WO02091691A1 WO 2002091691 A1 WO2002091691 A1 WO 2002091691A1 EP 0204713 W EP0204713 W EP 0204713W WO 02091691 A1 WO02091691 A1 WO 02091691A1
Authority
WO
WIPO (PCT)
Prior art keywords
segments
segment
communication node
digital
samples
Prior art date
Application number
PCT/EP2002/004713
Other languages
French (fr)
Inventor
Sören Vang ANDERSEN
Alan Duric
Gernot Kubin
Original Assignee
Global Ip Sound Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Global Ip Sound Ab filed Critical Global Ip Sound Ab
Publication of WO2002091691A1 publication Critical patent/WO2002091691A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • H04L2012/6481Speech, voice

Definitions

  • the present invention relates to signal processing in connection with digital signal transmission over, and reception from, a packet switched network.
  • Real-time transmissions over packet switched networks such as speech, audio or video over Internet Protocol based networks (mainly the Internet or Intranet networks)
  • Internet Protocol based networks mainly the Internet or Intranet networks
  • features include such things as relatively low operating costs, easy integration of new services, and one network for both non-real-time and real-time data.
  • Real-time data typically a speech, an audio or a video signal
  • Real-time data are in packet switched systems converted into a digital signal, i.e. into a bitstream, which is divided in portions of suitable size in order to be transmitted in data packets over the packet switched network from a transmitter end to a receiver end.
  • packet switched networks originally were designed for transmission of non-real-time data, transmissions of real-time data over such networks causes some problems.
  • Data packets can be lost during transmission, as they can be deliberately discarded by the network due to congestion problems or transmission errors. In non-realtime applications this is not a problem since a lost packet can be retransmitted. However, retransmission is not a possible solution for real-time applications that are delay sensitive. A packet that arrives too late to a real-time application cannot be used to reconstruct the corresponding signal since this signal already has been, or should have been, delivered to the receiving end, e.g. for playback by a speaker or for visualisation on a display screen. Therefore, a packet that arrives too late is equivalent to a lost packet.
  • the main problem with lost or delayed data packets is the introduction of distortion in the reconstructed signal.
  • the distortion results from the fact that signal segments conveyed by lost or delayed data packets cannot be reconstructed .
  • a predictive coding method encodes a signal pattern based on dependencies between the pattern representations. It encodes the signal for transmission with a fixed bit rate and with a trade-off between the signal quality and the transmitted bit rate.
  • Examples of predictive coding methods used for speech are Adaptive Predictive Coding (APC) and Code Excited Linear Prediction (CELP) , which both coding methods are well known to a person skilled in the art.
  • a first class adds redundancy and significant delay at the transmitter side, e.g., with loss-resilient codes or with multiple- description source/channel coding.
  • a second class attempts perceptual concealment of the packet loss through signal interpolation at the receiver side.
  • the path sharing traffic can be the path between a base station and another node, such as a Mobile Switching Centre, in a mobile communication system, in which the sources or destinations are Mobile Stations, or other radio transceivers, connected to the base station via radio links.
  • a base station and another node, such as a Mobile Switching Centre, in a mobile communication system, in which the sources or destinations are Mobile Stations, or other radio transceivers, connected to the base station via radio links.
  • the invention is also applicable when traffic share the same connection between two interconnected routers, or communication nodes, in which case the paths are separated before and after those routers.
  • traffic may be transmitted from the same gateway or telecommunications switch and share the same path to a router.
  • traffic may be received by the same gateway or telecommunications switch after having shared the same path from a router.
  • the multiplexing is preferably performed by multiplexing representations, e.g. quantized representations, from each one of the multiple segments, i.e. from each segment of a set of segments, or by multiplexing parts from each one of the multiple segments based on linear combinations of digital samples from all of the multiple segments.
  • multiplexing representations e.g. quantized representations
  • loss-resilient codes are added to the segments in order to be able to regenerate parts of segments that have been lost due to packet losses. These loss-resilient codes further alleviate the . effect of packet losses in a system operating in accordance with the present invention.
  • the present invention is suitable for use in packet switched networks were packet loss may occur.
  • the invention provides robust coding with no redundancy overhead, even though redundancy can be added in a L > t H ⁇ >
  • Fig. 1 shows the basic principle of the multiplexing scheme in accordance with an embodiment of the invention in an exemplified, and schematically illustrated, system
  • Fig. 2 shows two Media Gateways adapted for transmitting and receiving multiplexed segment representations in accordance with an embodiment of the invention
  • Fig. 3 shows a multiplexed encoding arrangement transmitting multiple segments to a multiplexed decoding arrangements, in accordance with embodiments of the invention.
  • a first communication node (Media Gateway) 100 receives a set of multiple segments.
  • Each segment belongs to a respective encoded real-time signal.
  • S' , S" and S'" with respective segments are depicted in Fig. 1, however, any number (greater than one) of signals can be subject to the practising of the present invention.
  • each segment includes a very limited number of elements in its representation. All the three signals share a common path from the first OJ J t IS ⁇ > ⁇ 1
  • Fig. 1 the incoming segments of signals S' , S" and S'" to the Media Gateway 100 are either already encoded by their respective sources, or they are encoded by the Media Gateway 100.
  • Fig. 1 it can also be seen that if a data packet and its payload is lost, every third part of a segment representation will be lost.
  • loss resilient coding may advantageously be added to the multiplexing scheme.
  • the resulting redundancy bits ( r , r 2 and r 3 ) of such loss resilient coding have been included in the segments and packet payloads shown in Fig. 1. Again, the number of redundancy bits indicated in each segment/payload is merely chosen for ease of description, any number of redundancy bits can be used.
  • the loss resilient coding adding the redundancy bits is in Fig. 1 indicated as having been performed by the respective sources, however, this can alternatively be performed by the Media Gateway 100.
  • the loss resilient coding is preferably added by applying a coding method appreciated by a person skilled in the art, such as Reed-Solomon coding.
  • a person skilled in the art will also appreciate that by means of the segment parts and redundancy bits that are received by Media Gateway 110, lost segment representation parts in a lost packet payload can be reconstructed.
  • Each Media Gateway is controlled by its respective Media Gateway
  • the MGCP is in essence a master/slave protocol where the Media Gateway executes the commands that are sent by the Media Gateway Controller.
  • the transmitting Media Gateway 200 is controlled by Media Gateway Controller 210 and includes a MUX server function 220.
  • the MUX server function is vendor specific and does not effect other architecture elements or any standardized protocol.
  • the receiving Media Gateway 250 is controlled by Media Gateway Controller 260 and includes a vendor specific DeMUX server function 270.
  • Inputted to each Coder 230 at the transmitting MG 200 are multiple real-time data streams.
  • the MGC 210 of MG 200 Upon setting up a multiplexed connection between the two MG' s 200 and 250, the MGC 210 of MG 200 will issue a call set-up to MGC 260 of MG 250.
  • the MUX server function 220 includes a table of what coders that are included by MG 200 and the destination of their encoded data streams. This table is built by listening to the signalling between the MGC 210 and MG 200 when setting up the Coders 230 and the destinations of their respective data streams.
  • the MUX server function 220 is implemented as a process which receives segment parts outputted from the Coder 230 processes. Using its included table the MUX server function 220 multiplexes the segment parts received from the different Coders 230. When a data packet payload has been assembled, this payload will be outputted to one of the waiting Coder processes. This Coder will in turn output the assembled payload to a Packetizer 240. In case the path between the two MG's is over an Internet Protocol network, the Packetizer 240 will add IP headers etc. to the packet streams.
  • the DeMUX server function 270 will listen to this signalling and build a table to be used for de-multiplexing of received data packet payloads that includes multiplexed segment representation parts.
  • packet payloads are received by the Decoders 280 from a Depacketizer 290, these payloads will be outputted to the DeMUX server function process 270.
  • the DeMUX process After the DeMUX process has disassembled a number of payloads and extracted segment representation parts therefrom, parts that together form a complete segment will be transferred as a full segment representation to one of the waiting Decoders 280 for further transmission to its destination.
  • Media Gateway 200 and Media Gateway 250 act as a first and a second communication node, respectively.
  • Media Gateway 200 encodes and assembles multiple segments and Media Gateway 250 performs disassembling and decoding with respect to the segments.
  • the encoding performed by the set of Coders 230 could be located at the respective signal " sources and the decoding performed by the set of Decoders 280 at the respective signal destinations.
  • Fig. 3 the system is based on K scalar adaptive predictive speech coders with noise feedback coding of the kind proposed by Atal and Schroeder in "Predictive coding of speech signals and subjective error criteria", B.S. Atal and M.R. Schroeder, IEEE Trans. Acoust . Speech Signal Processing, vol. 27, no. 3, pp. 247-254, 1979.
  • the k-. th predictive encoder and decoder are shown.
  • the traditional single-input single-output scalar quantizer have been replaced with what is herein referred to as a multiplexed quantizer 300.
  • the multiplexed quantizer 300 takes at each sampling instant n the K inputs to q from all K predictive encoders and outputs quantized representations to qf t back to the individual predictive encoders. In doing this, the multiplexed quantizer 300 generates K quantization indices to t,f each in the range 1 to 2 b where b is the number of bits allocated for each index. These indices are packetized and transmitted over the packet-switched network in K independent packet streams 310..320. At the decoder side, the received indices t,, 1 to i n ⁇ are available.
  • the demultiplexer 330 resolves K representations of the quantized information from the available indices z n 1 to i n ⁇ . These representations are subsequently input to the K predictive decoders.
  • the side information is conveniently encoded with a loss-resilient coding such as the Reed-Solomon code to enable transmission of this information in a robust manner using the same K packet streams as the multiplexed quantized information.
  • a loss-resilient coding such as the Reed-Solomon code
  • the Reed-Solomon code is described in "Reed-Solomon Codes and their Applications", S.B. Wicker and V.K. Bhargava, IEEE Press, New York, 1994.
  • Fig. 3 three different multiplexing schemes in accordance with different embodiments of the invention will now be described: packet hopping, Hadamard multiplexing, and an extension of the Hadamard multiplexing that exploits a nonlinear preprocessing and estimation method.
  • the multiplexed quantizer block is a system in which indices from scalar quantizers are hopped, i.e., cycled from one packet stream to the next as the sample instant n increments .
  • indices from scalar quantizers are hopped, i.e., cycled from one packet stream to the next as the sample instant n increments .
  • the packet hopping described above can be expressed as a particular orthogonal transformation of the input to the multiplexed quantizer followed by a quantization of the transform output. If we define column vectors with elements equal to the K scalar input, output, or index values for the multiplexed quantizer and demultiplexer, such that e.g.,
  • the multiplexed quantizer is defined by the following equations :
  • the demultiplexer on the receiver side is defined by:
  • Equation 1 to 3 The equivalence of these equations with the packet hopping described by Equations 1 to 3 is obtained by letting the transform matrix M consult equate an adequate time varying row or column permutation of an identity matrix.
  • transform matrix M With this formulation of the multiplexing, it is relevant to introduce, in alternative embodiments, other transform matrices than the row or column permuted identity matrix.
  • a simple, yet relevant, transform for this purpose is the normalized Hadamard transform disclosed in "Orthogonal Designs; Quadratic Forms and
  • the general method suitable to use is to zero K z of the K inputs to the multiplexed quantizer prior to applying the transform M B .
  • the K z inputs with lowest amplitudes are set to zero.
  • no information about the position of zero valued elements in q n is conveyed to the decoder, only the knowledge that K Z of the elements were zero is exploited.
  • the method is described below. Suppose that the number of lost packet streams is K ⁇ ps and the lost packet streams are indexed by an integer set k, . Then we may formulate a set of equations relating to the received coefficients c n with the encoded scaled prediction errors q n .
  • a is a K ⁇ ps dimensional unknown vector.
  • M ⁇ ( k Zi . , k lps ) has full rank . Furthermore,
  • Equation 5 the method applicable in the decoder is to calculate a for all C x possible choices of k z; . and select the a vector that occurred C 2 times.
  • q n is obtained as the right-hand side of Equation 4.
  • the speech files were encoded at 40 kbps and decoded after that a percentage of the data packets had been randomly dropped. Random packet loss rates between 0% and 40% were simulated.
  • a 64 kbps /-law quantization and packet loss concealment (PLC) according to the ITU-G.711 standard was used. The reference system was simulated for the same speech files and packet losses as the multiplexed predictive coders.
  • multiplexed predictive coding was preferred over G.711 PLC for packet loss rates in the range from 10% to 40%.
  • the multiplexed predictive coders can be designed to operate at significantly lower bit rates than the G.711 standard.

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)

Abstract

The present invention relates to a signal processing in connection with digital signal transmission over, and reception from, a packet switched network. The invention is applicable when a set of digital signals share a common transmission path. The information about a segment of each digital signal among the set of digital signals is distributed in several data packets. Data packet payloads are assembled such way that each payload includes a part of the segment representation of several segments of a set of multiple signal segments, i.e. any data packet will contain a part of the complete transmitted information about several ones of the multiple signal segments. Effects on a reconstructed signal due to lost packets are alleviated since a lost packet only leads to a partial loss of information for each segment of a set of multiple segments.

Description

MULTIPLEXED CODING
Technical Field of the Invention
The present invention relates to signal processing in connection with digital signal transmission over, and reception from, a packet switched network.
Technical Background and Prior Art
Real-time transmissions over packet switched networks, such as speech, audio or video over Internet Protocol based networks (mainly the Internet or Intranet networks) , has become increasingly attractive due to a number of features . These features include such things as relatively low operating costs, easy integration of new services, and one network for both non-real-time and real-time data. Real-time data, typically a speech, an audio or a video signal, are in packet switched systems converted into a digital signal, i.e. into a bitstream, which is divided in portions of suitable size in order to be transmitted in data packets over the packet switched network from a transmitter end to a receiver end. As packet switched networks originally were designed for transmission of non-real-time data, transmissions of real-time data over such networks causes some problems. Data packets can be lost during transmission, as they can be deliberately discarded by the network due to congestion problems or transmission errors. In non-realtime applications this is not a problem since a lost packet can be retransmitted. However, retransmission is not a possible solution for real-time applications that are delay sensitive. A packet that arrives too late to a real-time application cannot be used to reconstruct the corresponding signal since this signal already has been, or should have been, delivered to the receiving end, e.g. for playback by a speaker or for visualisation on a display screen. Therefore, a packet that arrives too late is equivalent to a lost packet.
When transferring a real-time signal as packets, the main problem with lost or delayed data packets is the introduction of distortion in the reconstructed signal. The distortion results from the fact that signal segments conveyed by lost or delayed data packets cannot be reconstructed .
When transferring a signal it is most often desired to use as little bandwidth as possible. As is well known, many signals have patterns containing redundancies. Appropriate coding methods can avoid the transmission of the redundant information thereby enabling a more bandwidth effective transmission of the signal . Typical coding methods taking advantage of such redundancies are predictive coding methods. A predictive coding method encodes a signal pattern based on dependencies between the pattern representations. It encodes the signal for transmission with a fixed bit rate and with a trade-off between the signal quality and the transmitted bit rate. Examples of predictive coding methods used for speech are Adaptive Predictive Coding (APC) and Code Excited Linear Prediction (CELP) , which both coding methods are well known to a person skilled in the art. To alleviate the effects that lost packets have on the reconstructed signal with respect to lost signal information, two classes of signal processing methods have been proposed in the prior art . A first class adds redundancy and significant delay at the transmitter side, e.g., with loss-resilient codes or with multiple- description source/channel coding. A second class attempts perceptual concealment of the packet loss through signal interpolation at the receiver side. Thus, the drawbacks with these methods are either increased network payload data rate and increased transmission delay, or degraded perceptual quality of the received signal, or both.
Figure imgf000005_0001
ω t to μ> o H LΠ LΠ σ LΠ o Π
Figure imgf000006_0001
interconnects two circuit-switched networks with gateways performing dedicated point-to-point communications.
As another example, the path sharing traffic can be the path between a base station and another node, such as a Mobile Switching Centre, in a mobile communication system, in which the sources or destinations are Mobile Stations, or other radio transceivers, connected to the base station via radio links.
Yet other applications are so called streaming and multicasting in packet data networks.
As mentioned, the invention is also applicable when traffic share the same connection between two interconnected routers, or communication nodes, in which case the paths are separated before and after those routers.
Furthermore, traffic may be transmitted from the same gateway or telecommunications switch and share the same path to a router. Correspondingly, traffic may be received by the same gateway or telecommunications switch after having shared the same path from a router.
The multiplexing is preferably performed by multiplexing representations, e.g. quantized representations, from each one of the multiple segments, i.e. from each segment of a set of segments, or by multiplexing parts from each one of the multiple segments based on linear combinations of digital samples from all of the multiple segments.
According to an embodiment, loss-resilient codes are added to the segments in order to be able to regenerate parts of segments that have been lost due to packet losses. These loss-resilient codes further alleviate the . effect of packet losses in a system operating in accordance with the present invention.
Thus, the present invention is suitable for use in packet switched networks were packet loss may occur. The invention provides robust coding with no redundancy overhead, even though redundancy can be added in a L > t H μ>
LΠ O LΠ O Lπ o Lπ rt rt H φ 0 ii φ φ OJ Ω
13 CQ a Φ
Ω CQ μ-
Ω OJ 3 <
G μ-1 μ- φ
3 OJ CQ ϋ
13 i CQ h-1 μ- 13
Φ ιQ G Φ
X a a ii μ- Hi rt a Φ G
^ rt M ϋ μ- H 3
3 N G CQ
0 OJ ϋ
P. rt CQ 0) φ μ- 13
H 0 Pi 13
OJ a a fi rt φ 0
Φ OJ 13 a rt ii
Hi Pi 0 μ-
0 OJ
H 13 13 rt ii OJ φ rt Φ Ω tr Pi W 13
Φ μ- Φ Φ
Ω rt H
Φ rt Ω a μ- h-1 Φ rt <! 0 13 μ- Φ CQ rt ϋ CQ a φ Ω Φ OJ
0 CQ \->
CQ P.
«<: Φ =>
CQ ii H Φ rt CQ tr H- φ Φ LQ
3 μ- tr
** CD ii rt φ μ-
OJ 3 CQ a a OJ rt LQ
P. P. fi
Φ μ- 0
Ω Hi rt rt
0 μ- rt
G tr a Φ
Figure imgf000008_0001
should by no means be interpreted as limiting on the scope of the claims .
The above mentioned and further features of, and advantages with, the present invention, will be more fully understood from the following description, with reference to the accompanying drawings, of exemplifying embodiments thereof .
Brief Description of the Drawings Exemplifying embodiments of the present invention will be described by way of example with reference to the accompanying drawings, in which:
Fig. 1 shows the basic principle of the multiplexing scheme in accordance with an embodiment of the invention in an exemplified, and schematically illustrated, system; Fig. 2 shows two Media Gateways adapted for transmitting and receiving multiplexed segment representations in accordance with an embodiment of the invention; and Fig. 3 shows a multiplexed encoding arrangement transmitting multiple segments to a multiplexed decoding arrangements, in accordance with embodiments of the invention.
Detailed Description of the Invention
With reference to Fig. 1 the principle of multiplexing in accordance with an embodiment of the invention is shown. A first communication node (Media Gateway) 100 receives a set of multiple segments. Each segment belongs to a respective encoded real-time signal. For ease of description, only three signal S' , S" and S'" with respective segments are depicted in Fig. 1, however, any number (greater than one) of signals can be subject to the practising of the present invention. Furthermore, for ease of description, each segment includes a very limited number of elements in its representation. All the three signals share a common path from the first OJ J t IS μ> μ1
LΠ o Lπ o LΠ o LΠ
OJ
Figure imgf000010_0001
In Fig. 1 the incoming segments of signals S' , S" and S'" to the Media Gateway 100 are either already encoded by their respective sources, or they are encoded by the Media Gateway 100. In Fig. 1 it can also be seen that if a data packet and its payload is lost, every third part of a segment representation will be lost. To further alleviate the effect of lost data packets, loss resilient coding may advantageously be added to the multiplexing scheme. The resulting redundancy bits ( r , r2 and r3 ) of such loss resilient coding have been included in the segments and packet payloads shown in Fig. 1. Again, the number of redundancy bits indicated in each segment/payload is merely chosen for ease of description, any number of redundancy bits can be used. Moreover, the loss resilient coding adding the redundancy bits is in Fig. 1 indicated as having been performed by the respective sources, however, this can alternatively be performed by the Media Gateway 100. The loss resilient coding is preferably added by applying a coding method appreciated by a person skilled in the art, such as Reed-Solomon coding. Furthermore, a person skilled in the art will also appreciate that by means of the segment parts and redundancy bits that are received by Media Gateway 110, lost segment representation parts in a lost packet payload can be reconstructed.
With reference to Fig. 2, two Media Gateways, or communication nodes, at respective ends of a path shared by multiple connections are shown. Each Media Gateway (MG) is controlled by its respective Media Gateway
Controller (MGC) over a Media Gateway Control Protocol (MGCP) . The MGCP is in essence a master/slave protocol where the Media Gateway executes the commands that are sent by the Media Gateway Controller. The transmitting Media Gateway 200 is controlled by Media Gateway Controller 210 and includes a MUX server function 220. The MUX server function is vendor specific and does not effect other architecture elements or any standardized protocol. Correspondingly, the receiving Media Gateway 250 is controlled by Media Gateway Controller 260 and includes a vendor specific DeMUX server function 270.
Inputted to each Coder 230 at the transmitting MG 200 are multiple real-time data streams. Upon setting up a multiplexed connection between the two MG' s 200 and 250, the MGC 210 of MG 200 will issue a call set-up to MGC 260 of MG 250. The MUX server function 220 includes a table of what coders that are included by MG 200 and the destination of their encoded data streams. This table is built by listening to the signalling between the MGC 210 and MG 200 when setting up the Coders 230 and the destinations of their respective data streams.
The MUX server function 220 is implemented as a process which receives segment parts outputted from the Coder 230 processes. Using its included table the MUX server function 220 multiplexes the segment parts received from the different Coders 230. When a data packet payload has been assembled, this payload will be outputted to one of the waiting Coder processes. This Coder will in turn output the assembled payload to a Packetizer 240. In case the path between the two MG's is over an Internet Protocol network, the Packetizer 240 will add IP headers etc. to the packet streams.
Correspondingly, upon reception of the call set-up from MGC 210, signalling will occur between the MGC 260 and the Media Gateway 250. The DeMUX server function 270 will listen to this signalling and build a table to be used for de-multiplexing of received data packet payloads that includes multiplexed segment representation parts. When packet payloads are received by the Decoders 280 from a Depacketizer 290, these payloads will be outputted to the DeMUX server function process 270. After the DeMUX process has disassembled a number of payloads and extracted segment representation parts therefrom, parts that together form a complete segment will be transferred as a full segment representation to one of the waiting Decoders 280 for further transmission to its destination.
A person skilled in the art will appreciate other ways of implementing the assembling/disassembling of packet payloads, e.g. by having the MUX server function 220 and DeMUX server function 270 interact with the Packetizer 240 and Depacketizer 290, respectively.
Thus, in Fig. 2, Media Gateway 200 and Media Gateway 250 act as a first and a second communication node, respectively. Media Gateway 200 encodes and assembles multiple segments and Media Gateway 250 performs disassembling and decoding with respect to the segments. Alternatively, the encoding performed by the set of Coders 230 could be located at the respective signal " sources and the decoding performed by the set of Decoders 280 at the respective signal destinations.
With reference to Fig. 3, the multiplexing/demultiplexing of signal segment parts in accordance with exemplifying embodiments of the invention will be described in greater detail .
In Fig. 3 the system is based on K scalar adaptive predictive speech coders with noise feedback coding of the kind proposed by Atal and Schroeder in "Predictive coding of speech signals and subjective error criteria", B.S. Atal and M.R. Schroeder, IEEE Trans. Acoust . Speech Signal Processing, vol. 27, no. 3, pp. 247-254, 1979. In the figure only the k-. th predictive encoder and decoder are shown. In this system the traditional single-input single-output scalar quantizer have been replaced with what is herein referred to as a multiplexed quantizer 300. The multiplexed quantizer 300 takes at each sampling instant n the K inputs
Figure imgf000013_0001
to q from all K predictive encoders and outputs quantized representations
Figure imgf000013_0002
to qft back to the individual predictive encoders. In doing this, the multiplexed quantizer 300 generates K quantization indices to t,f each in the range 1 to 2b where b is the number of bits allocated for each index. These indices are packetized and transmitted over the packet-switched network in K independent packet streams 310..320. At the decoder side, the received indices t,,1 to in κ are available. These indices differ from the indices to i in the encoder whenever an index has been lost with a data packet on the network, in which case the index value is replaced by zero. The demultiplexer 330 resolves K representations
Figure imgf000014_0001
of the quantized information from the available indices zn 1 to in κ . These representations are subsequently input to the K predictive decoders. The resulting encoding and decoding system is shown in Figure 1. Not included in this figure is the LPC analysis and side information quantization which leads to coefficients for the predictors P (z) , noise-feedback filters F (z) , and scaling factors σ* for k = 1 . . K. The side information is conveniently encoded with a loss-resilient coding such as the Reed-Solomon code to enable transmission of this information in a robust manner using the same K packet streams as the multiplexed quantized information. The Reed-Solomon code is described in "Reed-Solomon Codes and their Applications", S.B. Wicker and V.K. Bhargava, IEEE Press, New York, 1994. With further reference to Fig. 3, three different multiplexing schemes in accordance with different embodiments of the invention will now be described: packet hopping, Hadamard multiplexing, and an extension of the Hadamard multiplexing that exploits a nonlinear preprocessing and estimation method.
In a packet hopping scheme the multiplexed quantizer block is a system in which indices from scalar quantizers are hopped, i.e., cycled from one packet stream to the next as the sample instant n increments . One version of this system is specified by the equations
= Q ( q^mod^+l ) (1)
Figure imgf000015_0001
and
~(n+k)mod(K)+l = Q(ξ*} (3)
for k = 1.. K. Here Q ( ) and Q"1 ( • ) are the mappings from quantizer input to quantization index and from quantization index to the quantized representation of the quantizer input, respectively. For an adequate response to packet losses Q_:1"(0) = 0. The notation (•)mod(ii) denotes the modulo K operation.
The packet hopping described above can be expressed as a particular orthogonal transformation of the input to the multiplexed quantizer followed by a quantization of the transform output. If we define column vectors with elements equal to the K scalar input, output, or index values for the multiplexed quantizer and demultiplexer, such that e.g.,
Figure imgf000015_0002
Then the multiplexed quantizer is defined by the following equations :
c„ = M„q„
Figure imgf000015_0003
and The demultiplexer on the receiver side is defined by:
and
q„ = M^c„
The equivalence of these equations with the packet hopping described by Equations 1 to 3 is obtained by letting the transform matrix M„ equate an adequate time varying row or column permutation of an identity matrix. With this formulation of the multiplexing, it is relevant to introduce, in alternative embodiments, other transform matrices than the row or column permuted identity matrix. A simple, yet relevant, transform for this purpose is the normalized Hadamard transform disclosed in "Orthogonal Designs; Quadratic Forms and
Hadamard Matrices", A.V. Geramita and J. Seberry, Marcel Dekker, 1979. The Hadamard multiplexing will hold advantages over the packet hopping method. These advantages are explained in the following. One advantage is that whenever less than K packets are lost in the network there are no full erasures of any sample in any of the quantized prediction error signals. This advantage can be exploited when the Hadamard transform is combined with a nonlinear preprocessing and estimation scheme as described below.
Let us assume the elements of qn to be uncorrelated and neglect the impact of quantization noise. Then the matrix M^ is the linear minimum mean-squared error estimator for qM given the coefficient vector c„ . This estimator is the mean-square optimum for Gaussian qn . However, the gain-scaled linear prediction errors for voiced speech signals are known to be non-Gaussian. Thus, a nonlinear estimator can result in lower mean-squared error. Indeed, we observed in preliminary experiments that nonlinear estimation could lead to a significant decrease of the mean-squared error. However, the nonlinear estimation led to very high computational complexity. Therefore we adopt an alternative method in which a well defined nonlinearity is applied to the input of the multiplexed quantizer. Knowledge of this nonlinearity can then subsequently be exploited to improve the reconstructed quantized prediction errors in the case of packet losses.
The general method suitable to use is to zero Kz of the K inputs to the multiplexed quantizer prior to applying the transform MB . Advantageously, the Kz inputs with lowest amplitudes are set to zero. In the method, no information about the position of zero valued elements in qn is conveyed to the decoder, only the knowledge that KZ of the elements were zero is exploited. The method is described below. Suppose that the number of lost packet streams is K\ps and the lost packet streams are indexed by an integer set k, . Then we may formulate a set of equations relating to the received coefficients cn with the encoded scaled prediction errors qn .
qn = MT n n + M ( ,k,Aa (4)
In this equation a is a K\ps dimensional unknown vector. We have used a matlab-style notation M^(:,k;jM) to denote a matrix consisting of the columns of M^ that are indexed
Figure imgf000017_0001
Now assume that qn had Xlps zero-valued elements indexed by the set kzi : qn ( kz. ) = 0 , then
Figure imgf000017_0002
provided that M^ ( kZi. , klps ) has full rank . Furthermore ,
« = c„ ( k^ ) ( 5 )
The indexing for the zero-valued elements of qH is not known by the decoder, however there are
Figure imgf000018_0001
ways in which the decoder can assume K\ps elements of qn to be zero. Of these
Figure imgf000018_0002
will be true assumptions . Whenever izi > Kχps there are multiple true assumptions and all true assumptions will result in the same value for a , i.e., the one given in Equation 5. Thus, the method applicable in the decoder is to calculate a for all Cx possible choices of kz;. and select the a vector that occurred C2 times. Hereafter qn is obtained as the right-hand side of Equation 4.
Rank deficiency of the matrix M^(kz.,k//ω) limits the use of this method. For example, when Mft equals the permuted identity matrix that follows from the packet hopping our nonlinear preprocessing and estimation does not apply. In contrast, when M„ is the normalized Hadamard transform, the method applies with no complications for ips = 1. For K\vs > 1, rank deficiency can occur for some of the possible choices of kz/ . In this case heuristics must be introduced in the selection of a .
To verify the benefits of the present invention, a coding experiment was coducted in which 12 speech files, each containing two utterances, were jointly encoded by multiplexed predictive coders (K = 12) . The speech files were encoded at 40 kbps and decoded after that a percentage of the data packets had been randomly dropped. Random packet loss rates between 0% and 40% were simulated. As a reference system, a 64 kbps /-law quantization and packet loss concealment (PLC) according to the ITU-G.711 standard was used. The reference system was simulated for the same speech files and packet losses as the multiplexed predictive coders.
A preference test was conducted of the 40 kbps multiplexed predictive coder with packet hopping versus the reference system. In this test listeners were subjected to 24 utterances processed by the two systems in randomized order. For each utterance the listeners made a preference decision. The results are given in Table 3.
Packet loss rate 0% 10! 20% 30! 40%
Preference to MPC 35% 63% 77% 92% 97% Preference to G.711 65% 37% 23% 8% 3%
Table 3.
We see from these results that multiplexed predictive coding was preferred over G.711 PLC for packet loss rates in the range from 10% to 40%. Thus, this implies that the multiplexed predictive coding system is more robust to packet losses than scalar quantization and packet loss concealment according to the G.711 standard. In addition, the multiplexed predictive coders can be designed to operate at significantly lower bit rates than the G.711 standard.

Claims

1. A method of signal processing in connection with digital signal transmission over a packet switched network, including the steps of: encoding a set of segments of respective digital signals into a set of corresponding segment representations; and assembling a set of data packet payloads for transmission in respective data packets over the packet switched network, wherein each data packet payload, for several segments of said set of segments, includes a part of the full segment representation, in such way that all parts of a segment that are included by respective data packet payloads together form the full representation of said a segment .
2. The method as claimed in claim 1, wherein the number of data packet payloads in said set of data packet payloads either is equal to, or different from, the number of segments in said set of segments.
3. The method as claimed in claims 1 or 2, said encoding step including encoding all segments of said set of segments essentially simultaneously.
4. The method as claimed in any one of claims 1 - 3, wherein said encoding step includes, with respect to each segment, quantization of digital signal samples to quantized representations of the samples, wherein the representation of the segment consists of quantized representations of the digital samples of the segment .
5. The method as claimed in any one of claims 1 - 4, wherein said part of the full segment representation is included in the data packet payload as quantized representations of corresponding digital signal samples.
6. The method as claimed in any one of claims 1 - 3, wherein the full segment representations of the segments of said set of segments include respective parts which are included in said set of data packet payloads as linear combinations of respective sets of digital samples, wherein each set of digital samples includes digital samples from all segments of said set of segments .
7. The method as claimed in any one of claims 1 - 3 or 6, said step of encoding multiple segments including the steps of: transforming a set of digital samples including digital samples from all segments of said set of segments into a set of linear combinations of these digital samples; quantizing the set of linear combinations to a corresponding set of quantized linear combinations, wherein the quantized linear combinations are provided to be included in respective data packet payloads as parts of full segment representations of the segments of said set of segments.
8. The method as claimed in claim 7, wherein the digital signal samples of said transforming step are transformed by means of a Hadamard transform.
9. The method as claimed in claim 7 or 8 , wherein said transforming step is preceded by the step of applying a predefined non-linearity with respect to the digital samples.
10. The method as claimed in claim 9, wherein said step of applying a non-linearity includes setting a number of those digital samples that have the lowest amplitude, among said set of digital samples, to the value zero before performing said transforming step.
11. The method as claimed in any one of the previous claims, wherein the encoding step includes the step of: applying a loss-resilient encoding of said set of segments; and adding redundancy bits obtained from the loss- resilient encoding to the segments of said set of segments, wherein also the redundancy bits are subject to the assembling step.
12. The method as claimed in any one of claims 1 - 11, wherein a first communication node performs said encoding step and said assembling step with respect to digital signals that are to be transmitted to the same second communication node, which digital signals are received from respective sources being directly or indirectly connected to the first communication node.
13. The method as claimed in claim 12, wherein said first communication node is a gateway or a telecommunications switch and said second communication node a gateway, a telecommunications switch, or a router in a telecommunications network.
14. The method as claimed in any one of claims 1 - 11, wherein said encoding step collectively is performed by a set of sources, each source encoding the segment of a corresponding digital signal for further transmission to a first communication node; and said assembling step is performed by said first communication node with respect to digital signals that are received from said set of sources and that are to be transmitted to the same second communication node.
15. The method as claimed in claim 14, wherein said first communication node is a gateway, a telecommunications switch, or a router in a telecommunications network and said second communication node a gateway, a telecommunications switch, or a router in a telecommunications network.
16. The method as claimed in claim 12 or 14, wherein said sources are mobile communication stations, said first communication node a base station in a radio communications network, and said second communication node a node in a radio communications network with which the base station communicates.
17. The method as claimed in any one of claims 1 - 16, wherein a digital signal corresponds to any real-time data traffic signal, such as a speech, audio or video signal .
18. An apparatus for processing digital signals to be transmitted over a packet switched network, wherein the apparatus includes means for performing the steps of the method as claimed in any one of claims 1-17.
19. A computer-readable medium storing computer- executable components for processing digital signals to be transmitted over a packet switched network, wherein the computer-executable components perform the steps of the method as claimed in any one of claims 1-17.
20. A method of signal processing in connection with digital signal reception from a packet switched network, including the steps of : disassembling a set of data packet payloads received in respective data packets from the packet switched network into a set of segment representations corresponding to a set of segments of respective digital signals, wherein, for several segments of said set of segments, a part of the full representation of the segment is extracted from each data packet payload, whereby all parts extracted with respect to a segment forms the full representation of said a segment; and decoding said set of segment representations into the corresponding set of segments .
21. The method as claimed in claim 20, wherein the number of data packet payloads in said set of data packet payloads either is equal to, or different from, the number of segments in said set of segments.
22. The method as claimed in claim 20 or 21, said decoding step including decoding all segment representations of said set of segment representations essentially simultaneously.
23. The method as claimed in any one of claims 20 -
22, wherein said decoding step includes, with respect to each segment, which consists of quantized representations of digital signal samples, dequantization of the sample representations to corresponding digital signal samples .
24. The method as claimed in any one of claims 20 -
23, wherein said part of the full representation of the segment included in each data packet payload includes quantized representations of corresponding digital signal samples .
25. The method as claimed in any one of claims 20 - 22, wherein parts of respective segment representations are extracted from said set of data packet payloads as linear combinations of a set of digital samples which includes samples from all segments of said set of segments .
26. The method as claimed in any one of claims 20 21 or 25, said step of decoding including the steps of: dequantizing parts of respective segment representations extracted from said set of data packet payloads to a set of linear combinations; and transforming said set of linear combinations to a set of digital samples which includes samples from all segments of said set of segments.
27. The method as claimed in claim 26, wherein the digital signal samples of said transforming step are transformed by means of a transpose matrix of a Hadamard transform.
28. The method as claimed in claim 26 or 27, wherein said transforming step is followed by the step of reconstructing lost linear combinations of said set of linear combinations, the reconstruction being based on knowledge that a number of the digital samples from which the set of linear combinations was derived at encoding was set to zero during encoding.
29. The method as claimed in any one of the previous claims, wherein the decoding step includes the step of exploiting redundancy bits included by the segment representations for recovering lost parts of the segment representations.
30. The method as claimed in any one of claims 20 - 29, wherein a second communication node performs said disassembling step and said decoding step with respect to packetized digital signals that are received from the same first communication node, which digital signals then are transferred to respective terminating entities being directly or indirectly connected to the second communication node.
31. The method as claimed in claim 30, wherein said second communication node is a gateway or a telecommunications switch and said first communication node a gateway, a telecommunications switch, or a router in a telecommunications network.
32. The method as claimed in any one of claims 20 - 29, wherein said disassembling step is performed by a second communication node with respect packetized digital signals that are received from the same first communication node, which digital signals then are transmitted to respective entities; and said decoding step collectively is performed by said entities, each entity decoding the segment representation of a corresponding digital signal .
33. The method as claimed in claim 32, wherein said second communication node is a gateway, a telecommunications switch, or a router in a telecommunications network and said first communication node a gateway, a telecommunications switch, or a router in a telecommunications network.
34. The method as claimed in claim 30 or 32, wherein said entities are mobile communication stations, said second communication node a base station in a radio communications network, and said first communication node a node in a radio communications network with which the base station communicates.
35. The method as claimed in any one of claims 20 - 34, wherein a digital signal corresponds to any real-time data traffic signal, such as a speech, audio or video signal .
36. An apparatus for processing digital signals received from a packet switched network, wherein the apparatus includes means for performing the steps of the method as claimed in any one of claims 20-35.
37. A computer-readable medium storing computer- executable components for processing digital signals received from a packet switched network, wherein the computer-executable components perform the steps of the method as claimed in any one of claims 20-35.
PCT/EP2002/004713 2001-05-03 2002-04-29 Multiplexed coding WO2002091691A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE0101550-2 2001-05-03
SE0101550A SE0101550D0 (en) 2001-05-03 2001-05-03 Multiplexed coding

Publications (1)

Publication Number Publication Date
WO2002091691A1 true WO2002091691A1 (en) 2002-11-14

Family

ID=20283970

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2002/004713 WO2002091691A1 (en) 2001-05-03 2002-04-29 Multiplexed coding

Country Status (2)

Country Link
SE (1) SE0101550D0 (en)
WO (1) WO2002091691A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5249185A (en) * 1990-08-03 1993-09-28 Nippon Telephone And Telegraph Corporation Voice packet assembling/disassembling apparatus
WO1997008861A1 (en) * 1995-08-25 1997-03-06 Terayon Corporation Apparatus and method for digital data transmission

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5249185A (en) * 1990-08-03 1993-09-28 Nippon Telephone And Telegraph Corporation Voice packet assembling/disassembling apparatus
WO1997008861A1 (en) * 1995-08-25 1997-03-06 Terayon Corporation Apparatus and method for digital data transmission

Also Published As

Publication number Publication date
SE0101550D0 (en) 2001-05-03

Similar Documents

Publication Publication Date Title
US7397411B2 (en) Method, apparatus, system, and program for code conversion transmission and code conversion reception of audio data
EP2070083B1 (en) System and method for providing redundancy management
CA2300495C (en) Technique for multi-rate coding of a signal containing information
US8279947B2 (en) Method, apparatus and system for multiple-description coding and decoding
KR20070067170A (en) Packet loss compensation
EP1290835B1 (en) Transmission over packet switched networks
KR20080066823A (en) Content encoding, distribution, and reception method, device, and system, and program
JPH11500291A (en) Method and apparatus for encoding audio signals
JPWO2005119950A1 (en) Audio data receiving apparatus and audio data receiving method
KR20050084284A (en) Switching method for mdc/scalable coding
WO2007035147A1 (en) Adaptive source signal encoding
CA2483512A1 (en) Method and apparatus for transmitting and receiving coded packet and program therefor
US7532672B2 (en) Codecs providing multiple bit streams
Balam et al. Multiple descriptions and path diversity for voice communications over wireless mesh networks
WO2002091691A1 (en) Multiplexed coding
US7715365B2 (en) Vocoder and communication method using the same
Andersen et al. Multiplexed predictive coding of speech
JP2007028432A (en) Packet relay transmission apparatus
JP2002261819A (en) Method for improving loss by packet redundancy
Massey et al. Packet Size Matters in IP Transport
Ehret et al. Using aacPlus for premium color ring back tones
Ye et al. Multiple description speech codecs applying distributed subframe interleaving
Zheng et al. Packet loss protection for interactive speech object rendering: A multiple description approach
Hellerud et al. Robust Transmission of Lossless Audio with Low Delay over IP Networks

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ CZ DE DE DK DK DM DZ EC EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP