MULTIPLEXED CODING
Technical Field of the Invention
The present invention relates to signal processing in connection with digital signal transmission over, and reception from, a packet switched network.
Technical Background and Prior Art
Real-time transmissions over packet switched networks, such as speech, audio or video over Internet Protocol based networks (mainly the Internet or Intranet networks) , has become increasingly attractive due to a number of features . These features include such things as relatively low operating costs, easy integration of new services, and one network for both non-real-time and real-time data. Real-time data, typically a speech, an audio or a video signal, are in packet switched systems converted into a digital signal, i.e. into a bitstream, which is divided in portions of suitable size in order to be transmitted in data packets over the packet switched network from a transmitter end to a receiver end. As packet switched networks originally were designed for transmission of non-real-time data, transmissions of real-time data over such networks causes some problems. Data packets can be lost during transmission, as they can be deliberately discarded by the network due to congestion problems or transmission errors. In non-realtime applications this is not a problem since a lost packet can be retransmitted. However, retransmission is not a possible solution for real-time applications that are delay sensitive. A packet that arrives too late to a real-time application cannot be used to reconstruct the corresponding signal since this signal already has been, or should have been, delivered to the receiving end, e.g. for playback by a speaker or for visualisation on a
display screen. Therefore, a packet that arrives too late is equivalent to a lost packet.
When transferring a real-time signal as packets, the main problem with lost or delayed data packets is the introduction of distortion in the reconstructed signal. The distortion results from the fact that signal segments conveyed by lost or delayed data packets cannot be reconstructed .
When transferring a signal it is most often desired to use as little bandwidth as possible. As is well known, many signals have patterns containing redundancies. Appropriate coding methods can avoid the transmission of the redundant information thereby enabling a more bandwidth effective transmission of the signal . Typical coding methods taking advantage of such redundancies are predictive coding methods. A predictive coding method encodes a signal pattern based on dependencies between the pattern representations. It encodes the signal for transmission with a fixed bit rate and with a trade-off between the signal quality and the transmitted bit rate. Examples of predictive coding methods used for speech are Adaptive Predictive Coding (APC) and Code Excited Linear Prediction (CELP) , which both coding methods are well known to a person skilled in the art. To alleviate the effects that lost packets have on the reconstructed signal with respect to lost signal information, two classes of signal processing methods have been proposed in the prior art . A first class adds redundancy and significant delay at the transmitter side, e.g., with loss-resilient codes or with multiple- description source/channel coding. A second class attempts perceptual concealment of the packet loss through signal interpolation at the receiver side. Thus, the drawbacks with these methods are either increased network payload data rate and increased transmission delay, or degraded perceptual quality of the received signal, or both.
ω t to μ> o H LΠ LΠ σ LΠ o Π
interconnects two circuit-switched networks with gateways performing dedicated point-to-point communications.
As another example, the path sharing traffic can be the path between a base station and another node, such as a Mobile Switching Centre, in a mobile communication system, in which the sources or destinations are Mobile Stations, or other radio transceivers, connected to the base station via radio links.
Yet other applications are so called streaming and multicasting in packet data networks.
As mentioned, the invention is also applicable when traffic share the same connection between two interconnected routers, or communication nodes, in which case the paths are separated before and after those routers.
Furthermore, traffic may be transmitted from the same gateway or telecommunications switch and share the same path to a router. Correspondingly, traffic may be received by the same gateway or telecommunications switch after having shared the same path from a router.
The multiplexing is preferably performed by multiplexing representations, e.g. quantized representations, from each one of the multiple segments, i.e. from each segment of a set of segments, or by multiplexing parts from each one of the multiple segments based on linear combinations of digital samples from all of the multiple segments.
According to an embodiment, loss-resilient codes are added to the segments in order to be able to regenerate parts of segments that have been lost due to packet losses. These loss-resilient codes further alleviate the . effect of packet losses in a system operating in accordance with the present invention.
Thus, the present invention is suitable for use in packet switched networks were packet loss may occur. The invention provides robust coding with no redundancy overhead, even though redundancy can be added in a
L > t H μ>
LΠ O LΠ O Lπ o Lπ rt rt H φ 0 ii φ φ OJ Ω
13 CQ a Φ
Ω CQ μ-
Ω OJ 3 <
G μ-1 μ- φ
3 OJ CQ ϋ
13 i CQ h-1 μ- 13
Φ ιQ G Φ
X a a ii μ- Hi rt a Φ G
^ rt M ϋ μ- H 3
3 N G CQ
0 OJ ϋ
P. rt CQ 0) φ μ- 13
H 0 Pi 13
OJ a a fi rt φ 0
Φ OJ 13 a rt ii
Hi Pi 0 μ-
0 OJ
H 13 13 rt ii OJ φ rt Φ Ω tr Pi W 13
Φ μ- Φ Φ
Ω rt H
Φ rt Ω a μ- h-1 Φ rt <! 0 13 μ- Φ CQ rt ϋ CQ a φ Ω Φ OJ
0 CQ \->
CQ P.
«<: Φ =>
CQ ii H Φ rt CQ tr H- φ Φ LQ
3 μ- tr
** CD ii rt φ μ-
OJ 3 CQ a a OJ rt LQ
P. P. fi
Φ μ- 0
Ω Hi rt rt
0 μ- rt
should by no means be interpreted as limiting on the scope of the claims .
The above mentioned and further features of, and advantages with, the present invention, will be more fully understood from the following description, with reference to the accompanying drawings, of exemplifying embodiments thereof .
Brief Description of the Drawings Exemplifying embodiments of the present invention will be described by way of example with reference to the accompanying drawings, in which:
Fig. 1 shows the basic principle of the multiplexing scheme in accordance with an embodiment of the invention in an exemplified, and schematically illustrated, system; Fig. 2 shows two Media Gateways adapted for transmitting and receiving multiplexed segment representations in accordance with an embodiment of the invention; and Fig. 3 shows a multiplexed encoding arrangement transmitting multiple segments to a multiplexed decoding arrangements, in accordance with embodiments of the invention.
Detailed Description of the Invention
With reference to Fig. 1 the principle of multiplexing in accordance with an embodiment of the invention is shown. A first communication node (Media Gateway) 100 receives a set of multiple segments. Each segment belongs to a respective encoded real-time signal. For ease of description, only three signal S' , S" and S'" with respective segments are depicted in Fig. 1, however, any number (greater than one) of signals can be subject to the practising of the present invention. Furthermore, for ease of description, each segment includes a very limited number of elements in its representation. All the three signals share a common path from the first
OJ J t IS μ> μ1
LΠ o Lπ o LΠ o LΠ
OJ
In Fig. 1 the incoming segments of signals S' , S" and S'" to the Media Gateway 100 are either already encoded by their respective sources, or they are encoded by the Media Gateway 100. In Fig. 1 it can also be seen that if a data packet and its payload is lost, every third part of a segment representation will be lost. To further alleviate the effect of lost data packets, loss resilient coding may advantageously be added to the multiplexing scheme. The resulting redundancy bits ( r , r2 and r3 ) of such loss resilient coding have been included in the segments and packet payloads shown in Fig. 1. Again, the number of redundancy bits indicated in each segment/payload is merely chosen for ease of description, any number of redundancy bits can be used. Moreover, the loss resilient coding adding the redundancy bits is in Fig. 1 indicated as having been performed by the respective sources, however, this can alternatively be performed by the Media Gateway 100. The loss resilient coding is preferably added by applying a coding method appreciated by a person skilled in the art, such as Reed-Solomon coding. Furthermore, a person skilled in the art will also appreciate that by means of the segment parts and redundancy bits that are received by Media Gateway 110, lost segment representation parts in a lost packet payload can be reconstructed.
With reference to Fig. 2, two Media Gateways, or communication nodes, at respective ends of a path shared by multiple connections are shown. Each Media Gateway (MG) is controlled by its respective Media Gateway
Controller (MGC) over a Media Gateway Control Protocol (MGCP) . The MGCP is in essence a master/slave protocol where the Media Gateway executes the commands that are sent by the Media Gateway Controller. The transmitting Media Gateway 200 is controlled by Media Gateway Controller 210 and includes a MUX server function 220. The MUX server function is vendor specific
and does not effect other architecture elements or any standardized protocol. Correspondingly, the receiving Media Gateway 250 is controlled by Media Gateway Controller 260 and includes a vendor specific DeMUX server function 270.
Inputted to each Coder 230 at the transmitting MG 200 are multiple real-time data streams. Upon setting up a multiplexed connection between the two MG' s 200 and 250, the MGC 210 of MG 200 will issue a call set-up to MGC 260 of MG 250. The MUX server function 220 includes a table of what coders that are included by MG 200 and the destination of their encoded data streams. This table is built by listening to the signalling between the MGC 210 and MG 200 when setting up the Coders 230 and the destinations of their respective data streams.
The MUX server function 220 is implemented as a process which receives segment parts outputted from the Coder 230 processes. Using its included table the MUX server function 220 multiplexes the segment parts received from the different Coders 230. When a data packet payload has been assembled, this payload will be outputted to one of the waiting Coder processes. This Coder will in turn output the assembled payload to a Packetizer 240. In case the path between the two MG's is over an Internet Protocol network, the Packetizer 240 will add IP headers etc. to the packet streams.
Correspondingly, upon reception of the call set-up from MGC 210, signalling will occur between the MGC 260 and the Media Gateway 250. The DeMUX server function 270 will listen to this signalling and build a table to be used for de-multiplexing of received data packet payloads that includes multiplexed segment representation parts. When packet payloads are received by the Decoders 280 from a Depacketizer 290, these payloads will be outputted to the DeMUX server function process 270. After the DeMUX process has disassembled a number of payloads and extracted segment representation parts therefrom, parts
that together form a complete segment will be transferred as a full segment representation to one of the waiting Decoders 280 for further transmission to its destination.
A person skilled in the art will appreciate other ways of implementing the assembling/disassembling of packet payloads, e.g. by having the MUX server function 220 and DeMUX server function 270 interact with the Packetizer 240 and Depacketizer 290, respectively.
Thus, in Fig. 2, Media Gateway 200 and Media Gateway 250 act as a first and a second communication node, respectively. Media Gateway 200 encodes and assembles multiple segments and Media Gateway 250 performs disassembling and decoding with respect to the segments. Alternatively, the encoding performed by the set of Coders 230 could be located at the respective signal " sources and the decoding performed by the set of Decoders 280 at the respective signal destinations.
With reference to Fig. 3, the multiplexing/demultiplexing of signal segment parts in accordance with exemplifying embodiments of the invention will be described in greater detail .
In Fig. 3 the system is based on K scalar adaptive predictive speech coders with noise feedback coding of the kind proposed by Atal and Schroeder in "Predictive coding of speech signals and subjective error criteria", B.S. Atal and M.R. Schroeder, IEEE Trans. Acoust . Speech Signal Processing, vol. 27, no. 3, pp. 247-254, 1979. In the figure only the k-. th predictive encoder and decoder are shown. In this system the traditional single-input single-output scalar quantizer have been replaced with what is herein referred to as a multiplexed quantizer 300. The multiplexed quantizer 300 takes at each sampling instant n the K inputs
to q from all K predictive encoders and outputs quantized representations
to qf
t back to the individual predictive encoders. In doing this, the multiplexed quantizer 300 generates K quantization indices to t,f each in the range 1 to 2
b
where b is the number of bits allocated for each index. These indices are packetized and transmitted over the packet-switched network in K independent packet streams 310..320. At the decoder side, the received indices t,,
1 to i
n κ are available. These indices differ from the indices to i in the encoder whenever an index has been lost with a data packet on the network, in which case the index value is replaced by zero. The demultiplexer 330 resolves K representations
of the quantized information from the available indices z
n 1 to i
n κ . These representations are subsequently input to the K predictive decoders. The resulting encoding and decoding system is shown in Figure 1. Not included in this figure is the LPC analysis and side information quantization which leads to coefficients for the predictors P (z) , noise-feedback filters F (z) , and scaling factors σ* for k = 1 . . K. The side information is conveniently encoded with a loss-resilient coding such as the Reed-Solomon code to enable transmission of this information in a robust manner using the same K packet streams as the multiplexed quantized information. The Reed-Solomon code is described in "Reed-Solomon Codes and their Applications", S.B. Wicker and V.K. Bhargava, IEEE Press, New York, 1994. With further reference to Fig. 3, three different multiplexing schemes in accordance with different embodiments of the invention will now be described: packet hopping, Hadamard multiplexing, and an extension of the Hadamard multiplexing that exploits a nonlinear preprocessing and estimation method.
In a packet hopping scheme the multiplexed quantizer block is a system in which indices from scalar quantizers are hopped, i.e., cycled from one packet stream to the next as the sample instant n increments . One version of this system is specified by the equations
and
~(n+k)mod(K)+l = Q-ι(ξ*} (3)
for k = 1.. K. Here Q ( • ) and Q"1 ( • ) are the mappings from quantizer input to quantization index and from quantization index to the quantized representation of the quantizer input, respectively. For an adequate response to packet losses Q_:1"(0) = 0. The notation (•)mod(ii) denotes the modulo K operation.
The packet hopping described above can be expressed as a particular orthogonal transformation of the input to the multiplexed quantizer followed by a quantization of the transform output. If we define column vectors with elements equal to the K scalar input, output, or index values for the multiplexed quantizer and demultiplexer, such that e.g.,
Then the multiplexed quantizer is defined by the following equations :
c„ = M„q„
and
The demultiplexer on the receiver side is defined by:
and
q„ = M^c„
The equivalence of these equations with the packet hopping described by Equations 1 to 3 is obtained by letting the transform matrix M„ equate an adequate time varying row or column permutation of an identity matrix. With this formulation of the multiplexing, it is relevant to introduce, in alternative embodiments, other transform matrices than the row or column permuted identity matrix. A simple, yet relevant, transform for this purpose is the normalized Hadamard transform disclosed in "Orthogonal Designs; Quadratic Forms and
Hadamard Matrices", A.V. Geramita and J. Seberry, Marcel Dekker, 1979. The Hadamard multiplexing will hold advantages over the packet hopping method. These advantages are explained in the following. One advantage is that whenever less than K packets are lost in the network there are no full erasures of any sample in any of the quantized prediction error signals. This advantage can be exploited when the Hadamard transform is combined with a nonlinear preprocessing and estimation scheme as described below.
Let us assume the elements of qn to be uncorrelated and neglect the impact of quantization noise. Then the matrix M^ is the linear minimum mean-squared error estimator for qM given the coefficient vector c„ . This estimator is the mean-square optimum for Gaussian qn . However, the gain-scaled linear prediction errors for voiced speech signals are known to be non-Gaussian. Thus,
a nonlinear estimator can result in lower mean-squared error. Indeed, we observed in preliminary experiments that nonlinear estimation could lead to a significant decrease of the mean-squared error. However, the nonlinear estimation led to very high computational complexity. Therefore we adopt an alternative method in which a well defined nonlinearity is applied to the input of the multiplexed quantizer. Knowledge of this nonlinearity can then subsequently be exploited to improve the reconstructed quantized prediction errors in the case of packet losses.
The general method suitable to use is to zero Kz of the K inputs to the multiplexed quantizer prior to applying the transform MB . Advantageously, the Kz inputs with lowest amplitudes are set to zero. In the method, no information about the position of zero valued elements in qn is conveyed to the decoder, only the knowledge that KZ of the elements were zero is exploited. The method is described below. Suppose that the number of lost packet streams is K\ps and the lost packet streams are indexed by an integer set k, . Then we may formulate a set of equations relating to the received coefficients cn with the encoded scaled prediction errors qn .
qn = MT n n + M ( ,k,Aa (4)
In this equation a is a K\
ps dimensional unknown vector. We have used a matlab-style notation M^(:,k
;jM) to denote a matrix consisting of the columns of M^ that are indexed
Now assume that qn had Xlps zero-valued elements indexed by the set kzi : qn ( kz. ) = 0 , then
provided that M^ ( k
Zi. , k
lps ) has full rank . Furthermore ,
« = c„ ( k^ ) ( 5 )
The indexing for the zero-valued elements of qH is not known by the decoder, however there are
ways in which the decoder can assume K\ps elements of qn to be zero. Of these
will be true assumptions . Whenever izi > Kχps there are multiple true assumptions and all true assumptions will result in the same value for a , i.e., the one given in Equation 5. Thus, the method applicable in the decoder is to calculate a for all Cx possible choices of kz;. and select the a vector that occurred C2 times. Hereafter qn is obtained as the right-hand side of Equation 4.
Rank deficiency of the matrix M^(kz.,k//ω) limits the use of this method. For example, when Mft equals the permuted identity matrix that follows from the packet hopping our nonlinear preprocessing and estimation does not apply. In contrast, when M„ is the normalized Hadamard transform, the method applies with no complications for ips = 1. For K\vs > 1, rank deficiency can occur for some of the possible choices of kz/ . In this case heuristics must be introduced in the selection of a .
To verify the benefits of the present invention, a coding experiment was coducted in which 12 speech files, each containing two utterances, were jointly encoded by multiplexed predictive coders (K = 12) . The speech files
were encoded at 40 kbps and decoded after that a percentage of the data packets had been randomly dropped. Random packet loss rates between 0% and 40% were simulated. As a reference system, a 64 kbps /-law quantization and packet loss concealment (PLC) according to the ITU-G.711 standard was used. The reference system was simulated for the same speech files and packet losses as the multiplexed predictive coders.
A preference test was conducted of the 40 kbps multiplexed predictive coder with packet hopping versus the reference system. In this test listeners were subjected to 24 utterances processed by the two systems in randomized order. For each utterance the listeners made a preference decision. The results are given in Table 3.
Packet loss rate 0% 10! 20% 30! 40%
Preference to MPC 35% 63% 77% 92% 97% Preference to G.711 65% 37% 23% 8% 3%
Table 3.
We see from these results that multiplexed predictive coding was preferred over G.711 PLC for packet loss rates in the range from 10% to 40%. Thus, this implies that the multiplexed predictive coding system is more robust to packet losses than scalar quantization and packet loss concealment according to the G.711 standard. In addition, the multiplexed predictive coders can be designed to operate at significantly lower bit rates than the G.711 standard.