WO1995028770A1 - Adpcm signal encoding/decoding system and method - Google Patents
Adpcm signal encoding/decoding system and method Download PDFInfo
- Publication number
- WO1995028770A1 WO1995028770A1 PCT/AU1995/000216 AU9500216W WO9528770A1 WO 1995028770 A1 WO1995028770 A1 WO 1995028770A1 AU 9500216 W AU9500216 W AU 9500216W WO 9528770 A1 WO9528770 A1 WO 9528770A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- kalman filter
- output
- bit rate
- specialised
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/4006—Conversion to or from arithmetic code
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3002—Conversion to or from differential modulation
- H03M7/3044—Conversion to or from differential modulation with several bits only, i.e. the difference between successive samples being coded by more than one bit, e.g. differential pulse code modulation [DPCM]
- H03M7/3046—Conversion to or from differential modulation with several bits only, i.e. the difference between successive samples being coded by more than one bit, e.g. differential pulse code modulation [DPCM] adaptive, e.g. adaptive differential pulse code modulation [ADPCM]
Definitions
- the present invention relates to ADPCM (Adaptive Differential Pulse Code Modulation) signal encoding systems and, in particular, to such systems used to encode speech signals to result in a variable bit rate digitised representation of the speech signals.
- ADPCM Adaptive Differential Pulse Code Modulation
- the invention also relates to the reverse decoding of such variable bit rate digitised representations in order to provide a reconstituted speech signal.
- the invention also is applicable to other audio signals such as music.
- the present invention finds applications in the encoding of speech signals in order to enable such speech to be stored. For example, many banks and like financial institutions routinely record telephone conversations of their market dealers in order to provide a permanent record of financial contractual obligations entered into verbally. Substantial savings in the volume of storage can be made if such speech is stored digitally and reconstituted as necessary by the reverse decoding.
- the coding techniques of the present invention also find application in ATM (Asynchronous Transfer Mode) communications applications where a variable bit rate in the coded output is not of great consequence since it is the average bit rate which is important in such applications.
- ADPCM encoding and decoding systems using a predictor which essentially comprises a Kalman filter are known, however, such systems have not found practical application because of the very substantial computational demands made by the Kalman filter predictor even though the Kalman filter is known to be "optimal" as far as linear prediction is concerned.
- This is essentially achieved by use of a sub- optimal Kalman predictor utilising smoothing sample estimates and arithmetic coding to take advantage of the silent periods within speech utterances and to preserve ADPCM stability at low bit rates.
- an ADPCM audio encoding system to provide a variable bit rate digitised representation of audio signals, said system comprising a digital input for said audio signals connected to a positive input of a summer, the output of said summer being connected to a cascade connected quantizer and dequantizer, the output of said dequantizer forming an input to a 'specialised' Kalman filter having two outputs, the first of said outputs comprising a predicted output which is connected to a negative input of said summer, the other of said filter outputs comprising a smoothed output which is utilized by an autoregressive calculator means to modify the operation of said 'specialised' Kalman filter; and an arithmetic coder having its input connected to said quantizer output, and controlled by a variance estimator means operating on the output of said dequantizer, the output of said arithmetic coder comprising said variable bit rate digitised representation.
- the encoding system further comprises an analog-to-digital converter for receiving audio signals in analog form and converting said signals to digital form.
- an analog-to-digital converter for receiving audio signals in analog form and converting said signals to digital form.
- audio transducer means to generate said analog audio signals from sound pressure waves.
- an ADPCM decoding system to provide a reconstituted audio signal from a variable bit rate digitised representation of an original audio signal, said system comprising a dequantizer having its input connected to receive a decoded form of said variable bit rate digitised representation, the output of said dequantizer being connected to a 'specialised' Kalman filter, the output of which comprises said reconstituted audio signal and is utilized by an autoregressive calculator means to modify the operation of said 'specialised' Kalman filter; and an arithmetic decoder, controlled by a variance estimator means operating on the output of said dequantizer, interposed between said variable bit rate digitised representation and said dequantizer to generate said decoded form of said variable bit rate digitised representation.
- the audio signals are speech signals.
- the invention further discloses an ADPCM audio encoding/decoding system comprising an encoding system described above, a digital memory storage means for receiving and storing said variable bit rate digitised representation, and a decoding system as described above.
- the invention yet further disclosed a method for ADPCM audio encoding to provide a variable bit rate digitized representation of audio signals, said method comprising the steps of: filtering a digital audio signal by 'specialised' Kalman filter means; quantizing the filtered signal; and arithmetic coding the quantized signal to provide said variable bit rate digitised representation of said audio signal.
- the invention yet further discloses a method for ADPCM audio decoding to provide a reconstituted audio signal from a variable bit rate digitised representation of an original audio signal, the method comprising the steps of: arithmetic decoding of said digitised representation; dequantizing of said decoded representation; and filtering said dequantized signal by 'specialised' Kalman filter means to produce said reconstituted audio signal.
- Fig. 1 is a schematic block diagram of an ADPCM speech encoding system of the preferred embodiment
- Fig. 2 is a schematic block diagram of the ADPCM speech decoding system of the preferred embodiment.
- Fig. 3 shows a distributed communications arrangement.
- the encoding system 10 of the preferred embodiment has a microphone 1 at which speech signals are generated and passed to an A to D converter 2.
- a summer 3 is provided together with a quantizer 5 and dequantizer 6.
- the output of the quantizer 5 is input to an arithmetic coder 8 which is controlled, as indicated by dotted lines, by a variance estimator (VarEst) 9 which obtains its input from the dequantizer 6.
- VarEst variance estimator
- the output O/P of the arithmetic coder 8 is the encoded bit stream.
- the output, O/P can be connected with a digital data memory store, such as CD-ROM, DAT or a disc-drive, or alternatively to a distributed communications network for broadcast.
- the output of the dequantizer 6 is input to a specialised Kalman filter 7 which has two outputs. One of these outputs is a predicted output P/O which is applied to the negative input of the summer 3. The other output is a smoothed output S/O.
- the smootiied output S/O is used by an autoregressive model coefficient calculator (ARcalc) 4 which is in turned used to adapt the specialised Kalman filter 7.
- ARcalc autoregressive model coefficient calculator
- the specialised Kalman filter 7 is again used, however, since the input I/P for the decoder constitutes a variable bit rate stream, this is applied to an arithmetic decoder 18 which is again controlled by a variance estimator (Var Est) 9.
- the input, I/P can be from a memory store or from a broadcast network as discussed above.
- a dequantizer 16 is provided.
- the specialised Kalman filter 7 is again adapted by the ARcalc 4 and its output O/P is the reconstructed output which is connected to a loudspeaker 15.
- Speech signals have significant amounts of redundancy, or correlation between samples.
- the encoding/decoding problem is to represent efficiently the information contained in the speech signal in a digital form for storage or transmission over a channel. In order to do this, it is desirable to remove the redundancy from the samples to be stored or transmitted. Put another way, it is desirable to remove that which is predictable from the samples to be stored or transmitted.
- speech is typically modelled as a filtered white noise signal which is mathematically represented in a way which incorporates various filter coefficients.
- Forwards adaptation schemes transmit these filter coefficients to the decoder in a quantized form. Then both the encoder and decoder are able to use the quantized coefficients in the prediction process.
- backwards adaptation as in the present case, these filter coefficients are not transmitted. Instead, backwards adaptation is based on the availability of the reconstructed signal at the decoder which can then be used to produce a set of filter coefficients for the prediction process. If the reconstructed signal is close to the input signal, then it is reasonable to expect that backwards adaptation will perform well. It is known that a Kalman filter can be used to provide a good prediction with coarsely quantized measurements. These predictions are based on Kalman filter state estimates which contain smoothed signal values up to the order of the Kalman filter. However, a filter order of from 10 to 50 typically is required, and the computational cost typically is of an order of magnitude which approximates to the cube power of the filter order.
- a Kalman filter predictor which utilises an adequate filter order gives rise to a substantial computational burden.
- the computational cost of the update equations run every sample is in the vicinity of 120 MFLOPS.
- This computational burden can be reduced by use of a Kalman filter technique with reduced complexity.
- such an approach uses smoothing estimates in the predictor up to a relatively small lag of n samples. This approach utilises the consequence that most of the smoothing gain is to be obtained in the first few smoothing lags.
- Smoothing to n lags is mathematically equivalent to assuming that only the top left hand n x n block of the error co variance matrix is non-zero.
- the error covariance matrix arises because the calculated coefficients are not identical with the actual signal coefficients.
- a specialised form of error covariance sub-matrix can be assumed and a good approximation obtained by simply updating the n x n error covariance matrix in order to provide an appropriate smoothing.
- a specialised Kalman filter maximum smoothing lag value as low as 4, a reconstituted speech signal which differs from the equivalent signal of the full 50th order Kalman filter by less than 0.2 dB in signal to noise ratio is obtainable.
- a computational load of only 0.8 MFLOPS is required compared to 120 MFLOPS for the 50th order Kalman filter.
- the modified or specialised Kalman filter also results in considerable subjective improvement as it practically eliminates the high frequency "hiss" introduced by large quantization errors. This is especially important for low bit rate signal encoding/decoding systems.
- an arithmetic encoder and corresponding decoder are used in conjunction with the quantizer and dequantizer. This arithmetic encoding/decoding gives a very large increase in the number of effective levels of the quantization. These levels are entropy coded via the arithmetic coder 8 (and its corresponding arithmetic decoder 18) based on the expected probability of the occurrence of each level. The probabilities are calculated using a distribution assumption, for example a Laplacian distribution assumption, and a variance estimate of the prediction difference signal which is input to the quantizer 5,
- a substantial advantage to flow from an effectively larger number of quantization levels is that it practically eliminates overload distortion and hence there is more information present in the quantized data to assist in predictor adaptation and the estimate of the variance.
- the arithmetic coder 8 and decoder 18 are entropy coded based on the expected probabilities of occurrence, the expected silence periods within speech utterances are able to be utilised to advantage.
- the above described preferred embodiment gives good quality reconstituted speech at an average bit rate of 16 kbps. Similarly, good quality speech at low bit rates in the vicinity of 8 kbps are achievable.
- a plurality of the encoding systems 10 and a plurality of decoding systems 20 can be arranged in a distributed communications system 30 as shown in Fig. 3.
- the system 30 has a plurality of sites 25, which can have transmit- only, receive-only capability or a combination of both. Whilst a ring network configuration is shown, other network configurations are contemplated, including hub.
- Embodiments of the invention can be implemented as software coding alone.
- the encoder takes the speech signal as an input and produces a variable rate compressed digital signal (I k ) for storage or transmission (see Figure 1).
- the speech input (S k ) to the encoder will be in the form of a (high) fixed bit rate digital signal obtained from a microphone in cascade with an analog-to-digital (A/D) converter or from some other means such as another codec (eg. PCM).
- the prediction output of the specialised Kalman filter (S k _ ⁇ ) is subtracted from the speech signal (S k ) to produce a difference signal (S k ),
- the quantiser which is represented by the Q block.
- the quantiser may take many forms including (but not restricted to) uniform, non-uniform, adaptive, etc.
- the output of the quantiser (Z k ) is a trivially coded version of the difference signal (S k ) and is mathematically specified as
- the quantiser output is then applied to two other blocks, the arithmetic coder (AC) and the dequantiser (Q -1 ).
- the dequantiser converts the coded signal Z k back to a real value (Y k ) which represents the quantised difference signal,
- the quantised difference signal Y k then drives the specialised Kalman filter (KF) which
- the specialised Kalman filter parameters are backwards adapted by virtue of the fact that the adaptation is performed based on the reconstructed speech signal (S ⁇ ).
- the block ARcalc is driven by the reconstructed speech signal and computes the coefficients of the all-pole or auto-regressive (AR) signal model which in turn is used to adapt the specialised Kalman filter.
- the arithmetic coder takes the trivially coded difference signal (Z k ) and converts it to a variable rate entropy coded bit stream (I k ).
- the arithmetic coding is adapted to cope with the nonstationary statistics and for this particular application, the adaptation was performed based on the short term variance (although it could be based on some other quantity).
- the block VarEst computes the short term variance of the quantised difference signal (Y k ) and this quantity is used to adapt the arithmetic coder.
- the specialised Kalman filter is a simplified version of a full Kalman filter based on an all-pole signal model. It is also an extension of the standard linear predictor that is commonly used in speech coding. Mathematical Details
- the full Kalman filter and the specialised Kalman filter are based on the all-pole signal model (also referred to as an auto-regressive (AR) signal model).
- the all-pole speech signal model is specified by _
- Equation (5) can also be written as
- the state vector, x k consists of N samples, from the fcth sample back to the (k — N + l)th sample.
- K k is the Kalman gain vector, given by
- K k P k H ⁇ (HP k H ⁇ + R k ) '1 (11)
- the Kalman filter predicted output is given by
- the state vector estimate for the Kalman Filter is of the form
- K k is the ith entry of the Kalman gain vector K k .
- the Kalman filter takes account of the measurement (quantisation) noise in the signal coding situation, by recognising that the reconstructed sample (S k ) is not a perfect representation of the input speech sample S k .
- the Kalman Filter exploits the correlation between samples, given by the all-pole signal model, to obtain smoothed estimates of the input samples for use in the prediction.
- Smoothing theory specifies that most of the smoothing gain is to be obtained in the first few smoothing lags for this particular problem. Since this is the case, most of the advantage of full Kalman filtering can be obtained by smoothing to the first few n lags (say up to 5 lags), rather than continuing to the full order (eg., 49 lags for a 50th order signal model).
- the reduced complexity Kalman Filter state estimate has the form
- the sub-optimal smoothed estimate is read out of the state vector and mathematically, this is specified by where H J is defined as the (j + l)th row of an N x N identity matrix.
- Kalman Filter in the KF-AC-ADPCM system also results in considerable subjective improvement, as it practically eliminates the high frequency "hiss" introduced by the large quantisation errors. This is especially important for low bit rate signal coding systems.
- the subjective improvement is in fact far greater than the objective SNR measures would indicate from above.
- the specialised Kalman filter utilises backward adaptation in its operation. This involves calculating the filter coefficients from the reconstructed speech signal (S k ) rather than from the original speech signal (S k ), as would be the case in forward adaptation.
- the block ARcalc is driven by the reconstructed speech signal and computes the coefficients of the all-pole or auto- regressive (AR) signal model which in turn is used to adapt the specialised Kalman filter.
- This is a well known procedure, particularly in the Low Delay Code-word Excited Linear Prediction systems (LD-CELP) such as the CCITT G.728 16kbit/sec standard.
- LD-CELP Low Delay Code-word Excited Linear Prediction systems
- the speech signal is broken up into short segments (20-200 samples) and an assumption is made that the signal is stationary over that period, i.e. it is assumed to be piece- wise stationary.
- the "optimal" set of ⁇ * coefficients can be computed and these are used in the specialised Kalman filter for the duration of the segment.
- Arithmetic Coding is a practically optimal entropy coding scheme, that is used here to encode the quantiser output with only the number of bits required by information theoretic consid ⁇ erations, based on the probability of that quantisation level being used.
- the prime difference between arithmetic coding and the more common Huffman coding is that it does not suffer from the disadvantage of requiring each source symbol to be encoded with an integral number of bits. This is particularly advantageous for highly peaked probability distributions as in this case.
- the arithmetic coder takes the trivially coded difference signal (Z k ) and converts it to a variable rate entropy coded bit stream (I k ). Mathematically, this can be specified by
- the arithmetic coding is adapted to cope with the nonstationary statistics and for this particular application, the adaptation was performed based on the short term variance (although it could be based on some other quantity).
- the block VarEst computes the short term variance of the quantised difference signal (Y k ) and uses this quantity to adapt the arithmetic coder.
- the quantised difference signal is allocated to various bins based on the magnitude of the short term variance ⁇ 2 .
- a probability distribution is then either assumed (eg. Laplacian) or calculated (eg. using test sentences to compute a look-up table) for each bin and this is then used to arithmetically code the quantised difference signal (Yk).
- Perceptual weighting is a commonly used technique in many speech coding applications. It is used to improve the subjective quality of the decoded (reconstructed) speech by utilising known properties of the human auditory response. It is known that the human hearing is less sensitive to coding distortion in frequency bands where the energy is greatest due to the masking effect of the human ear. Perceptual weighting is found to give significant subjective performance improvements in the KF-AC-ADPCM system.
- Perceptual weighting is the technique of adding appropriate filtering in order to redistribute the coding distortion energy in approximately the same distribution as the speech signal.
- the output of the encoder (I k ) is a variable rate bit stream produced by the arithmetic coding block. These data are then either transmitted over a digital telecommunications channel (eg. Asynchronous Transfer Mode (ATM) network) or stored on some form of digital storage device (eg. hard disk) .
- ATM Asynchronous Transfer Mode
- the input bit stream to the decoder will be referred to as .
- I k I k -
- a practical codec must be able to cope with the non-ideal situation that occurs when bit errors are introduced (I k I k ) and special techniques (such as periodic resynchronisation) are added in order to achieve this.
- the encoder output bit stream (I k ) is buffered in order to achieve the fixed rate average value. This necessarily introduces an additional encoding delay.
- FIG. 2 is a block diagram of the decoder.
- the decoder converts the received compressed variable rate bit stream (I k ) to the decoder reconstructed speech signal (S k ).
- This reconstructed speech signal is in a similar form (fixed rate digital signal) to the original speech signal S k -
- the reconstructed signal (S k ) is then either connected to a cascade of a digital-to-analog (D/A) converter and speaker or to some other device.
- D/A digital-to-analog
- the decoder first of all arithmetic decodes the variable rate bit stream (I k ) to produce the trivially coded signal at the decoder (Z k ) and this is represented by the arithmetic decoder block (AC -1 ).
- the output of the arithmetic decoder (Z k ) is then converted to a decoder quantised difference signal (Y k ) by the dequantiser block Q _1 and this can be written as
- the dequantiser block is identical to the one at the encoder.
- the arithmetic decoder is adapted in an equivalent way to the arithmetic coder, i.e. a variance estimate of the decoder quantised difference signal (Y k ) is used to adapt the arithmetic decoder.
- the VarEst block is identical to that of the encoder.
- the specialised Kalman filter is identical in form to that of the encoder.
- the specialised Kalman filter is backward adapted in an equivalent manner to the encoder.
- Table 2 SNR and segSNR comparison between G.728 LD-CELP and KF-AC-ADPCM operat ⁇ ing at 16 kb/s
- Table 2 shows values of SNR and segSNR for 16 kb/s LD-CELP and KF-AC-ADPCM operating at an average bit rate of 16 kb/s, when tested on the sentence "Cats and dogs each hate the other" .
- KF-AC-ADPCM at an average of 16 kb/s has higher SNR and segSNR fig ⁇ ures than LD-CELP.
- the informal listening tests confirm that the subjective performance is significantly superior to that of LD-CELP.
- the informal listening tests also indicate that the KF-AC-ADPCM subjective quality at an average bit rate of 12 kb/s is equal to that of 16 kb/s LD-CELP.
- the KF-AC-ADPCM subjective quality is slightly inferior to LD-CELP, but nevertheless is still very good.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
An ADPCM signal encoding/decoding system is disclosed. A digital input signal (I/P) is input to a summer (3), the output signal of which is quantized (5) and output to an arithmetic coder (8). The quantized signal is dequantized (6) and passed to a specialized Kalman filter (7). An output of the Kalman filter (7) is applied to the summer (3). The Kalman filter (7) is adapted by an autoregressive model coefficient calculator (4). The Kalman filter (7) is specialised in the sense that there is smoothing of the estimates in the predictor up to a relatively small lag of n full-order samples. The arithmetic coder (8) is adapted by a variance estimator (9) so that arithmetic coding is performed using only the number of bits required by information theoretic considerations, based on the probability of that quantization level being used. Decoding of audio signals so encoded operates essentially in a converse manner.
Description
ADPCM Signal Encoding/Decoding System and Method
Technical Field of the Invention
The present invention relates to ADPCM (Adaptive Differential Pulse Code Modulation) signal encoding systems and, in particular, to such systems used to encode speech signals to result in a variable bit rate digitised representation of the speech signals. The invention also relates to the reverse decoding of such variable bit rate digitised representations in order to provide a reconstituted speech signal. The invention also is applicable to other audio signals such as music.
Background Art
The present invention finds applications in the encoding of speech signals in order to enable such speech to be stored. For example, many banks and like financial institutions routinely record telephone conversations of their market dealers in order to provide a permanent record of financial contractual obligations entered into verbally. Substantial savings in the volume of storage can be made if such speech is stored digitally and reconstituted as necessary by the reverse decoding. The coding techniques of the present invention also find application in ATM (Asynchronous Transfer Mode) communications applications where a variable bit rate in the coded output is not of great consequence since it is the average bit rate which is important in such applications. ADPCM encoding and decoding systems using a predictor which essentially comprises a Kalman filter are known, however, such systems have not found practical application because of the very substantial computational demands made by the Kalman filter predictor even though the Kalman filter is known to be "optimal" as far as linear prediction is concerned.
There are four aspects required to be traded-off in implementing a low bit rate speech coding system: reconstructed speech quality, average bit rate, encoding/decoding delay and computational complexity. It would be desirable to be able to realise a system
that allows these trade-offs to be made with flexibility, even in the course of operation, to allow specific application requirements to be met. Disclosure of the Invention
It is the object of the present invention to provide such an encoding/decoding system, and the corresponding encoding and decoding methods, which enable some of the virtues of Kalman filtering to be obtained but at very low bit rates and without the vice of substantial computational loads. This is essentially achieved by use of a sub- optimal Kalman predictor utilising smoothing sample estimates and arithmetic coding to take advantage of the silent periods within speech utterances and to preserve ADPCM stability at low bit rates.
In accordance with the first aspect of the present invention there is disclosed an ADPCM audio encoding system to provide a variable bit rate digitised representation of audio signals, said system comprising a digital input for said audio signals connected to a positive input of a summer, the output of said summer being connected to a cascade connected quantizer and dequantizer, the output of said dequantizer forming an input to a 'specialised' Kalman filter having two outputs, the first of said outputs comprising a predicted output which is connected to a negative input of said summer, the other of said filter outputs comprising a smoothed output which is utilized by an autoregressive calculator means to modify the operation of said 'specialised' Kalman filter; and an arithmetic coder having its input connected to said quantizer output, and controlled by a variance estimator means operating on the output of said dequantizer, the output of said arithmetic coder comprising said variable bit rate digitised representation.
Preferably, the encoding system further comprises an analog-to-digital converter for receiving audio signals in analog form and converting said signals to digital form. There further can be provided audio transducer means to generate said analog audio signals from sound pressure waves.
In accordance with a second aspect of the present invention there is disclosed an ADPCM decoding system to provide a reconstituted audio signal from a variable bit rate digitised representation of an original audio signal, said system comprising a dequantizer
having its input connected to receive a decoded form of said variable bit rate digitised representation, the output of said dequantizer being connected to a 'specialised' Kalman filter, the output of which comprises said reconstituted audio signal and is utilized by an autoregressive calculator means to modify the operation of said 'specialised' Kalman filter; and an arithmetic decoder, controlled by a variance estimator means operating on the output of said dequantizer, interposed between said variable bit rate digitised representation and said dequantizer to generate said decoded form of said variable bit rate digitised representation.
Preferably the audio signals are speech signals. The invention further discloses an ADPCM audio encoding/decoding system comprising an encoding system described above, a digital memory storage means for receiving and storing said variable bit rate digitised representation, and a decoding system as described above.
There further can be a plurality of the encoding systems and a plurality of the decoding systems connected to a distributed digital communications network.
The invention yet further disclosed a method for ADPCM audio encoding to provide a variable bit rate digitized representation of audio signals, said method comprising the steps of: filtering a digital audio signal by 'specialised' Kalman filter means; quantizing the filtered signal; and arithmetic coding the quantized signal to provide said variable bit rate digitised representation of said audio signal.
The invention yet further discloses a method for ADPCM audio decoding to provide a reconstituted audio signal from a variable bit rate digitised representation of an original audio signal, the method comprising the steps of: arithmetic decoding of said digitised representation; dequantizing of said decoded representation; and filtering said dequantized signal by 'specialised' Kalman filter means to produce said reconstituted audio signal.
Brief Description of the Drawings
A preferred embodiment of the present invention will now be described with reference to the drawings in which:
Fig. 1 is a schematic block diagram of an ADPCM speech encoding system of the preferred embodiment;
Fig. 2 is a schematic block diagram of the ADPCM speech decoding system of the preferred embodiment; and
Fig. 3 shows a distributed communications arrangement.
Detailed Description and Best Mode of Performance
As seen in Fig. 1, the encoding system 10 of the preferred embodiment has a microphone 1 at which speech signals are generated and passed to an A to D converter 2. A summer 3 is provided together with a quantizer 5 and dequantizer 6. The output of the quantizer 5 is input to an arithmetic coder 8 which is controlled, as indicated by dotted lines, by a variance estimator (VarEst) 9 which obtains its input from the dequantizer 6. The output O/P of the arithmetic coder 8 is the encoded bit stream. The output, O/P, can be connected with a digital data memory store, such as CD-ROM, DAT or a disc-drive, or alternatively to a distributed communications network for broadcast. The output of the dequantizer 6 is input to a specialised Kalman filter 7 which has two outputs. One of these outputs is a predicted output P/O which is applied to the negative input of the summer 3. The other output is a smoothed output S/O. The smootiied output S/O is used by an autoregressive model coefficient calculator (ARcalc) 4 which is in turned used to adapt the specialised Kalman filter 7. For the decoding system 20 as seen in Fig. 2, the specialised Kalman filter 7 is again used, however, since the input I/P for the decoder constitutes a variable bit rate stream, this is applied to an arithmetic decoder 18 which is again controlled by a variance estimator (Var Est) 9. The input, I/P, can be from a memory store or from a broadcast network as discussed above. A dequantizer 16 is provided. The specialised
Kalman filter 7 is again adapted by the ARcalc 4 and its output O/P is the reconstructed output which is connected to a loudspeaker 15.
Speech signals have significant amounts of redundancy, or correlation between samples. The encoding/decoding problem is to represent efficiently the information contained in the speech signal in a digital form for storage or transmission over a channel. In order to do this, it is desirable to remove the redundancy from the samples to be stored or transmitted. Put another way, it is desirable to remove that which is predictable from the samples to be stored or transmitted.
For the purposes of analysis, speech is typically modelled as a filtered white noise signal which is mathematically represented in a way which incorporates various filter coefficients. Forwards adaptation schemes transmit these filter coefficients to the decoder in a quantized form. Then both the encoder and decoder are able to use the quantized coefficients in the prediction process.
However, in backwards adaptation as in the present case, these filter coefficients are not transmitted. Instead, backwards adaptation is based on the availability of the reconstructed signal at the decoder which can then be used to produce a set of filter coefficients for the prediction process. If the reconstructed signal is close to the input signal, then it is reasonable to expect that backwards adaptation will perform well. It is known that a Kalman filter can be used to provide a good prediction with coarsely quantized measurements. These predictions are based on Kalman filter state estimates which contain smoothed signal values up to the order of the Kalman filter. However, a filter order of from 10 to 50 typically is required, and the computational cost typically is of an order of magnitude which approximates to the cube power of the filter order. As a consequence, a Kalman filter predictor which utilises an adequate filter order gives rise to a substantial computational burden. For example, for Kalman filtering with a 50th order predictor, the computational cost of the update equations run every sample is in the vicinity of 120 MFLOPS.
This computational burden can be reduced by use of a Kalman filter technique with reduced complexity. Although sub-optimal, such an approach uses smoothing estimates in the predictor up to a relatively small lag of n samples. This approach utilises the consequence that most of the smoothing gain is to be obtained in the first few smoothing lags.
Smoothing to n lags is mathematically equivalent to assuming that only the top left hand n x n block of the error co variance matrix is non-zero. The error covariance matrix arises because the calculated coefficients are not identical with the actual signal coefficients. Thus a specialised form of error covariance sub-matrix can be assumed and a good approximation obtained by simply updating the n x n error covariance matrix in order to provide an appropriate smoothing. With a specialised Kalman filter maximum smoothing lag value as low as 4, a reconstituted speech signal which differs from the equivalent signal of the full 50th order Kalman filter by less than 0.2 dB in signal to noise ratio is obtainable. A computational load of only 0.8 MFLOPS is required compared to 120 MFLOPS for the 50th order Kalman filter.
Furthermore, the modified or specialised Kalman filter also results in considerable subjective improvement as it practically eliminates the high frequency "hiss" introduced by large quantization errors. This is especially important for low bit rate signal encoding/decoding systems. In order to overcome the problems of large quantization errors, an arithmetic encoder and corresponding decoder are used in conjunction with the quantizer and dequantizer. This arithmetic encoding/decoding gives a very large increase in the number of effective levels of the quantization. These levels are entropy coded via the arithmetic coder 8 (and its corresponding arithmetic decoder 18) based on the expected probability of the occurrence of each level. The probabilities are calculated using a distribution assumption, for example a Laplacian distribution assumption, and a variance estimate of the prediction difference signal which is input to the quantizer 5,
A substantial advantage to flow from an effectively larger number of quantization levels is that it practically eliminates overload distortion and hence there is
more information present in the quantized data to assist in predictor adaptation and the estimate of the variance.
Furthermore, because the arithmetic coder 8 and decoder 18 are entropy coded based on the expected probabilities of occurrence, the expected silence periods within speech utterances are able to be utilised to advantage.
The above described preferred embodiment gives good quality reconstituted speech at an average bit rate of 16 kbps. Similarly, good quality speech at low bit rates in the vicinity of 8 kbps are achievable.
The foregoing description relates to speech signals, however the invention equally is applicable to other audio signals including music.
To aid the foregoing description, a further detailed mathematically-based description follows, referring again to Figs. 1 and 2.
In another embodiment, a plurality of the encoding systems 10 and a plurality of decoding systems 20 can be arranged in a distributed communications system 30 as shown in Fig. 3. The system 30 has a plurality of sites 25, which can have transmit- only, receive-only capability or a combination of both. Whilst a ring network configuration is shown, other network configurations are contemplated, including hub.
Embodiments of the invention can be implemented as software coding alone.
Encoder
The encoder takes the speech signal as an input and produces a variable rate compressed digital signal (Ik) for storage or transmission (see Figure 1). The speech input (Sk) to the encoder will be in the form of a (high) fixed bit rate digital signal obtained from a microphone in cascade with an analog-to-digital (A/D) converter or from some other means such as another codec (eg. PCM). The prediction output of the specialised Kalman filter (S k_χ) is subtracted from the speech signal (Sk) to produce a difference signal (Sk),
Sk = Sk - S _ (1)
This signal is then applied to the quantiser which is represented by the Q block. The quantiser may take many forms including (but not restricted to) uniform, non-uniform, adaptive, etc. The output of the quantiser (Zk) is a trivially coded version of the difference signal (Sk) and is mathematically specified as
Zk = Q [Sk] . (2)
The quantiser output is then applied to two other blocks, the arithmetic coder (AC) and the dequantiser (Q-1). The dequantiser converts the coded signal Zk back to a real value (Yk) which represents the quantised difference signal,
Yk -= Q-- [Zk] ~- Q-- [Q [sk]] - (3)
The quantised difference signal Yk then drives the specialised Kalman filter (KF) which
The specialised Kalman filter parameters are backwards adapted by virtue of the fact that the adaptation is performed based on the reconstructed speech signal (S~). The block ARcalc is driven by the reconstructed speech signal and computes the coefficients of the all-pole or auto-regressive (AR) signal model which in turn is used to adapt the specialised Kalman filter.
The arithmetic coder takes the trivially coded difference signal (Zk) and converts it to a variable rate entropy coded bit stream (Ik). The arithmetic coding is adapted to cope with the nonstationary statistics and for this particular application, the adaptation was performed based on the short term variance (although it could be based on some other quantity). The block VarEst computes the short term variance of the quantised difference signal (Yk) and this quantity is used to adapt the arithmetic coder.
Specialised Kalman filter
The specialised Kalman filter is a simplified version of a full Kalman filter based on an all-pole signal model. It is also an extension of the standard linear predictor that is commonly used in speech coding.
Mathematical Details
Both the full Kalman filter and the specialised Kalman filter are based on the all-pole signal model (also referred to as an auto-regressive (AR) signal model). The all-pole speech signal model is specified by _
Sk = Σ ^WN Wk, (5) i=ι <HZ where Sk is the speech signal, Wk represents the excitation sequence (white noise), and aι are the all-pole model coefficients. Equation (5) can also be written as
H = [1, 0, - - - , 0].
In this state-space formulation, the state vector, xk, consists of N samples, from the fcth sample back to the (k — N + l)th sample.
The full Kalman Filter equations for the above all-pole speech signal model are:
Kk = PkHτ(HPkHτ + Rk)'1 (11) and
Pfc+ι = FPkFτ - FPkHτ(HPkHτ + Rk)-lHPkFτ + Qk, (12) is a Riccati difference equation (RDE) which recursively calculates the error covariance matrix. The quantity
Rk = E[ ] = σlk (13)
s the measurement noise variance related to the quantised prediction error signal, Yk = Q[Sk
Sk\k-1 = Hxk k-l- (15)
The state vector estimate for the Kalman Filter is of the form
Thus the previous speech sample estimates are present in the state vector to various fixed-lag smoothed values. It is then a simple matter of reading out the smoothed sample from this state vector. Mathematically, the smoothed output is specified by
Sk-ι\k = Hlxk\k; 1 = 0, 1, . . . N - l, (17) where H-7 is defined as the (j + l)th row of an N x N identity matrix. Another way of writing (9), (10) and (16) is
>fc|fc k\k-l + KkYk
Sk-ι\k — Sk-\\k-\ + κkYk
Sk-N+ι\k — Sk-N+ \k-\ + Kk Yk, where Kk is the ith entry of the Kalman gain vector Kk.
The Kalman filter takes account of the measurement (quantisation) noise in the signal coding situation, by recognising that the reconstructed sample (Sk) is not a perfect representation of the input speech sample Sk. The Kalman Filter exploits the correlation between samples, given by the all-pole signal model, to obtain smoothed estimates of the input samples for use in the prediction.
Smoothing theory specifies that most of the smoothing gain is to be obtained in the first few smoothing lags for this particular problem. Since this is the case, most of the advantage of full Kalman filtering can be obtained by smoothing to the first few n lags (say up to 5 lags), rather than continuing to the full order (eg., 49 lags for a 50th order signal model).
The reduced complexity (and consequently sub-optimal) specialised Kalman Filter approach uses smoothed estimates in the filter up to a relatively small lag of n samples. Our sub-optimal
prediction is based on
°k-\ \k = Sk-l\k-l + κkγk
The reduced complexity Kalman Filter state estimate has the form
Compare this with the full Kalman Filter shown above, and it can be seen that we now have a "hybrid" approach and this is what we refer to as the specialised Kalman filter. The re¬ constructed speech signal can be obtained from the sub-optimal smoothed estimate via (Sk = Sk_nk). The sub-optimal smoothed estimate is read out of the state vector and mathematically, this is specified by
where HJ is defined as the (j + l)th row of an N x N identity matrix.
Smoothing to n lags only is equivalent to assuming that only the top left hand n x n block of the error covariance is non-zero Thus, we have an error covariance matrix of the form
With Pk of this form, the Riccati equation (13), gives
(24)
Table 1: SNR measures and MFLOPS for the Riccati difference equation
with P11 and Q} defined as the top left n x n blocks from the F and Qk matrices, and H1 defined as the vector of size n from the left of the H vector. In fact, for Pk as in (22), the top left n x n block of Pk+i is exactly P^, and the only non-zero elements in Pfc+i are in the top left (n + 1) x (n + 1) block. If (22) is a good approximation for Pk obtained during full Kalman Filtering, then we would expect the non-zero elements outside the n x n top left block of Pfc+i to be close to zero, and hence a« = pii' 1 0 ' (25)
0 0 is a good approximation for Pfc+i- We thus are able to reduce the computational cost of full Kalman Filtering, through a sub-optimal approach, by simply updating the n x n error covariance matrix Pk n.
Specialised Kalman Filter Results
Simulations using a 50th order (N = 50) linear predictor within a AC-ADPCM speech coding framework, have shown that even with an upper block as low as n x n in the specialised Kalman filter, the approach can give a reconstructed signal differing from the full order Kalman filter approach by less than 0.2 dB in SNR. The corresponding improvement over the standard linear predictor for this situation is 1.0 dB in SNR. These tabulated results correspond to the output from the variable bit rate ADPCM system, at an average of 1.5 bits/sample, or 12 kbps (at 8 kHz sampling rate).
In Table 1 we present the Signal to Noise Ratio (SNR) and the Segmental SNR (segSNR) for a speech sentence along with the computational cost of the Riccati equation in MFLOPS. The notation LP indicates the standard linear predictor, KF denotes the full 50th order Kalman Filter, and KFn indicates the specialised Kalman filter with an n x n upper block in the Riccati difference equation.
From the table, it is clear that the use of Kalman Filtering techniques can result in significant performance improvement, for extremely low additional complexity. Take particular note of the 0.5 dB improvement from using a first order Riccati equation. The extra computational complexity needed to obtain this improvement is negligible.
The use of the Kalman Filter in the KF-AC-ADPCM system also results in considerable subjective improvement, as it practically eliminates the high frequency "hiss" introduced by the large quantisation errors. This is especially important for low bit rate signal coding systems.
The subjective improvement is in fact far greater than the objective SNR measures would indicate from above.
Adaptation
The specialised Kalman filter utilises backward adaptation in its operation. This involves calculating the filter coefficients from the reconstructed speech signal (Sk) rather than from the original speech signal (Sk), as would be the case in forward adaptation. The block ARcalc is driven by the reconstructed speech signal and computes the coefficients of the all-pole or auto- regressive (AR) signal model which in turn is used to adapt the specialised Kalman filter. This is a well known procedure, particularly in the Low Delay Code-word Excited Linear Prediction systems (LD-CELP) such as the CCITT G.728 16kbit/sec standard.
The speech signal is broken up into short segments (20-200 samples) and an assumption is made that the signal is stationary over that period, i.e. it is assumed to be piece- wise stationary. For this segment, the "optimal" set of α* coefficients can be computed and these are used in the specialised Kalman filter for the duration of the segment.
There are many standard approaches to computing these α* coefficients. A common approach is -as follows:
First of all a segment of the most recent data (reconstructed samples) is windowed using some form of recursive window. Next this windowed data is used to compute the auto-correlation coefficients. The Levinson-Durbin algorithm then efficiently computes the α* coefficients in a recursive fashion based on the auto-correlation coefficients.
Arithmetic Coding
Arithmetic Coding is a practically optimal entropy coding scheme, that is used here to encode the quantiser output with only the number of bits required by information theoretic consid¬ erations, based on the probability of that quantisation level being used. The prime difference between arithmetic coding and the more common Huffman coding is that it does not suffer from the disadvantage of requiring each source symbol to be encoded with an integral number of bits. This is particularly advantageous for highly peaked probability distributions as in this case.
The arithmetic coder takes the trivially coded difference signal (Zk) and converts it to a variable rate entropy coded bit stream (Ik). Mathematically, this can be specified by
Ik = AC [Zk] (26)
The arithmetic coding is adapted to cope with the nonstationary statistics and for this particular application, the adaptation was performed based on the short term variance (although it could be based on some other quantity). The block VarEst computes the short term variance of the quantised difference signal (Yk) and uses this quantity to adapt the arithmetic coder. The short term variance is calculated by the recursion σk 2 +1 = σl + (l - a)Yk- (27) where σ~ is a measure of the short term variance and a is the leakage constant with a value between 0 and 1.
The quantised difference signal is allocated to various bins based on the magnitude of the short term variance σ2. A probability distribution is then either assumed (eg. Laplacian) or calculated (eg. using test sentences to compute a look-up table) for each bin and this is then used to arithmetically code the quantised difference signal (Yk).
Perceptual Weighting
Perceptual weighting is a commonly used technique in many speech coding applications. It is used to improve the subjective quality of the decoded (reconstructed) speech by utilising known properties of the human auditory response. It is known that the human hearing is less sensitive to coding distortion in frequency bands where the energy is greatest due to the masking effect of the human ear. Perceptual weighting is found to give significant subjective performance improvements in the KF-AC-ADPCM system.
Perceptual weighting is the technique of adding appropriate filtering in order to redistribute the coding distortion energy in approximately the same distribution as the speech signal.
Transmission Channel or Storage
The output of the encoder (Ik) is a variable rate bit stream produced by the arithmetic coding block. These data are then either transmitted over a digital telecommunications channel (eg. Asynchronous Transfer Mode (ATM) network) or stored on some form of digital storage device (eg. hard disk) . There is always the possibility of bit errors occurring during transmission or storage, and so the input bit stream to the decoder will be referred to as . In an ideal situation, Ik = Ik- A practical codec must be able to cope with the non-ideal situation that occurs when bit errors are introduced (Ik Ik) and special techniques (such as periodic resynchronisation) are added in order to achieve this. For fixed rate channel applications, the encoder output bit stream (Ik) is buffered in order to achieve the fixed rate average value. This necessarily introduces an additional encoding delay.
Decoder
Figure 2 is a block diagram of the decoder. The decoder converts the received compressed variable rate bit stream (Ik) to the decoder reconstructed speech signal (Sk ). This reconstructed speech signal is in a similar form (fixed rate digital signal) to the original speech signal Sk- The reconstructed signal (Sk ) is then either connected to a cascade of a digital-to-analog (D/A) converter and speaker or to some other device.
The decoder first of all arithmetic decodes the variable rate bit stream (Ik) to produce the trivially coded signal at the decoder (Zk) and this is represented by the arithmetic decoder block (AC-1). Mathematically we have
Zk' = AC"1 [lk'] . (28)
The output of the arithmetic decoder (Zk) is then converted to a decoder quantised difference signal (Yk) by the dequantiser block Q_1 and this can be written as
Yk = Q-1 [zk '] (29)
Note that the dequantiser block is identical to the one at the encoder. The arithmetic decoder is adapted in an equivalent way to the arithmetic coder, i.e. a variance estimate of the decoder quantised difference signal (Yk) is used to adapt the arithmetic decoder. The VarEst block is identical to that of the encoder. The decoder quantised difference signal (Yk) then drives the specialised Kalman filter to produce the decoder reconstructed signal (Sk ). This reconstructed signal is taken to be the decoder smoothed output, i.e. Sk = <S' MA.. The specialised Kalman filter is identical in form to that of the encoder. The specialised Kalman filter is backward adapted in an equivalent manner to the encoder.
Codec SNR (dB) segSNR (dB)
LD-CELP 18.81 16.03
KF-AC- ADPCM 28.03 17.82
Table 2: SNR and segSNR comparison between G.728 LD-CELP and KF-AC-ADPCM operat¬ ing at 16 kb/s
KF-AC-ADPCM Results
It is difficult to present concrete performance results for speech coding since the ultimate per¬ formance criterion is subjective quality and this can only be properly evaluted by expensive formal listening tests. The common approach is to use objective measures such as SNR and segSNR in conjunction with informal listening tests. In addition, it is also usual to compare with an existing well known standard, eg. CCITT G.728 LD-CELP 16 kb/s codec.
Table 2 shows values of SNR and segSNR for 16 kb/s LD-CELP and KF-AC-ADPCM operating at an average bit rate of 16 kb/s, when tested on the sentence "Cats and dogs each hate the other" .
Notice the KF-AC-ADPCM at an average of 16 kb/s has higher SNR and segSNR fig¬ ures than LD-CELP. The informal listening tests confirm that the subjective performance is significantly superior to that of LD-CELP. The informal listening tests also indicate that the KF-AC-ADPCM subjective quality at an average bit rate of 12 kb/s is equal to that of 16 kb/s LD-CELP. At 8kb/s, the KF-AC-ADPCM subjective quality is slightly inferior to LD-CELP, but nevertheless is still very good.
Claims
1. An ADPCM audio encoding system to provide a variable bit rate digitised representation of audio signals, said system comprising a digital input for said audio signals connected to a positive input of a summer, the output of said summer being connected to a cascade connected quantizer and dequantizer, the output of said dequantizer forming an input to a 'specialised' Kalman filter having two outputs, the first of said outputs comprising a predicted output which is connected to a negative input of said summer, the other of said filter outputs comprising a smoothed output which is utilized by an autoregressive calculator means to modify the operation of said specialised Kalman filter; and an arithmetic coder having its input connected to said quantizer output, and controlled by a variance estimator means operating on the output of said dequantizer, the output of said arithmetic coder comprising said variable bit rate digitised representation.
2. An encoding system as claimed in claim 1, further comprising an analog-to-digital converter for receiving audio signals in analog form and converting said signals to digital form.
3. An encoding system as claimed in claim 2, further comprising audio transducer means to generate said analog audio signals from sound pressure waves.
4. An encoding system as claimed in any one of the preceding claims, wherein said specialised Kalman filter utilises sub-optimal Kalman prediction by smoothing sample estimates only to a reduced number of lags relative to the full order.
5. An encoding system as claimed in any one of the preceding claims, wherein said arithmetic coder generates a variable rate entropy coded bit stream.
6. An encoding system as claimed in any one of the preceding claims, further comprising digital memory storage means for receiving and storing said variable bit rate digitised representation.
7. An encoding system as claimed in any of the preceding claims, wherein said audio signals are speech signals.
8. An ADPCM decoding system to provide a reconstituted audio signal from a variable bit rate digitised representation of an original audio signal, said system comprising a dequantizer having its input connected to receive a decoded form of said variable bit rate digitised representation, the output of said dequantizer being connected to a 'specialised' Kalman filter, the output of which comprises said reconstituted audio signal and is utilized by an autoregressive calculator means to modify the operation of said 'specialised' Kalman filter, and an arithmetic decoder, controlled by a variance estimator means operating on the output of said dequantizer, interposed between said variable bit rate digitised representation and said dequantizer to generate said decoded form of said variable bit rate digitised representation.
9. A decoding system as claimed in claim 8, further comprising transduce means for receiving said reconstituted audio signal to generate sound pressure waves.
10. A decoding circuit as claimed in either of claim 8 or claim 9, wherein said specialised Kalman filter utilises sub-optimal Kalman prediction by smoothing sample estimates only to a reduced number of lags relative to the full order.
11. A decoding system as claimed in any one of claims 8 to 10, wherein said audio signals are speech signals.
12. An ADPCM audio encoding/decoding system comprising: an encoding system as claimed in any one of claims 1 to 5; a digital memory storage means for receiving and storing said variable bit rate digitised representation; and a decoding system as claimed in any one of claims 8 to 10.
13. An ADPCM audio encoding/decoding system comprising: one or more encoding systems as claimed in any one of claims 1 to 5; a distributed digital communications network to which said one or more encoding systems are connected and over which ones of said variable bit rate digitised representations are transmitted; one or more decoding systems as claimed in any one of claims 8 to 10 connected to said network, each said decoding system selectively acting as a destination for said transmitted representations.
14. A method for ADPCM audio encoding to provide a variable bit rate digitized representation of audio signals, said method comprising the steps of: filtering a digital audio signal by 'specialised' Kalman filter means; quantizing the filtered signal; and arithmetic coding the quantized signal to provide said variable bit rate digitised representation of said audio signal.
15. A method as claimed in claim 14, wherein said filtering steps includes dequantizing said quantized signal, filtering said dequantized signal by sub-optimal Kalman prediction by smoothing sample estimates only to a reduced number of lags relative to the full order to produce a predicted signal and subtracting the predicted signal from said digital audio signal.
16. A method as claimed in claim 15, whereby said step of arithmetic coding includes generating a variable rate entropy coded bit stream.
17. A method for ADPCM audio decoding to provide a reconstituted audio signal from a variable bit rate digitised representation of an original audio signal, the method comprising the steps of: arithmetic decoding of said digitised representation; dequantizing of said decoded representation; and filtering said dequantized signal by 'specialised' Kalman filter means to produce said reconstituted audio signal.
18. A method as claimed in claim 17, whereby said step of filtering includes filtering said dequantized signal by sub-optimal Kalman prediction by smoothing sample estimates only to a reduced number of lags relative to the full order.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU22098/95A AU2209895A (en) | 1994-04-13 | 1995-04-13 | Adpcm signal encoding/decoding system and method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AUPM5037 | 1994-04-13 | ||
AUPM5037A AUPM503794A0 (en) | 1994-04-13 | 1994-04-13 | Adpcm signal encoding/decoding system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1995028770A1 true WO1995028770A1 (en) | 1995-10-26 |
Family
ID=3779617
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/AU1995/000216 WO1995028770A1 (en) | 1994-04-13 | 1995-04-13 | Adpcm signal encoding/decoding system and method |
Country Status (2)
Country | Link |
---|---|
AU (1) | AUPM503794A0 (en) |
WO (1) | WO1995028770A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ES2124664A1 (en) * | 1996-11-12 | 1999-02-01 | Alsthom Cge Alcatel | Method and device for adaptive differential pulse code modulation. |
WO2000041313A1 (en) * | 1999-01-07 | 2000-07-13 | Koninklijke Philips Electronics N.V. | Efficient coding of side information in a lossless encoder |
WO2008058692A1 (en) * | 2006-11-13 | 2008-05-22 | Global Ip Solutions (Gips) Ab | Lossless encoding and decoding of digital data |
FR3018942A1 (en) * | 2014-03-24 | 2015-09-25 | Orange | ESTIMATING CODING NOISE INTRODUCED BY COMPRESSION CODING OF ADPCM TYPE |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2217583A (en) * | 1982-12-10 | 1984-06-14 | Nec Corporation | Ad pcm decoder |
EP0206273A2 (en) * | 1985-06-20 | 1986-12-30 | Fujitsu Limited | Adaptive differential pulse code modulation system |
EP0206352A2 (en) * | 1985-06-28 | 1986-12-30 | Fujitsu Limited | Coding transmission equipment for carrying out coding with adaptive quantization |
US4654863A (en) * | 1985-05-23 | 1987-03-31 | At&T Bell Laboratories | Wideband adaptive prediction |
EP0288281A2 (en) * | 1987-04-21 | 1988-10-26 | Oki Electric Industry Company, Limited | ADPCM encoding and decoding systems |
-
1994
- 1994-04-13 AU AUPM5037A patent/AUPM503794A0/en not_active Abandoned
-
1995
- 1995-04-13 WO PCT/AU1995/000216 patent/WO1995028770A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2217583A (en) * | 1982-12-10 | 1984-06-14 | Nec Corporation | Ad pcm decoder |
US4654863A (en) * | 1985-05-23 | 1987-03-31 | At&T Bell Laboratories | Wideband adaptive prediction |
EP0206273A2 (en) * | 1985-06-20 | 1986-12-30 | Fujitsu Limited | Adaptive differential pulse code modulation system |
EP0206352A2 (en) * | 1985-06-28 | 1986-12-30 | Fujitsu Limited | Coding transmission equipment for carrying out coding with adaptive quantization |
EP0288281A2 (en) * | 1987-04-21 | 1988-10-26 | Oki Electric Industry Company, Limited | ADPCM encoding and decoding systems |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ES2124664A1 (en) * | 1996-11-12 | 1999-02-01 | Alsthom Cge Alcatel | Method and device for adaptive differential pulse code modulation. |
WO2000041313A1 (en) * | 1999-01-07 | 2000-07-13 | Koninklijke Philips Electronics N.V. | Efficient coding of side information in a lossless encoder |
WO2008058692A1 (en) * | 2006-11-13 | 2008-05-22 | Global Ip Solutions (Gips) Ab | Lossless encoding and decoding of digital data |
FR3018942A1 (en) * | 2014-03-24 | 2015-09-25 | Orange | ESTIMATING CODING NOISE INTRODUCED BY COMPRESSION CODING OF ADPCM TYPE |
WO2015145050A1 (en) * | 2014-03-24 | 2015-10-01 | Orange | Estimation of encoding noise created by compressed micda encoding |
Also Published As
Publication number | Publication date |
---|---|
AUPM503794A0 (en) | 1994-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6593872B2 (en) | Signal processing apparatus and method, signal coding apparatus and method, and signal decoding apparatus and method | |
JP2005534947A (en) | Scale-factor feedforward prediction based on acceptable distortion of noise formed when compressing on a psychoacoustic basis | |
KR101143792B1 (en) | Signal encoding device and method, and signal decoding device and method | |
EP1096476B1 (en) | Speech signal decoding | |
JP2007504503A (en) | Low bit rate audio encoding | |
JPH1097295A (en) | Coding method and decoding method of acoustic signal | |
US7072830B2 (en) | Audio coder | |
JP3266178B2 (en) | Audio coding device | |
EP2560163A1 (en) | Apparatus and method of enhancing quality of speech codec | |
JPH0590974A (en) | Method and apparatus for processing front echo | |
JP4843142B2 (en) | Use of gain-adaptive quantization and non-uniform code length for speech coding | |
JP3357829B2 (en) | Audio encoding / decoding method | |
JP4359949B2 (en) | Signal encoding apparatus and method, and signal decoding apparatus and method | |
JP2001053869A (en) | Voice storing device and voice encoding device | |
US6678653B1 (en) | Apparatus and method for coding audio data at high speed using precision information | |
WO1995028770A1 (en) | Adpcm signal encoding/decoding system and method | |
JP4281131B2 (en) | Signal encoding apparatus and method, and signal decoding apparatus and method | |
JP2003110429A (en) | Coding method and device, decoding method and device, transmission method and device, and storage medium | |
JP5057334B2 (en) | Linear prediction coefficient calculation device, linear prediction coefficient calculation method, linear prediction coefficient calculation program, and storage medium | |
JP5451603B2 (en) | Digital audio signal encoding | |
JP3265726B2 (en) | Variable rate speech coding device | |
JP3417362B2 (en) | Audio signal decoding method and audio signal encoding / decoding method | |
JP3004664B2 (en) | Variable rate coding method | |
JP3496618B2 (en) | Apparatus and method for speech encoding / decoding including speechless encoding operating at multiple rates | |
JP4409733B2 (en) | Encoding apparatus, encoding method, and recording medium therefor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CA JP US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: CA |