WO2000033293A1

WO2000033293A1 - Fixed-point multiplication for adpcm speech coder

Info

Publication number: WO2000033293A1
Application number: PCT/SG1998/000098
Authority: WO
Inventors: Foo Yuen Leong
Original assignee: Stmicroelectronics Asia Pacific Pte Ltd
Priority date: 1998-12-02
Filing date: 1998-12-02
Publication date: 2000-06-08
Also published as: EP1138037A1

Abstract

The ITU-T Recommendation G.726 specifies an ADPCM algorithm for the encoding of speech signals. In the adaptive predictor block of the algorithm, a floating point multiplication routine is specified for the calculation of the signal estimate. This routine is computationally intensive and accounts for 30 % of the MIPs requirement for the algorithm. A fixed-point multiplication is proposed as a replacement, which makes use of the availability of 40-bit accumulators. The new routine provides a significant reduction in the MIPs requirement and also improves speech quality.

Description

FIXED-POINT MULTIPLICATION FOR ADPCM SPEECH CODER

Field of the Invention

This invention relates to the implementation of a digital speech coder for the transmission of speech or voice band data over a communications network.

Background of the Invention

In order to transmit speech or voice band data over a communications network in a digital form, one of the methods that may be used to encode the input data for transmission is Adaptive Differential Pulse Coded Modulation (ADPCM). The ADPCM algorithm achieves speech compression by combining adaptive quantization and differential PCM. Adaptive quantization adjusts the step size of the quantizer as the signal changes. This allows the algorithm to accommodate variations in the signal amplitude. Differential PCM involves transmitting the difference between the current and previous signal sample instead of simply transmitting the current sample itself. The difference signal obtained in this way tends to have a much lower dynamic range compared to the original signal and may therefore be quantized to a specific signal-to-noise ratio with fewer bits.

In practice, the difference signal is computed from the current signal sample and a signal estimate determined by an adaptive predictor. The adaptive predictor uses signal estimates of previous samples to obtain an approximation of the current sample. This is performed in both the encoder and decoder so that they are synchronised with each other and there will not be any accumulation of errors in the reconstructed signal at the decoder output.

In ITU-T Recommendation G.726, the adaptive predictor is represented by a two-pole, six- zero adaptive predictive filter. The combination of poles and zeros enables the filter to deal with any general input signal. The sixth-order all-zero filter is needed to stabilise the filter and prevent it from drifting into oscillation. The filter coefficients are updated based on a simplified gradient algorithm.

The signal estimate is computed by:

*.⁽*⁾

where s_e : signal estimate s : reconstructed signal cf_q : quantized difference signal βj, bi : predictor coefficients

The range of values of the predictor coefficients is limited to ±2 and are stored as 16-bit fixed point values. The quantized difference signal and reconstructed signal can vary between -32768 to 32767. Initially 16-bit fixed point values, they are then converted to floating point and stored. The aforementioned ITU-T recommendation specifies that the multiplication operation should be performed in floating point, by converting all inputs to floating point values with 6 bits of mantissa and 4 bits of exponent. The resulting product is then converted back into a 16-bit fixed point number.

Summary of the Invention

In accordance with the present invention, there is provided a method for encoding speech or voice band data by way of adaptive differential pulse coded modulation including an adaptive predictor procedure which implements an adaptive predictive filter for generating a signal estimate from quantized difference signal values, reconstructed signal values and respective predictor coefficients according to a predetermined multiplication and accumulation operation, wherein the quantized difference signal values and reconstructed signal values are represented by single word length fixed point binary values, including performing multiplication in fixed point format between the respective said predictor coefficients and the quantized difference signal values and reconstructed signal values to generate respective double word length fixed point partial product values, summing the double word length fixed point partial product values to form a double word length predictor sum and rounding the predictor sum to a single word length fixed point representation of said signal estimate.

The present invention also provides An adaptive differential pulse coded modulation encoder for encoding speech or voice band data for transmission over a communications network, including an adaptive predictor having an adaptive predictive filter for generating a signal estimate from input quantized difference signal values, input reconstructed signal values and respective predetermined predictor coefficients, wherein the quantized difference signal values and reconstructed signal values are represented by single word length fixed point binary values, the adaptive predictive filter including a multiplier which performs multiplication in fixed point format between the respective said predictor coefficients and the quantized difference signal values and reconstructed signal values to generate respective double word length fixed point partial product values, and an accumulator for slimming the double word length fixed point partial product values to form a double word length predictor sum and rounding the predictor sum to a single word length fixed point representation of said signal estimate.

Preferably the single word length representations comprise 16 bit binary values and the double word length representations comprise 32 bit binary values. However, it will be appreciated that other length words are possible within the scope of the invention, depending upon the type of computational processing equipment the invention is to be implemented on.

The invention is described in greater detail hereinafter, by way of example only, through description of a preferred embodiment thereof and with reference to the accompanying drawing which illustrates a generalised block diagram of an ADPCM encoder. Detailed Description of the Preferred Embodiment

The present invention relates to adaptive differential pulse coded modulation (ADPCM) of speech or voice band data for transmission over a communications network, of the type which is described in ITU-T Recommendation G.726, the disclosure of which is incorporated herein by reference. The ADPCM encoder in the ITU-T recommendation converts a 64 kbit/s PCM input into an ADPCM compressed output for transmission. The accompanying drawing figure illustrates a block diagram of an ADPCM encoder according to the ITU-T recommendation. Referring to the figure, an A-law or μ-law PCM input stream is first converted to uniform PCM. A difference signal is then obtained by subtracting an estimate of the input signal from the input signal itself. An adaptive quantizer is used to assign a quantized value of a predetermined number of binary digits to the value of the difference signal for transmission to the decoder. An inverse adaptive quantizer is arranged to produce a quantized difference signal from the quantized value output from the adaptive quantizer. The input signal estimate is added to the quantized difference signal to produce the reconstructed version of the input signal. Both the reconstructed signal and the quantized difference signal are operated upon by an adaptive predictor which produces the input signal estimate, thus forming a feedback loop.

The embodiment of the invention herein described is concerned primarily with the adaptive predictor portion of the ADPCM encoder, and in particular the filtering operation of the adaptive predictor. Because of the floating point multiplications, the filtering operation of the adaptive predictor is the most complex block of the ADPCM algorithm. According to the ITU-T recommendation, this involves first convening the fixed point inputs to floating point, multiplying the mantissas and adding the exponents, and finally converting the floating point product back to fixed point representation. In addition to the computational complexity of this operation, when the values d_q and s_t are close to the 16-bit limit, converting them to floating point will result in a loss of precision even before the multiply operation can take place. This is because only 6 bits are retained for the mantissa of the floating point number. The preferred embodiment of the present invention provides a way in which to perform the operations more efficiently and accurately.

Since all the input values are originally available in 16-bit fixed point format, it is possible to perform the multiplication directly in fixed point. This eliminates the need to convert the values between fixed and floating point formats. To reduce the errors due to loss of precision, the full 32-bit intermediate products are kept during accumulation. At the end of the filter operation, the final accumulated product is then rounded off to 16 bits.

The preferred embodiment of the invention primarily involves four function blocks that are defined in ITU-T Recommendation G.726, namely the FLOATA, FLOATB, FMULT and ACCUM blocks. The functions of these blocks as utilised in the ITU-T recommendation are described briefly below with reference to the figure and the signal estimate equation mentioned above.

FLOATA: This block receives the quantized difference signal d_q as input, where the quantity d_q is defined as a 15 or 16 bit signed binary magnitude. The quantized difference signal d_q is converted into a floating point value. This is performed by computing the exponent and mantissa and combining the sign bit, 4 exponent bits and 6 mantissa bits into one 11 bit word.

FLOATB: This block receives the reconstructed signal s_t as input, where the quantity s_t is defined as a 16 bit twos-complement quantity. The reconstructed signal s_τ is converted into a floating point value. This is performed by computing the exponent and mantissa and combining the sign bit, 4 exponent bits and 6 mantissa bits into one 11 bit word.

FMULT: This block multiplies predictor coefficients with the corresponding quantized difference signal or reconstructed signal. The multiplication is done in a floating point format, and thus the predictor coefficients, which are defined as 16 bit twos-complement quantities, are first converted into floating point representations. The products of the multiplication operations are signal estimate partial products (WAn, WBn), which are also defined as 16 bit twos- complement quantities, requiring a conversion from the floating point multiplication result.

ACCUM: This block operates on the signal estimate partial products to perform the summing portion of the operation represented by the equation discussed above. The partial products of the signal estimates (WA1, WA2, WB1, WB2, WB3, WB4, WB5 and WB6) are received as input and summed to obtain the complete signal estimate s_e. All of the quantities are twos-complement representations.

The preferred embodiment employs fixed point multiplication rather than floating point computation, which requires a number of modifications as described below. Replacing the floating point multiplication with a fixed point multiplication eliminates the need to convert values between fixed and floating point formats. This significantly reduces the complexity of the overall algorithm. By omitting the fixed-to-floating point conversion, the full precision of the original values is preserved, thereby reducing errors due to loss of precision. As a result, an improvement in the quality of the decoded signal can be achieved. The primary modifications to the ITU-T recommended ADPCM adaptive predictive filter which are implemented in the preferred embodiment are summarised below.

The FLOATA block ordinarily takes the reconstructed signal and converts it from 16-bit signed magnitude format to floating point. This block is re-defined to convert the signed magnitude numbers to 16-bit twos-complement numbers instead. The FLOATB block ordinarily takes the signal estimate and converts it from 16-bit twos-complement to floating point. This block is no longer needed and is discarded from the system.

The FMULT block normally performs several functions, namely converting the predictor filter coefficients from 16-bit twos-complement to floating point, performing floating point multiplication by adding the exponents and multiplying the mantissas and finally converting the product bxk into a 16-bit twos-complement number. The preferred embodiment requires that all these functions be discarded and replaced by a simple fixed-point multiplication which multiplies two 16-bit twos-complement numbers to give a 32-bit product. The full 32 bits of the result is retained. No truncation to 16 bits is performed.

The ACCUM block ordinarily adds the 16-bit predictor outputs together to form the signal estimate. The preferred embodiment requires that the accumulation function be modified to operate on 32-bit inputs. After the final accumulation, the result is then rounded off to 16 bits to give the signal estimate.

Thus, the preferred embodiment of the invention requires the ability to perform 16x16 bit fixed-point multiplication and to store the 32-bit result for subsequent arithmetic operations. The following procedures, in pseudocode, implement the preferred embodiment by redefining the FLOATA, FLOATB, FMULT and ACCUM blocks originally specified in ITU-T G.726. The modified procedures in combination retain the functionality of the ITU-T recommendation, although not in strict compliance with the specification. Table 1, below, provides a description of the format of the variables used in the procedures.

TC denotes twos-complement representation SM denotes signed magnitude representation denotes sign bit

Table 1 Format and description of variables

Procedure 1: FLOATA Function: Convert 16-bit signed magnitude to 16-bit two's complement

DQS = DQ > > 15 Get the sign bit. DQM = DQ & 32767 Compute magnitude. if DQS = 1 , DQO = DQM Convert magnitude to else DQO = -DQM twos-complement. Procedure 2: FLOATB

Function: Copy 16-bit twos-complement number from input to output

SRO - SR

Procedure 3: FMULT

Function: Multiply predictor coefficients with corresponding quantized difference signal or reconstructed signal. Multiplication is done in fixed point format.

WAn = An x SRn | Perform fixed point WBn = Bn x DQn | multiplication.

Procedure 4: ACCUM: Function: Addition of predictor outputs to fύrm the partial signal estimate (from the sixth order predictor) and the signal estimate.

SEZI = WB1 + WB2 + WB3 + WB4 + WB5 + WB6 | Sum for partial signal estimate.

SEI = SEZI + WA2 + WA1 | Complete sum for signal

I estimate. SEZ = SEZI > > 14 SE = SEI > > 14

The foregoing detailed description of the preferred embodiment of the invention has been presented by way of example only, and is not intended to be considered limiting to the invention as defined in the claims appended hereto.

Claims

Claims:

1. A method for encoding speech or voice band data by way of adaptive differential pulse coded modulation including an adaptive predictor procedure which implements an adaptive predictive filter for generating a signal estimate from quantized difference signal values, reconstructed signal values and respective predictor coefficients according to a predetermined multiplication and accumulation operation, wherein the quantized difference signal values and reconstructed signal values are represented by single word length fixed point binary values, including performing multiplication in fixed point format between the respective said predictor coefficients and the quantized difference signal values and reconstructed signal values to generate respective double word length fixed point partial product values, summing the double word length fixed point partial product values to form a double word length predictor sum and rounding the predictor sum to a single word length fixed point representation of said signal estimate.

2. A method as claimed in claim 1, wherein a said single word length representation comprises 16 bits and a said double word length representation comprises 32 bits..

3. An adaptive differential pulse coded modulation encoder for encoding speech or voice band data for transmission over a communications network, including an adaptive predictor having an adaptive predictive filter for generating a signal estimate from input quantized difference signal values, input reconstructed signal values and respective predetermined predictor coefficients, wherein the quantized difference signal values and reconstructed signal values are represented by single word length fixed point binary values, the adaptive predictive filter including a multiplier which performs multiplication in fixed point format between the respective said predictor coefficients and the quantized difference signal values and reconstructed signal values to generate respective double word length fixed point partial product values, and an accumulator for summing the double word length fixed point partial product values to form a double word length predictor sum and rounding the predictor sum to a single word length fixed point representation of said signal estimate.

4. An adaptive differential pulse coded modulation encoder as claimed in claim 3, wherein a said single word length representation comprises 16 bits and a said double word length representation comprises 32 bits.