WO2001050458A1

WO2001050458A1 - Subband adpcm voice encoding and decoding

Info

Publication number: WO2001050458A1
Application number: PCT/US2000/034410
Authority: WO
Inventors: Paul Gothard Knutson; Kumar Ramaswamy; John William Richardson
Original assignee: Thomson Licensing S.A.
Priority date: 1999-12-31
Filing date: 2000-12-19
Publication date: 2001-07-12
Also published as: AU2110001A

Abstract

A wireless telephone system comprises a base transceiver having a base transmitter and base receiver, and a plurality of wireless handsets. Each transmitter of the system comprises a transmitter encoder for encoding an input signal having an input sample rate, to be transmitted to a receiver of the system. The transmitter encoder comprises a filter bank for separating the input signal into a plurality of subband signals, each subband covering a different frequency range; and a plurality of ADPCM encoders, one for each subband, for encoding each subband signal at a number of bits per sample according to the relative importance of the subband.

Description

SUBBAND ADPCM VOICE ENCODING AND DECODING

BACKGROUND OF THE INVENTION Field of the Invention

The present invention relates to speech data encoding and transmission and, in particular, to ADPCM speech compression In multi-line wireless telephone systems with increased data compression,

Description of the Related Art

Digital data transmission from a transmitter to a receiver requires a variety of digital signal processing techniques to allow the data to be transmitted by the transmitter and successfully recovered or acquired by the receiver. In digital wireless telephone systems, for example, a wireless (cordless) telephone handset unit communicates via digital radio signals with a base unit, which is typically connected via a standard telephone line to an external telephone network. Each handset and the base comprise a transceiver, having a transmitter and receiver. In such a system, a user may employ the wireless handset to engage in a telephone call with another user through the base unit and the telephone network. The digital signals thus transmitted may be audio signals. If the audio signals represent human speech, they may be referred to as speech or voice signals.

Multi-line wireless telephone systems are in use in various situations, such as businesses with many telephone users. Such systems employ a base unit that communicates with up to N handsets in real time, typically with digital communications schemes, such as a spread-spectrum, time division multiplex (TDM) schemes such as time division multiple access (TDMA). In a spread spectrum system, bandwidth resources are traded for performance gains, in accordance with the so-called Shannon theory. The advantages of a spread-spectrum, system include low power spectral density, improved narrowband interference rejection, built-in selective addressing capability (with code selection), and inherent channel multiple access capability. Spread-spectrum systems employ a variety of techniques, including direct sequencing or sequence (DS), frequency hopping (FH), chirp systems, and hybrid DS/FH systems. DS spread spectrum systems are sometimes referred to as DSSS systems.

In a TDMA system, a single RF channel is used, and each handset transmits and receives audio data packets as well as non-audio data packets during dedicated time slices or time slots within an overall TDMA cycle or epoch. Other communications schemes include frequency division multiple access (FDMA), code division multiple access (CDMA), and combinations of such schemes. Various modulation schemes are employed, such as carrierless amplitude/phase (CAP) and quadrature amplitude modulation (QAM), although CAP is typically used only in wired channels.

In such communications schemes, data packets are digitally compressed in accordance with a certain encoding scheme, where each packet comprises a number of samples, which represent the data of the data packet in compressed form. Such data is typically audio data representing human speech; this is the case, for example, in wireless telephone systems in which human users communicate via transmitted speech. Adaptive differential pulse code modulation (ADPCM) is often used to compress speech-type data to provide audio data packets which consist of transmitted ADPCM samples. A common ADPCM implementation takes 1 6-bit linear PCM samples from an ADC and converts them to 4-bit samples, for example, yielding a compression rate of 4: 1 . ADPCM is a form of pulse code modulation (PCM), and produces a digital signal with a lower bit rate than standard PCM, by recording only the difference between samples and adjusting the coding scale dynamically to accommodate large and small differences. Some applications use ADPCM to digitize a voice or speech signal so voice and data can be transmitted simultaneously over a digital facility normally used only for one or the other. Thus, ADPCM is frequently used to encode speech signals (i.e., analog signals representing human speech), for example in a wireless telephone system. In such applications, instead of quantizing the speech signal directly, like PCM codecs, ADPCM codecs quantize the difference between the speech signal and a prediction that has been made of the speech signal. If the prediction is accurate then the difference between the real and predicted speech samples will have a lower variance than the real speech samples, and will be accurately quantized with fewer bits than would be needed to quantize the original speech samples. At the decoder receiving such ADPCM speech signal samples, the quantized difference signal is added to the predicted signal to yield the reconstructed speech signal. The performance of ADPCM codecs is aided by using adaptive prediction and quantization, so that the predictor and difference quantizer adapt to the changing characteristics of the speech being coded. In the mid- 1980s, the CCITT (Consultative Committee on International Telephone and Telegraphy, now known as the ITU-T, for Telecommunication Standardization Sector of the International Telecommunications Union; http://www.itu.int/) standardized a 32 kbits/s ADPCM, known as G721 , which gave reconstructed speech almost as good as the 64 kbits/s PCM codecs. Later in recommendations G726 and G727 codecs operating at 40,32,24 and 16 kbits/s were standardized.

In a TDM type system employing ADPCM samples, during a time slice, digitally compressed data packets may be transmitted which comprise ADPCM samples, in accordance with recommendation ITU-T G.721 or G.727 with a block code. This allows, for example, 16 ADPCM samples to be transmitted per audio packet. As noted above, the audio packets may represent speech data,

Digital data is typically transmitted as modulated signals over a transmission medium, such as the RF channel. (Other transmission media often used for digital communications include asymmetric digital subscriber loop (ADSL) systems or cable modem systems.) The digital data, in the form of a stream of binary digits (bits), is first mapped to a stream of symbols, each of which may represent multiple bits. A constellation is the set of all possible symbols for a given signaling scheme. Symbols can be a set of real amplitude levels, as in pulse amplitude modulation (PAM), or a set of points on a circle in the complex plane such as in quadrature phase shift keying (QPSK: 4 points on a circle, separated by 90 degrees of phase), or an array of points at different amplitudes and phases on the complex plane, as in QAM. Sets of bits are mapped to symbols by a look-up table (eg., a ROM). The number of symbols in a signaling constellation depends on the encoding scheme. For example, each

QPSK symbol represents 2 bits of the input data stream, with the 4 symbols,

1 H-j, 1 -j, - 1 +j, - 1 -j each representing the bit patterns 00, 01 , 10, and 1 1 , respectively. The real portion of such complex digital symbols is referred to as in-phase, or "I" data, and the imaginary part as quadrature, or "Q" data, yielding

1 , Q pairs.

To transmit a given input data value in a complex data system, the input data value to be transmitted is mapped to a pair of coordinates l,Q of a corresponding constellation point on a complex signal constellation having real and imaginary axes I and Q. These l,Q symbols, which represent the original data value, are then transmitted as part of data packets by a modulated channel. A receiver can recover the l,Q pairs and determine the constellation location therefrom, and perform a reverse-mapping to provide the original input data value or a close approximation thereof

In a DSSS type spread spectrum system, each symbol is transmitted by a string of "sub-symbols" or "chips". The string of chips is typically derived by multiplying the symbol (which may be either a 1 or -1 , in some schemes) by a pseudo-random number (PN) binary string of a certain length (number of chips C. Such systems are thus characterized by a chip rate, which is related to the symbol rate, Spread spectrum systems may be used to transmit any digital data, whether in complex format or not, and whether or not in a TDMA system.

Thus, in a DSSS system, a signal represents successive symbols, by means of successive "chips" of symbols. A received signal is sampled to provide samples. Samples thus represent a signal, which itself represents chips, which represent symbols.

The receiver side of a transceiver samples a received signal with an analog-to-digital converter (ADC), which provides samples representative of the signal, which in turn represents symbols. The transmitter side of a transceiver converts symbols into analog samples that constitute a signal, with a digital

-to-analog converter (DAC).

As noted above, digital data transmission requires a variety of digital signal processing techniques to allow the data to be transmitted by the transmitter (e.g., the transmitter of the base unit transceiver) and successfully recovered by the receiver (e,g., the receiver of a given handset transceiver). For example, the receiver side of a data transmission in a spread-spectrum digital wireless telephone systems employs a variety of functions to recover data from a transmitted RF signal. These functions can include: timing recovery for symbol synchronization, carrier recovery (frequency demodulation), and gain. The receiver thus includes, inter alia, an automatic gain control (AGC) loop, carrier tracking loop (CTL), and timing loop for each link.

Timing recovery is the process by which the receiver clock (timebase) is synchronized to the transmitter clock. This permits the received signal to be sampled at the optimum point in time to reduce the chance of a slicing error associated with decision-directed processing of received symbol values. In some receivers, the received signal is sampled at a multiple of the transmitter symbol (or chip) rate. For example, some receivers sample the received signal at twice the transmitter symbol (or chip) rate. In any event, the sampling clock of the receiver must be synchronized to the symbol clock of the transmitter, Carrier recovery is the process by which a received RF signal, after being frequency shifted to a lower intermediate passband, is frequency shifted to baseband to permit recovery of the modulating baseband information. AGC tracks signal strength and adjusts the gain, for example to help compensate for the effects of transmission channel disturbances upon the received signal. AGC, along with other equalization techniques, can help remove intersymbol interference (ISI) caused by transmission channel disturbances. ISI causes the value of a given symbol to be distorted by the values of preceding and following symbols. These and related functions, and related modulation schemes and systems, are discussed in greater detail in Edward A. Lee & David G. Messerschmitt, Digital Communication, 2d ed. (Boston: Kluwer Academic Publishers, 1 994),

In a burst mode or TDM communication system, such as a TDMA-based multi-line wireless telephone system, quick acquisition of carrier loops is required to efficiently utilize available bandwidth. For example, a TDMA-based digital multi-line wireless telephone system may use a TDMA audio packet structure such as structure 200 illustrated in Fig. 2, where a base unit having a transceiver sequentially transmits to and receives from different handsets over the time interval Td, with guard time Tg between packet transmissions. Guard time is established to allow the transmitters to power-down and to allow the receivers to power-up. The receivers must synchronize for each packet.

Various techniques have been used to compress audio and speech data. Code excited linear prediction (CELP) and linear predictive coding (LPC are compression algorithms often used for low bit rate (2400 and 4800 bps) speech coding. However, such speech data compression techniques can be complex and can require a costly or otherwise undesirably high amount of computer processing bandwidth. ADPCM can be implemented with less complexity or processing requirements than CELP and LPC, and is thus often used for audio or speech data encoding. However, ADPCM does not always compress as well as other techniques such as CELP (i.e., more bits are required in the compressed data to transmit the same quality information). Thus, in systems such as those employing ADPCM encoding, there is a need for improved ADPCM encoding and compression techniques to reduce the number of bits needed to represent a given signal. SUMMARY

A wireless telephone system comprises a base transceiver having a base transmitter and base receiver, and a plurality of wireless handsets. Each handset comprises a handset transceiver for establishing a wireless link over a shared channel with the base unit via the base transceiver, each handset transceiver having a handset receiver and a handset transmitter. Each transmitter of the system comprises a transmitter encoder for encoding an input signal having an input sample rate, to be transmitted to a receiver of the system. The transmitter encoder comprises a filter bank for separating the input signal into a plurality of subband signals, each subband covering a different frequency range; and a plurality of ADPCM encoders, one for each subband, for encoding each subband signal at a number of bits per sample according to the relative importance of the subband.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1 is a block diagram of TDMA multi-line digital wireless telephone system, in accordance with an embodiment of the present invention;

Fig, 2 is a schematic representation of the TDMA audio packet structure used in the digital wireless telephone system of Fig. 1 , in accordance with an embodiment of the present invention;

Fig. 3 is a block diagram of a transmitter/receiver system illustrating the subband ADPCM encoding and decoding architecture of the present invention;

Fig. 4 is a block diagram of an analysis filter bank for use in the subband ADPCM encoding of the present invention; Fig. 5 is a block diagram of a synthesis filter bank for use in the subband ADPCM decoding of the present invention;

Fig. 6 is a block diagram of an alternative analysis filter bank for use in subband ADPCM encoding;

Fig. 7 is a block diagram of an alternative synthesis filter bank for use in subband ADPCM decoding;

Figs. 8-9 illustrate, respectively, alternative subband ADPCM encoder and decoder architectures, employing filter trees comprising filter banks such as those described in Figs. 4-5, which filter trees are pruned to different decimation levels, for a subband ADPCM transmitter/receiver system in accordance with an alternative embodiment of the present invention;

Fig. 10 is a graph showing exemplary low-pass filter (LPF) and high-pass filter (HPF) frequency responses of LPFs and HPFs of the encoders and decoders of Figs. 3, 4-5, and 8-9;

Fig. 1 1 is a graph showing exemplary LPF and HPF frequency responses of the LPFs and HPFs of an alternative low delay, near minimum phase filter set; and

Fig. 1 2 is a graph showing exemplary LPF and HPF group delays of the LPFs and HPFs of the alternative low delay, near minimum phase filter set.

DESCRIPTION OF THE PREFERRED EMBODIMENT

In the present invention, the audio spectrum of an ADPCM-based communication system such as a wireless telephone system is broken into subbands at the encoder (transmitter), In an embodiment, this is done by defining a voice band, and then omitting from encoding upper and lower subbands outside the voice band. The remaining voice band is broken or divided into a number of voice subbands, by appropriate filtering and sampling decimation, ADPCM encoding is used for each such subband signal, where subbands having greater importance by some criteria (e,g., by overall sound quality of the decoded and reconstructed audio signal, that is, perceptive importance) are ADPCM encoded with a greater number of bits per sample relative to the other, less important subbands. For example, subbands in the lower frequencies may be encoded with a greater number of bits relative to subbands in the higher voice frequencies. Analysis and synthesis filter bank and decimation unit trees are used to simplify design and minimize processing resources and delay-related and aliasing problems. The present invention thus augments conventional ADPCM techniques with special filters to allocate compression resources where they are needed most in the audio band. This improves decoded voice quality relative to conventional full band low bit rate ADPCM. These and other details and advantages of the present invention are described in further detail below.

Referring now to Fig. 1 , there is shown a block diagram of spread spectrum TDMA multi-line digital wireless telephone system 100, in accordance with an embodiment of the present invention. TDMA system 100 comprises a base unit 1 10, which has receiver and transmitter units 1 12 and 1 1 1 , respectively, and is coupled to external telephone network 1 1 6 via telephone line(s) 1 1 5. System 100 also comprises N wireless handsets 120,, 120₂, ... 120_N. Each has a transmitter and receiver unit (transceiver), such as transmitter 121 and receiver 122 of handset 120, In one embodiment, receiver unit 1 12 comprises N separate receivers, and transmitter unit 1 1 1 comprises N separate transmitters, so that receiver and transmitter units 1 12 and 1 1 1 provide N total transceiver units, one for each of N wireless handsets. At any given time, M handsets (0 < M < N) are operating or active (i.e., in the process of conducting a telephone call). In one embodiment, system 100 employs a digital TDMA ^"scheme, in which each operating handset only transmits or receives data during its own "time slice" or slot. System 100 thus provides a wireless network between the base station 1 10 and each handset 120; (1 < I < N). System 1 00 preferably employs block error coding to reduce error. In one embodiment, during a time slice, digitally compressed audio packets employing ADPCM samples are transmitted, for example in accordance with recommendation ITU-T G.721 or G.727 with a block code. This allows, for example, 1 6 ADPCM samples to be transmitted per audio packet. Block codes and ADPCM are preferred because of their low latency, which allows the wireless phone behavior to mimic that of a standard corded phone. Channel codes such as convolutional codes or turbo codes, or stronger source coding such as LPC (linear predictive coding), transform coding, or formant coding incur more delay, which makes the system less like the equivalent corded telephone. The trunsmitters 1 1 1 , 1 21 and receivers 1 1 2, 1 22 of the base unit and handsets employ the ADPCM subband encoding and decoding, respectively, of the present invention, as described in further detail below with reference to Figs. 3 -1 1 , thereby improving the quality and/or bit compression rate of transmitted signals and data.

Referring now to Fig. 2, there is shown a schematic representation of the TDMA audio packet structure 200 used in the digital wireless telephone system 100 of Fig. 1 , in accordance with an embodiment of the present invention. Structure 200 comprises a 2 ms (Td) field 210 of digital data, which comprises eight audio packets, such as audio packet 220. Each audio packet is a set of audio data transmitted either to a given handset from the base unit or vice-versa, during a given time-slice in an overall "epoch" scheme, during which time no other handsets receive or transmit data over the system's data channel. Each packet is labeled Ti or Ri, to indicate whether it is being transmitted from the base unit 1 10 or received by the base unit 1 1 0, to or from a given handset 1 2O_f.

In the present invention, during the 2 msec TDMA field cycle, voice data is exchanged in packets containing 1 6 samples of voice data. In one mode of operation of system 1 00, these samples are 4-bit ITU-T G.721 or G.727 ADPCM samples (i.e., a 32Kbps ADPCM signal). By changing to G.727 3 or 2 bit ADPCM samples (24 or 16Kbps ADPCM signals, respectively), an additional 16 or 32 bits are freed up for coding in each packet.

As noted previously, in the present invention, the audio spectrum of an ADPCM-based communication system such as a wireless telephone system is broken into bands (subbands) at the encoder (transmitter). ADPCM encoding is used for each such subband, where low frequencies are encoded with a greater number of bits relative to high voice frequencies. The subbands are received by decoders (receivers) which synthesize the original data from the received subbands. This improves decoded voice quality relative to conventional full band low bit rate ADPCM. In accordance with an embodiment of the invention, bands can be uniform or varied in frequency range, and can be built out of folded audio spectra for optimum coverage of the 250-325OHz phone band.

In an embodiment, the ADPCM subband encoding and decoding technique of the present invention is employed in a wireless telephone system 100, and ITU-T G.727 ADPCM is employed for compression of the data packets transmitted between transceivers in the system. Referring now to Fig. 3, there is shown a block diagram of a transmitter/receiver system 300 illustrating the subband ADPCM encoding and decoding architecture of the present invention. System 300 comprises encoder portion 310, decoder portion 320, and transmission channel 331 . Encoder portion or architecture 3 10 represents the ADPCM encoding performed by any transmitter 1 1 1 , 121 of any of the transceivers of system 100 of Fig. 1 . Decoder portion or architecture 320 represents the ADPCM decoding performed by any receiver 1 12, 122 of any of the transceivers of system 100 of Fig. 1 . As will be appreciated, an encoder 310 may be placed functionally before the transmitter unit, but may be considered to be part of a transmitter comprising both encoding and transmission functions. Similarly, a decoder 320 may be placed functionally after a receiver unit, but may be considered to be part of a receiver comprising both receiving and decoding functionality. In system 300, an input signal (e.g., a sampled audio signal received from a microphone which is receiving speech from a human user) is encoded as follows. First, a voice band, which is to be divided or filtered into subbands which are to be ADPCM encoded and combined for transmission, is defined, and the frequency range outside this voice band is dropped (i.e., not encoded, or omitted). The voice band portion is then separated into a plurality of adjacent subband signals by analysis filter bank and output decimation unit 31 1 . These subbands are adjacent in the sense that each covers a frequency range which is adjacent to the frequency ranges of neighboring subbands.

A subband signal may only be decimated down to the minimum number of samples needed to express the signal. In particular, the amount of sample rate reduction is limited by Nyquist's Sampling Theorem, which provides that, for a discrete time system to adequately represent a given bandwidth, the sample rate must be greater than or equal to 2x the bandwidth. To minimize sample rate, the subbands are preferably decimated by decimators of unit 31 1 , to minimize the number of samples per subband signal while representing the subband signal without aliasing. The decimation unit or decimators may be a separate unit or module that receive the subband output of the filters, or filters and decimators may be combined together, as in the system of Figs. 8 and 9, so that filtering and decimation take place together. In either case, a filter bank as well as a number of decimators are employed to provide decimated subband signals.

Each decimated subband signal is encoded employing ADPCM, i.e. ADPCM encoders 312. As will be appreciated, the overall bit rate achieved is determined by the number of subbands, the decimation factor for each subband, and the number of bits used to code each subband. ITU G.727 type ADPCM encoding allows for 2 to 5 bits per sample, based on a 14 bit linear PCM input (a sampled signal). The subbands are preferably encoded in order of perceptive importance, allowing for better utilization of the available bit rate. Listening experiments may be performed to empirically determine optimal parameters in this regard, depending upon system constraints and other implementation details and factors. On approach to subbands filtering would be to use specially-designed filters which separate the bands directly, as illustrated in the encoder analysis filter/decimator bank 600 of Fig. 6 and the decoder zero-padder/synthesis filter bank 700 of Fig. 7. Filters 600-700 of Figs. 6-7 permit the direct computation of subbands. Analysis filter bank 600 of Fig. 6 comprises n filters F₀, F_{1 f} . . , F_n, one for each of the n subbands into which the voice band is divided. There are n decimators 4₀, d . , d_n_ to reduce the sample rate for the particular subband. Synthesis filter bank 700 of Fig. 7 correspondingly comprises "inverse-decimators" or zero-padders d₀, d, , . . . , d_n , each of which inserts zeros to bring the subband signal back up to the output sample rate; and synthesizers S₀, S,, . . . , S_n isolates the correct, aliased version of the subband to reconstruct the signal at the adder 731 . Thus, in this approach, individual filter F₀-F_n,, outputs are decimated by the decimators d₀-d_n, These decimated subband signals may then be encoded, combined, and transmitted, and then received by a decoder 700, where the subbands are decoded, zero-padded, and filtered by synthesis filters S₀-S_n, and summed by summer 731 to complete the link, Reconstruction filter banks and other aspects of filter banks are described in further detail in Gilbert Strang & Truong Nguyen, Wavelets and Filter Banks (Wellesley Cambridge Press, 1 996).

System 300, however, preferably employs an alternative approach in which, instead of directly computing subbands with specially designed, dedicated filters for each subband, as in filters 600-700 of Figs. 6-7, a filter tree formed from filter banks with identical LPF/HPF pairs is utilized, along with appropriate delay matching and decimation to prevent aliasing. In this approach, the filtering of the signal to divide it into subbands is accomplished by a tree of filters and frequency shifts. This approach may require a reduced amount of hardware to perform the equivalent subband separation, since smaller subbands require longer filters to separate them. Referring now to Figs. 4-5, there are shown filter banks 400, 500 which may be utilized to form analysis and synthesis filter trees to implement the analysis and synthesis filter bank units 31 1 , 321 of Fig. 3. Thus, Fig. 4 illustrates an analysis filter bank 400 that can be used as a building block of an analysis filter tree to perform the functionality performed by the analysis filter bank of unit 31 1 of system 300 of Fig. 3. Fig. 5 illustrates a synthesis filter bank 500 that can be used to form a synthesis filter tree that performs the functionality performed by the synthesis filter bank of unit 321 of system 300 of Fig, 3 . In these filter banks, e.g. filter bank 400, rotators (frequency shifters or multipliers) 41 1 are placed at the output of the decimators of the HPFs, to convert the high band signal back to baseband. This permits the use of LPF/HPF pairs to be used instead of requiring a more complex and expensive bandpass filter. In a sampled system, only a band of frequencies of bandwidth equal to one-half the sample rate can be represented without aliasing. Typically, this range is from 0 to fs/2 (one-half the sample rate). The LPF passes the low half of this region, from 0 to fs/4. The HPF can be computed by passing all frequencies and subtracting the output of the low pass filter, leaving only the high pass frequency band. Conversely, in the decoder type synthesis filter bank 500, each high band is rotated by rotator 501 to bring the signal back up from baseband to a high band signal, before zero padding it with zero padders 502 and applying the zero-padded high band signal to the corresponding HPF for the LPF/HPF pair for that subband.

The filter banks 400, 500 of Figs. 4-5 may be enlarged or expanded into trees with various decimation levels to provide more subbands, as will be appreciated. In one embodiment, the filter bank approach of Figs. 4-5 is applied for five subbands, to provide a system such as that illustrated in Figs. 8-9. In particular, a system using filter trees pruned to different levels is illustrated in Figs. 8 and 9. These filter trees comprise a number of filter banks such as filter banks 400, 500 of Figs. 4-5, along with delay matching units. The system formed from the filter tree banks of Figs. 8 and 9 provide good performance at a (low) rate of 15Kbps. In the encoder 800 of Fig. 8, for example, at each decimation level there are filter tree banks consisting of LPF/HPF pairs, decimators, rotators, and matching delay units, as necessary. The filter banks portion 81 1 filters and decimates the input signal to provide the five subbands to the ADPCM encoders 81 2, which encode each subband at the number of bits per sample appropriate for the particular subband and the current bit rate to be achieved by encoder 800. These ADPCM encoded subbands are then assembled by packet assembly unit 81 3, to provide a compressed signal having a total bit rate such as 1 5Kbps for a low bit rate encoding, which may be then transmitted via the RF channel and received by decoder 900 of Fig. 9.

Thus, in an embodiment, system 300, employing filter trees 800, 900, operates as follows. An input signal having a given sample rate is separated into subbands and decimated by the analysis\filter banks comprising portion 81 1 (31 1 ) of filter tree 800. The decimated subband signals are then encoded (by ADPCM encoders 31 2/81 2), and assembled by packet assembly 81 3, and then transmitted (via channel 331 ). The individual subbands received by a decoder 320 implementing filter tree 900 are then decoded as follows. First, the packets are disassembled by packet disassembly module or unit 91 3, and the individual subbands are decoded by ADPCM decoders 322 (1 922). Synthesis portion 921 (which implements synthesis unit 321 ) then zero-pads and filters the signals, which are then summed by summer 323 (923), to synthesize (reconstruct) the original signal encoded by the encoder in order to complete the communication link.

An example used herein for illustrative purposes assumes that the telephone voice band of 250-325OHz is to be preserved and ADPCM subband encoded in an 8KHz sample rate system (that is, in a system in which the input audio signal is sampled at 8KBz), with 14 bits/sample, resulting in 1 1 2kbps, In order to achieve higher compression and higher quality for a given bandwidth or available bit rate, frequency bands outside the defined voice band (250-325OHz) of the input audio signal are completely omitted from encoding. For example, an input signal resulting from a microphone exposed to audible human speech may have frequencies from 0 to 4000Hz (sampled at 8KHz, with 14bits/sample) . The voice band is 250-325OHz; thus, the subbands from 0-250Hz and 3250- 4000Hz need not be encoded; if they are, processing and transmission channel bandwidth may be wasted or underutilized. Therefore, subbands from 0-250Hz and 3250-400OHz are not coded at all.

The remaining voice band of 250-325OHz is divided into five subbands 0- 4, as shown in the subband column of Table 1 , below. Then, for a given compressed bit rate to be achieved (e.g., very low, low, or high), each subband is ADPCM compressed at a number of bits per sample according to the relative perceptual importance of speech data in that subband. Typically, the first two subbands 0 and 1 are more important because in the lower frequency ranges, and thus should be allocated more bits per sample in the ADPCM encoding than the other subbands. This is because the human ear is more sensitive to noise and quantization distortion in the low frequency bands than in the high frequency bands.

If the encoder is to encode at an overall low bit rate (e.g., 1 5Kbps, as per the "Low Rate" columns of Table 1 ), for example, the low frequency subbands 0 and 1 (i.e., 250-1 000Hz) are coded at 4 bits per sample, and high frequency subbands 2-4 (1000-3250Hz) are ADPCM encoded at only 2bits/sample, as shown in the "low rate" columns of Table 1 . Thus, by filtering the input signal into separate subbands and employing appropriate decimation, low frequencies may be encoded with a greater number of bits relative to high voice frequencies, to improve overall audio quality of the decoded signal for the same compressed bit rate. Table 1 shows how the different bands are encoded, and different ways to achieve different bit rates for various degrees of voice quality, in various alternative embodiments of the present invention. Table 1 may be used with any embodiment of the present invention, such as the encoder/decoder architecture of system 300 of Fig. 3, or with the encoders and decoders shown in Figs. 4-5, 6-7, and 8-9.

Table 1: Bit Rate Table

10

As shown in Table 1 , various ADPCM compression levels may be employed, corresponding to resulting bit rates of very low rate, low rate, and

15 standard rate. As will be appreciated, a given encoder may select a particular compression rate depending upon current requirements. In one embodiment of wireless telephone system 100, for example, higher compression is employed by the transmitter encoders when a larger number of handsets are in use. Consequently, the transition from one bit rate to another must be performed on

20 the fly. Also, if a given channel 331 is degrading, a lower bit rate (high compression) may be employed to permit additional error correction, which also requires bit rates to be changed dynamically, or on the fly.

To be able to switch compression and hence bit rates dynamically, 25 datapaths with identical delays are employed. One approach calls for switching between a low bit rate and standard bit rate as shown in Table 1 , where all datapath delays are common. Alternatively, for example, it may be desired to switch between the standard 4 bit/sample ADPCM on the entire 8Ksps (8000 samples per second) signal and the low rate of Table 1 . In this case, the delay of 30 the entire filter tree must be added to the standard full bandwidth ADPCM path. With respect to delay, the compression at the encoder stage in the present invention, e.g. the system illustrated by the filter banks of Figs. 8 and 9, can add considerable delay to the signal. Since there is a decimation at each level of the tree, the filters of level 4 take 8 times longer than the level 1 filters. Delay for the system is calculated in Table 2, below, where d is the delay of the filter (high and low pass filters must have equivalent delay), and ch is the channel delay.

With a 31 tap truncated sin(x)/x filter, this amounts to 450 samples, or 56,25ms of delay for the filters, plus the channel delay. Thus, it may be desirable to employ line echo cancellation with such a compression scheme. In the example, some signals are taken to ADPCM from levels 2 and 3. These are delayed by the

"matching delay" blocks or units 801 to match the delay in the path to level 4.

Table 2: Delay Calculation

With respect to the high-pass and low-pass filters (HPF, LPF, respectively) employed in the present invention, empirical results have shown that good results can be achieved with a simple 31 tap truncated sin(x)/x impulse response type filter. Each LPF divides the frequency band in half, with its 3dB point at 1 /4 of sample rate; the HPF frequency response is a mirror image of the low pass filter. Eqs. (1 ) and (2) below describe the impulse response of such an LPF/HPF pair, as used in the analysis and synthesis filter banks of the encoders and decoders of the present invention:

A LPF/HPF filter pair designed as truncated sin(x)/x functions, has a rectangular passband from zero to fs/4 for the LPF, and a rectangular passband from fs/4 to fs/2 for the HPR The HPF is formed by subtracting the lowpass impulse response, 1 pf(n), from an allpass impulse response, which for this example would be [00000000000000010000000000000000]. This filter passes everything, and matches the delay of the sin(x)/x derived filters. Figs. 10, 1 1 , and 12 describe characteristics of a specially designed impulse response which minimizes delay through the filter at the expense of some group delay distortion. This filter pair works together to cancel aliasing caused by signals passing through the transition bands and stopbands of the filters.

By controlling the phase (+1 ) of the frequency shift multipliers 802, the band edges can be configured to combine constructively or destructively. Constructive combinations result in increased alias terms using the filters described below. These aliases can give speech an undesirable "robotic" quality. Thus, it is desirable to configure the phases so that the band edges combine destructively to improve sound quality, although this can cause dips in the voiceband frequency response at the filter band edges. The frequency response 1000 of the filters is shown in Fig. 10. In particular, Fig. 10 shows the LPF and HPF frequency responses 1001 , 1002, respectively, for filters of Figs. 3, 4-5, and 8-9. It should be noted that there is gain in the filter to compensate for the decimation and zero padding operation. Filters can be designed which allow perfect reconstruction, where aliasing components created in the analysis bank combine to produce a flat frequency response in the synthesized result. The coefficients of Table 4 are designed with this in mind.

In the embodiments described above, the speech signal is divided into five subbands, after discarding the omitted or unused upper and lower subbands. In alternative embodiments, fewer or more than five subbands may be employed. All things being equal, however, it is preferable to employ fewer, rather than more, subbands. This is because using fewer bands reduces the amount of memory required (in filter states and in ADPCM states). Also, with current filter technology, fewer bands result in fewer dips in the frequency response caused by interference at the band edges. Finally, ADPCM encoding takes time to respond to a sudden change in a signal, which delay becomes evident as the ADPCM compression is operated at lower and lower sample rates. Thus, by minimizing the number of low sample rate bands, ADPCM adaptation artifacts are minimized.

The present invention is suitable in a variety of applications, and, in particular, is applicable wherever modest bit rate digitized speech is needed in a high error rate environment, and where simple speech compression algorithms are needed. Examples of such systems which can benefit from the ADPCM subband encoding of the present invention include wireless telephone systems, digital n-way radio, and Werner telephony,

In an alternative embodiment, a low delay, near minimum phase filter set may be used in the subband ADPCM encoding and decoding described herein. Such a low delay, near minimum phase filter set reduces encoder-decoder delay from the 56.25ms (encode and decode) described above down to 7.5ms, which is substantially less than typical LPC-type approaches, While the filter group delay is not flat, it does not degrade speech intelligibility, and sounds as good or better than many long delay filters. Key benefits of using such a filter set are listed in Table 3 below:

Table 3

Low complexity, delay, and sensitivity to errors results from use of this filter set, so that it is useful in a multi-line/multi-handset wireless telephone system in which, for example, the base unit may contain 12 vocoders. Since cable modems have the bandwidth to support multiple calls, the use of the subband ADPCM compression of the present invention can simplify design of multi-"line" cable phone systems for the same reason. Low error sensitivity is also a benefit in wireless applications, since channel coding requirements can be reduced. Some PCS systems use rate Vi and 1 /3 convolutional codes, which double and triple the channel bit rates required.

The frequency characteristics of such a low delay, near minimum phase filter set are illustrated in Figs. 1 1 and 12. Fig. 1 1 shows the LPF and HPF magnitude frequency responses 1 101 , 1 102, respectively, and Fig. 12 shows the LPF and HPF group delays 1201 , 1202, respectively. As can be seen in the group delay plots of Fig. 12, most of the passband (0 to 0.5 for the LPF and 0.5 to 1 for the HPF) has delay less than 3, peaking to 7 only at the transition region around a normalized frequency of 0.5, where 1 is one-half the sample rate. These delay distortions are practically imperceptible in speech. Table 4, below, shows the LPF and HPF filter coefficients that characterize such a filter set. Such a filter set may be implemented by suitably configuring a finite impulse response (FIR) type filter in accordance with the filter coefficients listed in Table 4, below.

Table 4: Filter Coefficients

The ADPCM subband encoding and decoding technique of the present invention has been described above with reference to implementation in wireless telephone systems, but may also be applied in other communications systems as well. In answering machine or voice mail systems, for example, the present invention may be employed to reduce the processing complexities compared to CELP and LPC techniques. Reducing processing requirements allows a system to answer multiple lines or play back to multiple handsets. Some applications, such as answering machines, voice mail, or Internet phones, may be able to further reduce bit rates using additional techniques such as: dropping or changing quantization of frequency bands below a threshold energy (which can involve measuring energy in each subband and transmitting a code indicating which bands are coded at what rate); and detecting quiet time and not storing information during quiet periods.

One skilled in the art will recognize that the wireless system described above according to the principles of the invention may be a cellular system where base unit 1 10 represents a base station serving one of the cells in a cellular telephone network.

It will be understood that various changes in the details, materials, and arrangements of the parts which have been described and illustrated above in order to explain the nature of this invention may be made by those skilled in the art without departing from the principle and scope of the invention as recited in the following claims.

Claims

1 . An encoder for encoding an input signal having an input sample rate, the encoder comprising:

(a) a filter bank for separating the input signal into a plurality of subband signals, each subband covering a different frequency range; and

(b) a plurality of adaptive differential pulse code modulation (ADPCM) encoders, one for each subband, for encoding each subband signal at a number of bits per sample according to the relative importance of the subband.

2. The encoder of claim 1 , wherein the input signal is an audio signal.

3. The encoder of claim 2, wherein the audio signal comprises speech signals.

4. The encoder of claim 2, wherein the filter bank separates a voice band portion of the audio signal into a plurality of adjacent subband signals.

5. The encoder of claim 4, wherein: the voice band is from 250-325OHz; the plurality of adjacent subbands comprise five subbands 0-5 covering the following frequency ranges, respectively: 250-50OHz, 500-1 OOOHz, 1000-200OHz, 2000-300OHz, and 3000-325OHz.

6. The encoder of claim 5, wherein: the input signal has an input sample rate of 8KHz and is sampled at 14bits/sarnple; and at a compressed signal total bit rate of 1 5Kbps, the five subband signals are encoded at the following number of bits per sample, respectively: 4 bits/sample, 4 bits/sample, 2 bits/sample, 2 bits/sample, and 2 bits/sample.

7 The encoder of claim 1 , wherein the encoders encode subband signals covering lower frequencies of the voice band at a higher number of bits per sample relative to subband signals covering higher frequencies of the voice band,

8. The encoder of claim 1 , wherein the plurality of encoders encode subbands having a relatively higher perceptual importance at a relatively higher number of bits per sample.

9. The encoder of claim 1 , wherein the plurality of encoders encode each subband signal at a number of bits per sample according to the relative perceptual importance of the subband.

10. The encoder of claim 1 , wherein the plurality of encoders encode each subband signal at a number of bits per sample according to the relative perceptual importance of the subband.

1 1. The encoder of claim 1 , further comprising a plurality of decimators for decimating each subband signal before encoding by the encoders to minimize the number of samples per subband signal while representing the subband signal without aliasing.

1 2. The encoder of claim 1 , wherein the encoder is an encoder of a transmitter of a plurality of transceivers of a wireless telephone system comprising: a base transceiver having a base transmitter; and a plurality of wireless handsets, each handset comprising a handset transceiver for establishing a wireless link over a shared channel with the base unit via the base transceiver, each handset transceiver having a handset transmitter.

1 3. The encoder of claim 1 2, wherein the wireless link is a time-division multiple access (TDMA) link, in which each handset communicates during an exclusive time slot of a TDMA scheme that allocates time slots to handsets.

1 4. A wireless telephone system, comprising:

(a) a base transceiver having a base receiver and a base transmitter; and (b) a plurality of wireless handsets, each handset comprising a handset transceiver for establishing a wireless link over a shared channel with the base unit via the base transceiver, each handset transceiver having a handset receiver and a handset transmitter, each said transmitter comprising a transmitter encoder for encoding an input signal having an input sample rate, the transmitter encoder comprising:

(1 ) a filter bank for separating the input signal intQ a plurality of subband signals, each subband covering a different frequency range; and

(2) a plurality of ADPCM encoders, one for each subband, for encoding each subband signal at a number of bits per sample according to the relative importance of the subband.

1 5. The system of claim 1 4, wherein the wireless link is a time-division multiple access (TDMA) link, in which each handset communicates during an exclusive time slot of a TDNIA scheme that allocates time slots to handsets.

16. The system of claim 14, wherein each receiver comprises a receiver decoder for decoding an encoded signal comprising a plurality of subband signals transmitted by a transmitter of the system, wherein each receiver decoder comprises:

(1 ) a plurality of decoders, one for each subband, for decoding each subband signal of the transmitted signal at the number of bits per sample for that subband; and

(2) a synthesis filter bank and summer for combining the decoded subband signals to produce an original signal encoded by the transmitter.

17. The system of claim 14, wherein: the input signal is an audio signal; the filter bank separates a voice band portion of the audio signal into a plurality of adjacent subband signals.

18. The system of claim 17, the encoders encode subband signals covering lower frequencies of the voice band at a higher number of bits per sample relative to subband signals covering higher frequencies of the voice band.

1 9. The system of claim 1 7, wherein: the voice band is from 250-325OHz; the plurality of adjacent subbands comprise five subbands 0-5 covering the following frequency ranges, respectively: 250-50OHz, 500-1 OOOHz,

1 000-200OHz, 2000-300OHz, and 3000-325OHz.

20. The system of claim 1 9, wherein: the input signal has an input sample rate of MHz and is sampled at I4bits/sample; and at a compressed signal total bit rate of 1 5Kbps, the five subband signals are encoded at the following number of bits per sample, respectively: 4 bits/sample, 4 bits/sample, 2 bits/sample, 2 bits/sample, and 2 bits/sample,

21 . The system of claim 14, wherein each encoder further comprises a plurality of decimators for decimating each subband signal before encoding by the encoders to minimize the number of samples per subband signal while representing the subband signal without aliasing.