WO1998004046A2 - Enhanced encoding of dtmf and other signalling tones - Google Patents

Enhanced encoding of dtmf and other signalling tones Download PDF

Info

Publication number
WO1998004046A2
WO1998004046A2 PCT/CA1997/000516 CA9700516W WO9804046A2 WO 1998004046 A2 WO1998004046 A2 WO 1998004046A2 CA 9700516 W CA9700516 W CA 9700516W WO 9804046 A2 WO9804046 A2 WO 9804046A2
Authority
WO
WIPO (PCT)
Prior art keywords
spectrum
voice
voice signal
signal
quantization
Prior art date
Application number
PCT/CA1997/000516
Other languages
French (fr)
Other versions
WO1998004046A3 (en
Inventor
Redwan Salami
Claude Laflamme
Jean-Pierre Adoul
Original Assignee
Universite De Sherbrooke
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universite De Sherbrooke filed Critical Universite De Sherbrooke
Priority to EP97931602A priority Critical patent/EP0913034A2/en
Priority to AU35345/97A priority patent/AU3534597A/en
Publication of WO1998004046A2 publication Critical patent/WO1998004046A2/en
Publication of WO1998004046A3 publication Critical patent/WO1998004046A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
    • G10L19/07Line spectrum pair [LSP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/20Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0016Codebook for LPC parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Definitions

  • the present invention relates to the field of digital encoding of voice signals.
  • voice signal is intended to designate speech, audio, music and other signals.
  • DTMF Dual Tone Multi-Frequency
  • the present invention enhances encoding of DTMF signals and other signalling tones so as to prevent their purpose from being hindered by the digital encoding procedure.
  • Low bit rate speech encoding algorithms are usually based on a speech production model and therefore optimized for speech signals. As the bit rate is reduced to 8 kbits/second and below, these encoders meet difficulties in encoding non-speech signals such as DTMF signals and other signalling tones; this results in occasional failures in detecting these signals at the receiver end.
  • linear prediction Central to the speech production model is the parametric description of the short-term speech spectrum.
  • the most common approach called “linear prediction” consists of transmitting at regular time intervals, typically every 10 or 20 milliseconds, a set of so-called linear prediction (LP) coefficients. Efficient encoding of the LP coefficients involves quantization tables trained by means of a speech data base.
  • An object of the present invention is to provide a quantizing method and device capable of overcoming the above described drawbacks of the prior art for example by "reserving" in the field of entries to the speech-trained quantization table of LP coefficients some entries for representing the short-term spectrum of DTMF signals and other signalling tones.
  • Another object of the present invention is to introduce in a quantization method and device a DTMF(or other signalling tones)- specific codebook with minimal change to the conventional quantization procedure.
  • a method and a device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion is provided.
  • a detection to determine whether the spectrum vector represents a voice signal or a non-voice signal is made.
  • the voice-signal quantization-codebook portion is searched for quantizing the spectrum vector and producing the spectrum index.
  • the non-voice signal quantization codebook portion is searched for quantizing the spectrum vector and producing the spectrum index when the spectrum vector represents a non-voice signal.
  • the non-voice signal quantization codebook portion searched for encoding the non-voice signal representative spectrum indexes greatly improves encoding of non-voice signals such as DTMF signals and other signalling tones.
  • the present invention also relates to a method and a device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index.
  • a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion.
  • the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion are searched by measuring a weighted distance between the spectrum vector and the entries of the voice-signal quantization-codebook portion, and the non-voice signal quantization codebook portion.
  • the spectrum vector represents a voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the voice-signal quantization-codebook portion.
  • the spectrum vector represents a non- voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion.
  • the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, a predetermined set of combinations of partial spectrum indexes are reserved for non-voice signals, and searching the voice-signal quantization-codebook portion comprises searching the quantization codebook subtables and producing corresponding partial spectrum indexes forming combinations not included in the predetermined set of combinations of partial spectrum indexes.
  • the spectrum vector represents a voice signal
  • the spectrum index is produced by combining the partial spectrum indexes corresponding to said one entry of the voice- signal quantization-codebook portion.
  • the spectrum index represents a non-voice signal
  • the spectrum index is produced by selecting, in relation to said one entry of the non-voice signal quantization codebook portion, one combination of the predetermined set.
  • the predetermined set of combinations of partial spectrum indexes reserved for non-voice signals correspond to invalid combinations of entries of respective quantization codebook subtables.
  • the spectrum vector has components related to line-spectral-pairs
  • the voice-signal quantization-codebook portion comprises at least three quantization codebook subtables each having a plurality of entries
  • one combination of the predetermined set is selected to form the spectrum index, this combination being composed of a non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion
  • the non-voice-signal label part corresponds to a combination of entries of two subtables amongst the at least three quantization codebook subtables which is logically invalid in regard to adjacent line-spectral-pair component ordering.
  • the quantization codebook subtables are searched in stages including a first stage and at least one subsequent stage, and the predetermined set of combinations of partial spectrum indexes is formed by considering, at least, one predetermined partial spectrum index for the first stage combined with partial spectrum indexes corresponding to entries of the quantization codebook subtables searched in the subsequent stage(s).
  • the present invention is further concerned with a method and a device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, which method and device using a spectrum-vector quantization codebook including a voice- signal quantization-codebook portion and a non-voice signal quantization codebook portion.
  • a weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion is measured and it is detected that the spectrum vector represents a non- voice signal when the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion is smaller than a predetermined weighted distance threshold.
  • a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion is produced.
  • the voice-signal quantization-codebook portion is searched for quantizing the spectrum vector and producing the spectrum index.
  • the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries
  • the voice-signal quantization-codebook portion comprises addresses which are related to combinations of entries of the plurality of quantization codebook subtables
  • the voice-signal quantization-codebook portion is searched by splitting the spectrum vector into a plurality of subvectors, searching the quantization codebook subtables for quantizing the subvectors, respectively, and producing respective partial spectrum indexes, and combining the partial spectrum indexes to produce the spectrum index, and an invalid combination of the entries of at least two quantization codebook subtables is reserved as predetermined non- voice-signal label part;
  • the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of stages including a first stage and at least one subsequent stage, each stage having a given number of entries, at least one entry of the first stage is reserved as the predetermined non-voice-signal label part, and the at least one entry of the first stage is combined with at least one entry of the subsequent stage(s) to represent non-voice signals.
  • the spectrum vector has components related to line-spectral-pairs or immitance-spectral-pairs
  • the measured weighted distance is a weighted Euclidean distance
  • the non-voice signal comprises a signalling tone, for example a DTMF signal.
  • the present invention still further relates to an encoder for encoding a voice or non-voice input signal, comprising an encoding section responsive to the voice or non-voice input signal for producing residual voice or non-voice signal information, a spectrum processing section responsive to the input voice or non-voice signal for producing a spectrum index, and means for transmitting the residual signal information and the spectrum index through a communication channel.
  • the spectrum processing section comprises means responsive to the input voice or non-voice signal for producing a spectrum vector at recurrent time intervals and one of the above described devices for quantizing the spectrum vector to produce the spectrum index.
  • a cellular communication system for servicing a large geographical area divided into a plurality of cells, comprising: mobile transmitter/receiver units; cellular base stations respectively situated in the cells; means for controlling communication between the cellular base stations; and a bidirectional wireless communication sub-system between each mobile unit situated in one cell and the cellular base station of that cell, this bidirectional wireless communication sub-system comprising in both the mobile unit and the cellular base station (a) a transmitter including one of the above described encoders for encoding a voice or non-voice signal and means for transmitting the encoded voice or non-voice signal, and (b) a receiver including means for receiving a transmitted encoded voice or non-voice signal and means for decoding the received encoded voice or non-voice signal.
  • Figure 1 is a simplified block diagram of a LP voice encoder, showing spectrum processing modules including a spectrum vector quantization module;
  • Figure 2 is a block diagram of the spectrum vector quantization module of the LP voice encoder of Figure 1 ;
  • Figure 3 is a simplified, schematic block diagram of a cellular communication system in which the LP voice encoder of Figure 1 can be used;
  • Figure 4 is a flow chart illustrating a first method for labelling and representing DTMF signals;
  • Figure 5 is a flow chart illustrating a second method for labelling and representing DTMF signals
  • Figure 6 is a flow chart illustrating a first method for detecting and quantizing DTMF signals.
  • Figure 7 is a flow chart illustrating a second method for detecting and quantizing DTMF signals.
  • a cellular communication system such as 301 ( Figure 3) provides a telecommunication service over a large geographic area by dividing that large geographic area into a number C of smaller cells.
  • the C smaller cells are serviced by respective cellular base stations 302 1t 302 2 ... 302 c to provide each cell with radio signalling, audio and data channels.
  • the radio signalling channels are used to page mobile radiotelephones (mobile transmitter/receiver units) such as 303 within the limits of the coverage area (cell) of the cellular base station 302, and to place calls to other radiotelephones 303 located either inside or outside the base station's cell or to another network such as the Public Switched Telephone Network (PSTN) 304.
  • PSTN Public Switched Telephone Network
  • radiotelephone 303 Once a radiotelephone 303 has successfully placed or received a call, an audio or data channel is established between this radiotelephone 303 and the cellular base station 302 corresponding to the cell in which the radiotelephone 303 is situated, and communication between the base station 302 and radiotelephone 303 is conducted over that audio or data channel.
  • the radiotelephone 303 may also receive control or timing information over the signalling channel whilst a call is in progress.
  • a radiotelephone 303 If a radiotelephone 303 leaves a cell and enters another adjacent cell while a call is in progress, the radiotelephone 303 hands over the call to an available audio or data channel of the new cell base station. If a radiotelephone 303 leaves a cell and enters another adjacent cell while no call is in progress, the radiotelephone 303 sends a control message over the signalling channel to log into the base station 302 of the new cell. In this manner mobile communication over a wide geographical area is possible.
  • the cellular communication system 301 further comprises a control terminal 305 to control communication between the cellular base stations 302 and the PSTN 304, for example during a communication between a radiotelephone 303 and the PSTN 304, or between a radiotelephone 303 located in a first cell and a radiotelephone 303 situated in a second cell.
  • a bidirectional wireless radio communication subsystem is required to establish an audio or data channel between a base station 302 of one cell and a radiotelephone 303 located in that cell.
  • a bidirectional wireless radio communication subsystem typically comprises in the radiotelephone 303:
  • - a transmitter 306 including: - an encoder 307 for encoding the voice signal;
  • a receiver 310 including:
  • the radiotelephone further comprises other conventional circuits 313 to which the encoder 307 and decoder 312 are connected, which circuits 313 are well known to those of ordinary skill in the art and, accordingly, will not be further described in the subject patent application.
  • such a bidirectional wireless radio communication subsystem typically comprises in the base station 302:
  • a transmitter 314 including:
  • an encoder 315 for encoding the voice signal
  • a transmission circuit 316 for transmitting the encoded voice signal from the encoder 315 through an antenna such as 317;
  • a receiver 318 including:
  • decoder 320 for decoding the received encoded voice signal from the receiving circuit 319.
  • the base station 302 further comprises, typically, a base station controller 321 , along with its associated data base 322, for controlling communication between the control terminal 305 and the transmitter 314 and receiver 318.
  • a base station controller 321 for controlling communication between the control terminal 305 and the transmitter 314 and receiver 318.
  • voice encoding is required in order to reduce the bandwidth necessary to transmit voice signal, for example speech, across the bidirectional wireless radio communication subsystem, i.e. between a radiotelephone 303 and a base station 302.
  • the aim of the present invention is to provide an efficient technique usable by the encoders 307 and 315 of Figure 3 for encoding non-voice signals such as Dual-Tone Multi-Frequency (DTMF) signals and other signalling tones.
  • non-voice signals such as Dual-Tone Multi-Frequency (DTMF) signals and other signalling tones.
  • DTMF Dual-Tone Multi-Frequency
  • LP voice encoders typically operating at 13 kbits/second and below such as Code-Excited Linear Prediction (CELP) encoders use a LP synthesis filter to model the short-term spectral envelope of the voice signal.
  • CELP Code-Excited Linear Prediction
  • the LP information is transmitted, typically, every 10 or 20 ms to the decoder and is extracted at the decoder end.
  • FIG 1 is a simplified block diagram of a LP voice encoder 100 (that can be used as encoders 307 and 315 of Figure 3) showing explicitly the spectrum processing modules 102-104 which are used to extract and quantize the LP information.
  • Module 101 is used to represent the LP voice encoder 100 without the spectrum processing modules 102-104.
  • the structure of a LP voice encoder is believed to be well known to those of ordinary skill in the art and, accordingly, module 101 will not be further described in the present specification.
  • An example of LP voice encoder is illustrated in Figure 1 of US patent N° 5,444,816 granted on August 22, 1995 to Jean- Pierre Adoul and Claude Laflamme. The description of US patent N fl 5,444,816 is inco ⁇ orated herein by reference.
  • the spectrum processing modules 102-104 comprise a spectrum analysis module 102 for extracting a set of LP coefficients 106 from a sampled input voice or non-voice signal 105. To extract the set of LP coefficients 106, the spectrum analysis module 102 follows the well known linear-prediction analysis procedure.
  • the spectrum processing modules 102-104 also comprise a module 103 for transforming the set of LP coefficients 106 from spectrum analysis module 102 into another domain where quantization can be done more efficiently.
  • the most popular LP coefficient transformation is the Line Spectral Pairs (LSP) transformation.
  • LSP Line Spectral Pairs
  • ISP Immitance Spectral Pairs
  • Transformation module 103 therefore produces a spectrum vector 107 having components in line-spectral-pair parametric form or in immitance-spectral-pair parametric form.
  • the spectrum vector 107 can be either the LSP (or ISP) vector itself, or, in other embodiments, a LSP (or ISP) difference vector; this LSP (or ISP) difference vector is the difference between the LSP (or ISP) vector and a prediction vector based on past excitation.
  • the modules 102 and 103 are responsive to the sampled input voice or non-voice signal 105 to produce the spectrum vector 107 at recurrent time intervals.
  • the spectrum processing modules 102-104 comprise a spectrum vector quantization module 104.
  • the function of module 104 is to quantize the spectrum vector 107 delivered by the transformation module 103 in view of producing a spectrum index 108.
  • Module 101 produces residual voice or non-voice signal information 109.
  • the residual information 109 from module 101 and the spectrum index 108 from module 104 are multiplexed through a multiplexor 110 to produce a digital output propagated through a given audio or data channel.
  • VQ vector quantization
  • the spectrum information is quantized by means of "constrained VQ” schemes whereby the unpractically large VQ table is emulated by combining a number of small quantization subtables.
  • the two commonly used constrained VQ schemes are the “M-way split-VQ” and the “multistage VQ” scheme.
  • the quantization subtables are jointly trained based on a large database using iterative algorithms such as the LBG or k-means algorithms [Allen Gersho and Robert M. Gray, "Vector Quantization and signal compression” Kluwer Academic Publishers, 1992, 732 pages].
  • the training database consists of transformed LP vectors extracted from long voice sequences consisting mainly of male and female voice and often in several languages.
  • FIG 2 is a block diagram of the spectrum vector quantization module 104 of Figure 1.
  • two quantization schemes are compared for best performance, namely a conventional scheme (Box 1) and a specific scheme (Box 2).
  • Box 1 of Figure 2 represents the conventional scheme depicted herein as an M-way split scheme.
  • Vector splitting module 201 splits the input spectrum vector 107 from transformation module 103 ( Figure 1) into M subvectors which are independently vector quantized in the M modules 202, 203 ... 204 using codebooks 205, 206 ... 207 of size N, respectively, where M and N are integers.
  • Codebooks 205, 206 ... 207 are quantization subtables trained using mostly voice/audio databases.
  • the corresponding codebook 205, 206 ... 207 is searched to find the nearest partial spectrum index corresponding to the input spectrum subvector.
  • the partial spectrum indexes from the vector quantization modules 202, 203 ... 204 and resulting from the M distinct VQ operations are multiplexed by multiplexor 208 to provide a spectrum index 213 according to the conventional M-way split scheme.
  • the short-term spectral envelope of DTMF signals exhibits spectral shapes which are very different from those of voice signals.
  • DTMF signals are not included in the training database since they may affect the quantizer performance. This results in a quantization table which has no entries representative of DTMF signals.
  • the bit rate is reduced to 8 kbits/second and below, the fewer bits allocated for modelling the excitation signal (in the decoders such as 312 and 320 in Figure 3) are not sufficient to properly compensate for the poorly quantized DTMF LP spectrum. This explains the occasional failure to detect DTMF signals at the decoder output.
  • Box 2 of Figure 2 represents the above mentioned DTMF-specific scheme, more specifically a DTMF-specific quantization scheme using unconstrained VQ.
  • the input spectrum vector 107 is vector quantized by searching a full-length DTMF codebook 209 to find the nearest index N corresponding to the input spectrum vector 107.
  • the procedure used to train the full-length DTMF codebook 209 is the following.
  • Spectrum vectors representing the 16 DTMF signals are obtained by applying the same LP analysis as performed by the spectrum analysis module 102 and transformation module 103 of Figure 1 to long sequences of individual DTMF signals. At least one average spectrum vector is retained for each DTMF signal as entries of the codebook 209.
  • some addresses amongst the address field spanned by the n bits assigned to quantizing the spectrum vector 107 according to some conventional scheme are "reserved" to represent the short-term spectrum of DTMF signals. Reserving a mere 16 entries for representing the spectrum vectors of the 16 DTMF signals out of more than one million entries of the address field can hardly affect the performance. Thus, there is no extra bit needed for using the DTMF- specific quantization scheme disclosed in the present invention.
  • Index mapping module 211 is essentially a look-up table mapping each index from the full-length DTMF codebook 209 into one of the "reserved" addresses of the address field spanned by the n bits assigned to quantizing the spectrum information according to the conventional scheme. Index mapping module 211 produces a corresponding spectrum index 214.
  • This first example is using 3-way split VQ of LSPs, in which a 10 th order LSP vector is split into three subvectors of dimension 3,3 and 4, respectively, using 8,9 and 9- bits subtables such as 205, 206 and 207 for the respective subvectors.
  • a LP filter is stable only if the LSPs are ordered, that is when LSP k is larger than LSP, if k is larger than I.
  • step 401 of Figure 4 Since the dynamic ranges of the individual LSPs are overlapping each other, it is easy to find (step 401 of Figure 4) an invalid combination of the entries of the first two quantization codebook subtables 205 and 206, from the first two subvectors in which LSP 4 is smaller than LSP 3 . Thus, this logically invalid combination of said entries can be "reserved" (step 402 of Figure 4) for labelling DTMF signals. In that case, the 9 bits in the index of the third subvector can be used to represent DTMF signals, that is the entry of the full-length DTMF codebook 209. Note that this procedure is not restricted to split-VQ and can be implemented in any existing quantizer in which certain invalid combinations of partial indexes (i.e. subtable entries) can be found.
  • This second example is concerned with a two-stage VQ of LSPs, in which 9-bit subtables are used in each stage.
  • the quantizer comprises 511+1 entries in the first stage and 512 entries in the second stage, one entry of the first stage can be reserved (step 501 of Figure 5) for labelling DTMF signals. Combined with that reserved entry of the first stage, some of the 512 partial indexes of the above described second stage can be used (step 502 of Figure 5) to represent the DTMF signals, more specifically the entry of the DTMF codebook 209 (Box 2).
  • selector 212 the function of selector 212 is to compare the performance of the conventional (Box 1) and DTMF-specific (Box 2) quantization schemes and to select, through a switch 215, as outgoing spectrum index 108 the spectrum index 213 or 214 resulting from the scheme presenting the best performance. To conduct this comparison of performance, the selector 212 uses the same distance measure, for example a weighted Euclidean distance measure, in the two quantization schemes.
  • Implementation of the VQ scheme according to the present invention requires a minimal change to the conventional procedure. Indeed, the search for the best spectrum index is conducted in accordance with the conventional quantization scheme.
  • the minimum distance measure corresponding to the best spectrum index found (step 601 of Figure 6) using the conventional VQ scheme (Box 1) is compared (step 602 of Figure 6) with the minimum distance obtained with each entry of the full-length DTMF codebook 209 (Box 2).
  • One embodiment for the index mapping module 211 given as a simple alternate to using a look-up table, operates as follows.
  • the encoder does not attempt to classify the signal as voice, DTMF or other signal, whereby no additional information needs to be transmitted to the decoder.
  • the additional DTMF codebook 209 can be seen as superimposed over a small part of the spectral vector codebook subtables 205 -207 ( Figure 2, Box 1), which small codebook part is specially trained and tailored to DTMF signals. In the rare event where an entry from this special codebook 209 is selected during processing of an actual voice signal, no harm will result as the encoder will continue to find the optimum excitation signal in accordance with the usual procedure.
  • the bit rate is not sufficient to encode the excitation signal (including the DTMF signal) so as to enable proper reconstruction of the DTMF signal at the decoder.
  • the above described DTMF-trained quantization codebook 209 can be used to detect DTMF signals at the encoder and information as to whether the present frame is voice or a DTMF signal is transmitted to the decoder using an extra flag bit or, more efficiently, by means of a set of reserved addresses of the address field as described hereinabove.
  • the DTMF signal is artificially regenerated whenever a received DTMF frame is detected.
  • the detection process can also be performed by the selector 212 as follows prior to LP quantization.
  • a weighted distance for example the Euclidean distance, is computed (step 701 of Figure 7) between the input spectrum vector 107 and each individual entry of the full-length DTMF codebook 209. Then, each computed weighted distance is compared (step 702 of Figure 7) with a predetermined weighted distance threshold.
  • the frame is declared (step 703) to be a DTMF frame and the selector 212 positions the switch 215 so as to select (step 704) for transmission spectrum index 214 from the full-length DTMF codebook 209 of Box 2.
  • a precomputed set of weighting factors is used in the distance measure.
  • the detection thresholds are determined in relation to statistics of DTMF signals within the allowed range of spectral tilt and frequency deviations. The detection process is very efficient since DTMF signals exhibit spectral shapes which are very different from tones of voice signals.
  • the transformed LP vectors from module 103 of Figure 1 for example LSP vectors, corresponding to DTMF signals are easily distinguishable from those corresponding to voice signals. If no entry of the DTMF codebook 209 gives a weighted distance smaller than the predetermined weighted distance threshold associated to this entry, the frame is declared to be a voice-signal frame, the quantization codebook subtables such as 205, 206 and 207 are searched to produce the spectrum index 213, and the selector 212 positions the switch 215 so as to select the spectrum index 213 as spectrum index 108 to be transmitted.
  • the present invention results in a significant improvement in the performance of the voice encoder 100 for processing DTMF signals, and ensures that these signals are properly encoded and correctly detected and decoded at the receiver.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

In a method and device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, a spectrum-vector quantization codebook including a first voice-signal quantization codebook portion and a second non-voice signal quantization codebook portion is used. Whether the spectrum vector represents a voice signal or a non-voice signal is detected. When the spectrum vector represents a voice signal, the first codebook portion is searched for quantizing the spectrum vector and producing the spectrum index. The second codebook portion is searched for quantizing the spectrum vector and producing the spectrum index when the spectrum vector represents a non-voice signal. According to a first alternative, the first and second codebook portions are searched by measuring a weighted distance between the spectrum vector and the entries of these two codebook portions, and the spectrum vector is considered as representing a non-voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the second codebook portion. According to a second alternative, a weighted distance is measured between the spectrum vector and the entries of the second codebook portion, and the spectrum vector is considered as representing a non-voice signal when the weighted distance measured between the spectrum vector and one entry of the second codebook portion is smaller than a predetermined threshold.

Description

ENHANCED ENCODING OF DTMF
AND OTHER SIGNALLING TONES
BACKGROUND OF THE INVENTION
1. Field of the invention:
The present invention relates to the field of digital encoding of voice signals. In the present specification, including the appended claims, the term "voice signal" is intended to designate speech, audio, music and other signals.
In this context, it is often required to encode non-voice signals such as DTMF (Dual Tone Multi-Frequency) signals and other signalling tones for dialling, transmitting data and/or performing other transmission functions. The present invention enhances encoding of DTMF signals and other signalling tones so as to prevent their purpose from being hindered by the digital encoding procedure.
2. Brief description of the prior art:
Low bit rate speech encoding algorithms are usually based on a speech production model and therefore optimized for speech signals. As the bit rate is reduced to 8 kbits/second and below, these encoders meet difficulties in encoding non-speech signals such as DTMF signals and other signalling tones; this results in occasional failures in detecting these signals at the receiver end.
Central to the speech production model is the parametric description of the short-term speech spectrum. The most common approach called "linear prediction" consists of transmitting at regular time intervals, typically every 10 or 20 milliseconds, a set of so-called linear prediction (LP) coefficients. Efficient encoding of the LP coefficients involves quantization tables trained by means of a speech data base.
One modest improvement in encoding DTMF signals is obtained by including DTMF signals into the speech database used to train the above mentioned quantization tables. However, this improvement is limited due to the high disparity that exists between speech and DTMF spectra and the fact that some form of constrained (or structured) vector quantization is inevitable to reduce complexity.
OBJECTS OF THE INVENTION
An object of the present invention is to provide a quantizing method and device capable of overcoming the above described drawbacks of the prior art for example by "reserving" in the field of entries to the speech-trained quantization table of LP coefficients some entries for representing the short-term spectrum of DTMF signals and other signalling tones.
Another object of the present invention is to introduce in a quantization method and device a DTMF(or other signalling tones)- specific codebook with minimal change to the conventional quantization procedure.
SUMMARY OF THE INVENTION
More specifically, in accordance with the present invention, there is provided a method and a device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index. In these method and device, a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion is provided. A detection to determine whether the spectrum vector represents a voice signal or a non-voice signal is made. When the spectrum vector represents a voice signal, the voice-signal quantization-codebook portion is searched for quantizing the spectrum vector and producing the spectrum index. In the same manner, the non-voice signal quantization codebook portion is searched for quantizing the spectrum vector and producing the spectrum index when the spectrum vector represents a non-voice signal. The non-voice signal quantization codebook portion searched for encoding the non-voice signal representative spectrum indexes greatly improves encoding of non-voice signals such as DTMF signals and other signalling tones.
The present invention also relates to a method and a device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index. In these method and device, there is provided a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion. The voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion are searched by measuring a weighted distance between the spectrum vector and the entries of the voice-signal quantization-codebook portion, and the non-voice signal quantization codebook portion. It is detected that the spectrum vector represents a voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the voice-signal quantization-codebook portion. In the same manner, it is detected that the spectrum vector represents a non- voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion. When the spectrum vector represents a voice signal, a spectrum index is produced in relation to said one entry of the voice-signal quantization-codebook portion. When the spectrum vector represents a non-voice signal, a spectrum index is produced in relation to said one entry of the non-voice signal quantization codebook portion. In accordance with a preferred embodiment, the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, a predetermined set of combinations of partial spectrum indexes are reserved for non-voice signals, and searching the voice-signal quantization-codebook portion comprises searching the quantization codebook subtables and producing corresponding partial spectrum indexes forming combinations not included in the predetermined set of combinations of partial spectrum indexes. When the spectrum vector represents a voice signal, the spectrum index is produced by combining the partial spectrum indexes corresponding to said one entry of the voice- signal quantization-codebook portion. When the spectrum vector represents a non-voice signal, the spectrum index is produced by selecting, in relation to said one entry of the non-voice signal quantization codebook portion, one combination of the predetermined set.
Preferably, the predetermined set of combinations of partial spectrum indexes reserved for non-voice signals correspond to invalid combinations of entries of respective quantization codebook subtables.
In accordance with another preferred embodiment, the spectrum vector has components related to line-spectral-pairs, the voice-signal quantization-codebook portion comprises at least three quantization codebook subtables each having a plurality of entries, one combination of the predetermined set is selected to form the spectrum index, this combination being composed of a non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion, and the non-voice-signal label part corresponds to a combination of entries of two subtables amongst the at least three quantization codebook subtables which is logically invalid in regard to adjacent line-spectral-pair component ordering.
According to a further preferred embodiment, the quantization codebook subtables are searched in stages including a first stage and at least one subsequent stage, and the predetermined set of combinations of partial spectrum indexes is formed by considering, at least, one predetermined partial spectrum index for the first stage combined with partial spectrum indexes corresponding to entries of the quantization codebook subtables searched in the subsequent stage(s).
The present invention is further concerned with a method and a device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, which method and device using a spectrum-vector quantization codebook including a voice- signal quantization-codebook portion and a non-voice signal quantization codebook portion. A weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion is measured and it is detected that the spectrum vector represents a non- voice signal when the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion is smaller than a predetermined weighted distance threshold. Upon detection that the spectrum vector represents a non-voice signal, a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion is produced. Upon failure to detect that the spectrum vector represents a non-voice signal, the voice-signal quantization-codebook portion is searched for quantizing the spectrum vector and producing the spectrum index.
In accordance with preferred embodiments:
- the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, the voice-signal quantization-codebook portion comprises addresses which are related to combinations of entries of the plurality of quantization codebook subtables, the voice-signal quantization-codebook portion is searched by splitting the spectrum vector into a plurality of subvectors, searching the quantization codebook subtables for quantizing the subvectors, respectively, and producing respective partial spectrum indexes, and combining the partial spectrum indexes to produce the spectrum index, and an invalid combination of the entries of at least two quantization codebook subtables is reserved as predetermined non- voice-signal label part; and
- the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of stages including a first stage and at least one subsequent stage, each stage having a given number of entries, at least one entry of the first stage is reserved as the predetermined non-voice-signal label part, and the at least one entry of the first stage is combined with at least one entry of the subsequent stage(s) to represent non-voice signals. Preferably, the spectrum vector has components related to line-spectral-pairs or immitance-spectral-pairs, the measured weighted distance is a weighted Euclidean distance, and the non-voice signal comprises a signalling tone, for example a DTMF signal.
The present invention still further relates to an encoder for encoding a voice or non-voice input signal, comprising an encoding section responsive to the voice or non-voice input signal for producing residual voice or non-voice signal information, a spectrum processing section responsive to the input voice or non-voice signal for producing a spectrum index, and means for transmitting the residual signal information and the spectrum index through a communication channel. The spectrum processing section comprises means responsive to the input voice or non-voice signal for producing a spectrum vector at recurrent time intervals and one of the above described devices for quantizing the spectrum vector to produce the spectrum index.
In accordance with the present invention, there is finally provided a cellular communication system for servicing a large geographical area divided into a plurality of cells, comprising: mobile transmitter/receiver units; cellular base stations respectively situated in the cells; means for controlling communication between the cellular base stations; and a bidirectional wireless communication sub-system between each mobile unit situated in one cell and the cellular base station of that cell, this bidirectional wireless communication sub-system comprising in both the mobile unit and the cellular base station (a) a transmitter including one of the above described encoders for encoding a voice or non-voice signal and means for transmitting the encoded voice or non-voice signal, and (b) a receiver including means for receiving a transmitted encoded voice or non-voice signal and means for decoding the received encoded voice or non-voice signal.
The objects, advantages and other features of the present invention will become more apparent upon reading of the following non restrictive description of a preferred embodiment thereof, given by way of example only with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
In the appended drawings:
Figure 1 is a simplified block diagram of a LP voice encoder, showing spectrum processing modules including a spectrum vector quantization module;
Figure 2 is a block diagram of the spectrum vector quantization module of the LP voice encoder of Figure 1 ;
Figure 3 is a simplified, schematic block diagram of a cellular communication system in which the LP voice encoder of Figure 1 can be used; Figure 4 is a flow chart illustrating a first method for labelling and representing DTMF signals;
Figure 5 is a flow chart illustrating a second method for labelling and representing DTMF signals;
Figure 6 is a flow chart illustrating a first method for detecting and quantizing DTMF signals; and
Figure 7 is a flow chart illustrating a second method for detecting and quantizing DTMF signals.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
Although application of the method and device for quantizing a spectrum vector according to the invention to a cellular communication system is disclosed as a non limitative example in the present specification, it should be kept in mind that these method and device can be used with the same advantages in many other types of communication systems in which quantization of a spectrum vector, supplied at recurrent time intervals, is required.
As well known to those of ordinary skill in the art, a cellular communication system such as 301 (Figure 3) provides a telecommunication service over a large geographic area by dividing that large geographic area into a number C of smaller cells. The C smaller cells are serviced by respective cellular base stations 3021t 3022 ... 302c to provide each cell with radio signalling, audio and data channels.
The radio signalling channels are used to page mobile radiotelephones (mobile transmitter/receiver units) such as 303 within the limits of the coverage area (cell) of the cellular base station 302, and to place calls to other radiotelephones 303 located either inside or outside the base station's cell or to another network such as the Public Switched Telephone Network (PSTN) 304.
Once a radiotelephone 303 has successfully placed or received a call, an audio or data channel is established between this radiotelephone 303 and the cellular base station 302 corresponding to the cell in which the radiotelephone 303 is situated, and communication between the base station 302 and radiotelephone 303 is conducted over that audio or data channel. The radiotelephone 303 may also receive control or timing information over the signalling channel whilst a call is in progress.
If a radiotelephone 303 leaves a cell and enters another adjacent cell while a call is in progress, the radiotelephone 303 hands over the call to an available audio or data channel of the new cell base station. If a radiotelephone 303 leaves a cell and enters another adjacent cell while no call is in progress, the radiotelephone 303 sends a control message over the signalling channel to log into the base station 302 of the new cell. In this manner mobile communication over a wide geographical area is possible. The cellular communication system 301 further comprises a control terminal 305 to control communication between the cellular base stations 302 and the PSTN 304, for example during a communication between a radiotelephone 303 and the PSTN 304, or between a radiotelephone 303 located in a first cell and a radiotelephone 303 situated in a second cell.
Of course, a bidirectional wireless radio communication subsystem is required to establish an audio or data channel between a base station 302 of one cell and a radiotelephone 303 located in that cell. As illustrated in very simplified form in Figure 3, such a bidirectional wireless radio communication subsystem typically comprises in the radiotelephone 303:
- a transmitter 306 including: - an encoder 307 for encoding the voice signal; and
- a transmission circuit 308 for transmitting the encoded voice signal from the encoder 307 through an antenna such as 309; and
- a receiver 310 including:
- a receiving circuit 311 for receiving a transmitted encoded voice signal through the same antenna 309; and
- a decoder 312 for decoding the received encoded voice signal from the receiving circuit 311. The radiotelephone further comprises other conventional circuits 313 to which the encoder 307 and decoder 312 are connected, which circuits 313 are well known to those of ordinary skill in the art and, accordingly, will not be further described in the subject patent application.
Also, such a bidirectional wireless radio communication subsystem typically comprises in the base station 302:
- a transmitter 314 including:
- an encoder 315 for encoding the voice signal; and - a transmission circuit 316 for transmitting the encoded voice signal from the encoder 315 through an antenna such as 317; and
- a receiver 318 including:
- a receiving circuit 319 for receiving a transmitted encoded voice signal through the same antenna 317; and
- a decoder 320 for decoding the received encoded voice signal from the receiving circuit 319.
The base station 302 further comprises, typically, a base station controller 321 , along with its associated data base 322, for controlling communication between the control terminal 305 and the transmitter 314 and receiver 318.
As well known to those of ordinary skill in the art, voice encoding is required in order to reduce the bandwidth necessary to transmit voice signal, for example speech, across the bidirectional wireless radio communication subsystem, i.e. between a radiotelephone 303 and a base station 302.
The aim of the present invention is to provide an efficient technique usable by the encoders 307 and 315 of Figure 3 for encoding non-voice signals such as Dual-Tone Multi-Frequency (DTMF) signals and other signalling tones.
LP voice encoders typically operating at 13 kbits/second and below such as Code-Excited Linear Prediction (CELP) encoders use a LP synthesis filter to model the short-term spectral envelope of the voice signal. The LP information is transmitted, typically, every 10 or 20 ms to the decoder and is extracted at the decoder end.
Figure 1 is a simplified block diagram of a LP voice encoder 100 (that can be used as encoders 307 and 315 of Figure 3) showing explicitly the spectrum processing modules 102-104 which are used to extract and quantize the LP information.
Module 101 is used to represent the LP voice encoder 100 without the spectrum processing modules 102-104. The structure of a LP voice encoder is believed to be well known to those of ordinary skill in the art and, accordingly, module 101 will not be further described in the present specification. An example of LP voice encoder is illustrated in Figure 1 of US patent N° 5,444,816 granted on August 22, 1995 to Jean- Pierre Adoul and Claude Laflamme. The description of US patent Nfl 5,444,816 is incoφorated herein by reference. The spectrum processing modules 102-104 comprise a spectrum analysis module 102 for extracting a set of LP coefficients 106 from a sampled input voice or non-voice signal 105. To extract the set of LP coefficients 106, the spectrum analysis module 102 follows the well known linear-prediction analysis procedure.
The spectrum processing modules 102-104 also comprise a module 103 for transforming the set of LP coefficients 106 from spectrum analysis module 102 into another domain where quantization can be done more efficiently. The most popular LP coefficient transformation is the Line Spectral Pairs (LSP) transformation. A related transformation having properties similar to the properties of the LSP transformation is the well known Immitance Spectral Pairs (ISP) transformation.
Transformation module 103 therefore produces a spectrum vector 107 having components in line-spectral-pair parametric form or in immitance-spectral-pair parametric form. The spectrum vector 107 can be either the LSP (or ISP) vector itself, or, in other embodiments, a LSP (or ISP) difference vector; this LSP (or ISP) difference vector is the difference between the LSP (or ISP) vector and a prediction vector based on past excitation. More specifically, the modules 102 and 103 are responsive to the sampled input voice or non-voice signal 105 to produce the spectrum vector 107 at recurrent time intervals.
Finally, the spectrum processing modules 102-104 comprise a spectrum vector quantization module 104. The function of module 104 is to quantize the spectrum vector 107 delivered by the transformation module 103 in view of producing a spectrum index 108.
Module 101 produces residual voice or non-voice signal information 109. The residual information 109 from module 101 and the spectrum index 108 from module 104 are multiplexed through a multiplexor 110 to produce a digital output propagated through a given audio or data channel.
Many quantization methods are available. The most efficient approaches are those using some form of vector quantization (VQ). Most voice/audio codecs assign a number n of bits to quantize the spectrum information where, typically, n ≥ 20. The most efficient method is to utilize these n bits in relation to a quantization table having as many addresses, i.e. entries, as contained in the address field spanned by the n bits. This approach is called "unconstrained" vector quantization (VQ). Unfortunately, for n ≥20, the address field contains over one million addresses (220 addresses or entries), which makes storing and searching unpractical. In recent implementations of LP encoders, the spectrum information is quantized by means of "constrained VQ" schemes whereby the unpractically large VQ table is emulated by combining a number of small quantization subtables. The two commonly used constrained VQ schemes are the "M-way split-VQ" and the "multistage VQ" scheme. In these two commonly used constrained VQ schemes, the quantization subtables are jointly trained based on a large database using iterative algorithms such as the LBG or k-means algorithms [Allen Gersho and Robert M. Gray, "Vector Quantization and signal compression" Kluwer Academic Publishers, 1992, 732 pages]. The training database consists of transformed LP vectors extracted from long voice sequences consisting mainly of male and female voice and often in several languages.
Figure 2 is a block diagram of the spectrum vector quantization module 104 of Figure 1. In Figure 2, two quantization schemes are compared for best performance, namely a conventional scheme (Box 1) and a specific scheme (Box 2).
More specifically, Box 1 of Figure 2 represents the conventional scheme depicted herein as an M-way split scheme.
Vector splitting module 201 splits the input spectrum vector 107 from transformation module 103 (Figure 1) into M subvectors which are independently vector quantized in the M modules 202, 203 ... 204 using codebooks 205, 206 ... 207 of size N, respectively, where M and N are integers. Codebooks 205, 206 ... 207 are quantization subtables trained using mostly voice/audio databases. In each vector quantization modules 202, 203 ... 204, the corresponding codebook 205, 206 ... 207 is searched to find the nearest partial spectrum index corresponding to the input spectrum subvector. The partial spectrum indexes from the vector quantization modules 202, 203 ... 204 and resulting from the M distinct VQ operations are multiplexed by multiplexor 208 to provide a spectrum index 213 according to the conventional M-way split scheme.
The short-term spectral envelope of DTMF signals exhibits spectral shapes which are very different from those of voice signals. In the following description, the preferred embodiment of the invention will be described with reference to DTMF signals; it should however be kept in mind that the present invention can be implemented in relation to other non-voice signals such as other signalling tones. Usually, DTMF signals are not included in the training database since they may affect the quantizer performance. This results in a quantization table which has no entries representative of DTMF signals. As the bit rate is reduced to 8 kbits/second and below, the fewer bits allocated for modelling the excitation signal (in the decoders such as 312 and 320 in Figure 3) are not sufficient to properly compensate for the poorly quantized DTMF LP spectrum. This explains the occasional failure to detect DTMF signals at the decoder output.
The alternative quantization scheme which is aimed at improving the encoding of DTMF signals of interest will now be described. Box 2 of Figure 2 represents the above mentioned DTMF-specific scheme, more specifically a DTMF-specific quantization scheme using unconstrained VQ.
In module 210, the input spectrum vector 107 is vector quantized by searching a full-length DTMF codebook 209 to find the nearest index N corresponding to the input spectrum vector 107.
The procedure used to train the full-length DTMF codebook 209 is the following. Spectrum vectors representing the 16 DTMF signals are obtained by applying the same LP analysis as performed by the spectrum analysis module 102 and transformation module 103 of Figure 1 to long sequences of individual DTMF signals. At least one average spectrum vector is retained for each DTMF signal as entries of the codebook 209.
In the present invention, some addresses amongst the address field spanned by the n bits assigned to quantizing the spectrum vector 107 according to some conventional scheme are "reserved" to represent the short-term spectrum of DTMF signals. Reserving a mere 16 entries for representing the spectrum vectors of the 16 DTMF signals out of more than one million entries of the address field can hardly affect the performance. Thus, there is no extra bit needed for using the DTMF- specific quantization scheme disclosed in the present invention.
Index mapping module 211 is essentially a look-up table mapping each index from the full-length DTMF codebook 209 into one of the "reserved" addresses of the address field spanned by the n bits assigned to quantizing the spectrum information according to the conventional scheme. Index mapping module 211 produces a corresponding spectrum index 214.
These "special" addresses can be reserved either at the design stage by forbidding these addresses during the training on the voice database or, as will be explained in the following description by way of the two following examples 1 and 2, they can be advantageously superimposed to some combination of subtables entries that cannot logically occurs anyway. Thus, whether invalid logically or by fiat, these "special" addresses are reserved in the address field for indexing the non- voice signals. Example 1 :
This first example is using 3-way split VQ of LSPs, in which a 10th order LSP vector is split into three subvectors of dimension 3,3 and 4, respectively, using 8,9 and 9- bits subtables such as 205, 206 and 207 for the respective subvectors. According to the ordering property of LSPs, a LP filter is stable only if the LSPs are ordered, that is when LSPk is larger than LSP, if k is larger than I. Since the dynamic ranges of the individual LSPs are overlapping each other, it is easy to find (step 401 of Figure 4) an invalid combination of the entries of the first two quantization codebook subtables 205 and 206, from the first two subvectors in which LSP4 is smaller than LSP3. Thus, this logically invalid combination of said entries can be "reserved" (step 402 of Figure 4) for labelling DTMF signals. In that case, the 9 bits in the index of the third subvector can be used to represent DTMF signals, that is the entry of the full-length DTMF codebook 209. Note that this procedure is not restricted to split-VQ and can be implemented in any existing quantizer in which certain invalid combinations of partial indexes (i.e. subtable entries) can be found.
Second example:
This second example is concerned with a two-stage VQ of LSPs, in which 9-bit subtables are used in each stage. If the quantizer comprises 511+1 entries in the first stage and 512 entries in the second stage, one entry of the first stage can be reserved (step 501 of Figure 5) for labelling DTMF signals. Combined with that reserved entry of the first stage, some of the 512 partial indexes of the above described second stage can be used (step 502 of Figure 5) to represent the DTMF signals, more specifically the entry of the DTMF codebook 209 (Box 2).
Referring back to Figure 2 of the appended drawings, the function of selector 212 is to compare the performance of the conventional (Box 1) and DTMF-specific (Box 2) quantization schemes and to select, through a switch 215, as outgoing spectrum index 108 the spectrum index 213 or 214 resulting from the scheme presenting the best performance. To conduct this comparison of performance, the selector 212 uses the same distance measure, for example a weighted Euclidean distance measure, in the two quantization schemes.
Implementation of the VQ scheme according to the present invention requires a minimal change to the conventional procedure. Indeed, the search for the best spectrum index is conducted in accordance with the conventional quantization scheme. The minimum distance measure corresponding to the best spectrum index found (step 601 of Figure 6) using the conventional VQ scheme (Box 1) is compared (step 602 of Figure 6) with the minimum distance obtained with each entry of the full-length DTMF codebook 209 (Box 2). One embodiment for the index mapping module 211, given as a simple alternate to using a look-up table, operates as follows. In the 3-way-split VQ example (Example 1), when an entry of the DTMF codebook 209 gives the overall smallest distance, a DTMF signal is detected and labelled by setting the partial indexes of the first two subvectors to the invalid combination (step 603). The partial index of the third subvector represents in this case the entry in the DTMF codebook 209 (step 604). At the receiver, whenever said invalid combination of the first two partial indexes (i.e. the non-voice signal label) is received, the entry of the full-length DTMF codebook 209 represented by the index of the third subvector is chosen.
It should be pointed out that, in the above described procedure, the encoder does not attempt to classify the signal as voice, DTMF or other signal, whereby no additional information needs to be transmitted to the decoder. The additional DTMF codebook 209 can be seen as superimposed over a small part of the spectral vector codebook subtables 205 -207 (Figure 2, Box 1), which small codebook part is specially trained and tailored to DTMF signals. In the rare event where an entry from this special codebook 209 is selected during processing of an actual voice signal, no harm will result as the encoder will continue to find the optimum excitation signal in accordance with the usual procedure.
Therefore, when an entry of the DTMF codebook 209 gives the smallest weighted distance, a non-voice signal is detected and spectrum index 214 is selected by selector 212 through switch 215. On the contrary, when entries of the quantization codebook subtables such as 205, 206 and 207 give the smallest weighted distance, a voice signal is detected and represented by these entries. Spectrum index 213 is then selected by selector 212 through switch 215.
For transmission rates below 5 kbits/second, the bit rate is not sufficient to encode the excitation signal (including the DTMF signal) so as to enable proper reconstruction of the DTMF signal at the decoder. In this case the above described DTMF-trained quantization codebook 209 can be used to detect DTMF signals at the encoder and information as to whether the present frame is voice or a DTMF signal is transmitted to the decoder using an extra flag bit or, more efficiently, by means of a set of reserved addresses of the address field as described hereinabove. At the decoder, the DTMF signal is artificially regenerated whenever a received DTMF frame is detected.
In an alternate implementation, the detection process can also be performed by the selector 212 as follows prior to LP quantization. First, a weighted distance, for example the Euclidean distance, is computed (step 701 of Figure 7) between the input spectrum vector 107 and each individual entry of the full-length DTMF codebook 209. Then, each computed weighted distance is compared (step 702 of Figure 7) with a predetermined weighted distance threshold. If, for a given entry of the DTMF codebook 209, the computed weighted distance is smaller than the predetermined threshold associated to this entry, the frame is declared (step 703) to be a DTMF frame and the selector 212 positions the switch 215 so as to select (step 704) for transmission spectrum index 214 from the full-length DTMF codebook 209 of Box 2. For each entry of the DTMF codebook 209, a precomputed set of weighting factors is used in the distance measure. The detection thresholds are determined in relation to statistics of DTMF signals within the allowed range of spectral tilt and frequency deviations. The detection process is very efficient since DTMF signals exhibit spectral shapes which are very different from tones of voice signals. Thus, the transformed LP vectors from module 103 of Figure 1 , for example LSP vectors, corresponding to DTMF signals are easily distinguishable from those corresponding to voice signals. If no entry of the DTMF codebook 209 gives a weighted distance smaller than the predetermined weighted distance threshold associated to this entry, the frame is declared to be a voice-signal frame, the quantization codebook subtables such as 205, 206 and 207 are searched to produce the spectrum index 213, and the selector 212 positions the switch 215 so as to select the spectrum index 213 as spectrum index 108 to be transmitted.
The present invention results in a significant improvement in the performance of the voice encoder 100 for processing DTMF signals, and ensures that these signals are properly encoded and correctly detected and decoded at the receiver.
Although the present invention has been described hereinabove by way of a preferred embodiment thereof, this embodiment can be modified at will, within the scope of the appended claims, without departing from the spirit and nature of the subject invention.

Claims

WHAT IS CLAIMED IS:
1. A method for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, said method comprising the steps of: providing a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion, wherein both the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of entries; searching the voice-signal quantization-codebook portion by measuring a weighted distance between the spectrum vector and the entries of the voice-signal quantization-codebook portion; searching the non-voice signal quantization codebook portion by measuring the weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; detecting that the spectrum vector represents a voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the voice-signal quantization-codebook portion; detecting that the spectrum vector represents a non-voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion;
when the spectrum vector represents a voice signal, producing a spectrum index in relation to said one entry of the voice-signal quantization-codebook portion; and when the spectrum vector represents a non-voice signal, producing a spectrum index in relation to said one entry of the non-voice signal quantization codebook portion.
2. A method for quantizing a spectrum vector as defined in claim
1 , wherein the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, wherein a predetermined set of combinations of partial spectrum indexes are reserved for non-voice signals, and wherein: the step of searching the voice-signal quantization-codebook portion comprises searching the quantization codebook subtables and producing corresponding partial spectrum indexes forming combinations not included in said predetermined set of combinations of partial spectrum indexes; when the spectrum vector represents a voice signal, the step of producing a spectrum index comprises combining the partial spectrum indexes corresponding to said one entry of the voice-signal quantization-codebook portion to produce the spectrum index; and when the spectrum vector represents a non-voice signal, the step of producing a spectrum index comprises selecting, in relation to said one entry of the non-voice signal
quantization codebook portion, one combination of said predetermined set to form the spectrum index.
3. A method for quantizing a spectrum vector as defined in claim
2, wherein the predetermined set of combinations of partial spectrum indexes reserved for non-voice signals correspond to invalid combinations of entries of respective quantization codebook subtables.
4. A method for quantizing a spectrum vector as defined in claim
3, in which: the spectrum vector has components related to line-spectral-pairs; the voice-signal quantization-codebook portion comprises at least three quantization codebook subtables each having a plurality of entries; said selecting step comprises selecting one combination of said predetermined set forming the spectrum index and composed of a non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion, said non-voice-signal label part corresponding to a combination of entries of two subtables amongst said at least three quantization codebook subtables which is logically invalid in regard to adjacent line-spectral-pair component ordering.
5. A method for quantizing a spectrum vector as defined in claim 2, wherein searching of the quantization codebook subtables is conducted in stages including a first stage and at least one subsequent stage, and wherein the predetermined set of combinations of partial spectrum indexes is formed by considering, at least, one predetermined partial spectrum index for the first stage combined with partial spectrum indexes corresponding to entries of the quantization codebook subtables searched in said at least one subsequent stage.
6. A method for quantizing a spectrum vector as defined in claim 1 , in which the spectrum vector has components related to line-spectral-pairs.
7. A method for quantizing a spectrum vector as defined in claim 1 , in which the spectrum vector has components related to immitance-spectral-pairs.
8. A method for quantizing a spectrum vector as defined in claim 1 , wherein the non-voice signal comprises a signalling tone.
9. A method for quantizing a spectrum vector as defined in claim
8, wherein the signalling tone is a DTMF signal.
10. A method for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, said method comprising the steps of: providing a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion; detecting whether the spectrum vector represents a voice signal or a non-voice signal; when the spectrum vector represents a voice signal, searching the voice-signal quantization-codebook portion for quantizing the spectrum vector and producing the spectrum index; and when the spectrum vector represents a non-voice signal, searching the non-voice signal quantization codebook portion for quantizing the spectrum vector and producing the spectrum index.
11. A method for quantizing a spectrum vector as defined in claim
10, wherein the step of detecting whether the spectrum vector represents a voice signal or a non-voice signal comprises: measuring a weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; detecting that the spectrum vector represents a non-voice signal when the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion is smaller than a predetermined threshold associated to said one entry of the non- voice signal quantization codebook portion; and
detecting that the spectrum vector represents a voice signal when the weighted distance measured between the spectrum vector and every entry of the non-voice signal quantization codebook portion is larger than the predetermined threshold associated to the corresponding entry.
12. A method for quantizing a spectrum vector as defined in claim
10, in which the spectrum vector has components related to line-spectral- pairs.
13. A method for quantizing a spectrum vector as defined in claim 10, in which the spectrum vector has components related to immitance- spectral-pairs.
14. A method for quantizing a spectrum vector as defined in claim 10, wherein the non-voice signal comprises a signalling tone.
15. A method for quantizing a spectrum vector as defined in claim 10, wherein the signalling tone is a DTMF signal.
16. A method for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, said method comprising the steps of: providing a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion, wherein the non-voice signal quantization codebook portion comprises a plurality of entries; measuring a weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; detecting that the spectrum vector represents a non-voice signal when the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion is smaller than a predetermined weighted distance threshold; upon detection that the spectrum vector represents a non-voice signal, producing a spectrum index including a predetermined non-voice- signal label part and a second part related to said one entry of the non- voice signal quantization codebook portion; and upon failure to detect that the spectrum vector represents a non- voice signal, searching the voice-signal quantization-codebook portion for quantizing the spectrum vector and producing the spectrum index.
17. A method for quantizing a spectrum vector as defined in claim 16, wherein the predetermined weighted distance threshold is associated to said one entry of the non-voice signal quantization codebook portion.
18. A method for quantizing a spectrum vector as defined in claim
16, wherein the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, wherein the voice-signal quantization-codebook portion
comprises addresses which are related to combinations of entries of the plurality of quantization codebook subtables, and wherein: the step of searching the voice-signal quantization-codebook portion comprises the steps of: splitting the spectrum vector into a plurality of subvectors; searching the quantization codebook subtables for quantizing the subvectors, respectively, and producing respective partial spectrum indexes; and combining the partial spectrum indexes to produce said spectrum index; and the step of producing a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion comprises the step of: reserving an invalid combination of the entries of at least two of said quantization codebook subtables as said predetermined non-voice-signal label part.
19. A method for quantizing a spectrum vector as defined in claim 16, wherein: the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of stages including a first stage and at least one subsequent stage, each stage having a given number of entries; and
the step of producing a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion comprises the steps of: reserving at least one entry of the first stage as said predetermined non-voice-signal label part; and combining said at least one entry of the first stage with at least one entry of said at least one subsequent stage to represent non-voice signals.
20. A method for quantizing a spectrum vector as defined in claim
16, in which the spectrum vector has components related to line-spectral- pairs.
21. A method for quantizing a spectrum vector as defined in claim 16, in which the spectrum vector has components related to immitance- spectral-pairs.
22. A method for quantizing a spectrum vector as defined in claim 16, wherein the measured weighted distance is a weighted Euclidean distance.
23. A method for quantizing a spectrum vector as defined in claim 16, wherein the non-voice signal comprises a signalling tone.
24. A method for quantizing a spectrum vector as defined in claim 23, wherein the signalling tone is a DTMF signal.
25. A device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion, wherein both the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of entries; means for searching the voice-signal quantization-codebook portion by measuring a weighted distance between the spectrum vector and the entries of the voice-signal quantization-codebook portion; means for searching the non-voice signal quantization codebook portion by measuring the weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; means for detecting that the spectrum vector represents a voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the voice-signal quantization-codebook portion; means for detecting that the spectrum vector represents a non- voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion;
means for producing a spectrum index in relation to said one entry of the voice-signal quantization-codebook portion when the spectrum vector represents a voice signal; and means for producing a spectrum index in relation to said one entry of the non-voice signal quantization codebook portion when the spectrum vector represents a non-voice signal.
26. A device for quantizing a spectrum vector as defined in claim
25, wherein the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, wherein a predetermined set of combinations of partial spectrum indexes are reserved for non-voice signals, and wherein: the means for searching the voice-signal quantization-codebook portion comprises means for searching the quantization codebook subtables and means for producing corresponding partial spectrum indexes forming combinations not included in said predetermined set of combinations of partial spectrum indexes; the means for producing a spectrum index when the spectrum vector represents a voice signal comprises means for combining the partial spectrum indexes corresponding to said one entry of the voice-signal quantization-codebook portion to produce the spectrum index; and the means for producing a spectrum index when the spectrum vector represents a non-voice signal comprises means
for selecting, in relation to said one entry of the non-voice signal quantization codebook portion, one combination of said predetermined set to form the spectrum index.
27. A device for quantizing a spectrum vector as defined in claim 26, wherein the predetermined set of combinations of partial spectrum indexes reserved for non-voice signals correspond to invalid combinations of entries of respective quantization codebook subtables.
28. A device for quantizing a spectrum vector as defined in claim 27, in which: the spectrum vector has components related to line-spectral-pairs; the voice-signal quantization-codebook portion comprises at least three quantization codebook subtables each having a plurality of entries; said one combination of said predetermined set forming the spectrum index is composed of a non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion; and said non-voice-signal label part corresponds to a combination of entries of two subtables amongst said at least three quantization codebook subtables which is logically invalid in regard to adjacent line-spectral-pair component ordering.
29. A device for quantizing a spectrum vector as defined in claim 26, wherein said subtable searching means comprises means for searching the quantization codebook subtables in stages including a first stage and at least one subsequent stage, and wherein the predetermined set of combinations of partial spectrum indexes is formed by considering, at least, one predetermined partial spectrum index for the first stage combined with partial spectrum indexes corresponding to entries of the quantization codebook subtables searched in said at least one subsequent stage.
30. A device for quantizing a spectrum vector as defined in claim
25, in which the spectrum vector has components related to line-spectral-pairs.
31. A device for quantizing a spectrum vector as defined in claim 25, in which the spectrum vector has components related to immitance-spectral-pairs.
32. A device for quantizing a spectrum vector as defined in claim 25, wherein the non-voice signal comprises a signalling tone.
33. A device for quantizing a spectrum vector as defined in claim 32, wherein the signalling tone is a DTMF signal.
34. A device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion; means for detecting whether the spectrum vector represents a voice signal or a non-voice signal; means for searching the voice-signal quantization-codebook portion for quantizing the spectrum vector and producing the spectrum index when the spectrum vector represents a voice signal; and means for searching the non-voice signal quantization codebook portion for quantizing the spectrum vector and producing the spectrum index when the spectrum vector represents a non-voice signal.
35. A device for quantizing a spectrum vector as defined in claim 34, wherein the means for detecting whether the spectrum vector represents a voice signal or a non voice signal comprises: means for measuring a weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; means for detecting that the spectrum vector represents a non- voice signal when the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion is smaller than a predetermined threshold associated to said one entry of the non-voice signal quantization codebook portion;
means for detecting that the spectrum vector represents a voice signal when the weighted distance measured between the spectrum vector and every entry of the non-voice signal quantization codebook portion is larger than the predetermined threshold associated to the corresponding entry of the non-voice signal quantization codebook portion.
36. A device for quantizing a spectrum vector as defined in claim 34, in which the spectrum vector has components related to line-spectral-pairs.
37. A device for quantizing a spectrum vector as defined in claim 24, in which the spectrum vector has components related to immitance-spectral-pairs.
38. A device for quantizing a spectrum vector as defined in claim
34, wherein the non-voice signal comprises a signalling tone.
39. A device for quantizing a spectrum vector as defined in claim 38, wherein the signalling tone is a DTMF signal.
40. A device for quantizing a spectrum vector, supplied at recurrent time intervals, to produce a spectrum index, comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization
codebook portion, wherein the non-voice signal quantization codebook portion comprises a plurality of entries; means for measuring a weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; means for detecting that the spectrum vector represents a non- voice signal when the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion is smaller than a predetermined weighted distance threshold; means for producing, upon detection that the spectrum vector represents a non-voice signal, a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion; and means for searching, upon failure to detect that the spectrum vector represents a non-voice signal, the voice-signal quantization- codebook portion for quantizing the spectrum vector and producing the spectrum index.
41. A device for quantizing a spectrum vector as defined in claim 40, wherein the predetermined weighted distance threshold is associated to said one entry of the non-voice signal quantization codebook portion.
42. A device for quantizing a spectrum vector as defined in claim 40, wherein the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, wherein the voice-signal quantization-codebook portion comprises addresses which are related to combinations of entries of the plurality of quantization codebook subtables, and wherein: said means for searching the voice-signal quantization-codebook portion comprises: means for splitting the spectrum vector into a plurality of subvectors; means for searching the quantization codebook subtables for quantizing the subvectors, respectively, and producing respective partial spectrum indexes; and means for combining the partial spectrum indexes to produce said spectrum index; and said means for producing a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion comprises: means for reserving an invalid combination of the entries of at least two of said quantization codebook subtables as said predetermined non-voice-signal label part.
43. A device for quantizing a spectrum vector as defined in claim 40, wherein:
the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of stages including a first stage and at least one subsequent stage, each stage having a given number of entries; and said means for producing a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion comprises: means for reserving at least one entry of the first stage as said predetermined non-voice signal label part; and means for combining said at least one entry of the first stage with at least one entry of said at least one subsequent stage to represent non-voice signals.
44. A device for quantizing a spectrum vector as defined in claim 40, wherein the measured weighted distance is a weighted Euclidean distance.
45. A device for quantizing a spectrum vector as defined in claim 40, in which the spectrum vector has components related to line-spectral-pairs.
46. A device for quantizing a spectrum vector as defined in claim 40, in which the spectrum vector has components related to immitance-spectral-pairs.
47. A device for quantizing a spectrum vector as defined in claim 40, wherein the non-voice signal comprises a signalling tone.
48. A device for quantizing a spectrum vector as defined in claim 40, wherein the signalling tone is a DTMF signal.
49. An encoder for encoding a voice or non-voice input signal, comprising: an encoding section responsive to the voice or non-voice input signal for producing residual voice or non-voice signal information; a spectrum processing section responsive to the input voice or non-voice signal for producing a spectrum index; and means for transmitting the residual signal information and the spectrum index through a communication channel; -wherein said spectrum processing section comprises: means responsive to the input voice or non-voice signal for producing a spectrum vector at recurrent time intervals; and a device for quantizing the spectrum vector to produce the spectrum index, comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion, wherein both the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of entries;
means for searching the voice-signal quantization-codebook portion by measuring a weighted distance between the spectrum vector and the entries of the voice-signal quantization-codebook portion; means for searching the non-voice signal 5 quantization codebook portion by measuring the weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; means for detecting that the spectrum vector 10 represents a voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the voice-signal quantization-codebook portion; means for detecting that the spectrum vector 15 represents a non-voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the non- voice signal quantization codebook portion; means for producing a spectrum index in 20 relation to said one entry of the voice-signal quantization-codebook portion when the spectrum vector represents a voice signal; and means for producing a spectrum index in relation to said one entry of the non-voice signal
quantization codebook portion when the spectrum vector represents a non-voice signal.
50. An encoder as defined in claim 49, wherein the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, wherein a predetermined set of combinations of partial spectrum indexes are reserved for non-voice signals, and wherein: the means for searching the voice-signal quantization-codebook portion comprises means for searching the quantization codebook subtables and means for producing corresponding partial spectrum indexes forming combinations not included in said predetermined set of combinations of partial spectrum indexes; the means for producing a spectrum index when the spectrum vector represents a voice signal comprises means for combining the partial spectrum indexes corresponding to said one entry of the voice-signal quantization-codebook portion to produce the spectrum index; and the means for producing a spectrum index when the spectrum vector represents a non-voice signal comprises means for selecting, in relation to said one entry of the non-voice signal quantization codebook portion, one combination of said predetermined set to form the spectrum index.
51. An encoder as defined in claim 50, wherein the predetermined set of combinations of partial spectrum indexes reserved for non-voice signals correspond to invalid combinations of entries of respective quantization codebook subtables.
52. An encoder as defined in claim 51 , in which: the spectrum vector has components related to line-spectral-pairs; the voice-signal quantization-codebook portion comprises at least three quantization codebook subtables each having a plurality of entries; said one combination of said predetermined set forming the spectrum index is composed of a non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion; and said non-voice-signal label part corresponds to a combination of entries of two subtables amongst said at least three quantization codebook subtables which is logically invalid in regard to adjacent line-spectral-pair component ordering.
53. An encoder as defined in claim 50, wherein said subtable searching means comprises means for searching the quantization codebook subtables in stages including a first stage and at least one subsequent stage, and wherein the predetermined set of combinations of partial spectrum indexes is formed by considering, at least, one predetermined partial spectrum index for the first stage combined with partial spectrum indexes corresponding to entries of the quantization codebook subtables searched in said at least one subsequent stage.
54. An encoder for encoding an input voice or non-voice signal, comprising: an encoding section responsive to the input voice or non-voice signal for producing residual voice or non-voice signal information; a spectrum processing section responsive to the input voice or non-voice signal for producing a spectrum index; and means for transmitting the residual signal information and the spectrum index through a communication channel; -wherein said spectrum processing section comprises: means responsive to the input signal for producing a spectrum vector at recurrent time intervals; and a device for quantizing the spectrum vector to produce the spectrum index, comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion; means for detecting whether the spectrum vector represents a voice signal or a non-voice signal; means for searching the voice-signal quantization-codebook portion for quantizing the spectrum vector and producing the spectrum index
when the spectrum vector represents a voice signal; and means for searching the non-voice signal quantization codebook portion for quantizing the spectrum vector and producing the spectrum index when the spectrum vector represents a non-voice signal.
55. An encoder as defined in claim 54, wherein the means for detecting whether the spectrum vector represents a voice signal or a non voice signal comprises: means for measuring a weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; means for detecting that the spectrum vector represents a non- voice signal when the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion is smaller than a predetermined threshold associated to said one entry of the non-voice signal quantization codebook portion; and means for detecting that the spectrum vector represents a voice signal when the weighted distance measured between the spectrum vector and every entry of the non-voice signal quantization codebook portion is larger than the predetermined threshold associated to the corresponding entry of the non-voice signal quantization codebook portion.
56. An encoder for encoding an input voice or non-voice signal, comprising: an encoding section responsive to the input voice or non-voice signal for producing residual voice or non-voice signal information; a spectrum processing section responsive to the input voice or non-voice signal for producing a spectrum index; and means for transmitting the residual signal information and the spectrum index through a communication channel; -wherein said spectrum processing section comprises: means responsive to the input voice or non-voice signal for producing a spectrum vector at recurrent time intervals; a device for quantizing the spectrum vector to produce the spectrum index, comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion, wherein the non-voice signal quantization codebook portion comprises a plurality of entries; means for measuring a weighted distance between the spectrum vector and the entries of the non- voice signal quantization codebook portion; means for detecting that the spectrum vector represents a non-voice signal when the weighted distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook
portion is smaller than a predetermined weighted distance threshold; means for producing, upon detection that the spectrum vector represents a non-voice signal, a spectrum index including a predetermined non-voice- signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion; and means for searching, upon failure to detect that the spectrum vector represents a non-voice signal, the voice-signal quantization-codebook portion for quantizing the spectrum vector and producing the spectrum index.
57. An encoder as defined in claim 56, wherein the predetermined weighted distance threshold is associated to said one entry of the non- voice signal quantization codebook portion.
58. An encoder as defined in claim 56, wherein the voice-signal quantization-codebook portion comprises a plurality of quantization codebook subtables each having a plurality of entries, wherein the voice- signal quantization-codebook portion comprises addresses which are related to combinations of entries of the plurality of quantization codebook subtables, and wherein:
said means for searching the voice-signal quantization-codebook portion comprises: means for splitting the spectrum vector into a plurality of subvectors; means for searching the quantization codebook subtables for quantizing the subvectors, respectively, and producing respective partial spectrum indexes; and means for combining the partial spectrum indexes to produce said spectrum index; and said means for producing a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion comprises: means for reserving an invalid combination of the entries of at least two of said quantization codebook subtables as said predetermined non-voice-signal label part.
59. An encoder as defined in claim 56, wherein: the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of stages including a first stage and at least one subsequent stage, each stage having a given number of entries; and
said means for producing a spectrum index including a predetermined non-voice-signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion comprises: means for reserving at least one entry of the first stage as said predetermined non-voice signal label part; and means for combining said at least one entry of the first stage with at least one entry of said at least one subsequent stage to represent non-voice signals.
60. An encoder as defined in claim 56, wherein the measured weighted distance is a weighted Euclidean distance.
61. A cellular communication system for servicing a large geographical area divided into a plurality of cells, comprising: mobile transmitter/receiver units; cellular base stations respectively situated in said cells; means for controlling communication between the cellular base stations; a bidirectional wireless communication sub-system between each mobile unit situated in one cell and the cellular base station of said one cell, said bidirectional wireless communication sub-system comprising in both the mobile unit and the cellular base station (a) a transmitter including an encoder for encoding a voice or non-voice signal and means for transmitting the encoded voice or non-voice signal, and (b) a receiver including means for receiving a transmitted encoded voice or non-voice signal and means for decoding the received encoded voice or non-voice signal;
- wherein said encoder comprises: an encoding section responsive to the voice or non-voice signal for producing residual voice or non-voice signal information; a spectrum processing section responsive to the voice or non- voice signal for producing a spectrum index; and means for supplying the residual signal information and the spectrum index to the transmitting means; -wherein said spectrum processing section comprises: means responsive to the voice or non-voice signal for producing a spectrum vector at recurrent time intervals; and a device for quantizing the spectrum vector to produce the spectrum index, comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion, wherein both the voice-signal quantization-codebook portion and the non-voice signal quantization codebook portion comprise a plurality of entries; means for searching the voice-signal quantization-codebook portion by measuring a weighted distance between the spectrum vector and the entries of the voice-signal quantization-codebook portion;
means for searching the non-voice signal quantization codebook portion by measuring the weighted distance between the spectrum vector and the entries of the non-voice signal quantization codebook portion; 5 means for detecting that the spectrum vector represents a voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the voice-signal quantization-codebook portion;
10 means for detecting that the spectrum vector represents a non-voice signal when the smallest weighted distance is the weighted distance measured between the spectrum vector and one entry of the non- voice signal quantization codebook portion;
15 means for producing a spectrum index in relation to said one entry of the voice-signal quantization-codebook portion when the spectrum vector represents a voice signal; and means for producing a spectrum index in
20 relation to said one entry of the non-voice signal quantization codebook portion when the spectrum vector represents a non-voice signal.
62. A cellular communication system for servicing a large geographical area divided into a plurality of cells, comprising: mobile transmitter/receiver units; cellular base stations respectively situated in said cells; means for controlling communication between the cellular base stations; a bidirectional wireless communication sub-system between each mobile unit situated in one cell and the cellular base station of said one cell, said bidirectional wireless communication sub-system comprising in both the mobile unit and the cellular base station (a) a transmitter including an encoder for encoding a voice or non-voice signal and means for transmitting the encoded voice or non-voice signal, and (b) a receiver including means for receiving a transmitted encoded voice or non-voice signal and means for decoding the received encoded voice or non-voice signal; - wherein said encoder comprises: an encoding section responsive to the voice or non-voice signal for producing residual voice or non-voice signal information; a spectrum processing section responsive to the voice or non- voice signal for producing a spectrum index; and means for supplying the residual signal information and the spectrum index to the transmitting means;
-wherein said spectrum processing section comprises: means responsive to the voice or non-voice signal for producing a spectrum vector at recurrent time intervals; and
a device for quantizing the spectrum vector to produce the spectrum index, comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion; means for detecting whether the spectrum vector represents a voice signal or a non-voice signal; means for searching the voice-signal quantization-codebook portion for quantizing the spectrum vector and producing the spectrum index when the spectrum vector represents a voice signal; and means for searching the non-voice signal quantization codebook portion for quantizing the spectrum vector and producing the spectrum index when the spectrum vector represents a non-voice signal.
63. A cellular communication system for servicing a large geographical area divided into a plurality of cells, comprising: mobile transmitter/receiver units; cellular base stations respectively situated in said cells; means for controlling communication between the cellular base stations;
a bidirectional wireless communication sub-system between each mobile unit situated in one cell and the cellular base station of said one cell, said bidirectional wireless communication sub-system comprising in both the mobile unit and the cellular base station (a) a transmitter including an encoder for encoding a voice or non-voice signal and means for transmitting the encoded voice or non-voice signal, and (b) a receiver including means for receiving a transmitted encoded voice or non-voice signal and means for decoding the received encoded voice or non-voice signal;
- wherein said encoder comprises: an encoding section responsive to the voice or non voice signal for producing residual voice or non-voice signal information; a spectrum processing section responsive to the voice or non- voice signal for producing a spectrum index; and means for supplying the residual signal information and the spectrum index to the transmitting means;
-wherein said spectrum processing section comprises: means responsive to the voice or non voice signal for producing a spectrum vector at recurrent time intervals; a device for quantizing the spectrum vector comprising: a spectrum-vector quantization codebook including a voice-signal quantization-codebook portion and a non-voice signal quantization codebook portion, wherein the non-voice signal quantization codebook portion comprises a plurality of entries;
means for measuring a weighted distance between the spectrum vector and the entries of the non- voice signal quantization codebook portion; means for detecting that the spectrum vector represents a non-voice signal when the weighted
5 distance measured between the spectrum vector and one entry of the non-voice signal quantization codebook portion is smaller than a predetermined weighted distance threshold; means for producing, upon detection that the
10 spectrum vector represents a non-voice signal, a spectrum index including a predetermined non-voice- signal label part and a second part related to said one entry of the non-voice signal quantization codebook portion; and
15 means for searching, upon failure to detect that the spectrum vector represents a non-voice signal, the voice-signal quantization-codebook portion for quantizing the spectrum vector and producing the spectrum index.
20
PCT/CA1997/000516 1996-07-17 1997-07-17 Enhanced encoding of dtmf and other signalling tones WO1998004046A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP97931602A EP0913034A2 (en) 1996-07-17 1997-07-17 Enhanced encoding of dtmf and other signalling tones
AU35345/97A AU3534597A (en) 1996-07-17 1997-07-17 Enhanced encoding of dtmf and other signalling tones

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US2191196P 1996-07-17 1996-07-17
US60/021,911 1996-07-17

Publications (2)

Publication Number Publication Date
WO1998004046A2 true WO1998004046A2 (en) 1998-01-29
WO1998004046A3 WO1998004046A3 (en) 1998-03-26

Family

ID=21806795

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA1997/000516 WO1998004046A2 (en) 1996-07-17 1997-07-17 Enhanced encoding of dtmf and other signalling tones

Country Status (4)

Country Link
EP (1) EP0913034A2 (en)
AU (1) AU3534597A (en)
CA (1) CA2258183A1 (en)
WO (1) WO1998004046A2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1179820A2 (en) * 2000-08-10 2002-02-13 Mitsubishi Denki Kabushiki Kaisha Method of coding LSP coefficients during speech inactivity
EP1420390A1 (en) * 2002-11-13 2004-05-19 Digital Voice Systems, Inc. Interoperable vocoder
US7634399B2 (en) 2003-01-30 2009-12-15 Digital Voice Systems, Inc. Voice transcoder
US8036886B2 (en) 2006-12-22 2011-10-11 Digital Voice Systems, Inc. Estimation of pulsed speech model parameters
US8359197B2 (en) 2003-04-01 2013-01-22 Digital Voice Systems, Inc. Half-rate vocoder
US11270714B2 (en) 2020-01-08 2022-03-08 Digital Voice Systems, Inc. Speech coding using time-varying interpolation
US11990144B2 (en) 2021-07-28 2024-05-21 Digital Voice Systems, Inc. Reducing perceived effects of non-voice data in digital speech

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0443548A2 (en) * 1990-02-22 1991-08-28 Nec Corporation Speech coder
EP0545386A2 (en) * 1991-12-03 1993-06-09 Nec Corporation Method for speech coding and voice-coder
EP0573398A2 (en) * 1992-06-01 1993-12-08 Hughes Aircraft Company C.E.L.P. Vocoder
EP0607989A2 (en) * 1993-01-22 1994-07-27 Nec Corporation Voice coder system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0443548A2 (en) * 1990-02-22 1991-08-28 Nec Corporation Speech coder
EP0545386A2 (en) * 1991-12-03 1993-06-09 Nec Corporation Method for speech coding and voice-coder
EP0573398A2 (en) * 1992-06-01 1993-12-08 Hughes Aircraft Company C.E.L.P. Vocoder
EP0607989A2 (en) * 1993-01-22 1994-07-27 Nec Corporation Voice coder system

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7031912B2 (en) 2000-08-10 2006-04-18 Mitsubishi Denki Kabushiki Kaisha Speech coding apparatus capable of implementing acceptable in-channel transmission of non-speech signals
EP1179820A3 (en) * 2000-08-10 2003-11-12 Mitsubishi Denki Kabushiki Kaisha Method of coding LSP coefficients during speech inactivity
EP1179820A2 (en) * 2000-08-10 2002-02-13 Mitsubishi Denki Kabushiki Kaisha Method of coding LSP coefficients during speech inactivity
US7970606B2 (en) 2002-11-13 2011-06-28 Digital Voice Systems, Inc. Interoperable vocoder
EP1420390A1 (en) * 2002-11-13 2004-05-19 Digital Voice Systems, Inc. Interoperable vocoder
US8315860B2 (en) 2002-11-13 2012-11-20 Digital Voice Systems, Inc. Interoperable vocoder
US7634399B2 (en) 2003-01-30 2009-12-15 Digital Voice Systems, Inc. Voice transcoder
US7957963B2 (en) 2003-01-30 2011-06-07 Digital Voice Systems, Inc. Voice transcoder
US8359197B2 (en) 2003-04-01 2013-01-22 Digital Voice Systems, Inc. Half-rate vocoder
US8595002B2 (en) 2003-04-01 2013-11-26 Digital Voice Systems, Inc. Half-rate vocoder
US8036886B2 (en) 2006-12-22 2011-10-11 Digital Voice Systems, Inc. Estimation of pulsed speech model parameters
US8433562B2 (en) 2006-12-22 2013-04-30 Digital Voice Systems, Inc. Speech coder that determines pulsed parameters
US11270714B2 (en) 2020-01-08 2022-03-08 Digital Voice Systems, Inc. Speech coding using time-varying interpolation
US11990144B2 (en) 2021-07-28 2024-05-21 Digital Voice Systems, Inc. Reducing perceived effects of non-voice data in digital speech

Also Published As

Publication number Publication date
CA2258183A1 (en) 1998-01-29
AU3534597A (en) 1998-02-10
WO1998004046A3 (en) 1998-03-26
EP0913034A2 (en) 1999-05-06

Similar Documents

Publication Publication Date Title
US5778335A (en) Method and apparatus for efficient multiband celp wideband speech and music coding and decoding
US5966688A (en) Speech mode based multi-stage vector quantizer
US5495555A (en) High quality low bit rate celp-based speech codec
KR100594670B1 (en) Automatic speech/speaker recognition over digital wireless channels
US4890325A (en) Speech coding transmission equipment
CN1129263C (en) Method and apparatus for group encoding signals
KR101061404B1 (en) How to encode and decode audio at variable rates
EP1089257A2 (en) Header data formatting for a vocoder
US6721712B1 (en) Conversion scheme for use between DTX and non-DTX speech coding systems
JPH0863200A (en) Generation method of linear prediction coefficient signal
WO1997027578A1 (en) Very low bit rate time domain speech analyzer for voice messaging
US6988067B2 (en) LSF quantizer for wideband speech coder
US6073094A (en) Voice compression by phoneme recognition and communication of phoneme indexes and voice features
KR20050046204A (en) An apparatus for coding of variable bit-rate wideband speech and audio signals, and a method thereof
US6104994A (en) Method for speech coding under background noise conditions
KR100257361B1 (en) Very low bit rate voice messaging system using asymmetric voice compression processing
WO1998004046A2 (en) Enhanced encoding of dtmf and other signalling tones
US5666350A (en) Apparatus and method for coding excitation parameters in a very low bit rate voice messaging system
US5987406A (en) Instability eradication for analysis-by-synthesis speech codecs
JPH02231825A (en) Method of encoding voice, method of decoding voice and communication method employing the methods
EP1121686B1 (en) Speech parameter compression
Jayant et al. Coding of speech and wideband audio
EP0850471B1 (en) Very low bit rate voice messaging system using variable rate backward search interpolation processing
CA2407791C (en) Method and apparatus for mitigating the effect of transmission errors in a distributed speech recognition process and system
CN1347548A (en) Speech synthesizer based on variable rate speech coding

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH KE LS MW SD SZ UG ZW AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH KE LS MW SD SZ UG ZW AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL

121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref document number: 2258183

Country of ref document: CA

Kind code of ref document: A

Ref document number: 2258183

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 09214963

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 1997931602

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1997931602

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref document number: 1998506404

Country of ref document: JP

WWR Wipo information: refused in national office

Ref document number: 1997931602

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1997931602

Country of ref document: EP