WO2000042601A1 - A method and device for designing and searching large stochastic codebooks in low bit rate speech encoders - Google Patents

A method and device for designing and searching large stochastic codebooks in low bit rate speech encoders Download PDF

Info

Publication number
WO2000042601A1
WO2000042601A1 PCT/CA2000/000036 CA0000036W WO0042601A1 WO 2000042601 A1 WO2000042601 A1 WO 2000042601A1 CA 0000036 W CA0000036 W CA 0000036W WO 0042601 A1 WO0042601 A1 WO 0042601A1
Authority
WO
WIPO (PCT)
Prior art keywords
random vectors
random
combination
vectors
stochastic
Prior art date
Application number
PCT/CA2000/000036
Other languages
French (fr)
Inventor
Corporation Voiceage
Original Assignee
Laflamme, Claude
Lefebre, Roch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Laflamme, Claude, Lefebre, Roch filed Critical Laflamme, Claude
Priority to AU30286/00A priority Critical patent/AU3028600A/en
Publication of WO2000042601A1 publication Critical patent/WO2000042601A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0004Design or structure of the codebook
    • G10L2019/0005Multi-stage vector quantisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0013Codebook search algorithms

Definitions

  • the present invention relates to a stochastic codebook structure, a method for generating a codeword using this stochastic codebook structure, and method and devices for efficiently searching a stochastic codebook.
  • a speech encoder converts a speech signal into a digital bitstream transmitted over a communication channel or stored in a storage medium.
  • the speech signal is first digitized, i.e. sampled and quantized with usually 16 bits per sample.
  • the speech encoder then represents these digital samples with a smaller number of bits while maintaining a good subjective speech quality.
  • the speech decoder or synthesizer processes the transmitted or stored bitstream and converts it back to a sound signal.
  • CELP Prediction
  • CELP a linear prediction (LP) filter is computed and transmitted every frame.
  • An excitation signal is determined in each subframe, which usually consists of two components: one from the past excitation (also called pitch contribution or adaptive codebook) and the other from an innovative codebook (also called fixed codebook). This excitation signal is transmitted and used at the decoder as the input of the LP synthesis filter in order to obtain the synthesized speech.
  • an innovative codebook is an indexed set of ⁇ /-sample-long sequences which will be referred to as ⁇ /-dimensional codevectors.
  • An innovative codebook can be stored in physical memory (e.g. look-up table) or can refer to a mechanism for relating the index k to a corresponding codevector (e.g. a formula).
  • each subframe (block of N samples) is synthesized by filtering an appropriate codevector from an innovative codebook through time varying filters modelling the spectral characteristics of the speech signal.
  • a synthetic output is computed for at least a subset of the codevectors of the innovative codebook (codebook search).
  • the retained codevector is the one producing the synthetic output closest to the original speech signal according to a perceptually weighted distortion measure. This perceptual weighting is performed through a so-called perceptual weighting filter, which is usually derived from the LP filter.
  • a first type of innovative codebooks are the so-called “stochastic codebooks”.
  • a drawback of these codebooks is that they involve substantial physical storage. They are stochastic (i.e. random) in the sense that the path from index to codevector involves look-up tables which are the result of randomly generated numbers or statistical techniques applied to large speech training sets. The size of stochastic codebooks tends to be limited by storage and/or search complexity.
  • a second type of innovative codebooks are the algebraic codebooks.
  • algebraic codebooks are not random and require no substantial storage.
  • An algebraic codebook is a set of indexed codevectors in which the amplitudes and positions of the pulses of the /c* h codevector can be derived from a corresponding index k through a rule requiring no, or minimal physical storage. Therefore, the size of algebraic codebooks is not limited by storage considerations.
  • Algebraic codebooks can be designed for efficient search. For these reasons, algebraic codebooks have known a considerable success in speech coding standards, where codebooks ranging from 17 bits (e.g. ITU-T Recommendation G.729) to 35 bits (ETSI Enhanced Full Rate GSM) were efficiently used.
  • An object of the present invention is therefore to provide a stochastic codebook structure with reduced storage requirements, a method for generating a codeword using this stochastic codebook structure, and method and devices for efficiently searching this stochastic codebook structure.
  • a stochastic codebook structure for generating codevectors, comprising a stochastic table and a codevector generator.
  • the stochastic table contains a set of M random vectors.
  • the codevector generator is connected to the stochastic table and comprises means for adding a number P of random vectors from the stochastic table to produce a codevector.
  • the present invention also relates to a stochastic codebook structure for generating codevectors, comprising a stochastic table containing a set of M random vectors.
  • the stochastic codebook structure also comprises a codevector generator connected to the stochastic table and including a combiner of subsets of P random vectors from the stochastic table. This combiner produces codevectors each by combination of a subset of P random vectors from the stochastic table.
  • a method for generating a codevector comprising constructing a stochastic table containing a set of M random vectors and combining a number P of random vectors from the stochastic table to produce a codevector.
  • - combining a number P of random vectors comprises adding the number P of random vectors from the stochastic table to produce the codevector
  • the number P is selected from the group consisting of 2 and 3;
  • - adding the number P of random vectors from the stochastic table to produce the codevector comprises computing the codevector using the following relation:
  • c denotes the codevector
  • v denotes the P random vectors
  • s 1 ( s 2 , ..., s p are signs equal to -1 or 1
  • p 1 p 2 ..., p p are indices of the P random vectors.
  • This stochastic codebook searching method comprises applying to the M random vectors a preselection criterion related to the signal, preselecting a subset of K random vectors amongst the M random vectors of the above mentioned set in relation to the preselection criterion, applying a search criterion related to the signal to combinations of P random vectors out of the K random vectors of the preselected subset, and selecting, in relation to the search criterion, one of the combinations of P random vectors forming the best codevector for encoding the signal.
  • the invention is concerned with a corresponding device for efficiently searching a stochastic codebook having a stochastic table containing a set of M random vectors of dimension N to find the best codevector for encoding a signal.
  • This stochastic codebook searching device further comprises means for applying a search criterion related to the signal to combinations of P random vectors out of the K random vectors of the preselected subset, and means for selecting, in relation to the search criterion, one of the combinations of P random vectors forming the best codevector for encoding the signal.
  • - applying the preselection criterion comprises: calculating a dot product between: a backward filtered version of a target vector calculated during encoding of the signal and used for searching the stochastic codebook; and each of the M random vectors of the set; and preselecting a subset of K random vectors comprises: preselecting as the subset the K random vectors of the set with the largest absolute values of dot products; (This corresponds to testing only the numerator of the search criterion) - calculating the dot product comprises calculating the backward filtered version d(n) of the search target vector x(n) by correlating the search target vector x(n) with h(n) in accordance with the following relation:
  • h(n) is an impulse response of a weighted synthesis filter calculated during encoding of the signal
  • - applying a search criterion comprises calculating, for each combination of P random vectors, a mathematical relation involving the combination, the mathematical relation being advantageously a ratio involving the combination and the target vector;
  • - selecting one of the combinations of P random vectors comprises selecting the combination with the largest ratio;
  • - calculating the ratio for each combination of P random vectors comprises: convolving each random vector of the subset of K random vectors with an impulse response of a weighted synthesis filter calculated during encoding of the signal and thereby producing K filtered random vectors; computing the energy of each filtered random vector; calculating a dot product of each filtered random vector with the target vector; and for each combination of P random vectors, computing the ratio in response to the corresponding P filtered random vectors, P computed energies and P calculated dot products;
  • - computing the ratio for each combination of P random vectors comprises computing the ratios for all possible combinations of P vectors through P nested calculations loops;
  • a gain of the signal representative codevector through a ratio having: a numerator constituted by a sum of the P dot products between the P random vectors of the selected one combination and the target vector; and a denominator involving the P computed energies and P filtered random vectors respectively corresponding to the P random vectors of the selected one combination;
  • this index containing information about: signs of the P random vectors of the selected one combination; and indices of the P random vectors of the selected one combination;
  • the stochastic codebook structure described hereinabove corresponds to a codebook of P log 2 (/W)+1 bits.
  • calculating the index of the best codevector comprises: dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table; determining the one of these two halves of the stochastic table which contains at least two of the three random vectors of the selected one combination; and constructing the index / of the best codevector using the following relation:
  • the third vector can be replaced by a pulse covering the range 0,.., ⁇ /-1.
  • the possibility of making the third vector replaced by a pulse gives the codebook more flexibility to capture special time events in the signal.
  • the stochastic codebook structure according to the present invention can be also used in conjunction with a sparse codebook (such as an algebraic codebook) where one (1) more bit can be used to denote which codebook is selected.
  • a sparse codebook such as an algebraic codebook
  • the present invention relates to a cellular communication system, a cellular network element, a cellular mobile transmitter/receiver unit, and a bidirectional wireless communication sub-system.
  • Figure 1 is a schematic block diagram of a CELP-type speech encoding device
  • Figure 2 is a schematic block diagram of a CELP-type speech decoding device
  • Figure 6 is a schematic flow chart summarizing the procedure according to the present invention for searching a stochastic codebook
  • Figure 7 is a simplified, schematic block diagram of a cellular communication system in which the present invention can be used.
  • a cellular communication system such as 701 (see Figure 7) provides a telecommunication service over a large geographic area by dividing that large geographic area into a number C of smaller cells.
  • the C smaller cells are serviced by respective cellular base stations 702,, 702 2 ... 702 c to provide each cell with radio signalling, audio and data channels.
  • Radio signalling channels are used to page mobile radiotelephones (mobile transmitter/receiver units) such as 703 within the limits of the coverage area (cell) of the cellular base station 702, and to place calls to other radiotelephones 703 located either inside or outside the base station's cell or to another network such as the Public Switched Telephone Network (PSTN) 704.
  • PSTN Public Switched Telephone Network
  • radiotelephone 703 Once a radiotelephone 703 has successfully placed or received a call, an audio or data channel is established between this radiotelephone 703 and the cellular base station 702 corresponding to the cell in which the radiotelephone 703 is situated, and communication between the base station 702 and radiotelephone 703 is conducted over that audio or data channel.
  • the radiotelephone 703 may also receive control or timing information over a signalling channel while a call is in progress. If a radiotelephone 703 leaves a cell and enters another adjacent cell while a call is in progress, the radiotelephone 703 hands over the call to an available audio or data channel of the base station 702 of the new cell.
  • the radiotelephone 703 If a radiotelephone 703 leaves a cell and enters another adjacent cell while no call is in progress, the radiotelephone 703 sends a control message over the signalling channel to log into the base station 702 of the new cell. In this manner mobile communication over a wide geographical area is possible.
  • the cellular communication system 701 further comprises a control terminal 705 to control communication between the cellular base stations 702 and the PSTN 704, for example during a communication between a radiotelephone 703 and the PSTN 704, or between a radiotelephone 703 located in a first cell and a radiotelephone 703 situated in a second cell.
  • a bidirectional wireless radio communication subsystem is required to establish an audio or data channel between a base station 702 of one cell and a radiotelephone 703 located in that cell.
  • a bidirectional wireless radio communication subsystem typically comprises in the radiotelephone 703:
  • a transmitter 706 including:
  • a receiver 710 including:
  • a receiving circuit 711 for receiving a transmitted encoded speech signal usually through the same antenna 709; and - a decoder 712 for decoding the received encoded speech signal from the receiving circuit 711.
  • the radiotelephone further comprises other conventional radiotelephone circuits 713 to which the encoder 707 and decoder 712 are connected and for processing signals therefrom, which circuits 713 are well known to those of ordinary skill in the art and, accordingly, will not be further described in the present specification.
  • such a bidirectional wireless radio communication subsystem typically comprises in the base station 702:
  • a transmitter 714 including:
  • a receiver 718 including:
  • a receiving circuit 719 for receiving a transmitted encoded speech signal through the same antenna 717 or through another antenna (not shown); and - a decoder 720 for decoding the received encoded voice signal from the receiving circuit 719.
  • the base station 702 further comprises, typically, a base station controller 721 , along with its associated database 722, for controlling communication between the control terminal 705 and the transmitter 714 and receiver 718.
  • a base station controller 721 for controlling communication between the control terminal 705 and the transmitter 714 and receiver 718.
  • encoding is required in order to reduce the bandwidth necessary to transmit sound signal, for example voice signal such as speech, across the bidirectional wireless radio communication subsystem, i.e., between a radiotelephone 703 and a base station 702.
  • LP speech encoders typically operating at 13 kbits/second and below such as Code-Excited Linear Prediction (CELP) encoders typically use a LP synthesis filter to model the short term spectral envelope of the speech signal.
  • CELP Code-Excited Linear Prediction
  • the LP information is transmitted, typically, every 10 or 20 ms to the decoder (such 720 and 712) and is extracted at the decoder end.
  • the novel technique disclosed in the present specification can apply to different LP-based encoding systems.
  • a CELP-type encoding system is used in the preferred embodiment of the present invention for illustrating these novel techniques.
  • speech is used in this preferred embodiment as the signal to be encoded, this novel technique can also be applied to other types of signals.
  • Figure 1 is a general, schematic block diagram of a CELP-type speech encoding device.
  • the sampled input speech 113 is divided into L-sample blocks called "frames".
  • frames For each frame, the different parameters representing the speech signal in the frame are computed, encoded, and transmitted. These parameters include linear prediction (LP) parameters representing the LP synthesis filter and excitation parameters.
  • LP linear prediction
  • the LP parameters are usually computed once every frame.
  • Each frame is further divided into smaller blocks of N samples (blocks of length N) in which the excitation parameters (adaptive and innovative parameters) are determined.
  • these blocks of length N are called "subframes"
  • a ⁇ /-sample sequence in a subframe is referred to as a ⁇ /-dimensional vector.
  • Various ⁇ /-dimensional vectors occur in the encoding procedure.
  • a list of vectors which appear in Figures 1 and 2 as well as a list of transmitted parameters are given herein below:
  • T Pitch lag or adaptive codebook index
  • b Pitch gain or adaptive codebook gain
  • W(z) Perceptual weighting filter W(z)/A(z) Weighted synthesis filter
  • M Number of random vectors in the stochastic table; P Number of added random vectors from the stochastic table to form the innovative codevector; K Number of preselected random vectors in the stochastic codebook, these preselected random vectors having indices p r p 2 p ⁇ and signs s 1 , s 2 , ... , s ⁇ ; X Dot product between d and the random vectors v,; S j Energys of the filtered preselected random vectors w,; j Dot products between the target vector x and the filtered preselected random vectors w LTP Long Term Prediction parameters; MSWE Mean-Squared Weighted Error;
  • Figure 2 is a schematic block diagram of a CELP-type speech decoding device and illustrates the various steps carried out between the digital input (input of the demultiplexer/decoder 201) and the output sampled speech (output of the postfilter 209).
  • the demultiplexer/decoder 201 extracts four types of parameters from the binary information (input bitstream 210) received through a digital input channel from the encoding device of Figure 1. From each received binary frame, the extracted parameters are:
  • the current speech signal is synthesized on the basis of these parameters as will be explained hereinbelow.
  • the decoding device of Figure 2 comprises an innovative excitation generator 203 to produce an innovative codevector c k in response to the received index k.
  • This innovative codevector c k is scaled by the innovative codebook gain g through a sealer 207.
  • the innovative excitation generator 203 is normally formed by an innovative codebook responsive to the index k to output the innovative codevector c k .
  • the LTP Long Term Prediction parameters
  • the adaptive codebook 202 As illustrated in Figure 2, the adaptive codebook
  • the adaptive codevector f ⁇ is scaled by the pitch gain b through a sealer 206 to obtain the signal bf ⁇ .
  • the signal bf ⁇ is then added to the scaled innovative codevector gc k through an adder 205 to produce the excitation codevector u.
  • the contents of the adaptive codebook 202 is updated through the memory 204 which itself receives and stores the excitation codevector u.
  • the synthesized output speech s is obtained by filtering the excitation codevector u through a synthesis filter 208 of transfer function MA(z), and then through a postfilter 209.
  • the synthesis filter 208 and the postfilter 209 are updated by the received STP parameters from the demultiplexer/decoder 201. Both filters 208 and 209 are well known to those of ordinary skill in the art and will not be further described in the present specification.
  • the sampled input speech signal 113 is processed on a frame by frame basis by the encoding device of Figure 1.
  • the encoding device is broken down into 11 modules numbered from 101 to 112.
  • Each input frame is first processed through an optional preprocessing unit 101.
  • This pre-processing unit 101 consists of a high pass filter with a 140 Hz cut-off frequency. This high pass filter removes the unwanted sound components below 140 Hz.
  • the output of the pre-processing unit 101 is denoted s(n).
  • This signal is used for performing linear prediction (LP) analysis in module 102.
  • LP analysis is a technique well known to those of ordinary skill in the art.
  • the autocorrelation approach is used.
  • the signal is first windowed using a Hamming window (usually of the order of 20-30 ms).
  • the parameters a are the coefficients of the LP filter, which is given by the following relation:
  • Module 102 performs LP analysis, as well as quantization and interpolation of the LP filter coefficients.
  • the LP filter coefficients are first transformed into another equivalent domain more suitable for quantization and interpolation purposes.
  • Line spectral pairs (LSP) and immitance spectral pairs (ISP) are two domains in which quantization and interpolation can be efficiently performed.
  • the 10 LP filter coefficients can be quantized in the order of 18 to 30 bits using split or multi-stage quantization, or a combination thereof.
  • the purpose of the inte ⁇ olation is to enable updating of the LP filter coefficients every subframe while transmitting them once every frame; this improves the performance of the encoding device without increasing the bit rate. Quantization and inte ⁇ olation of the LP filter coefficients are believed to be otherwise well known to those of ordinary skill in the art and, accordingly, will not be further described in the present specification.
  • the filter A(z) denotes the unquantized inte ⁇ olated LP filter of the subframe
  • the filter A(z) denotes the quantized interpolated LP filter of the subframe.
  • the optimum adaptive and innovative parameters are searched by minimizing the mean squared error between the input speech and the synthesized speech in a perceptually weighted domain. This is equivalent to minimizing the error between the weighted input speech and the weighted synthesis speech.
  • the perceptually weighted signal s w (n) is computed in a perceptual weighting filter 103.
  • Typical values of ⁇ , and ⁇ 2 are 0.9 and 0.6, respectively.
  • Other forms of transfer function W(z) also exist in the literature and could be used.
  • an open-loop pitch lag T OL is first estimated in open-loop adaptive search module 104 using the weighted speech signal s n). Then the closed loop pitch analysis, which is performed on a subframe basis in closed-loop adaptive codebook search module 107, is restricted around the open-loop pitch lag T 0L which significantly reduces the search complexity of the LTP parameters 7 and b (pitch lag 7 and pitch gain b).
  • open-loop pitch analysis is usually performed in open-loop adaptive search module once every 10 ms (two subframes) using techniques well known to those of ordinary skill in the art.
  • the target vector x' for LTP analysis is first computed by the adder 105. This is usually done by subtracting the zero-input response s 0 of the weighted synthesis filter W(z)/A(z) from the weighted speech signal s n). More specifically:
  • X' s.
  • - ⁇ is the ⁇ /-dimensional target vector
  • s w is the weighted speech signal vector in the subframe
  • s 0 is the zero-input response of the filter W(z)/A(z), which is the output of the combined filter W(z)/A(z) due to its initial states. Note that alternative, but mathematically equivalent, approaches can be used to compute the target vector x'.
  • the zero-input response calculator 110 is responsive to the quantized interpolated LP filter A(z) from the LP analysis, quantization and interpolation module 102 and to the initial states of the weighted synthesis filter W(z)/A(z) stored in update memory module 111 to calculate the zero- input response s 0 (that part of the response due to the initial states as determined by setting the inputs equal to zero) of filter W(z)/A(z).
  • update memory 111 the states of the weighted synthesis filter W(z)/A(z) are updated by filtering the excitation signal
  • the states of the weighted synthesis filter W(z)/A(z) are stored in update memory 111 and used in the next subframe as initial states for calculating the zero- input response in module 110. Similar to the target vector, other alternative, but mathematically equivalent approaches can be used to update the filter states. This operation is otherwise well known to those of ordinary skill in the art and, accordingly, will not be further described in the present specification.
  • a ⁇ /-dimensional impulse response vector ft of the weighted synthesis filter W(z)/A(z) is computed in the impulse response generator 106 using the LP filter coefficients A(z) and A ⁇ z) from module 102. Again, this operation is well known to those of ordinary skill in the art and, accordingly, will not be further described in the present specification..
  • the closed-loop pitch or adaptive codebook parameters b and 7 are computed in the closed-loop adaptive codebook search module 107; this closed-loop adaptive codebook search module 107 is responsive to the target vector x' and the impulse response vector ft to compute these closed- loop pitch or adaptive codebook parameters b and 7.
  • pitch prediction has been represented by a pitch filter having the following transfer function:
  • u (n) bu (n- T) + gc k ( n)
  • each vector in the adaptive codebook is a shift-by-one version of the previous vector (discarding one sample and adding a new sample).
  • the adaptive codebook is equivalent to the filter structure (1/(1 -bz ' ⁇ )), and the adaptive codevector f- ⁇ n) is given by:
  • a codevector f ⁇ (n) is built by repeating the available samples from the past excitation until the codevector is completed (this is not equivalent to the filter structure).
  • the codevector f ⁇ (n) may correspond to an interpolated version of the past excitation, with pitch lag 7 being a non- integer delay (e.g. 50.25).
  • the adaptive search consists of finding the best pitch lag 7 and gain b that minimize the mean squared weighted error E between the target vector x' and the scaled filtered past excitation, where error E is expressed as:
  • a 1/3 subsample pitch resolution is used, and the adaptive search is composed of three stages.
  • an open-loop pitch lag T 0L is estimated in the open-loop adaptive search module 104 in response to the perceptually weighted speech signal Sjn).
  • this open-loop pitch analysis is usually performed once every 10 ms (two subframes) using techniques well known to those of ordinary skill in the art.
  • the search criterion C is searched in the closed-loop adaptive search module 107 for integer pitch lags 7 around the estimated open-loop lag T 0L (usually ⁇ 5), which significantly simplifies the search procedure.
  • a simple procedure is used for updating the filtered adaptive codevector y ⁇ without the need to compute the convolution for every pitch lag.
  • a third stage of the search (module 107) tests the fractions around that optimum integer pitch lag.
  • subtractor 108 updates the target vector x' by subtracting the LTP contribution from that target vector x'
  • the search procedure in CELP is performed by finding the optimum innovative codevector c k and gain g which minimize the mean- squared error between the weighted input speech and weighted synthesis speech. This is equivalent to minimizing the mean-squared error between the target vector x and the scaled filtered codevector by ⁇ , as it is well known to those of ordinary skill in the art.
  • the mean-squared weighted error (MSWE) is given by: N-l
  • H Hc k is the filtered innovative codevector
  • H H is a lower triangular convolution matrix derived from the impulse response vector ft.
  • the matrix H is given by:
  • the present invention is concerned with constructing and efficiently searching such large stochastic codebooks, in particular but not exclusively innovative codebooks. This is disclosed in the following description.
  • an 11 -bit stochastic codebook can be constructed with 5 bits for each vector and 1 bit for the signs.
  • the random vectors in the stochastic table can be generated using several approaches. Some suggested approaches for generating the contents of the random vectors are:
  • FIG. 3 shows a flow chart for calculating the index of the codevector from the vector indices p, and p 2 and corresponding sign indices ⁇ , and ⁇ 2 .
  • the codevector index is given by (step 308):
  • the smaller index p 1 orp 2 is assigned to A, and the larger index p 2 oxp 1 is assigned to i 2 otherwise the larger index p ,or p 2 is assigned to / ' -j and the smaller index p 2 or p 1 to -
  • each vector requires ⁇ og 2 (M) bits and the sign information needs only one (1) bit; a total of 3 log 2 (/W)+1 bits.
  • a simple way of performing encoding of the index of the codevector is the following.
  • the stochastic table is divided into two halves with M/2 random vectors in each half. The half which contains at least two chosen vectors is then determined. This information, denoted by ⁇ , is encoded with one (1) bit.
  • the two vector indices in the same half are encoded according to the algorithm of Figure 3, and require 2 log 2 ( 2)+1 bits (which is equal to 2 log 2 (/W)-1 bits).
  • the third vector is encoded separately with log ⁇ M) bits for the index and one (1) bit for the sign.
  • calculating the index of the signal representative codevector comprises: dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table; determining the one of the two halves of the stochastic table which contains at least two of the three random vectors of the selected one of the combinations;and constructing the index / of the best codevector using the following relation:
  • - p 1 and p 2 denote respective indices of the two random vectors located in said one half of the stochastic table; and - p 3 denotes an indicia of the third random vector.
  • the innovative codebook search is performed in the above described innovative codebook search module 109.
  • the codevectors are given by:
  • the goal of the search procedure is to find the indices p., p 2 ..., p p of the best P random vectors and their corresponding signs s v s, s p , which maximize the search criterion:
  • a preselection process is used to identify K out of the M random vectors in the stochastic table, so that the search process is then confined to those K vectors.
  • the sign information corresponding to each preselected vector is also preset.
  • the sign corresponding to each preselected vector is given by the sign of ⁇ , for that vector (step 603 of Figure 6).
  • the search proceeds for selecting P vectors among those K vectors which maximize the search criterion Q k .
  • the filtered vectors w t , y ' 1 ,...,K corresponding to the K preselected vectors, are first calculated (step 604 of Figure 6) and stored.
  • the sign information is also included in the filtered vectors; i.e.:
  • the search then proceeds with the selection of P vectors among the K preselected vectors by maximizing the search criterion Q k (step 606 of Figure 6).
  • the filtered innovative codevector z is given by:
  • predetermined signs are included in the filtered preselected vectors W j .
  • the search criterion is given by (the codevector index k is dropped for simplicity)
  • the search procedure is shown in Figure 4.
  • cross product is used in comparing the present Q with the optimum one Q opt , in order to avoid the division inside the loop; more specifically, testing if Q>Q op ⁇ is equivalent to testing if R 2 D opt >R 2 opt D.
  • the codebook index k and gain g are encoded and transmitted.
  • the stochastic codebook disclosed in the present invention can be used alone or in conjunction with a sparse innovative codebook such as an algebraic codebook.
  • a sparse innovative codebook such as an algebraic codebook.
  • one (1) bit can be used to denote whether the algebraic section or the stochastic section of the innovation codebook is chosen.
  • Both sections are searched and a candidate from each section is retained. The two candidates are compared and the one which maximizes the selection criterion Q is chosen.
  • a modified selection criterion can be used for choosing the winner among the two codebook sections, by taking into consideration the nature of the current speech signal in the subframe. Criteria such as the pitch gain, the synthesis filter tilt, etc.
  • the search criterion such that to favour the algebraic part of the codebook in case of periodic signals (high pitch gain and strong tilt) or to favor the stochastic section otherwise.
  • Other variants of the stochastic codebook are also possible.

Abstract

A stochastic codebook structure with low storage requirements is designed and efficiently searched in view of encoding a sound signal. This codebook consists of a set of codevectors, built from a small set of random vectors. Each codevector is obtained by the addition of several signed vectors from the small set (for example 64) of random (e.g. Gaussian) vectors. For example, a codebook which consists of the addition of two signed vectors from a collection of 64 Gaussian vectors gives rise to a 13-bit (8192-entry) codebook (6 bits for each of the two vector and 1 bit for the signs). Similarly, adding 3 vectors from a collection of 64 vectors gives rise to a 19-bit codebook. Besides the memory efficient structure of the codebook, a fast search procedure is used whereby only a small subset of the codebook is searched. In this fast search procedure, a small number of vectors from the collection of random vectors are preselected, and the search is confined to the subset of codebook consisting of these preselected vectors.

Description

A METHOD AND DEVICE FOR DESIGNING AND
SEARCHING LARGE STOCHASTIC CODEBOOKS
IN LOW BIT RATE SPEECH ENCODERS
BACKGROUND OF THE INVENTION
1. Field of the invention:
The present invention relates to a stochastic codebook structure, a method for generating a codeword using this stochastic codebook structure, and method and devices for efficiently searching a stochastic codebook.
2. Brief description of the prior art:
The demand for efficient digital speech encoding techniques with a good subjective quality/bit rate trade-off is increasing for numerous applications such as voice transmission over land-mobile, satellite, digital radio, or packed networks, as well as voice storage, voice response, and wireless telephony. A speech encoder converts a speech signal into a digital bitstream transmitted over a communication channel or stored in a storage medium. The speech signal is first digitized, i.e. sampled and quantized with usually 16 bits per sample. The speech encoder then represents these digital samples with a smaller number of bits while maintaining a good subjective speech quality. The speech decoder or synthesizer processes the transmitted or stored bitstream and converts it back to a sound signal.
One of the best prior art techniques capable of achieving a good quality/bit rate trade-off is the so-called Code Excited Linear
Prediction (CELP) technique. According to this technique, the sampled speech signal is processed in successive blocks of L samples usually called frames where L is some predetermined number (corresponding to
10-30 ms of speech). In CELP, a linear prediction (LP) filter is computed and transmitted every frame. The L-sample frame is then divided into smaller blocks called subframes of N samples, where L=rN and r is the number of subframes in a frame (N usually corresponds to 4-10 ms of speech). An excitation signal is determined in each subframe, which usually consists of two components: one from the past excitation (also called pitch contribution or adaptive codebook) and the other from an innovative codebook (also called fixed codebook). This excitation signal is transmitted and used at the decoder as the input of the LP synthesis filter in order to obtain the synthesized speech.
In the CELP context, an innovative codebook is an indexed set of Λ/-sample-long sequences which will be referred to as Λ/-dimensional codevectors. Each codebook sequence is indexed by an integer k ranging from 1 to β where B represents the size of the innovative codebook often expressed as a number of bits b, where S=2b.
An innovative codebook can be stored in physical memory (e.g. look-up table) or can refer to a mechanism for relating the index k to a corresponding codevector (e.g. a formula).
To synthesize speech according to the CELP technique, each subframe (block of N samples) is synthesized by filtering an appropriate codevector from an innovative codebook through time varying filters modelling the spectral characteristics of the speech signal. At the encoder end, a synthetic output is computed for at least a subset of the codevectors of the innovative codebook (codebook search). The retained codevector is the one producing the synthetic output closest to the original speech signal according to a perceptually weighted distortion measure. This perceptual weighting is performed through a so-called perceptual weighting filter, which is usually derived from the LP filter.
A first type of innovative codebooks are the so-called "stochastic codebooks". A drawback of these codebooks is that they involve substantial physical storage. They are stochastic (i.e. random) in the sense that the path from index to codevector involves look-up tables which are the result of randomly generated numbers or statistical techniques applied to large speech training sets. The size of stochastic codebooks tends to be limited by storage and/or search complexity.
A second type of innovative codebooks are the algebraic codebooks. By contrast to the stochastic codebooks, algebraic codebooks are not random and require no substantial storage. An algebraic codebook is a set of indexed codevectors in which the amplitudes and positions of the pulses of the /c*h codevector can be derived from a corresponding index k through a rule requiring no, or minimal physical storage. Therefore, the size of algebraic codebooks is not limited by storage considerations. Algebraic codebooks can be designed for efficient search. For these reasons, algebraic codebooks have known a considerable success in speech coding standards, where codebooks ranging from 17 bits (e.g. ITU-T Recommendation G.729) to 35 bits (ETSI Enhanced Full Rate GSM) were efficiently used.
As the bit rate is reduced, the number of pulses in the codevectors of an algebraic codebook is reduced. This results in lower performance for unvoiced frames and in case of background noise, where codevectors with stochastic contents are more suitable. This shows the need for stochastic codebooks with efficient storage and search techniques.
OBJECT OF THE INVENTION
An object of the present invention is therefore to provide a stochastic codebook structure with reduced storage requirements, a method for generating a codeword using this stochastic codebook structure, and method and devices for efficiently searching this stochastic codebook structure. SUMMARY OF THE INVENTION
More specifically, in accordance with the present invention, there is provided a stochastic codebook structure for generating codevectors, comprising a stochastic table and a codevector generator. The stochastic table contains a set of M random vectors. The codevector generator is connected to the stochastic table and comprises means for adding a number P of random vectors from the stochastic table to produce a codevector.
The present invention also relates to a stochastic codebook structure for generating codevectors, comprising a stochastic table containing a set of M random vectors. The stochastic codebook structure also comprises a codevector generator connected to the stochastic table and including a combiner of subsets of P random vectors from the stochastic table. This combiner produces codevectors each by combination of a subset of P random vectors from the stochastic table.
Further in accordance with the present invention, there is provided a method for generating a codevector, comprising constructing a stochastic table containing a set of M random vectors and combining a number P of random vectors from the stochastic table to produce a codevector.
In accordance with preferred embodiments of the present invention: - combining a number P of random vectors comprises adding the number P of random vectors from the stochastic table to produce the codevector;
- the number P is selected from the group consisting of 2 and 3;
- adding the number P of random vectors from the stochastic table to produce the codevector comprises computing the codevector using the following relation:
■S, V + S V
1 P, S2VP2 P P
where c denotes the codevector, v denotes the P random vectors, s1 ( s2, ..., sp are signs equal to -1 or 1 , and p 1 p2 ..., p p , are indices of the P random vectors.
Still further in accordance with the present invention, there is provided a method for efficiently searching a stochastic codebook having a stochastic table containing a set of M random vectors of dimension N to find the best codevector for encoding a signal. This stochastic codebook searching method comprises applying to the M random vectors a preselection criterion related to the signal, preselecting a subset of K random vectors amongst the M random vectors of the above mentioned set in relation to the preselection criterion, applying a search criterion related to the signal to combinations of P random vectors out of the K random vectors of the preselected subset, and selecting, in relation to the search criterion, one of the combinations of P random vectors forming the best codevector for encoding the signal. The invention is concerned with a corresponding device for efficiently searching a stochastic codebook having a stochastic table containing a set of M random vectors of dimension N to find the best codevector for encoding a signal. This stochastic codebook searching device comprises means for applying to the M random vectors a preselection criterion related to the signal, and means for preselecting a subset of K (typically =6) random vectors amongst the M random vectors of the set in relation to the preselection criterion. This stochastic codebook searching device further comprises means for applying a search criterion related to the signal to combinations of P random vectors out of the K random vectors of the preselected subset, and means for selecting, in relation to the search criterion, one of the combinations of P random vectors forming the best codevector for encoding the signal.
In accordance with preferred embodiments of the stochastic codebook searching method and device:
- applying the preselection criterion comprises: calculating a dot product between: a backward filtered version of a target vector calculated during encoding of the signal and used for searching the stochastic codebook; and each of the M random vectors of the set; and preselecting a subset of K random vectors comprises: preselecting as the subset the K random vectors of the set with the largest absolute values of dot products; (This corresponds to testing only the numerator of the search criterion) - calculating the dot product comprises calculating the backward filtered version d(n) of the search target vector x(n) by correlating the search target vector x(n) with h(n) in accordance with the following relation:
-V-l d(ή) = x(n) * h{-n) = ^ x(i)h(i - n) i-n
where h(n) is an impulse response of a weighted synthesis filter calculated during encoding of the signal;
- presetting a sign of each random vector of the subset, wherein the preset sign can be the sign of the corresponding dot product;
- applying a search criterion comprises calculating, for each combination of P random vectors, a mathematical relation involving the combination, the mathematical relation being advantageously a ratio involving the combination and the target vector;
(The search process then proceeds with testing the search criterion for all the possible combinations of P out of K vectors. For P=2, this corresponds to testing the search criterion Kχ(K+1)/2 times (36 times for =8). A full search requires testing the criterion Mχ(/W+1)/2 times (528 times for M=32). This shows the significant decrease in the search complexity using the search method of the invention (the decrease is more significant when P=3 and M=64)).
- selecting one of the combinations of P random vectors comprises selecting the combination with the largest ratio; - calculating the ratio for each combination of P random vectors comprises: convolving each random vector of the subset of K random vectors with an impulse response of a weighted synthesis filter calculated during encoding of the signal and thereby producing K filtered random vectors; computing the energy of each filtered random vector; calculating a dot product of each filtered random vector with the target vector; and for each combination of P random vectors, computing the ratio in response to the corresponding P filtered random vectors, P computed energies and P calculated dot products;
- computing the ratio for each combination of P random vectors comprises computing the ratios for all possible combinations of P vectors through P nested calculations loops;
- calculating a gain of the signal representative codevector through a ratio having: a numerator constituted by a sum of the P dot products between the P random vectors of the selected one combination and the target vector; and a denominator involving the P computed energies and P filtered random vectors respectively corresponding to the P random vectors of the selected one combination;
- calculating an index of the best codevector, this index containing information about: signs of the P random vectors of the selected one combination; and indices of the P random vectors of the selected one combination;
- P=2 and calculating the index of the best codevector comprises constructing the index / of the best codevector from the respective indices -i and p 2 and sign indices σ1 and σ2 (σ= 0 or 1 to identify the sign) of the two random vectors using the following relation:
l = s + 2 x (i1 + i2 x M)
and the following rules:
- if σ , ≠ σ2, and p1 < p2, then set j = ft , = p, , and s = σ2 ;
- if σ 1 ≠ σ 2 , and p, > p2 ,then set i, = p, , = p, , and s = σ 1 ; - if <7y = σ2 , and p? > p2 , then set = p> , 2 = p, , and s = σ1 ; and
- if σ? = σ2 , and p1 < p2 , then set ^ = p, , i, = Q, , and s = σ, ; (The number of bits needed to encode the index of each codevector is log2(M) and the sign information can be encoded with only one (1) bit. Accordingly, the stochastic codebook structure described hereinabove corresponds to a codebook of P log2(/W)+1 bits. As an example, with =32 and F^2, an 11-bit codebook can be constructed (5 bits for each vector and one (1) bit for the signs). Similarly, if M=32 and P=3, a 16-bit codebook can be obtained using only 32 random vectors.)
- P=3 and calculating the index of the best codevector comprises: dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table; determining the one of these two halves of the stochastic table which contains at least two of the three random vectors of the selected one combination; and constructing the index / of the best codevector using the following relation:
l = φ + 2x (s + 2x(i1 + i2 x M/2)) + Mx Mx (σ3 + 2 x p3 )
and the following rules:
- if σ1 ≠ σ2, and p1 < p2, then set i, = ft , = p, , and s = σ2 ; - if σ1 ≠ σ 2 , and , > p2 ,then set /j = p, , = ft , and s = σ1
- if σ 1 = σ2 , and pή > p2 , then set i, = ft , £ - p , and s = σ1
- if σ1 = σ2 , and p1 ≤ p2 , then set i, = p, , = ft , and s = σ, and
- if φ corresponds to the second half, i1 = i1 - M/2 and i2 = i2 - M/2; where:
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of the selected one combination;
- σ1 and , denote respective sign indices of the two random vectors located in said one half of the stochastic table;
- σ3 denotes a sign index of the third random vector; and
- p1 and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and - p3 denotes an indicia of the third random vector; and - P=3 and the method and device further comprise calculating an index of the signal respresentative codevector, this index containing information about: signs of two of the three random vectors of the selected one combination; and indices of the two random vectors of the selected one combination; and a bit indicating that a pulse is chosen to replace a third of the three random vectors of the selected one combination.
Other variation of the codebook structure according to the invention can be obtained. In the case of P=3, the third vector can be replaced by a pulse covering the range 0,..,Λ/-1. The possibility of making the third vector replaced by a pulse gives the codebook more flexibility to capture special time events in the signal.
The stochastic codebook structure according to the present invention can be also used in conjunction with a sparse codebook (such as an algebraic codebook) where one (1) more bit can be used to denote which codebook is selected.
Finally, the present invention relates to a cellular communication system, a cellular network element, a cellular mobile transmitter/receiver unit, and a bidirectional wireless communication sub-system.
The objects, advantages and other features of the present invention will become more apparent upon reading of the following non restrictive description of a preferred embodiment thereof, given by way of example only with reference to the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS
In the appended drawings:
Figure 1 is a schematic block diagram of a CELP-type speech encoding device;
Figure 2 is a schematic block diagram of a CELP-type speech decoding device;
Figure 3 is a flow chart describing computation, according to the present invention, of the index of an innovative codevector for the case P=2, P being the number of signed random vectors added to derive the innovative codevector;
Figure 4 is a schematic representation of two nested loops used for computing, according to the present invention, two optimum indices among K preselected random vectors (for the case P=2);
Figure 5 is a schematic representation of three nested loops used for computing, according to the present invention, three optimum indices among the K preselected random vectors (for the case P=3);
Figure 6 is a schematic flow chart summarizing the procedure according to the present invention for searching a stochastic codebook; and Figure 7 is a simplified, schematic block diagram of a cellular communication system in which the present invention can be used.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
As well known to those of ordinary skill in the art, a cellular communication system such as 701 (see Figure 7) provides a telecommunication service over a large geographic area by dividing that large geographic area into a number C of smaller cells. The C smaller cells are serviced by respective cellular base stations 702,, 7022 ... 702c to provide each cell with radio signalling, audio and data channels.
Radio signalling channels are used to page mobile radiotelephones (mobile transmitter/receiver units) such as 703 within the limits of the coverage area (cell) of the cellular base station 702, and to place calls to other radiotelephones 703 located either inside or outside the base station's cell or to another network such as the Public Switched Telephone Network (PSTN) 704.
Once a radiotelephone 703 has successfully placed or received a call, an audio or data channel is established between this radiotelephone 703 and the cellular base station 702 corresponding to the cell in which the radiotelephone 703 is situated, and communication between the base station 702 and radiotelephone 703 is conducted over that audio or data channel. The radiotelephone 703 may also receive control or timing information over a signalling channel while a call is in progress. If a radiotelephone 703 leaves a cell and enters another adjacent cell while a call is in progress, the radiotelephone 703 hands over the call to an available audio or data channel of the base station 702 of the new cell. If a radiotelephone 703 leaves a cell and enters another adjacent cell while no call is in progress, the radiotelephone 703 sends a control message over the signalling channel to log into the base station 702 of the new cell. In this manner mobile communication over a wide geographical area is possible.
The cellular communication system 701 further comprises a control terminal 705 to control communication between the cellular base stations 702 and the PSTN 704, for example during a communication between a radiotelephone 703 and the PSTN 704, or between a radiotelephone 703 located in a first cell and a radiotelephone 703 situated in a second cell.
Of course, a bidirectional wireless radio communication subsystem is required to establish an audio or data channel between a base station 702 of one cell and a radiotelephone 703 located in that cell. As illustrated in very simplified form in Figure 7, such a bidirectional wireless radio communication subsystem typically comprises in the radiotelephone 703:
- a transmitter 706 including:
- an encoder 707 for encoding the speech signal; and
- a transmission circuit 708 for transmitting the encoded speech signal from the encoder 707 through an antenna such as 709; and
- a receiver 710 including:
- a receiving circuit 711 for receiving a transmitted encoded speech signal usually through the same antenna 709; and - a decoder 712 for decoding the received encoded speech signal from the receiving circuit 711.
The radiotelephone further comprises other conventional radiotelephone circuits 713 to which the encoder 707 and decoder 712 are connected and for processing signals therefrom, which circuits 713 are well known to those of ordinary skill in the art and, accordingly, will not be further described in the present specification.
Also, such a bidirectional wireless radio communication subsystem typically comprises in the base station 702:
- a transmitter 714 including:
- an encoder 715 for encoding the speech signal; and
- a transmission circuit 716 for transmitting the encoded speech signal from the encoder 715 through an antenna such as 717; and
- a receiver 718 including:
- a receiving circuit 719 for receiving a transmitted encoded speech signal through the same antenna 717 or through another antenna (not shown); and - a decoder 720 for decoding the received encoded voice signal from the receiving circuit 719.
The base station 702 further comprises, typically, a base station controller 721 , along with its associated database 722, for controlling communication between the control terminal 705 and the transmitter 714 and receiver 718. As well known to those of ordinary skill in the art, encoding is required in order to reduce the bandwidth necessary to transmit sound signal, for example voice signal such as speech, across the bidirectional wireless radio communication subsystem, i.e., between a radiotelephone 703 and a base station 702.
LP speech encoders (such as 715 and 707) typically operating at 13 kbits/second and below such as Code-Excited Linear Prediction (CELP) encoders typically use a LP synthesis filter to model the short term spectral envelope of the speech signal. The LP information is transmitted, typically, every 10 or 20 ms to the decoder (such 720 and 712) and is extracted at the decoder end.
The novel technique disclosed in the present specification can apply to different LP-based encoding systems. However, a CELP-type encoding system is used in the preferred embodiment of the present invention for illustrating these novel techniques. Although speech is used in this preferred embodiment as the signal to be encoded, this novel technique can also be applied to other types of signals.
Figure 1 is a general, schematic block diagram of a CELP-type speech encoding device.
Referring to Figure 1 , the sampled input speech 113 is divided into L-sample blocks called "frames". For each frame, the different parameters representing the speech signal in the frame are computed, encoded, and transmitted. These parameters include linear prediction (LP) parameters representing the LP synthesis filter and excitation parameters. The LP parameters are usually computed once every frame. Each frame is further divided into smaller blocks of N samples (blocks of length N) in which the excitation parameters (adaptive and innovative parameters) are determined. In the CELP literature, these blocks of length N are called "subframes", and a Λ/-sample sequence in a subframe is referred to as a Λ/-dimensional vector.
In the present preferred embodiment, the value of N corresponds to 5 ms and that of L corresponds to 20 ms, which means that a frame ( =160 at the sampling rate of 8 kHz) contains four subframes (Λ/=40 at the sampling rate of 8 kHz). Various Λ/-dimensional vectors occur in the encoding procedure. A list of vectors which appear in Figures 1 and 2 as well as a list of transmitted parameters are given herein below:
List of the main -V-dimensional vectors s Input speech vector (after pre-processing); sw Weighted speech vector; s0 Zero-input response of the weighted synthesis filter W(z)/A(z); x' Target vector for adaptive codebook search; h Impulse response of the combination of synthesis and weighted filters; fτ Adaptive codevector at pitch lag 7; bfτ Adaptive codevector scaled by pitch gain b; yτ Filtered adaptive codevector (fj convolved with ft); x Target vector for innovative codebook search; ck Innovative codevector at index k ( -th entry of the innovative codebook); g ck Innovative codevector scaled by the innovative codebook gain g; zk Filtered innovative codevector (c
5 convolved with ft);
V; Random vectors in the stochastic table of size M; Wj Filtered preselected random vectors; u Excitation codevector (scaled innovative
10 and adaptive codevectors); s' Synthesis signal before postfiltering; and d Correlation between target vector x and impulse response ft.
15 List of transmitted parameters
STP Short term prediction parameters (defining
A(z)); T Pitch lag (or adaptive codebook index); b Pitch gain (or adaptive codebook gain);
20 k Codevector index (innovative codebook entry); and g Innovative codebook gain.
List of other codec parameters and symbols
25 A(z) Short term prediction filter (LP filter) in the subframe; A(z) Quantized LP filter in the subframe;
W(z) Perceptual weighting filter; W(z)/A(z) Weighted synthesis filter;
L Number of samples in a frame;
N Number of samples in a subframe;
M Number of random vectors in the stochastic table; P Number of added random vectors from the stochastic table to form the innovative codevector; K Number of preselected random vectors in the stochastic codebook, these preselected random vectors having indices pr p2 pκ and signs s1 , s2 , ... , sκ ; X Dot product between d and the random vectors v,; Sj Energies of the filtered preselected random vectors w,; j Dot products between the target vector x and the filtered preselected random vectors w LTP Long Term Prediction parameters; MSWE Mean-Squared Weighted Error;
H Lower triangular convolution matrix derived from the impulse response vecrot ft; and Q Innovative codebook search criterion.
DECODING PRINCIPLE It is believed preferable to first describe the speech decoding device of Figure 2. Figure 2 is a schematic block diagram of a CELP-type speech decoding device and illustrates the various steps carried out between the digital input (input of the demultiplexer/decoder 201) and the output sampled speech (output of the postfilter 209).
The demultiplexer/decoder 201 extracts four types of parameters from the binary information (input bitstream 210) received through a digital input channel from the encoding device of Figure 1. From each received binary frame, the extracted parameters are:
- the short term prediction parameters STP (usually once per frame);
- the long-term prediction parameters (pitch lag Tand pitch gain b (usually once per subframe);
- the innovative codebook index k; and
- the innovative codebook gain g (usually once per subframe).
The current speech signal is synthesized on the basis of these parameters as will be explained hereinbelow.
The decoding device of Figure 2 comprises an innovative excitation generator 203 to produce an innovative codevector ck in response to the received index k. This innovative codevector ck is scaled by the innovative codebook gain g through a sealer 207. The innovative excitation generator 203 is normally formed by an innovative codebook responsive to the index k to output the innovative codevector ck.
The LTP (Long Term Prediction parameters) which usually consists of the past excitation delayed by the pitch lag 7 is generated by a adaptive codebook 202. As illustrated in Figure 2, the adaptive codebook
202 is responsive to the pitch lag 7 and to the past excitation u stored in memory 204 to produce the adaptive codebook codevector fτ at delay 7.
The adaptive codevector fτ is scaled by the pitch gain b through a sealer 206 to obtain the signal bfτ. The signal bfτ is then added to the scaled innovative codevector gck through an adder 205 to produce the excitation codevector u. The contents of the adaptive codebook 202 is updated through the memory 204 which itself receives and stores the excitation codevector u.
The synthesized output speech s is obtained by filtering the excitation codevector u through a synthesis filter 208 of transfer function MA(z), and then through a postfilter 209. The synthesis filter 208 and the postfilter 209 are updated by the received STP parameters from the demultiplexer/decoder 201. Both filters 208 and 209 are well known to those of ordinary skill in the art and will not be further described in the present specification.
It should be pointed out here that the previously described operations are repeated on a subframe basis, where the subframe size is equal to the vector dimension N. Although the STP parameters are updated on a frame basis (2 to 5 subframes), a quantized LP filter A(z) is computed on a subframe basis as it is well known to those of ordinary skill in the art. ENCODING PRINCIPLE
The sampled input speech signal 113 is processed on a frame by frame basis by the encoding device of Figure 1. In Figure 1 , the encoding device is broken down into 11 modules numbered from 101 to 112.
Each input frame is first processed through an optional preprocessing unit 101. This pre-processing unit 101 consists of a high pass filter with a 140 Hz cut-off frequency. This high pass filter removes the unwanted sound components below 140 Hz.
The output of the pre-processing unit 101 is denoted s(n). This signal is used for performing linear prediction (LP) analysis in module 102. LP analysis is a technique well known to those of ordinary skill in the art. In this preferred embodiment, the autocorrelation approach is used. In the autocorrelation approach, the signal is first windowed using a Hamming window (usually of the order of 20-30 ms). The autocorrelations are computed from the windowed signal, and Levinson-Durbin recursion is used to compute the LP parameters, a„ where /=1 ,...,p, and where p is the LP order which is typically 10. The parameters a, are the coefficients of the LP filter, which is given by the following relation:
P
A ( z ) 1 +
2 ∑ Σ = 1 α.
Module 102 performs LP analysis, as well as quantization and interpolation of the LP filter coefficients. The LP filter coefficients are first transformed into another equivalent domain more suitable for quantization and interpolation purposes. Line spectral pairs (LSP) and immitance spectral pairs (ISP) are two domains in which quantization and interpolation can be efficiently performed. The 10 LP filter coefficients can be quantized in the order of 18 to 30 bits using split or multi-stage quantization, or a combination thereof. The purpose of the inteφolation is to enable updating of the LP filter coefficients every subframe while transmitting them once every frame; this improves the performance of the encoding device without increasing the bit rate. Quantization and inteφolation of the LP filter coefficients are believed to be otherwise well known to those of ordinary skill in the art and, accordingly, will not be further described in the present specification.
The following paragraphs will describe the other encoding operations performed on a subframe basis. In the following description, the filter A(z) denotes the unquantized inteφolated LP filter of the subframe, and the filter A(z) denotes the quantized interpolated LP filter of the subframe.
Perceptual weighting
In analysis-by-synthesis encoders, the optimum adaptive and innovative parameters are searched by minimizing the mean squared error between the input speech and the synthesized speech in a perceptually weighted domain. This is equivalent to minimizing the error between the weighted input speech and the weighted synthesis speech.
The perceptually weighted signal sw(n) is computed in a perceptual weighting filter 103. Traditionally, the perceptually weighted signal sjn) is computed by a perceptual weighting filter having a transfer function W(z) of the form: W(z) = Λ(z/γ.,) / Λ(z/γ2) where 0 < γ2 < y, < 1
Typical values of γ<, and γ2 are 0.9 and 0.6, respectively. Other forms of transfer function W(z) also exist in the literature and could be used.
Open-loop pitch analysis
In order to simplify the pitch analysis, an open-loop pitch lag TOL is first estimated in open-loop adaptive search module 104 using the weighted speech signal s n). Then the closed loop pitch analysis, which is performed on a subframe basis in closed-loop adaptive codebook search module 107, is restricted around the open-loop pitch lag T0L which significantly reduces the search complexity of the LTP parameters 7 and b (pitch lag 7 and pitch gain b).
In the preferred embodiment, open-loop pitch analysis is usually performed in open-loop adaptive search module once every 10 ms (two subframes) using techniques well known to those of ordinary skill in the art.
Target vector computation
The target vector x' for LTP analysis is first computed by the adder 105. This is usually done by subtracting the zero-input response s0 of the weighted synthesis filter W(z)/A(z) from the weighted speech signal s n). More specifically:
X' = s. where -^ is the Λ/-dimensional target vector, sw is the weighted speech signal vector in the subframe, and s0 is the zero-input response of the filter W(z)/A(z), which is the output of the combined filter W(z)/A(z) due to its initial states. Note that alternative, but mathematically equivalent, approaches can be used to compute the target vector x'.
s0 is computed in the zero-input response calculating unit 110. More specifically, the zero-input response calculator 110 is responsive to the quantized interpolated LP filter A(z) from the LP analysis, quantization and interpolation module 102 and to the initial states of the weighted synthesis filter W(z)/A(z) stored in update memory module 111 to calculate the zero- input response s0 (that part of the response due to the initial states as determined by setting the inputs equal to zero) of filter W(z)/A(z). In update memory 111 , the states of the weighted synthesis filter W(z)/A(z) are updated by filtering the excitation signal
u = gck + bfτ
through the weighted synthesis filter W(z)/A(z). At the end of this filtering, the states of the weighted synthesis filter W(z)/A(z) are stored in update memory 111 and used in the next subframe as initial states for calculating the zero- input response in module 110. Similar to the target vector, other alternative, but mathematically equivalent approaches can be used to update the filter states. This operation is otherwise well known to those of ordinary skill in the art and, accordingly, will not be further described in the present specification.
Adaptive codebook search First, a Λ/-dimensional impulse response vector ft of the weighted synthesis filter W(z)/A(z) is computed in the impulse response generator 106 using the LP filter coefficients A(z) and A{z) from module 102. Again, this operation is well known to those of ordinary skill in the art and, accordingly, will not be further described in the present specification..
The closed-loop pitch or adaptive codebook parameters b and 7 are computed in the closed-loop adaptive codebook search module 107; this closed-loop adaptive codebook search module 107 is responsive to the target vector x' and the impulse response vector ft to compute these closed- loop pitch or adaptive codebook parameters b and 7.
Traditionally, pitch prediction has been represented by a pitch filter having the following transfer function:
Figure imgf000029_0001
where b is the pitch gain and 7 is the pitch lag. In this case, the pitch contribution to the excitation signal u(n) is given by bu(n-T), where the total excitation is given by
u (n) = bu (n- T) + gck ( n)
with g being the innovative codebook gain and ck(n) the innovative codevector at index k. This representation has limitations if the delay 7 is less than the subframe length N. In another representation, the pitch contribution can be seen as an adaptive codebook containing the past excitation signal. Generally, each vector in the adaptive codebook is a shift-by-one version of the previous vector (discarding one sample and adding a new sample). For pitch lags T>N, the adaptive codebook is equivalent to the filter structure (1/(1 -bz)), and the adaptive codevector f-^n) is given by:
fτ (n) = u ( n- T) , n = 0 , , N-1
For pitch lags 7 shorter than N, a codevector fγ(n) is built by repeating the available samples from the past excitation until the codevector is completed (this is not equivalent to the filter structure).
In recent encoders, higher pitch resolution is used which significantly improves the quality of voiced sound segments. This is achieved by oversampling the past excitation signal using polyphase inteφolation filters. In this case, the codevector fγ(n) may correspond to an interpolated version of the past excitation, with pitch lag 7 being a non- integer delay (e.g. 50.25).
The adaptive search consists of finding the best pitch lag 7 and gain b that minimize the mean squared weighted error E between the target vector x' and the scaled filtered past excitation, where error E is expressed as:
E = \\x -byj2 where yτ is the filtered adaptive codevector (f convolved with ft) at delay 7:
yΛ n) =fΛ n) *h { n) = fΛ i ) h ( n-±) , n=0 , . . . , N-l i=0
It can be shown that the error £ is minimized by maximizing the criterion:
Figure imgf000031_0001
where t denotes vector transpose.
In this preferred embodiment, a 1/3 subsample pitch resolution is used, and the adaptive search is composed of three stages.
In the first stage, an open-loop pitch lag T0L is estimated in the open-loop adaptive search module 104 in response to the perceptually weighted speech signal Sjn). As indicated in the foregoing description, this open-loop pitch analysis is usually performed once every 10 ms (two subframes) using techniques well known to those of ordinary skill in the art.
In the second stage, the search criterion C is searched in the closed-loop adaptive search module 107 for integer pitch lags 7 around the estimated open-loop lag T0L (usually ± 5), which significantly simplifies the search procedure. A simple procedure is used for updating the filtered adaptive codevector yτ without the need to compute the convolution for every pitch lag. Once an optimum integer pitch lag is found, a third stage of the search (module 107) tests the fractions around that optimum integer pitch lag.
Innovative codebook search
Once the pitch or LTP parameters 7 and b are determined, searching for the optimum innovative excitation is conducted by means of innovative codebook search module 109. First, subtractor 108 updates the target vector x' by subtracting the LTP contribution from that target vector x'
x = x' - byτ
where b is the pitch gain and yr is the filtered adaptive codevector (the past excitation at delay 7 convolved with the impulse response ft). The new target vector x is used for the innovative codebook search and is therefore supplied to module 109.
The search procedure in CELP is performed by finding the optimum innovative codevector ck and gain g which minimize the mean- squared error between the weighted input speech and weighted synthesis speech. This is equivalent to minimizing the mean-squared error between the target vector x and the scaled filtered codevector byτ, as it is well known to those of ordinary skill in the art. The mean-squared weighted error (MSWE) is given by: N-l
E = ∑ (x(n) gzΛn) n=0
where zk(n) is the filtered innovative codevector at index k given by:
zk[n) n * h (n) = Σ cΛ±)h n-i n = 0, ... ,N-1 i=0
It is usually easier to use vector and matrix notations to represent theMSWEE. That is:
E - || x-gzΛ2 = Wx-gHcΛ2
where zk = Hck is the filtered innovative codevector, and H is a lower triangular convolution matrix derived from the impulse response vector ft. The matrix H is given by:
h(0) 0 0 0 1
| h(1) h(0) 0 0
| h(2) h(1) h(0) ... 0
H= I
h(N-1)... h(2) h(1) h(0) J By differentiating with respect to g, it can be shown that the MSWE E is minimized by maximizing the search criterion:
. • xx { x C )
Q
Figure imgf000034_0001
In an exhaustive innovative codebook search, the search criterion is evaluated for all possible codevectors ck, - =0,...,β-1 , where B is the codebook size. For innovative codebooks exceeding 10 bits (1024 entries), an exhaustive search procedure becomes impractical. For sparse innovative codebooks where the codevectors contain few non-zero pulses, it is possible to construct huge codebooks and efficiently search them. For example, algebraic innovative codebooks of sizes as large as 35 bits can be easily constructed and searched using efficient non exhaustive search procedures. Example of such codebooks are given in the following US patents:
5,444,816 (Adouletal.) 1995
5,699,482 (Adouletal.) 1997
5,754,976 (Adouletal.) 1998
5,701,392 (Adouletal.) 1997.
For non sparse stochastic innovative codebooks, it is difficult to construct and search codebooks exceeding 10 bits. The use of sparse innovative codebooks was efficient at bit rates higher than 6 kbits/s. However, as the bit rate decreases, multi-mode coding becomes necessary, where the speech signal is divided into different modes (e.g. voiced, unvoiced, transient, background noise) and a speech frame is encoded according to the selected mode. In voiced speech mode, algebraic codebooks with a small number of pulses are suitable, while in unvoiced or background noise modes, non sparse stochastic codebooks are more suitable. Even without the use of multi-mode encoding, it was found that using an innovative codebook which contains a mixture of algebraic and random codevectors improves the performance of low bit rate codecs. In this case, 1 bit can be used to denote whether the algebraic or stochastic part of the innovative codebook is selected.
The present invention is concerned with constructing and efficiently searching such large stochastic codebooks, in particular but not exclusively innovative codebooks. This is disclosed in the following description.
STRUCTURE OF THE STOCHASTIC CODEBOOK
According to the preferred embodiment, a Λ/-dimensional codevector is derived by the addition of P signed random vectors (typically P=2 or 3) from a stochastic table containing M random vectors of dimension
N (typically M=32 or 64). Let v, denote the -th Λ/-dimensional random vector in the stochastic table, then a codevector is constructed by:
Figure imgf000035_0001
where the signs sv s2, ..., sp are signs equal to -1 or 1 , and p., ft ..., ft, are the indices of the random vectors from the stochastic table. The number of bits needed to encode the index of each vector v, is log2(/ ) and the sign information can be encoded with only 1 bit as will be seen below. So the structure described above corresponds to a codebook of size B=P log2(M)+1 bits.
As an example, with =32 and P=2, an 11 -bit stochastic codebook can be constructed with 5 bits for each vector and 1 bit for the signs. Similarly, if M=32 and P=3, a 16-bit stochastic codebook can be constructed.
This shows the memory efficiency of this new structure, since non sparse stochastic codebooks of sizes as large as 216 and higher can be constructed using only a table of M=32 or 64 vectors.
The random vectors in the stochastic table can be generated using several approaches. Some suggested approaches for generating the contents of the random vectors are:
- random generators with uniform distribution;
- random generators with Gaussian distribution;
- band-pass filtered vectors randomly generated as above;
- vectors generated using training algorithms;
- overlapping vectors obtained from a random sequence where each vector is a shift-by-k version from the previous vector, with k = 2 or 3 (this saves memory requirements since only the first vector is stored as N samples and other vectors need only k samples each);
- inverse DFT (Discrete Fourier Transform) of complex vectors with unit amplitude and random, uniformly distributed phases; and
- as above but with setting few first and last complex rays to zero (equivalent to band pass filtering).
Other approaches can be used for generating the contents of the random vectors without departing from the spirit of this invention.
In the present preferred embodiment, the last approach is used, where:
(1 ) complex vectors are generated with unit amplitudes and randomly generated phases (uniformly distributed between -π and π);
(2) the amplitudes of the few first and last rays are set to zero (to perform a sort of band-pass filtering); and
(3) inverse DFT is used to obtain the contents of the random vectors.
ENCODING THE CODEBOOK INDEX
Let's first consider the case where the table has M random vectors, and the innovative codevectors are generated by the addition of 2 random vectors from the stochastic table (P=2). That is, the codevectors are given by: c = s,\, + -y2v .
In this case, two signs, s., and s2 , and two indices, /' and have to be encoded. The values of /' and j are in the range 0 to M-λ , so that encoding thereof requires log2(/W) bits for each index, and encoding of the signs requires one (1) bit for each sign. However, one (1) bit can be saved upon encoding the signs since the order of the vectors v, and vs is not important. For example, choosing v16 as the first vector and v25 as the second vector is equivalent to choosing v25 as the first vector and v16 as the second vector. Thus, a total of 2 log2(/ )+1 bits is required to encode the signs and indices of the two random vectors.
A simple approach for implementing encoding of the codevector is to use only 1 bit for the sign information and 2 log2(/W) bits for the two indices while ordering the indices in a way such that the other sign information can be easily deduced. To better explain this, Figure 3 shows a flow chart for calculating the index of the codevector from the vector indices p, and p 2 and corresponding sign indices σ, and σ2. According to this procedure, the codevector index is given by (step 308):
l = s + 2 x (i1 + i2 x M)
If σ1 ≠ σ2 (step 301), and p < p2 (step 302), then set in step 308 >ι = P '< i - PX< and s = <-§ (step 303).
If σ 1 ≠ σ2 (step 301 ), and p., > p2 (step 302), then set in step 308 - Pi < ~ P2 . ^d s = σι (steP 304). If σ 1 = σ2 (step 301), and p-, > p2 (step 305), then set in step 308 i1 - p2 ; /2 = p1 ; and s = σ, (step 306).
If σ1 = σ2 (step 301), and p-, ≤ p 2 (step 305), then set in step 308 i1 = p ; = p2 ; and s = (step 307).
Thus, when constructing the index of the codevector, if the two signs are equal then the smaller index p1 orp2 is assigned to A, and the larger index p2oxp1 is assigned to i2 otherwise the larger index p ,or p 2 is assigned to /'-j and the smaller index p2 or p1 to -
When three random vectors are added to construct the excitation codevector, each vector requires \og2(M) bits and the sign information needs only one (1) bit; a total of 3 log2(/W)+1 bits. A simple way of performing encoding of the index of the codevector is the following. The stochastic table is divided into two halves with M/2 random vectors in each half. The half which contains at least two chosen vectors is then determined. This information, denoted by φ, is encoded with one (1) bit. The two vector indices in the same half are encoded according to the algorithm of Figure 3, and require 2 log2( 2)+1 bits (which is equal to 2 log2(/W)-1 bits). The third vector is encoded separately with log^M) bits for the index and one (1) bit for the sign. The total number of bits is 1 +(2log2(M2)+1 )+(log2(Λf)+1 )=3log2( )+1.
More specifically, in the case P=3, calculating the index of the signal representative codevector comprises: dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table; determining the one of the two halves of the stochastic table which contains at least two of the three random vectors of the selected one of the combinations;and constructing the index / of the best codevector using the following relation:
/ = φ + 2 x (s + 2 x(/', + i2 x M/2)) + M x M x (q + 2 x ft )
and the following rules:
- if σ 1 ≠ σ2, and p1 < p2, then set j = ft , = p , and s = σ2 ; - if σ1 ≠ σ2 , and p1 > p2 ,then set i, = p, , = ft , and s = σ1
- if σ1 = σ2 , and p1 > p2 , then set i, = ft , £ = p , and s = σ1
- if σ1 = σ2 , and p1 < p2 , then set i, - p, , i> = ft , and s = σ1 and
- if φ corresponds to the second half, i1 = i1 - M/2 and i2 = i2 - M/2; where:
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of said selected one combination;
- σ1 and denote respective sign indices of the two random vectors located in said one half of the stochastic table;
- σ3 denotes a sign index of the third random vector; and
- p1 and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and - p3 denotes an indicia of the third random vector.
FAST SEARCH PROCEDURE FOR THE STOCHASTIC CODEBOOK The innovative codebook search is performed in the above described innovative codebook search module 109.
The codevectors are given by:
S V
2 P, p PB
The goal of the search procedure is to find the indices p., p2 ..., pp of the best P random vectors and their corresponding signs sv s, sp, which maximize the search criterion:
(x'z, )2 (x'Hc, )2 (d'c4 ): β* =
**** *k *k zk z
where x is the target vector and zk =Hck is the filtered innovative codevector at index k. Note that in the numerator of the search criterion, the dot product between x and zk is equivalent to the dot product between d and c k, where d=H *x is the backward filtered version of the target vector x which is also the correlation between the target vector x and the impulse response ft. To find the elements of the vector d, the following relation is used (step 601 of Figure 6):
N-l d( ) = x(ή) *h(-n) = ∑x(i)h(i - ) Since d is independent of the codevector index k, it is computed only once; this simplifies the computation of the numerator for the different codevectors.
After computing the vector d, a preselection process is used to identify K out of the M random vectors in the stochastic table, so that the search process is then confined to those K vectors.
This preselection is performed by testing the numerator of the search criterion Qk for the M random vectors and by selecting the K vectors which have the largest absolute dot products (or squared dot product) between d and v„ /=0,...,M-1. More specifically, the dot products χ, given by:
N-\
Z, = ∑ d{ή)v, (n) n=0
are calculated for all the random vectors v, (step 602 of Figure 6) and the indices of the K vectors which result in the K largest values of |χ,| are retained (step 603 of Figure 6). These indices are stored in the index vector m„ /=0,..., -1.
To further simplify the search, the sign information corresponding to each preselected vector is also preset. The sign corresponding to each preselected vector is given by the sign of χ, for that vector (step 603 of Figure 6). These preset signs are stored in the sign vector s,, /'=0,..., -1. The innovative codebook search is now confined to the preselected K vectors with their corresponding signs. For typical values of M=64, P=2, and K=6, the search is reduced to finding the best combination of P = 2 vectors among K = 6 random vectors instead of finding them among 64 random vectors. This reduces the number of tested vector combinations from 64x65/2 to 6x7/2.
Once the best promising K vectors and their corresponding signs are predetermined, the search proceeds for selecting P vectors among those K vectors which maximize the search criterion Qk.
The filtered vectors wt, y'=1 ,...,K corresponding to the K preselected vectors, are first calculated (step 604 of Figure 6) and stored.
This can be performed by convolving the preselected vectors with the impulse response ft of the weighted synthesis filter. The sign information is also included in the filtered vectors; i.e.:
wJ (ή) = sJ ∑vmj (i)h(n-i) , n=0,...,ΛM , y=0,..., -1.
.=0
The energy of each filtered preselected vector is then computed (step 605 of Figure 6):
^ = w = f>2 (fl) , y=0,..., -1 n=0
as well as its dot product with the target vector (step 605 of Figure 6):
N-\ j -= χ'W / = £ w, (M)X(/J) , y'=0,..., -1.
.1=0 Note that p} and ε} correspond to the numerator and denominator of the search criterion due to each preselected vector.
The search then proceeds with the selection of P vectors among the K preselected vectors by maximizing the search criterion Qk (step 606 of Figure 6).
Let's first start with the case where two vectors are added from the stochastic table (P=2). The search reduces to finding two vector indices p, and p2 among the K preselected vectors. In case of P=2, a codevector is given by:
c = Jι + S2 V P2
The filtered innovative codevector z is given by:
Figure imgf000044_0001
Note that the predetermined signs are included in the filtered preselected vectors Wj.
For two vectors, the search criterion is given by (the codevector index k is dropped for simplicity)
β _ (*'* )' _ (x'wΛ + 'wA )2 _ (Ppι + Ppι )2 z' z ( vw P„\ + w P„ι )J' + P_i )' ε P„\ + ε Pni + 2 ' P„\ w Pi The vectors wy and the values of p} and ε} are computed before starting the codebook search. Then Q is evaluated using two nested loops for all possible positions p, and p2. Only the dot products between the different vectors wy need to be computed inside the loop.
The search procedure is shown in Figure 4. The search criterion is computed as Q=R2/D. However, cross product is used in comparing the present Q with the optimum one Qopt, in order to avoid the division inside the loop; more specifically, testing if Q>Qop{ is equivalent to testing if R2Dopt>R2 optD.
At the end of the two nested loops, the optimum vector indices p-, and p2 will be known. The two indices and the corresponding signs are then encoded as shown in Figure 3. The gain of the innovative codevector is then found by
PPl + P Pi g = D opt
The case where the codevector is found by adding three vectors from the stochastic codebook (P=3) will now be briefly considered. The K preselected random vectors and their corresponding signs are found in the same manner as described above. The filtered preselected vectors Wj and their product with x, p and their energies, εs, are found also before starting the codebook search. The search then proceeds with finding the best three vectors among the K preselected vectors, by computing the search criterion Q=R2/D using three nested loops. Figure 5 shows the structure of the search in three nested loops in case of adding three random vectors to construct the innovative codevector. At the end of the three nested loops, the optimum vector indices p.,, p2, and p3 will be known. The gain of the innovative codevector is then found by:
Figure imgf000046_0001
Once the optimum codevector and its gain are chosen, the codebook index k and gain g are encoded and transmitted.
As indicated in the foregoing description, the search procedure described above is summarized in the flow chart of Figure 6.
The stochastic codebook disclosed in the present invention can be used alone or in conjunction with a sparse innovative codebook such as an algebraic codebook. In this case, one (1) bit can be used to denote whether the algebraic section or the stochastic section of the innovation codebook is chosen. Both sections are searched and a candidate from each section is retained. The two candidates are compared and the one which maximizes the selection criterion Q is chosen. A modified selection criterion can be used for choosing the winner among the two codebook sections, by taking into consideration the nature of the current speech signal in the subframe. Criteria such as the pitch gain, the synthesis filter tilt, etc. can be used to modify the search criterion such that to favour the algebraic part of the codebook in case of periodic signals (high pitch gain and strong tilt) or to favor the stochastic section otherwise. Other variants of the stochastic codebook are also possible. One such variant is to have the flexibility of replacing the third random vector, in case of P=3, with a single pulse. In this case, one (1) bit is needed for indicating that a pulse is chosen to replace the third random vector. This helps in capturing special time events in the signal.
Although the present invention has been described hereinabove by way of a preferred embodiment thereof, this embodiment can be modified at will, within the scope of the appended claims, without departing from the spirit and nature of the subject invention.

Claims

WHAT IS CLAIMED IS:
1. A method for generating a codevector, comprising: constructing a stochastic table containing a set of M random vectors; and combining a number P of random vectors from the stochastic table to produce a codevector.
2. A codevector generating method as recited in claim 1 , in which combining a number P of random vectors comprises adding the number P of random vectors from the stochastic table to produce the codevector.
3. A codevector generating method as recited in claim 1 , wherein the number P is selected from the group consisting of 2 and 3.
4. A codevector generating method as recited in claim 1 , wherein adding the number P of random vectors from the stochastic table to produce the codevector comprises computing the codevector using the following relation:
c - s- v S V
1 Pi P P„
where c denotes the codevector, v denotes the P random vectors, s-,, s2, ..., sP are signs equal to -1 or 1 , and p p2 ..., pP, are indices of the P random vectors.
5. A stochastic codebook structure for generating codevectors, comprising: a stochastic table containing a set of M random vectors; and a codevector generator connected to the stochastic table and including a combiner of subsets of P random vectors from the stochastic table, said combiner producing codevectors each by combination of a subset of P random vectors from the stochastic table.
6. A stochastic codebook structure as recited in claim 5, wherein the combiner comprises an adder of subsets of P random vectors from the stochastic table.
7. A stochastic codebook structure as recited in claim 5, wherein the number P is selected from the group consisting of 2 and 3.
8. A stochastic codebook structure as recited in claim 6, wherein the adder comprises means for computing the codevectors using the following relation:
Figure imgf000049_0001
where c denotes the codevector, v denotes the P random vectors, s s2, ..., sP are signs equal to -1 or 1 , and p., p2 ... , pP, are indices of the P random vectors.
9. A stochastic codebook structure for generating codevectors, comprising: a stochastic table containing a set of M random vectors; and a codevector generator connected to the stochastic table and comprising means for adding a number P of random vectors from the stochastic table to produce a codevector.
10. A method for efficiently searching a stochastic codebook having a stochastic table containing a set of M random vectors of dimension N to find the best codevector for encoding a signal, said stochastic codebook searching method comprising: applying to the M random vectors a preselection criterion related to the signal; preselecting a subset of K random vectors amongst the M random vectors of said set in relation to the preselection criterion; applying a search criterion related to said signal to combinations of P random vectors out of the K random vectors of said preselected subset; and selecting, in relation to the search criterion, one of said combinations of P random vectors forming said best codevector for encoding said signal.
11. A stochastic codebook searching method as defined in claim 10, wherein: applying the preselection criterion comprises: calculating a dot product between:
a backward filtered version of a target vector calculated during encoding of said signal and used for searching the stochastic codebook; and each of the M random vectors of said set; and preselecting a subset of K random vectors comprises: preselecting as said subset the K random vectors of said set with the largest absolute values of dot products.
12. A stochastic codebook searching method as defined in claim 11 , wherein calculating the dot product comprises calculating the backward filtered version d(n) of the search target vector x(n) by correlating the search target vector x(n) with h(n) in accordance with the following relation:
d{ ) = x(n) * h(-n) = ^ x(i)h(i - ή)
where h(n) is an impulse response of a weighted synthesis filter calculated during encoding of said signal.
13. A stochastic codebook searching method as defined in claim 10, further comprising presetting a sign of each random vector of the subset.
14. A stochastic codebook searching method as defined in claim 11 , further comprising presetting a sign of each random vector of said subset, said preset sign being the sign of the corresponding dot product.
15. A stochastic codebook searching method as defined in claim 10, wherein applying a search criterion comprises calculating, for each combination of P random vectors, a mathematical relation involving said combination.
16. A stochastic codebook searching method as defined in claim 15, wherein calculating the mathematical relation comprises calculating, for each combination of P random vectors, a ratio involving said combination and a target vector calculated during encoding of said signal and used for searching the stochastic codebook.
17. A stochastic codebook searching method as defined in claim 16, wherein selecting one of said combinations of P random vectors comprises selecting the combination with the largest ratio.
18. A stochastic codebook searching method as defined in claim 16, wherein calculating said ratio for each combination of P random vectors comprises: convolving each random vector of said subset of K random vectors with an impulse response of a weighted synthesis filter calculated during encoding of said signal and thereby producing K filtered random vectors;
computing the energy of each filtered random vector; calculating a dot product of each filtered random vector with the target vector; and for each combination of P random vectors, computing said ratio in response to the corresponding P filtered random vectors, P computed energies and P calculated dot products.
19. A stochastic codebook searching method as defined in claim 18, wherein computing said ratio for each combination of P random vectors comprises: computing said ratios for all possible combinations of P vectors through P nested calculations loops.
20. A stochastic codebook searching method as defined in claim 18, further comprising calculating a gain of the signal representative codevector through a ratio having: a numerator constituted by a sum of the P dot products between the P random vectors of said selected one combination and the target vector; and a denominator involving the P computed energies and P filtered random vectors respectively corresponding to the P random vectors of said selected one combination.
21. A stochastic codebook searching method as defined in claim 10, further comprising calculating an index of the best codevector, said index containing information about:
signs of the P random vectors of said selected one combination; and indices of the P random vectors of said selected one combination.
22. A stochastic codebook searching method as defined in claim 21 , wherein P=2 and wherein calculating the index of the best codevector comprises constructing the index / of said best codevector from the respective indices p, and p2 and sign indices σ and σ2 of the two random vectors using the following relation:
l = s + 2x (i1 + i2 x M)
and the following rules:
- if σ 1 ≠ σ2, and p1 < p 2, then set /j = ft , = ft , and s = σ2 ; - if σ 1 ≠ σ 2 , and p1 > p2 ,then set /j = p, , = ft , and s = σ1 ;
- if σ 1 = σ2 , and p1 > p2 , then set = ft , = p , and s = σ, ; and
- if σ 1 = σ2 , and p1 ≤ p2 , then set i, = ft , = ft , and s = σ .
23. A stochastic codebook searching method as defined in claim 21, wherein P=3 and wherein calculating the index of the best codevector comprises: dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table;
determining the one of said two halves of the stochastic table which contains at least two of the three random vectors of said selected one combination; and constructing the index / of said best codevector using the following relation:
/ = φ + 2 x (s + 2 x(i1 + /2 x M/2)) + Mx Mx (σi + 2 x p3 )
and the following rules:
- if σ , ≠ σ2, and p < p2, then set /j = ft , = ft , and s = σ2 ; - if σ1 ≠ σ2 , and p1 > p2 ,then set i, = ft , = ft , and s = σ1
- if σf = σ2 , and p^ > p2 , then set ^ = ft , = p , and s = σ,
- if σ1 - σ2 , and p1 ≤ p2 , then set -j = ft , / 2 = β , and s = σ1 and
- if φ corresponds to the second half, /', = 4 - Λ^2 and i2 = i2 - M/2; where:
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of said selected one combination;
- σ1 and σ2 denote respective sign indices of the two random vectors located in said one half of the stochastic table;
- σ3 denotes a sign index of the third random vector; and
- p1 and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and
- p3 denotes an indicia of the third random vector.
24. A stochastic codebook searching method as defined in claim 10, wherein P=3, and wherein said method further comprises calculating an index of the signal respresentative codevector, said index containing information about: signs of two of the three random vectors of said selected one combination; and indices of said two random vectors of said selected one combination; and a bit indicating that a pulse is chosen to replace a third of the three random vectors of said one selected combination.
25. A device for efficiently searching a stochastic codebook having a stochastic table containing a set of M random vectors of dimension N to find the best codevector for encoding a signal, said stochastic codebook searching device comprising: means for applying to the M random vectors a preselection criterion related to the signal; means for preselecting a subset of K random vectors amongst the M random vectors of said set in relation to the preselection criterion; means for applying a search criterion related to said signal to combinations of P random vectors out of the K random vectors of said preselected subset; and
means for selecting, in relation to the search criterion, one of said combinations of P random vectors forming said best codevector for encoding said signal.
26. A stochastic codebook searching device as defined in claim 25, wherein: said means for applying the preselection criterion comprises: means for calculating a dot product between: a backward filtered version of a target vector calculated during encoding of said signal and used for searching the stochastic codebook; and each of the M random vectors of said set; and said means for preselecting a subset of K random vectors comprises: means for preselecting as said subset the K random vectors of said set with the largest absolute values of dot products.
27. A stochastic codebook searching device as defined in claim 26, wherein said means for calculating the dot product comprises means for calculating the backward filtered version d(n) of the search target vector x(n) by correlating the search target vector x(n) with h(n) in accordance with the following relation:
-V-l d(n) = x(n) * h(-ή) = ^ x(i)h(i - ή) where h(n) is an impulse response of a weighted synthesis filter calculated during encoding of said signal.
28. A stochastic codebook searching device as defined in claim 25, further comprising means for presetting a sign of each random vector of the subset.
29. A stochastic codebook searching device defined in claim 26, further comprising means for presetting a sign of each random vector of said subset, said preset sign being the sign of the corresponding dot product.
30. A stochastic codebook searching device as defined in claim 25, wherein said means for applying a search criterion comprises means for calculating, for each combination of P random vectors, a mathematical relation involving said combination.
31. A stochastic codebook searching device as defined in claim 30, wherein said means for calculating the mathematical relation comprises means for calculating, for each combination of P random vectors, a ratio involving said combination and a target vector calculated during encoding of said signal and used for searching the stochastic codebook.
32. A stochastic codebook searching device as defined in claim 31 , wherein said means for selecting one of said combinations of P random vectors comprises means for selecting the combination with the largest ratio.
33. A stochastic codebook searching device as defined in claim 31 , wherein said means for calculating said ratio for each combination of P random vectors comprises: means for convolving each random vector of said subset of K random vectors with an impulse response of a weighted synthesis filter calculated during encoding of said signal and thereby producing K filtered random vectors; means for computing the energy of each filtered random vector; means for calculating a dot product of each filtered random vector with the target vector; and means for computing, for each combination of P random vectors, said ratio in response to the corresponding P filtered random vectors, P computed energies and P calculated dot products.
34. A stochastic codebook searching device as defined in claim 33, wherein said means for computing said ratio for each combination of P random vectors comprises: P nested calculation loops for computing said ratios for all possible combinations of P vectors.
35. A stochastic codebook searching device as defined in claim 33, further comprising means for calculating a gain of the best codevector through a ratio having:
a numerator constituted by a sum of the P dot products between the P random vectors of said selected one combination and the target vector; and a denominator involving the P computed energies and P filtered random vectors respectively corresponding to the P random vectors of said selected one combination.
36. A stochastic codebook searching device as defined in claim 25, further comprising means for calculating an index of the best codevector, said index containing information about: signs of the P random vectors of said selected one combination; and indices of the P random vectors of said selected one combination.
37. A stochastic codebook searching device as defined in claim 36, wherein P=2 and wherein said means for calculating the index of the best codevector comprises means for constructing the index / of said best codevector from the respective indices , and p 2 and sign indices σΛ and σ2 of the two random vectors using the following relation:
l = s + 2x (i1 + i2 x M)
and the following rules:
- if σ, ≠ σ2, and p, < p2, then set i =ft , = ft , and s = σ2 ; - if σ1 ≠ σ 2 , and p^ p ,then set /j = ft , = ft , and s = σ1 ;
- if σ 1 = σ2 , and p1 > p2 , then set = jg , j = p , and s = q ; and
- if σ = σ2 , and p1 < p2 , then set i, = p, , = ft , and s = σ, .
38. A stochastic codebook searching device as defined in claim 36, wherein P=3 and wherein said means for calculating the index of the best codevector comprises: means for dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table; means for determining the one of said two halves of the stochastic table which contains at least two of the three random vectors of said selected one combination; and means for constructing the index / of said best codevector using the following relation:
l = φ + 2 x (s + 2 x(i1 + i2 x M/2)) + Mx Mx (σj + 2 xps )
and the following rules:
- if σ 1 ≠ σ2, and p1 < p 2, then set i, = ft , = ft , and s = σ2 ;
- if σ1 ≠ σ2 , and p1 > p2 ,then set /j = p1 , i> = ft , and s = σ1 ; - if σ 1 = σ2 , and p1 > p2 , then set i, = ft , i, = p , and s = σf ;
- if σ -- = σ2 , and p1 ≤ p2 , then set ^ = p, , /2 = ft , and s = σ ; and
- if φ corresponds to the second half, i1 - 4 - M/2 and i2 = i2 - M/2; where:
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of said selected one combination;
- σ1 and σ2 denote respective sign indices of the two random vectors located in said one half of the stochastic table;
- σ3 denotes a sign index of the third random vector; and
- p1 and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and
- p3 denotes an indicia of the third random vector.
39. A stochastic codebook searching device as defined in claim 25, wherein P=3, and wherein said method further comprises means for calculating an index of the signal respresentative codevector, said index containing information about: signs of two of the three random vectors of said selected one combination; and indices of said two random vectors of said selected one combination; and a bit indicating that a pulse is chosen to replace a third of the three random vectors of said one selected combination.
40. A cellular communication system for servicing a large geographical area divided into a plurality of cells, comprising: mobile transmitter/receiver units; cellular base stations respectively situated in said cells;
means for controlling communication between the cellular base stations; a bidirectional wireless communication sub-system between each mobile unit situated in one cell and the cellular base station of said one cell, said bidirectional wireless communication sub-system comprising in both the mobile unit and the cellular base station ( a) a transmitter including means for encoding a sound signal and means for transmitting the encoded sound signal, and ( b) a receiver including means for receiving a transmitted encoded sound signal and means for decoding the received encoded sound signal;
-wherein said sound signal encoding means comprises means responsive to the sound signal for producing sound signal encoding parameters, and wherein said sound signal encoding parameter producing means comprises a device for efficiently searching a stochastic codebook having a stochastic table containing a set of M random vectors of dimension N to find the best codevector for encoding the sound signal, said stochastic codebook searching device comprising: means for applying to the M random vectors a preselection criterion related to the sound signal; means for preselecting a subset of K random vectors amongst the M random vectors of said set in relation to the preselection criterion; means for applying a search criterion related to said sound signal to combinations of P random vectors out of the K random vectors of said preselected subset; and
means for selecting, in relation to the search criterion, one of said combinations of P random vectors forming said best codevector for encoding said sound signal.
41. A cellular communication system as recited in claim 40, wherein: said means for applying the preselection criterion comprises: means for calculating a dot product between: a backward filtered version of a target vector calculated during encoding of said sound signal and used for searching the stochastic codebook; and each of the M random vectors of said set; and said means for preselecting a subset of K random vectors comprises: means for preselecting as said subset the K random vectors of said set with the largest absolute values of dot products.
42. A cellular communication system as recited in claim 41 , wherein said means for calculating the dot product comprises means for calculating the backward filtered version d(n) of the search target vector x(n) by correlating the search target vector x(n) with h(n) in accordance with the following relation:
-V-l d(ή) = x(n) * h(-n) = jT x(i)h(i - n) where h(n) is an impulse response of a weighted synthesis filter calculated during encoding of said sound signal.
43. A cellular communication system as recited in claim 40, further comprising means for presetting a sign of each random vector of the subset.
44. A cellular communication system as recited in claim 41 , further comprising means for presetting a sign of each random vector of said subset, said preset sign being the sign of the corresponding dot product.
45. A cellular communication system as recited in claim 40, wherein said means for applying a search criterion comprises means for calculating, for each combination of P random vectors, a mathematical relation involving said combination.
46. A cellular communication system as recited in claim 45, wherein said means for calculating the mathematical relation comprises means for calculating, for each combination of P random vectors, a ratio involving said combination and a target vector calculated during encoding of said sound signal and used for searching the stochastic codebook.
47. A cellular communication system as recited in claim 46, wherein said means for selecting one of said combinations of P random vectors comprises means for selecting the combination with the largest ratio.
48. A cellular communication system as recited in claim 46, wherein said means for calculating said ratio for each combination of P random vectors comprises: means for convolving each random vector of said subset of K random vectors with an impulse response of a weighted synthesis filter calculated during encoding of said sound signal and thereby producing K filtered random vectors; means for computing the energy of each filtered random vector; means for calculating a dot product of each filtered random vector with the target vector; and means for computing, for each combination of P random vectors, said ratio in response to the corresponding P filtered random vectors, P computed energies and P calculated dot products.
49. A cellular communication system as recited in claim 48, wherein said means for computing said ratio for each combination of P random vectors comprises:
P nested calculation loops for computing said ratios for all possible combinations of P vectors.
50. A cellular communication system as recited in claim 48, further comprising means for calculating a gain of the best codevector through a ratio having: a numerator constituted by a sum of the P dot products between the P random vectors of said selected one combination and the target vector; and
a denominator involving the P computed energies and P filtered random vectors respectively corresponding to the P random vectors of said selected one combination.
51. A cellular communication system as recited in claim 40, further comprising means for calculating an index of the best codevector, said index containing information about: signs of the P random vectors of said selected one combination; and indices of the P random vectors of said selected one combination.
52. A cellular communication system as recited in claim 51 , wherein P=2 and wherein said means for calculating the index of the best codevector comprises means for constructing the index / of said best codevector from the respective indices p, and p 2 and sign indices σ-, and σ2 of the two random vectors using the following relation:
l = s + 2x (i1 + i2 x M)
and the following rules:
- if σ 1 ≠ σ2, and p1 < p2, then set j = ft , £ = ft , and s = σ2 ;
- if σ1 ≠ σ 2 , and p1 > p2 ,then set i, = p, , = ft , and s = σ1 ;
- if σ 1 = σ2 , and p1 > p2 , then set = p , j = p , and s = q ; and
- if σ1 = σ2 , and p1 < p2 , then set /j = ft , = ft , and s = σ1 .
53. A cellular communication system as recited in claim 51 , wherein P=3 and wherein said means for calculating the index of the best codevector comprises: means for dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table; means for determining the one of said two halves of the stochastic table which contains at least two of the three random vectors of said selected one combination; and means for constructing the index / of said best codevector using the following relation:
l = φ + 2x (s + 2 x(i1 + i2 x M/2)) + Mx Mx (q + 2 x ft J
and the following rules:
- if σ1 ≠ σ2, and p-, < p2, then set /j = ft , , = p, , and s = σ2 ;
- if σ1 ≠ σ 2 , and Pτ > p2 ,then set /j = p, , = ft , and s = σ1
- if σ1 = σ2 , and p^ p2 , then set i, = ft , i, - p , and s = σ1
- if σ1 = σ2 , and p1 < p2 , then set /j = ft , i2 - ft , and s = σ1 and
- if φ corresponds to the second half, i1 = i1 - M/2 and i2 = i2 - M/2; where:
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of said selected one combination;
- σ1 and σ2 denote respective sign indices of the two random vectors located in said one half of the stochastic table;
- σ3 denotes a sign index of the third random vector; and
- p1 and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and
- p3 denotes an indicia of the third random vector.
54. A stochastic codebook searching device as defined in claim 40, wherein P=3, and wherein said method further comprises means for calculating an index of the sound signal respresentative codevector, said index containing information about: signs of two of the three random vectors of said selected one combination; and indices of said two random vectors of said selected one combination; and a bit indicating that a pulse is chosen to replace a third of the three random vectors of said one selected combination.
55. A cellular network element comprising ( a) a transmitter including means for encoding a sound signal and means for transmitting the encoded sound signal, and ( b) a receiver including means for receiving a transmitted encoded sound signal and means for decoding the received encoded sound signal;
-wherein said sound signal encoding means comprises means responsive to the sound signal for producing sound signal encoding parameters, and wherein said sound signal encoding parameter producing means comprises a device for efficiently searching a stochastic codebook having a stochastic table containing a set of M random vectors of dimension N to find the best codevector for encoding the sound signal, said stochastic codebook searching device comprising: means for applying to the M random vectors a preselection criterion related to the sound signal; means for preselecting a subset of K random vectors amongst the M random vectors of said set in relation to the preselection criterion; means for applying a search criterion related to said sound signal to combinations of P random vectors out of the K random vectors of said preselected subset; and means for selecting, in relation to the search criterion, one of said combinations of P random vectors forming said best codevector for encoding said sound signal.
56. A cellular network element as defined in claim 55, wherein: said means for applying the preselection criterion comprises: means for calculating a dot product between: a backward filtered version of a target vector calculated during encoding of said sound signal
and used for searching the stochastic codebook; and each of the M random vectors of said set; and said means for preselecting a subset of K random vectors comprises: means for preselecting as said subset the K random vectors of said set with the largest absolute values of dot products.
57. A cellular network element as defined in claim 56, wherein said means for calculating the dot product comprises means for calculating the backward filtered version d(n) of the search target vector x(n) by correlating the search target vector x(n) with h(n) in accordance with the following relation:
-V-l d(n) = x(n) * h(-n) = T x(i)h(i - n) ι=n
where h(n) is an impulse response of a weighted synthesis filter calculated during encoding of said sound signal.
58. A cellular network element as defined in claim 55, further comprising means for presetting a sign of each random vector of the subset.
59. A cellular network element as defined in claim 56, further comprising means for presetting a sign of each random vector of said subset, said preset sign being the sign of the corresponding dot product.
60. A cellular network element as defined in claim 55, wherein said means for applying a search criterion comprises means for calculating, for each combination of P random vectors, a mathematical relation involving said combination.
61. A cellular network element as defined in claim 60, wherein said means for calculating the mathematical relation comprises means for calculating, for each combination of P random vectors, a ratio involving said combination and a target vector calculated during encoding of said sound signal and used for searching the stochastic codebook.
62. A cellular network element as defined in claim 61 , wherein said means for selecting one of said combinations of P random vectors comprises means for selecting the combination with the largest ratio.
63. A cellular network element as defined in claim 61 , wherein said means for calculating said ratio for each combination of P random vectors comprises: means for convolving each random vector of said subset of K random vectors with an impulse response of a weighted synthesis filter calculated during encoding of said sound signal and thereby producing K filtered random vectors;
means for computing the energy of each filtered random vector; means for calculating a dot product of each filtered random vector with the target vector; and means for computing, for each combination of P random vectors, said ratio in response to the corresponding P filtered random vectors, P computed energies and P calculated dot products.
64. A cellular network element as defined in claim 63, wherein said means for computing said ratio for each combination of P random vectors comprises: P nested calculation loops for computing said ratios for all possible combinations of P vectors.
65. A cellular network element as defined in claim 63, further comprising means for calculating a gain of the best codevector through a ratio having: a numerator constituted by a sum of the P dot products between the P random vectors of said selected one combination and the target vector; and a denominator involving the P computed energies and P filtered random vectors respectively corresponding to the P random vectors of said selected one combination.
66. A cellular network element as defined in claim 55, further comprising means for calculating an index of the best codevector, said index containing information about:
signs of the P random vectors of said selected one combination; and indices of the P random vectors of said selected one combination.
67. A cellular network element as defined in claim 66, wherein P=2 and wherein said means for calculating the index of the best codevector comprises means for constructing the index / of said best codevector from the respective indices p1 and p 2 and sign indices σ1 and σ 2 of the two random vectors using the following relation:
/ = s + 2x f + /2x Mj
and the following rules:
- if σ1 ≠ σ2, and p1 < p2, then set j = ft , , = ft , and s = σ2 ; - if σ 1 ≠ σ 2 , and p1 > p2 ,then set i, = ft , = ft , and s = σ1 ;
- if σ1 = σ2 , and p1 > p2 , then set = β , j = p , and s = q ; and
- if σ1 = σ2 , and p < p2 , then set i, = ft , i = ft , and s = σ1 .
68. A cellular network element as defined in claim 66, wherein P=3 and wherein said means for calculating the index of the best codevector comprises: means for dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table;
means for determining the one of said two halves of the stochastic table which contains at least two of the three random vectors of said selected one combination; and means for constructing the index / of said best codevector using the following relation:
l = φ + 2x (s + 2x(i1 + i2 x M/2)) + Mx Mx ( 3 + 2 x p3 )
and the following rules:
- if σ1 ≠ σ2, and p1 < p2, then set /j = ft , = ft , and s = σ2 ; - if σ1 ≠ σ 2 , and p.. > p2 ,then set i, = ft , = ft , and s = σ
- if σ1 = σ2 , and p, > p2 , then set i, = ft , i, = p, , and s = σ1
- if σ1 = σ2 , and p1 ≤ p2 , then set /j = ft , = ft , and s = cr, and
- if φ corresponds to the second half, = 4 - M/2 and = /2 - Λ^2; where:
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of said selected one combination;
- σ1 and σ2 denote respective sign indices of the two random vectors located in said one half of the stochastic table;
- σ3 denotes a sign index of the third random vector; and
- p1 and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and
- p3 denotes an indicia of the third random vector.
69. A cellular network element as defined in claim 55, wherein P=3, and wherein said method further comprises means for calculating an index of the sound signal respresentative codevector, said index containing information about: signs of two of the three random vectors of said selected one combination; and indices of said two random vectors of said selected one combination; and a bit indicating that a pulse is chosen to replace a third of the three random vectors of said one selected combination.
70. A cellular mobile transmitter/receiver unit comprising (a) a transmitter including means for encoding a sound signal and means for transmitting the encoded sound signal, and (b) a receiver including means for receiving a transmitted encoded sound signal and means for decoding the received encoded sound signal; -wherein said sound signal encoding means comprises means responsive to the sound signal for producing sound signal encoding parameters, and wherein said sound signal encoding parameter producing means comprises a device for efficiently searching a stochastic codebook having a stochastic table containing a set of M
random vectors of dimension N to find the best codevector for encoding the sound signal, said stochastic codebook searching device comprising: means for applying to the M random vectors a preselection criterion related to the sound signal; means for preselecting a subset of K random vectors amongst the M random vectors of said set in relation to the preselection criterion; means for applying a search criterion related to said sound signal to combinations of P random vectors out of the K random vectors of said preselected subset; and means for selecting, in relation to the search criterion, one of said combinations of P random vectors forming said best codevector for encoding said sound signal.
71. A cellular mobile transmitter/receiver unit as defined in claim 70, wherein: said means for applying the preselection criterion comprises: means for calculating a dot product between: a backward filtered version of a target vector calculated during encoding of said sound signal and used for searching the stochastic codebook; and each of the M random vectors of said set; and said means for preselecting a subset of K random vectors comprises:
means for preselecting as said subset the K random vectors of said set with the largest absolute values of dot products.
72. A cellular mobile transmitter/receiver unit as defined in claim 71 , wherein said means for calculating the dot product comprises means for calculating the backward filtered version d(n) of the search target vector x(n) by correlating the search target vector x(n) with h(n) in accordance with the following relation:
-V-l d(n) = x(n) * h(-n) = x(i)h(i - ή)
where h(n) is an impulse response of a weighted synthesis filter calculated during encoding of said sound signal.
73. A cellular mobile transmitter/receiver unit as defined in claim 70, further comprising means for presetting a sign of each random vector of the subset.
74. A cellular mobile transmitter/receiver unit as defined in claim 71 , further comprising means for presetting a sign of each random vector of said subset, said preset sign being the sign of the corresponding dot product.
75. A cellular mobile transmitter/receiver unit as defined in claim 70, wherein said means for applying a search criterion comprises means for calculating, for each combination of P random vectors, a mathematical relation involving said combination.
76. A cellular mobile transmitter/receiver unit as defined in claim 75, wherein said means for calculating the mathematical relation comprises means for calculating, for each combination of P random vectors, a ratio involving said combination and a target vector calculated during encoding of said sound signal and used for searching the stochastic codebook.
77. A cellular mobile transmitter/receiver unit as defined in claim 76, wherein said means for selecting one of said combinations of P random vectors comprises means for selecting the combination with the largest ratio.
78. A cellular mobile transmitter/receiver unit as defined in claim 76, wherein said means for calculating said ratio for each combination of P random vectors comprises: means for convolving each random vector of said subset of K random vectors with an impulse response of a weighted synthesis filter calculated during encoding of said sound signal and thereby producing K filtered random vectors; means for computing the energy of each filtered random vector; means for calculating a dot product of each filtered random vector with the target vector; and
means for computing, for each combination of P random vectors, said ratio in response to the corresponding P filtered random vectors, P computed energies and P calculated dot products.
79. A cellular mobile transmitter/receiver unit as defined in claim 78, wherein said means for computing said ratio for each combination of P random vectors comprises:
P nested calculation loops for computing said ratios for all possible combinations of P vectors.
80. A cellular mobile transmitter/receiver unit as defined in claim 78, further comprising means for calculating a gain of the best codevector through a ratio having: a numerator constituted by a sum of the P dot products between the P random vectors of said selected one combination and the target vector; and a denominator involving the P computed energies and filtered random vectors respectively corresponding to the P random vectors of said selected one combination.
81. A cellular mobile transmitter/receiver unit as defined in claim 70, further comprising means for calculating an index of the best codevector, said index containing information about: signs of the P random vectors of said selected one combination; and
indices of the P random vectors of said selected one combination.
82. A cellular mobile transmitter/receiver unit as defined in claim 81 , wherein P=2 and wherein said means for calculating the index of the best codevector comprises means for constructing the index / of said best codevector from the respective indices p1 and p 2 and sign indices σ, and σ2 of the two random vectors using the following relation:
I = s + 2 x (i1 + i 2 x M)
and the following rules:
- if σ , ≠ σ2, and p1 < p 2, then set i, = ft , = ft , and s = σ2 ;
- if σ1 ≠ σ 2 , and p1 > p2 ,then set j = ft , i2 = p2 , and s = σ1 ;
- if σ 1 = σ2 , and p1 > p2 , then set = β , j = p , and s = q ; and
- if σ1 = σ2 , and p1 < p2 , then set i, = ft , 2 = ft , and s = σ-- .
83. A cellular mobile transmitter/receiver unit as defined in claim 81, wherein P=3 and wherein said means for calculating the index of the best codevector comprises: means for dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table;
means for determining the one of said two halves of the stochastic table which contains at least two of the three random vectors of said selected one combination; and means for constructing the index / of said best codevector using the following relation:
1 = φ + 2x(s + 2x(i1 +i2x M/2)) +MxMx(o3+2xp3)
and the following rules:
- if σ1 ≠ σ2, and p1 < p2, then set i, = ft , = ft , and s = σ2 ; - if σ r ≠ σ 2 , and Pτ> p2 ,then set /j = ft , 2 = ft , and s= σ1
- if σ 1 = σ2 , and p-,> p2 , then set i, = ft , i, = p , and s= σ1 -ffσ1 = σ2, and p1 ≤ p2 , then set ^ = p1 , i2 -p2 , and s = σ1 and
- if φ corresponds to the second half, 4 = 4 - M/2 and 4 = 4- M/2; where:
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of said selected one combination;
- σ1 and σ2 denote respective sign indices of the two random vectors located in said one half of the stochastic table;
- σ3 denotes a sign index of the third random vector; and
- p1 and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and
- p3 denotes an indicia of the third random vector.
84. A cellular mobile transmitter/receiver unit as defined in claim 70, wherein P=3, and wherein said method further comprises means for calculating an index of the sound signal respresentative codevector, said index containing information about: signs of two of the three random vectors of said selected one combination; and indices of said two random vectors of said selected one combination; and a bit indicating that a pulse is chosen to replace a third of the three random vectors of said one selected combination.
85. In a cellular communication system for servicing a large geographical area divided into a plurality of cells, and comprising: mobile transmitter/receiver units; cellular base stations respectively situated in said cells; and means for controlling communication between the cellular base stations; - a bidirectional wireless communication sub-system between each mobile unit situated in one cell and the cellular base station of said one cell, said bidirectional wireless communication sub-system comprising in both the mobile unit and the cellular base station (a) a transmitter including means for encoding a sound signal and means for transmitting the encoded sound signal, and (b) a receiver including means for receiving a transmitted encoded sound signal and means for decoding the received encoded sound signal;
-wherein said sound signal encoding means comprises means responsive to the sound signal for producing sound signal encoding parameters, and wherein said sound signal encoding parameter producing means comprises a device for efficiently searching a stochastic codebook having a stochastic table containing a set of M random vectors of dimension N to find the best codevector for encoding the sound signal, said stochastic codebook searching device comprising: means for applying to the M random vectors a preselection criterion related to the sound signal; means for preselecting a subset of K random vectors amongst the M random vectors of said set in relation to the preselection criterion; means for applying a search criterion related to said sound signal to combinations of P random vectors out of the K random vectors of said preselected subset; and means for selecting, in relation to the search criterion, one of said combinations of P random vectors forming said best codevector for encoding said sound signal.
86. The bidirectional wireless communication sub-system of claim 85, wherein: said means for applying the preselection criterion comprises: means for calculating a dot product between:
a backward filtered version of a target vector calculated during encoding of said sound signal and used for searching the stochastic codebook; and each of the M random vectors of said set; and said means for preselecting a subset of K random vectors comprises: means for preselecting as said subset the K random vectors of said set with the largest absolute values of dot products.
87. The bidirectional wireless communication sub-system of claim 86, wherein said means for calculating the dot product comprises means for calculating the backward filtered version d(n) of the search target vector x(n) by correlating the search target vector x(n) with h(n) in accordance with the following relation:
N-l d(n) = x(n) * h(-ή) = ^ x(ϊ)h(i — n) i=n
where h(n) is an impulse response of a weighted synthesis filter calculated during encoding of said sound signal.
88. The bidirectional wireless communication sub-system of claim 85, further comprising means for presetting a sign of each random vector of the subset.
89. The bidirectional wireless communication sub-system of claim 86, further comprising means for presetting a sign of each random vector of said subset, said preset sign being the sign of the corresponding dot product.
90. The bidirectional wireless communication sub-system of claim 85, wherein said means for applying a search criterion comprises means for calculating, for each combination of P random vectors, a mathematical relation involving said combination.
91. The bidirectional wireless communication sub-system of claim 90, wherein said means for calculating the mathematical relation comprises means for calculating, for each combination of P random vectors, a ratio involving said combination and a target vector calculated during encoding of said sound signal and used for searching the stochastic codebook.
92. The bidirectional wireless communication sub-system of claim 91 , wherein said means for selecting one of said combinations of P random vectors comprises means for selecting the combination with the largest ratio.
93. The bidirectional wireless communication sub-system of claim 91 , wherein said means for calculating said ratio for each combination of P random vectors comprises: means for convolving each random vector of said subset of K random vectors with an impulse response of a weighted synthesis filter calculated during encoding of said sound signal and thereby producing K filtered random vectors; means for computing the energy of each filtered random vector; means for calculating a dot product of each filtered random vector with the target vector; and means for computing, for each combination of P random vectors, said ratio in response to the corresponding P filtered random vectors, P computed energies and P calculated dot products.
94. The bidirectional wireless communication sub-system of claim 93, wherein said means for computing said ratio for each combination of P random vectors comprises:
P nested calculation loops for computing said ratios for all possible combinations of P vectors.
95. The bidirectional wireless communication sub-system of claim 93, further comprising means for calculating a gain of the best codevector through a ratio having: a numerator constituted by a sum of the P dot products between the P random vectors of said selected one combination and the target vector; and
a denominator involving the P computed energies and P filtered random vectors respectively corresponding to the P random vectors of said selected one combination.
96. The bidirectional wireless communication sub-system of claim 85, further comprising means for calculating an index of the best codevector, said index containing information about: signs of the P random vectors of said selected one combination; and indices of the P random vectors of said selected one combination.
97. The bidirectional wireless communication sub-system of claim 96, wherein P=2 and wherein said means for calculating the index of the best codevector comprises means for constructing the index / of said best codevector from the respective indices p and p 2 and sign indices σ1 and σ2 of the two random vectors using the following relation:
l = s + 2x (i1 + i2 x M)
and the following rules:
- if σ , ≠ σ2, and p < p2, then set i, = ft , = p , and s = σ2 ;
- if σ 1 ≠ σ 2 , and p > p2 ,then set i, = ft , i, - ft , and s = σ1 ;
- if σ 1 = σ2 , and p1 > p2 , then set = β , j = p , and s = q ; and
- if σ , = σ2 , and p1 < p2 , then set i, = ft , = ft , and s = σ, .
98. The bidirectional wireless communication sub-system of claim 96, wherein P=3 and wherein said means for calculating the index of the best codevector comprises: means for dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table; means for determining the one of said two halves of the stochastic table which contains at least two of the three random vectors of said selected one combination; and means for constructing the index / of said best codevector using the following relation:
l = φ + 2x (s + 2 x(i1 + i2 x M/2)) + Mx Mx (o3 + 2 xp3 )
and the following rules:
- if σ 1 ≠ σ2, and p., < p2, then set ^ = ft , = ft , and s = σ2 ;
- if σ 1 ≠ σ 2 , and p1 > p2 ,then set i, = ft , £ = ft , and s = σ1
- if σ1 = σ2 , and Pi > p2 , then set i, = ft , i, = ft , and s = σ1
- if σ1 = σ2 , and p1 ≤ p2 , then set i, = ft , ^ = ft , and s = σ1 and
- if φ corresponds to the second half, i1 = i1 - M/2 and i2 = i2 - M/2 where:
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of said selected one combination;
- σή and σ2 denote respective sign indices of the two random vectors located in said one half of the stochastic table;
- σ3 denotes a sign index of the third random vector; and
- p1 and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and
- p3 denotes an indicia of the third random vector.
99. The bidirectional wireless communication sub-system of claim 85, wherein P=3, and wherein said method further comprises means for calculating an index of the sound signal respresentative codevector, said index containing information about: signs of two of the three random vectors of said selected one combination; and indices of said two random vectors of said selected one combination; and a bit indicating that a pulse is chosen to replace a third of the three random vectors of said one selected combination.
100. A codevector generating method as defined in claim 4, wherein P=2 and wherein said codevector generating method further comprises calculating an index / of said codevector from the respective indices p., and p2 of the two random vectors and indices σ and σ respectively representative of the signs s1 and s2 of the two random vectors using the following relation:
l = s + 2x (i1 +i2xM)
and the following rules:
- if σ , ≠ σ2, and p1 < p2, then set i, = ft , = ft , and s = σ2 ;
- if σ1 ≠ σ 2 , and p > p2.then set = ft , = ft , and s = σ1 ;
- if σ 1 = σ2 , and p1 > p2 , then set j = β , j = p , and s = q- ; and - if σy = σ2 , and pή < p2 , then set ^ = p, , i = ft , and s= σ1 .
101. A codevector generating method as defined in claim 4, wherein P=3 and wherein said codevector generating method further comprisescalculating an index / of said codevector by: dividing the stochastic table into two halves with M/2 random vectors in each half of the stochastic table; determining the one of said two halves of the stochastic table which contains at least two of the three random vectors; and constructing the index / of said codevector using the following relation:
l = φ+2x(s + 2x(i1+i2x M/2)) + MxMx(c3+2xp3)
and the following rules:
- if σ 1 ≠ σ2, and p1 < p2, then set /, = ft , ^ = ft , and s = σ2 ;
- if σ1 ≠ σ 2 , and p-, > p2 ,then set i, = p, , i, = p> , and s = σ1
- if σ1 = σ2 , and p^ > p2 , then set i, = ft , = p , and s = σ,
- if σ 1 = σ2 , and p1 ≤ p2 , then set /j = p, , = ft , and s = σ, and - if φ corresponds to the second half, i1 = i1 - M/2 and i2 = i2 - M/2; where:
- σ σ2 and σ3 are indices respectively representative of the signs s1 , s2 and s3 of the three random vectors;
- φ = 0 or 1 and denotes said one half of the stochastic table containing at least two of the three random vectors of said selected one combination;
- σ1 and σ2 denote respective sign indices of the two random vectors located in said one half of the stochastic table; - σ3 denotes a sign index of the third random vector; and
- p,. and p2 denote respective indices of the two random vectors located in said one half of the stochastic table; and
- p3 denotes an indicia of the third random vector.
PCT/CA2000/000036 1999-01-15 2000-01-14 A method and device for designing and searching large stochastic codebooks in low bit rate speech encoders WO2000042601A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU30286/00A AU3028600A (en) 1999-01-15 2000-01-14 A method and device for designing and searching large stochastic codebooks in low bit rate speech encoders

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CA2,259,09419990115 1999-01-15
CA002259094A CA2259094A1 (en) 1999-01-15 1999-01-15 A method and device for designing and searching large stochastic codebooks in low bit rate speech encoders

Publications (1)

Publication Number Publication Date
WO2000042601A1 true WO2000042601A1 (en) 2000-07-20

Family

ID=4163194

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2000/000036 WO2000042601A1 (en) 1999-01-15 2000-01-14 A method and device for designing and searching large stochastic codebooks in low bit rate speech encoders

Country Status (3)

Country Link
AU (1) AU3028600A (en)
CA (1) CA2259094A1 (en)
WO (1) WO2000042601A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1677289A2 (en) * 2004-12-31 2006-07-05 Samsung Electronics Co., Ltd. High-band speech coding apparatus and high-band speech decoding apparatus in a wide-band speech coding/decoding system and high-band speech coding and decoding methods performed by the apparatuses

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0514912A2 (en) * 1991-05-22 1992-11-25 Nippon Telegraph And Telephone Corporation Speech coding and decoding methods
EP0577488A1 (en) * 1992-06-29 1994-01-05 Nippon Telegraph And Telephone Corporation Speech coding method and apparatus for the same

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0514912A2 (en) * 1991-05-22 1992-11-25 Nippon Telegraph And Telephone Corporation Speech coding and decoding methods
EP0577488A1 (en) * 1992-06-29 1994-01-05 Nippon Telegraph And Telephone Corporation Speech coding method and apparatus for the same

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1677289A2 (en) * 2004-12-31 2006-07-05 Samsung Electronics Co., Ltd. High-band speech coding apparatus and high-band speech decoding apparatus in a wide-band speech coding/decoding system and high-band speech coding and decoding methods performed by the apparatuses
EP1677289A3 (en) * 2004-12-31 2008-12-03 Samsung Electronics Co., Ltd. High-band speech coding apparatus and high-band speech decoding apparatus in a wide-band speech coding/decoding system and high-band speech coding and decoding methods performed by the apparatuses
US7801733B2 (en) 2004-12-31 2010-09-21 Samsung Electronics Co., Ltd. High-band speech coding apparatus and high-band speech decoding apparatus in wide-band speech coding/decoding system and high-band speech coding and decoding method performed by the apparatuses

Also Published As

Publication number Publication date
AU3028600A (en) 2000-08-01
CA2259094A1 (en) 2000-07-15

Similar Documents

Publication Publication Date Title
US7280959B2 (en) Indexing pulse positions and signs in algebraic codebooks for coding of wideband signals
US5495555A (en) High quality low bit rate celp-based speech codec
EP0808496B1 (en) Algebraic codebook with signal-selected pulse amplitudes for fast coding of speech
Laflamme et al. On reducing computational complexity of codebook search in CELP coder through the use of algebraic codes
EP1125286B1 (en) Perceptual weighting device and method for efficient coding of wideband signals
US6055496A (en) Vector quantization in celp speech coder
KR100535366B1 (en) Voice signal encoding method and apparatus
AU2002221389A1 (en) Indexing pulse positions and signs in algebraic codebooks for coding of wideband signals
KR19980024885A (en) Vector quantization method, speech coding method and apparatus
EP2805324B1 (en) System and method for mixed codebook excitation for speech coding
JPH09127990A (en) Voice coding method and device
JPH09127989A (en) Voice coding method and voice coding device
JPH10124092A (en) Method and device for encoding speech and method and device for encoding audible signal
US5434947A (en) Method for generating a spectral noise weighting filter for use in a speech coder
JPH10214100A (en) Voice synthesizing method
JP2003044099A (en) Pitch cycle search range setting device and pitch cycle searching device
WO2000042601A1 (en) A method and device for designing and searching large stochastic codebooks in low bit rate speech encoders
KR100341398B1 (en) Codebook searching method for CELP type vocoder
CN101192409A (en) Method and device for selecting self-adapting codebook excitation signal
JP2002073097A (en) Celp type voice coding device and celp type voice decoding device as well as voice encoding method and voice decoding method
EP1212750A1 (en) Multimode vselp speech coder
WO2000016501A1 (en) Method and apparatus for coding an information signal
JPH09127997A (en) Voice coding method and device
JPH09269800A (en) Video coding device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WA Withdrawal of international application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WA Withdrawal of international application