EP0203940A4 - Vocoder relp pour processeurs de signaux numeriques. - Google Patents

Vocoder relp pour processeurs de signaux numeriques.

Info

Publication number
EP0203940A4
EP0203940A4 EP19850905709 EP85905709A EP0203940A4 EP 0203940 A4 EP0203940 A4 EP 0203940A4 EP 19850905709 EP19850905709 EP 19850905709 EP 85905709 A EP85905709 A EP 85905709A EP 0203940 A4 EP0203940 A4 EP 0203940A4
Authority
EP
European Patent Office
Prior art keywords
subroutine
signal
samples
residual signal
pitch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP19850905709
Other languages
German (de)
English (en)
Other versions
EP0203940A1 (fr
Inventor
Philip John Wilson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hughes Network Systems LLC
MA Com Government Systems Inc
Original Assignee
MA Com Government Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MA Com Government Systems Inc filed Critical MA Com Government Systems Inc
Publication of EP0203940A1 publication Critical patent/EP0203940A1/fr
Publication of EP0203940A4 publication Critical patent/EP0203940A4/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the present invention generally pertain-s to voice coders (vocoders) and is particularly directed to Residual-Excited Linear Prediction (RELP) vocoders.
  • Vocoders convert speech signals into digital form for transmission and synthesize speech signals from these digital signals upon reception.
  • Vocoders typically operate at flexible binary data rates ' varying from 32 kbps ⁇ kilobits per second) down to about 2.4 kbps.
  • Vocoders traditionally are divided into two basic types, waveform coders and pitch-excited source coders.
  • Waveform coders operate at high data rates (above 16 kbps) and produce good quality natural sounding speech which is robust against both acoustic and transmitted noise.
  • Source coders operate at low data rates (less than 4.8 kbps) in an analysis/synthesis mode governed by a mathematical model of the human vocal , . apparatus.
  • Source vocoders typically sound robotic and do not perform well under poor acoustic conditions. ,
  • the RELP vocoder was originally proposed by Un and Magill, "The Residual-Excited Linear Prediction Vocoder with Transmissio Rate Below 9.6 kbits/s", IEEE Trans. COM-23, 1975 pp. 1466-1473; and an enhanced RELP vocoder was proposed by Dankberg and Wong, "Development of a 4.8-9.6 kbps RELP vocoder", ICASSP-79.
  • the purpose of the RELP vocoder was to provide satisfactory perform ⁇ ance in the gap between the operating ranges of waveform coders and source coders, to wit: 4.8 kbps to 16 kbps.
  • the RELP vocoder contains some features of both waveform coders and source coders.
  • digital speech data signal samples are analyzed over relatively short time segments (typically in the range of 10-30 ms-.) by a linear predictive coding (LPC) vocal tract modeling technique to provide LPC coefficients for each block of samples.
  • LPC coefficients represent the vocal tract, glottal flow and radiation of the speech represented by the digital signal samples.
  • the digital speech data signal samples are inverse filtered by a time-variant, all-pole recursive digital filter over each short time segment to provide residual signal (prediction error signal samples.
  • the time-variant character of speech is handled by a succession of such filters with different parameters.
  • the residual signal and the LPC coefficients are encoded (quantized) and formatted for transmission.
  • speech is synthesized by processing the residual signal in- accordance with the LPC coefficients.
  • the residual signal samples are bandlimited and downsampled prior to quantization in order to provide residual signal s-amples at a reduced data rate.
  • the upper band harmonics are generated during synthesis of th speech signal when the downsampled residual signal is upsa pled and zeros are inserted between data points.
  • the residual signal is quantized prior to transmission by adaptive delta modulation.
  • Dankberg and Wong. considered various other quantization tech ⁇ niques and concluded that pitch predictive adaptive different pulse code modulation (PPADPCM) provided the best signal-to- quantizing noise ratio.
  • PPADPCM pitch predictive adaptive different pulse code modulation
  • the residual signal samples are processed by pitch analysis to determine t pitch delay, are processed by pitch predictor gain analysis t determine the pitch predictor gain in accordance with the det mined pitch delay, processed by gain analysis to provide a maximum deviation quantizer gain, and are further processed b PPADPCM in accordance with the quantizer gain, pitch predicto gain and delay parameters to thereby provide the quantized residual signal.
  • RELP vocoders of the prior art have required complex hardware and have been so expensive to implement as to be commercially impractical. - m
  • the present invention provides a commercially practical RELP vocoder that, is implemented by two digital signal processors, one for a transmitter system and one for a remotely located re- ceiver system.
  • the transmitter digital signal processor is adapte for processing digital speech data signal samples to provide a formatted transmission signal including (a) a quantized residual signal generated by inverse filtering of the samples in accordance with linear predictive coding (LPC) coefficients generated from the samples, (b) quantized LPC -coefficients and (c) pitch and gain parameters generated during quantization of the residual signal from the inverse filtered samples, all of which are generated by the processor from the digital -speech data samples.
  • the receiver digital signal processor is adapted for processing the formatted transmission signal to synthesize reconstructed digital speech data signal samples.
  • the transmitter digital signal processor is adapted for performing a routine for generating the LPC coefficients; a routine for generating the residual signal; and a routine for quantizing the residual signal and the LPC coefficients.
  • the routine for generating the LPC coefficients includes a subroutine for pre-emphasizing the samples in order to emphasize the high frequencies of speech, a subroutine for defining an auto-correlatio function (ACF) from the prec-emphasized s-amples in order to generate ACF coefficients; and a subroutine for generating the LPC coe ficients from the generated ACF coefficients.
  • ACF auto-correlatio function
  • the routine for generating the residual signal includes a subroutine for inverse filtering the pre-emphasized samples in accordance with the generated LPC coefficients; a subroutine for bandlimitin the residual signal by low-pass filtering in a manner which will reduce the effects of quantization; and a subroutine for downsampling the bandlimited residual signal to reduce the number of residual signal samples that are quantized and formatte for transmission.
  • the routine for quantizing the residual signal and LPC coefficients includes a subroutine for quantizing the LPC coefficients; a subroutine for estimating, the pitch period of the downsampled residual signal by ACF analysis of the current downsampled residual signal frame in accordance with the ACF coefficients generated for the previous, frame to thereby provide a pitch delay parameter for the current frame; a subroutine for providing a pitch predictor gain parameter for each residual signal frame in accordance with the estimated pitch delay para- meter for each corresponding frame; a subroutine for providing a quantizer gain parameter for each residual signal frame in accord ance with the pitch delay and pitch predictor gain parameters for each corresponding frame; and a subroutine for quantizing each residual signal frame by pitch predictive adaptive differential pulse code modulation (PPADPCM) in accordance with the pitch delay, pitch predictor gain and quantizer gain parameters for each corresponding frame.
  • PPADPCM pitch predictive adaptive differential pulse code modulation
  • the receiver digital signal processor is adapted for processi the formatted transmission signal to synthesize reconstructed digital speech data signal samples by performing ' a synthesis routine th includes a subroutine for regenerating the LPC coefficients from the quantized LPC coefficients included in the transmission signal; a subroutine for decoding the quantized residual signal included in the transmission signal in accordance with the pitch delay, pitch predictive gain and quantizer gain parameters included in the transmission signal to thereby provide a decoded downsampled residual signal; a subroutine for spectrally regenerating-a full-band residual signal from the decoded downsampled residual signal; a subroutine for regenerating pre-emphasized digital speech data signal samples by auto-regressively filtering the regenerated full-band residual signal in accordance with the regenerated LPC coefficients; and a subroutine for de-emphasizing the regenerated preemphasized samples in order to de-emphasize the high frequencies of speech, to thereby provide the reconstructed digital speech data signal samples.
  • the decoding subroutine includes a subroutine for scaling quantizer coefficients for each quantized residual signal frame in accordance with the quantizer gain parameter included in the transmission signal; a subroutine for providing data samples from the quantized residual signal included in the transmission signal in accord- ' ance with the scaled quantizer coefficients; and a subroutine for providing the decoded downsampled residual signal from the data samples by pitch excitation in accordance with the pitch delay and pitch predictor gain parameters.
  • Figure 1 is a functional block diagram illustrating the process implemented by the transmitter digital signal processor to code an input signal sample for transmission.
  • FIG 2 is a functional block diagram illustrating the process implemented by the receiver signal processor to decode a sample which is coded in accordance with the process illustrate in Figure 1.
  • Figure 3 is a flow chart -of the LPC coefficient generation routine performed by the transmitter digital signal processor.
  • Figure 4 is a flow chart of the residual signal generation routine performed by the transmitter digital signal processor.
  • Figure 5 is a flow chart of the quantization routine per ⁇ formed by the transmitter digital signal processor.
  • Figure 6 is a diagram of a quantization filter implemented during the PPADPCM quantization subroutine included in the routin of Figure 3.
  • Figure 7 is a flow chart of the synthesis routine performed by the receiver digital signal processor.
  • the transmitter digital signal processor and receiver digital signal processor respectively are each Texas Instruments Model TMS32010 Digital Signal Processors.
  • the TMS32010 processor is a 16-bit, 200 ns cycle time, stand-alone processor with a 32-bit ALU and Accumulator.
  • the processor has a four level stack for nested subroutines; and arithmetic performance is enhanced by a hardware 16*16-bit parallel multiplier, which performs a pipelined multiply/accumulate operation in 400 ns.
  • the TMS32010 processor has 144 16-bit words available as internal RAM which may be augmented by addressing external RAM, for buffer storage, via TBLR/TBLW (table read/write) commands.
  • Program memory may be redefined as external- data memory but its access time is 600 ns. External program memory may be expanded to 8K bytes at full speed.
  • the two processors must perform all operations of the RELP vocoder in real time.
  • the processor choice is constrained by two key factors: operating speed and available internal RAM (especially important because frame storage is required) .
  • the TMS32010 processor is chosen based 5 on its fast operating speed (5 MHz) , data storage capabilities, and extensive development tools.
  • Digital speech data signal samples 10 are pre-emphasized 11 to improve the representation of 0 high frequencies during the subsequent LPC analysis.
  • Pre-emphasiz samples 12 are subjected- to LPC analysis 13 to provide LPC reflec ⁇ tion coefficients 14.
  • the LPC reflection coefficients 14 " are quantized 15 to provid quantized LPC reflection coefficients 16.
  • the LPC reflection 5 coefficients 14 are quantized to minimize distortion during sub ⁇ sequent transmission to the receiver.
  • LPC coefficients 17 are - generated 18 from. the quantized LPC reflection coefficients 16.
  • the pre-emphasized samples 12 are inverse filtered 19 in accordance with the LPC coefficients 17 to provide a residual 0 signal 20.
  • the residual signal 20 is bandlimited 21 and down ⁇ sampled 22 to provide a baseband residual signal 23.
  • the baseband residual signal 23 is quantized by PPADPCM quantization 24 in order to minimize the effects of distortion during subsequent .transmission of the quantized residual signal 25.
  • Three of the parameters of the PPADPCM quantization 24 are pitch delay, pitch predictor gain and quantizer gain. These three parameters are generated during PPADPCM quantization 24 and are necessary to decode to the quantized residual signal received by the receiver system. Accordingly, a pitch delay signal is provided on line 26, a pitch predictor gain signal is provided on line 27 and a quantizer gain signal is provided on line 28 incident to the PPADPCM quantization 24 of the baseband residual signal 23. .. .
  • the quantized residual signal 25, the quantizer, the pitch delay signal on line 26, the pitch predictor gain signal 27, the quantizer gain signal 28 and the quantized LPC reflection coefficients 16 are combined linearly by formatting 32 to provide a transmission frame 34.
  • the principal functions of the receiver processor are de ⁇ scribed with reference to Figure 2.
  • the format of each received data transmission frame 36 is decoded 37 to provide the quantized residual signal 39, the pitch delay parameter 40, the pitch predictor gain parameter 41, the quantizer gain parameter 42 and the quantized LPC reflection coefficients 43.
  • the quantized residual signal 39 is decoded by PPADPCM decoding 46 in accordance with the pitch delay 40, pitch predic ⁇ tor gain 4T and quantizer gain 42 to provide a decoded baseband residual signal 47.
  • the decoded baseband residual signal 47 is spectrally regenerated 48 to provide a full-band residual signal 4
  • the quantized LPC reflection coefficients 43 are processed 50 to generate the LPC coefficients 51.
  • the full-band residual signal 49 is filtered 52 in accord ⁇ ance with the generated LPC coefficients 51 to synthesize a decoded speech data signal samples 53.
  • the decoded speech data signal samples 53 are de-emphasized 54 to provide a regenerated digital speech data signal samples 55.
  • the processing routine represented by the flow chart of Figure 3 generally pertains to LPC analysis.
  • This routine gene ates the LPC coefficients from a buffered frame of pre-emphasiz speech data signal samples.
  • the routine of Figure 4 is general directed to generation of the residual signal; and the routine Figure 5 is generally directed to quantization of the residual signal and the LPC coefficients.
  • the LPC analysis routine includes the subroutines of initialization 58, sample input 59, pre-emphasis 61, ACF generation 63, ACF normalization 65 and LPC analysis 66.
  • the sample input subroutine 59 reads in digital speech data signal samples from an external data memory buffer.
  • the pre-emphasis subroutine 61 applies first-order digital pre-emphasis to the input speech data signal samples.
  • the input to the algorithm is the input ' speech sample S and the output is the pre-emphasized speech sample S 1 , both located in internal RAM.
  • First-order digital pre-emphasis is applied to the input speech signal to emphasize the high frequencies of speech. This leads to a more accurate estimate of the vocal tract frequency response, which is controlled by the. LPC parameters.
  • Pre-emphasi uses a single-delay high-pass filter. Experimentation shows that the choice of the pre-emphasis constant (a) is not critical and it is normally set to 0.9375.
  • the difference equation for the filter is:
  • the pre-emphasis function is complemented at the receiver system by applying a de-emphasis function.
  • the ACF generation subroutine 63 iteratively updates a correlation buffer for each input speech data signal sample.
  • This buffer must be zeroed prior to the first call to the sub ⁇ routine.
  • the output of this subroutine is a 32-bit precision auto-correlation function (ACF) for delays between zero and ten points. 13
  • an auto ⁇ correlation function (ACF) must be defined from a windowed buffer of pre-emphasized speech samples (s .) .
  • ACF auto ⁇ correlation function
  • the window (w.) is chosen to be rectangular for ease of implementation.
  • a tenth-order LPC analysis requires the ACF coefficients R 0 ,...,R, Q . These coefficients may be updated iteratively for each input speech data signal sample. •
  • V n+1) R k (n) + x n * x n-k (Eg. 5)
  • R, (n) is the n iteration of the k ACF coefficients.
  • This equation is implemented by the ACF generation subroutine 63.
  • the coefficients R, are maintained with 3.'2-bit accuracy to remove round-off error problems.
  • the algorithm is imple- mented by creating a delay buffer that is initialized to zero and ripples after each iteration. This implementation also ensures that the 32-bit result will not overflow.
  • the maxi ⁇ mum value attained by the accumulator for a data buffer of 180 samples, is:
  • the 32-bit ACF. result Upon completion of sample input, the 32-bit ACF. result, must be converted to 16-bit coefficients.
  • the ACF normalization subroutine 65 performs all operations required to convert the 32-bit ACF to a 16-bit result.
  • the LPC analysis subroutine 66 is transparent to a scaled ACF input. Therefore, to obtain the maximum dynamic range of the 16-bit ACF, the 32-bit results are scaled to the maximum, R Q , prior to truncation to 16-bits. The optimal procedure for this would be to divide all coefficients by R Q . However, execution efficiency is greatly improved by simply left-shifting the 32-bit numbers to remove leading zeros in the R Q value.
  • A.decision 67 that the 32-bit correlation frame is complete enables the processor to proceed to the ACF normalization subroutine 65.
  • the LPC analysis subroutine 66 implements the Durbin algorithm to generate the ten LPC .coefficients and ten LPC reflection coefficients 36. .
  • the Durbin algorithm-! 1 s input is the normalized 16-bit ACF. 15
  • the Durbin algorithm is an extremely efficient algorithm for generating the LPC coefficients. See J. Makhoul, "Linear Predition: A tutorial Review", Proc IEEE, Vol. 63, pp 561-80, 1975.
  • the ' algorithm is suitable for fixed-point arithmetic 5 implementation and also generates, as a by-product, the reflec ⁇ tion coefficients, which may used for quantization and coding prior to transmission to the receiver. " . •
  • the LPC coefficients may be generated by the Le Roux-Gueguen (LG) recursion, which is described in O J. Le Roux and C. Gueguen, "A Fixed Point Computation of Partial Correlation Coefficients in Linear Prediction", Proc ICASSP-77, pp 742-3.
  • LG Le Roux-Gueguen
  • the LG recursion although faster than the Durbin algorithm, generates only the LPC reflection -coefficients and not the LPC coefficients, per se which must be generated 5 separately.
  • R. is the i auto-correlation function k. is the i -reflection coefficient a.-' is the i . LPC coefficient (j iteration
  • the order of the LPC analysis, P is determined experimentally and a 10th order analysis is s.ufficient to adequately model the vocal tract frequency response.
  • the LPC parameters must be quantized and coded prior to transmission and resynthesis of the digital speech data signal at the receiver.
  • the LPC coefficients, a, are sensi ⁇ tive to quantization noise and introduce significant distortion to the signal.
  • a solution is to quantize and code the LPC reflection coefficients, k., which are much less sensitive to
  • LPC coefficient quantization subroutine 68 which is a part of the quantization routine of Figure 5.
  • the initialization subroutine 58 and the sample input ;.-. subroutine 59 are both contained .in the main program for the transmitter processor.
  • the main program controls the calling of the other subroutines in the LPC analysis routine of - Figure 3 in accordance with the following hierarchy: pre-emphasis 61, ACF generation 63, ACF normalization 65 and LPC analysis 66.
  • the main program implements the LPC -analysis routine of Figure 5 to generate a frame of a predetermined number of pre-emphasized speech data signal samples and the ten LPC coefficients.
  • LPC coefficients refers to either LPC coefficients or LPC reflection coefficients unless the latter is specified.
  • the residual signal generation routine is represented by the flow chart of Figure 4. This routine includes the subroutines of initialization 70, sample imput 71, inverse filter 72, bandlimit 73 and downsample 74.
  • the initialization subroutine 70 transfers second-order section filter coefficients from external data memory to the internal RAM of the transmitter processor for use during the bandlimit subroutine -73.
  • the s-ample input subroutine 71 inputs the pre-emphasized samples from a speech data buffer located in the external data memory to the zero-delay position of a speech delay buffer, which is located in the internal RAM of the transmitter processor. a 85/02KB
  • the inverse filter subroutine 72 implements an all-zero inverse filter in accordance with the LPC coefficients to generat the residual signal 19 ( Figure 1) .
  • the output from this sub ⁇ routine 72 is provided to a residual signal data buffer which is located in the external data memory.
  • the residual signal 20 is generated by inverse filtering the pre-emphasized speech data signal samples 12 in accordance with the LPC coefficients 17. (See Figure 1) .
  • the residual signal 19 is obtained by filtering the speech data signal samples 12 by the all-zero filter H(z) ⁇ . If represents the input speech sample at time n and y represents the corresponding output sample, the filter can be represented by the following difference equation:
  • the simplest way to implement this structure is to place the coefficients a, in a fixed register and to implement the delay buffer using a shift register.
  • the TMS32010 micro-code is optimized to perform this operation using the LTD/MPY commands: the processor has a pipelined Multiply/Accumulate instruction that executes in 400 ns.
  • the bandlimit subroutine 73 low-pass filters the residual signal 20 by implementing an eighth-order elliptic half-band filter, which in turn is implemented by using a cascade of four second-order sections.
  • the transfer function of the elliptic filter is:
  • the second-order polynomial H (z) is implemented by a second- order filter section.
  • the second-order section is implemented by an internal subroutine that is called four times to provide a cascade of four second-order sections.
  • a cascade of four sections is equivalent to an eighth-order elliptic low-pass filter. Each section uses a set of filter coefficients and requires its own delay buffer, which must be shifted at each iteration.
  • a decision 75 that the frame is complete concludes the residual signal generation routine of Figure 4.
  • the LPC coefficient quantization subroutine 68 quantizes the ten LPC reflection coefficients. 14. This subroutine obtains its input data from the LPC reflection coefficients 14 and quantizer look-up subroutine 68 during the operation of the LPC analysis subroutine 66. "This subroutine 68 is called by the LPC analysis subroutine 66.
  • the reflection coefficients are quantised with a variable number of bits per coefficient compatible with DOD standard LPC-10 coding, which is described in T. E. Tremain, "The
  • a data management algorithm performs buffer transfers between internal RAM and external data memory to enable all routines to execute using internal RAM memory.
  • the pitch delay subroutine 78 estimates the pitch period -to determine the pitch delay parameter T of the downsampled residual signal.22 ( Figure 1) used for the PPADPCM quantization using an auto-correlation function (ACF) analysis of the signal 22.
  • the inputs to the algorithm are the partial ACF of the previous frame and the current residual signal frame.
  • the out ⁇ put from the algorithm is the estimated pitch delay T and the updated partial ACF.
  • the pitch delay is updated at the frame rate.
  • Pitch analysis uses a simple auto-correlation detector:
  • the pitch delay, T is chosen as the maximum value of R(T) , evaluating 3 R(T) between Tmm. and Tmax. To enable an accurate estimate of the pitch delay, the analysis must cover three pitch periods, i.e., N>3Tmax.
  • The- limits of the pitch detection are chosen experimentally using Fortran simulations of the
  • R(T) is calculated by adding the current frame's partial-ACF, R 2 (T), and the previous frame's partial-ACF, R_, (T) , that was stored in external data memory.
  • the pitch predictor gain subroutine 80 evaluates the pitch predictor gain parameter B for the PPADPCM quantization and updates such evaluation at the frame rate.
  • the pitch predictor gain B is evaluated as:-
  • M is a single downsampled frame and T is the pitch delay. 3 is constrained between two limits:
  • the quantizer gain subroutine 81 evaluates the quantizer gain parameter q . for the PPADPCM quantization and updates such evaluation at the frame rate. This parameter is used to scale the quantizer to the input signal level; each input and output level of the quantizer is multiplied by q . .
  • the parameter is chosen to be the maximum x :
  • the CRC subroutine 82 introduces an n-bit cyclic redun- dancy code (CRC) on part of the transmission frame to enable detection of bit errors during transmission.
  • CRC cyclic redun- dancy code
  • the code protects the LPC coefficients and PPADPCM parameters.
  • the input to the subroutine is the relevant quantized coefficients.
  • the output from the subroutine is an n-bit CRC- to be transmitted.
  • the quantizer is embedded in the predictor loop so that the error spectrum introduced by quantization is uniform.
  • the parameters of the quantizer are the pitch delay (T) , the quantizer gain (qga. ), the pitch predictor gain (B) , and the order of the quantizer (Q) .
  • T the pitch delay
  • Q the quantizer gain
  • B the pitch predictor gain
  • Q the order of the quantizer
  • the data format subroutine 84 formats a data frame 34 ( Figure 1) for transmission.
  • the input to the subroutine 84 is a predetermined number of quantized residual signal samples 25, the pitch delay parameter 26, the pitch predictor gain 27, the quantizer gain 28, the quantized LPC coefficients 31 ( Figure 1) and the CRC.
  • the output from the subroutine 84 is a transmissio data frame 34 which is placed in the output buffer.
  • a decision 85 that the frame is complete concludes the quantization routine of Figure 5.
  • the calling hierarchy of the. subroutines in the quantiza ⁇ tion routine of Figure 5 is under the control of the main program.
  • the following subroutines are integrated together in a subroutine designated PPQNT: pitch predictor gain 80, quantize gain 81 and PPADPCM quantization 83.
  • the calling hierarchy is as follows: pitch * 78, PPQNT, CRC 82 and data format 84.
  • the subroutine 68 is called by the LPC analysis subroutine 66 in the LPC analysis routine of Figure 3.
  • the receiver digital signal processor utilizes a synthesis processing routine.
  • the synthesis routine includes the following subroutines: initialization 88, data input 89,- CRC check 90, LPC coefficient generation 91, PPADPCM decoding 92, spectral regeneration 93, LPC synthesis filter 94, de-emphasis 95, and speech output 97.
  • the initialization subroutine 88 is included in the main program for the receiver processor.- The initialization sub ⁇ routine 88 initializes all registers and data locations within the processor prior to the execution of each subroutine.
  • the data input subroutine 89 also is included in the main program for the receiver processor. This subroutine inputs the data transmission frame 36 received from the transmitter by inputting the frame from a frame buffer in external data memory.
  • the CRC check subroutine 90 uses the received transmission data frame to generate an n-bit CRC which it compares to the n-bit CRC in the received transmission data frame to check for transmission errors. If any errors are detected, a subset of the LPC and PPADPCM parameters for the current frame are dis- carded and a subset of the previous fr-ame's parameters substituted.
  • the input to this subroutine is an-bit CRC word from the data transmission frame.
  • the output from this subroutine is a flag indicating which set of parameters to use during the rest of the subroutine.
  • the LPC coefficient generation subroutine 91 reads in the transmitted quantized LPC parameters, calls a subroutine: IQRC to decode the LPC reflection coefficients, and performs a step-up algorithm to transform the LPC reflection coefficients to the LPC coefficients ' .
  • the input to this subroutine is the transmitted quantized LPC reflection coefficients 43 and the output is the LPC coefficients 51 ( Figure 2) .
  • the PPADPCM decoding subroutine 92 reads in the bit-packed quantized residual signal 39 and quantizer parameters 40, 41, 42 received from the transmitter and generates a decoded baseband (downsampled) residual ' signal 47 ( Figure 2) .
  • This subroutine 92 must perform the inverse operation of the transmitter's PPADPCM coding. It therefore divides into three parts: unpacking, quantizer look-up, and pitch excitation.
  • the PPADPCM decoding subroutine 92 first transfers the PPADPCM quantizer coefficients to internal RAM -and scales them using the quantizer gain parameter.
  • the inputs to this operation are the coefficient buffer stored in external data memory and the quantizer gain.
  • the output of this operation is the scaled look-up table located in internal RAM.
  • This subroutine 92 next reads in packed data bytes from a data buffer in external data memory, unpacks the byte, and decodes the data samples using the quantizer look-up table.
  • the input to this operation is the bit-packed data word and the quantizer coefficient table.
  • the output from this operation is the set of decoded data samples.
  • the received data Bytes are unpacked into individual data samples by masking off each individual data sample, which may then be decoded using the 29 quantizer look-up table that is identical to the one used at the transmitter to quantize the data samples.
  • the PPADPCM decoding subroutine 92 then implements a variable delay first-order difference equation to "pitch excite" the input data and recover the downsampled residual signal 47.
  • the input to this operation is the transmitted data sample, the pitch delay parameter and the pitch predictor gain parameter.
  • the output from this operation is the downsampled residual signal 47.
  • the difference equation for this operation is:
  • S is the downsampled residual signal sample
  • x is the transmitted data sample
  • B is the pitch predictor gain
  • T is the current frame's pitch delay (period) .
  • the spectral regeneration subroutine 93 is included in the main program for the receiver' processor.
  • the spectral regeneration subroutine 93 generates a full-band residual signal 49 from downsampled residual signal 47. The effect is to convert a 4 kHz downsampled signal 47 to an 8 kHz full-band signal 49.
  • the LPC synthesis filter subroutine 94 implements. an auto- regressive LPC synthesis filter governed by the LPC> coefficients
  • the inputs to this subroutine are the LPC coefficients 51 and the regenerated full-band residual signal 49.
  • the output from this subroutine is the regenerated pre-emphasized speech data signal sample 53.
  • This subroutine 94 generates the speech data signal samples 53 by filtering the residual signal 49 with a tenth-order all-pole filter.-
  • the filter is governed by the generated LPC coefficients 51.
  • the transfer function of the filter is:
  • the filter operation can be represented by the following difference equation:
  • the simplest way to implement this equation is to place the coefficients a, in a fixed register and to implement the delay buffer using a shift register.
  • the ' -TMS32010 micro-code is optimized to perform this operation using the LTD/MPY 31 commands:, the processor has a pipelined Multiply-Accumulate instruction that executes in 400 ns.
  • the de-emphasis subroutine 95 implements a first-order digital de-emphasis filter.
  • the inputs to this subroutine are the current regenerated sample 53, the previous regenerated sample, and the pre-emphasis constant.
  • the output from this subroutine is the regenerated speech data signal sample 55.
  • First-order digital de-emphasis is applied to complement the pre-emphasis function in the transmitter processor.
  • De-emphasis uses a single-delay low-pass filter.
  • the de-emphasis constant (A) is. also set to 0.9375.
  • the difference equation for the filter is:
  • the speech output subroutine 97 also is included in the main program for the receiver processor. This subroutine out ⁇ puts the regenerated speech data signal samples to a data buffer in external data memory from which the samples are provided.
  • CRC check 90 / LPC coefficient generation 91
  • PPADPCM decoding 92 inverse filter 94
  • de-emphasis 95 calls the following subroutines in the following order: CRC check 90 / LPC coefficient generation 91, PPADPCM decoding 92, inverse filter 94 and de-emphasis 95.
  • Transmitter and receiver systems that are commonly located may be included in a single digital processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Cash Registers Or Receiving Machines (AREA)
  • Telephone Function (AREA)
  • Input From Keyboards Or The Like (AREA)
  • Silicon Polymers (AREA)
  • Stereo-Broadcasting Methods (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)
EP19850905709 1984-11-02 1985-11-01 Vocoder relp pour processeurs de signaux numeriques. Withdrawn EP0203940A4 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US667446 1984-11-01
US66744684A 1984-11-02 1984-11-02

Publications (2)

Publication Number Publication Date
EP0203940A1 EP0203940A1 (fr) 1986-12-10
EP0203940A4 true EP0203940A4 (fr) 1987-04-07

Family

ID=24678262

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19850905709 Withdrawn EP0203940A4 (fr) 1984-11-02 1985-11-01 Vocoder relp pour processeurs de signaux numeriques.

Country Status (7)

Country Link
EP (1) EP0203940A4 (fr)
JP (1) JPS63500896A (fr)
AU (1) AU577641B2 (fr)
CA (1) CA1240396A (fr)
DK (1) DK311386A (fr)
NO (1) NO862602L (fr)
WO (1) WO1986002726A1 (fr)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4675863A (en) * 1985-03-20 1987-06-23 International Mobile Machines Corp. Subscriber RF telephone system for providing multiple speech and/or data signals simultaneously over either a single or a plurality of RF channels
JP2626223B2 (ja) * 1990-09-26 1997-07-02 日本電気株式会社 音声符号化装置
US6006174A (en) 1990-10-03 1999-12-21 Interdigital Technology Coporation Multiple impulse excitation speech encoder and decoder
US5235670A (en) * 1990-10-03 1993-08-10 Interdigital Patents Corporation Multiple impulse excitation speech encoder and decoder
ES2143396B1 (es) * 1998-02-04 2000-12-16 Univ Malaga Circuito integrado monolitico codec-encriptador de baja tasa para señales de voz.
US7907977B2 (en) 2007-10-02 2011-03-15 Agere Systems Inc. Echo canceller with correlation using pre-whitened data values received by downlink codec

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3431362A (en) * 1966-04-22 1969-03-04 Bell Telephone Labor Inc Voice-excited,bandwidth reduction system employing pitch frequency pulses generated by unencoded baseband signal
US3750024A (en) * 1971-06-16 1973-07-31 Itt Corp Nutley Narrow band digital speech communication system
GB2102254B (en) * 1981-05-11 1985-08-07 Kokusai Denshin Denwa Co Ltd A speech analysis-synthesis system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ICASSP 84 - IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 19th-21st March 1984, San Diego, US, vol. 2, pages 27.8.1-27.8.4, IEEE, New York, US; M. DANKBERG et al.: "Implementation of the RELP Vocoder using the TMS320" *
ICASSP 85 - IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 26th-29th March 1985, Tampa, US, vol. 3, pages 969-972, IEEE, New York, US; R.L. ZINSER: "An efficient pitch-aligned high-frequency regeneration technique for RELP Vocoders" *
See also references of WO8602726A1 *

Also Published As

Publication number Publication date
AU577641B2 (en) 1988-09-29
NO862602D0 (no) 1986-06-27
DK311386D0 (da) 1986-06-30
CA1240396A (fr) 1988-08-09
AU5019885A (en) 1986-05-15
WO1986002726A1 (fr) 1986-05-09
DK311386A (da) 1986-06-30
EP0203940A1 (fr) 1986-12-10
JPS63500896A (ja) 1988-03-31
NO862602L (no) 1986-09-01

Similar Documents

Publication Publication Date Title
US5903866A (en) Waveform interpolation speech coding using splines
EP0392126B1 (fr) Procédé pour la détermination rapide de la fréquence fondamentale pour des codeurs de parole avec prédiction à long terme
US5339384A (en) Code-excited linear predictive coding with low delay for speech or audio signals
EP0409239B1 (fr) Procédé pour le codage et le décodage de la parole
EP0331857B1 (fr) Procédé et dispositif pour le codage de la parole à faible débit
Andersen et al. Internet low bit rate codec (iLBC)
EP0470975B1 (fr) Méthode et appareil pour reconstruire des signaux de parole traités par une transformation adaptative et non quantifiés
US4964166A (en) Adaptive transform coder having minimal bit allocation processing
EP0573216A2 (fr) Vocodeur CELP
WO1990013110A1 (fr) Codeur transformateur adaptif a prevision a long terme
EP0865029B1 (fr) Interpolation de formes d'onde par décomposition en bruit et en signaux périodiques
EP0673015B1 (fr) Réduction de la complexitée de calcul en cas d'effacement des trames de données ou de perte des paquets de données
US4710959A (en) Voice encoder and synthesizer
JP2645465B2 (ja) 低遅延低ビツトレート音声コーダ
JP2000155597A (ja) デジタル音声符号器において使用するための音声符号化方法
AU577641B2 (en) Relp vocoder implemented in digital signal processors
Cuperman et al. Backward adaptation for low delay vector excitation coding of speech at 16 kbit/s
Chu et al. A frequency weighted Itakura-Saito spectral distance measure
US5673361A (en) System and method for performing predictive scaling in computing LPC speech coding coefficients
US20050256702A1 (en) Algebraic codebook search implementation on processors with multiple data paths
US5937374A (en) System and method for improved pitch estimation which performs first formant energy removal for a frame using coefficients from a prior frame
Sunwoo et al. Real-time implementation of the VSELP on a 16-bit DSP chip
EP1727129A2 (fr) Optimisation des coéfficients de prédiction linéaire de codeurs de parole par descente du gradient
US20040210440A1 (en) Efficient implementation for joint optimization of excitation and model parameters with a general excitation function
EP1326236B1 (fr) Optimisation simultanée de l'excitation et des paramètres du modèle d'un codeur de parole à impulsions multiples

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19860703

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH DE FR GB IT LI LU NL SE

A4 Supplementary search report drawn up and despatched

Effective date: 19870407

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: HUGHES NETWORK SYSTEMS, INC. (A DELAWARE CORPORAT

Owner name: M/A-COM GOVERNMENT SYSTEMS, INC.

17Q First examination report despatched

Effective date: 19881207

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 19890418

RIN1 Information on inventor provided before grant (corrected)

Inventor name: WILSON, PHILIP, JOHN