CA2137880A1 - Speech coding apparatus - Google Patents

Speech coding apparatus

Info

Publication number
CA2137880A1
CA2137880A1 CA002137880A CA2137880A CA2137880A1 CA 2137880 A1 CA2137880 A1 CA 2137880A1 CA 002137880 A CA002137880 A CA 002137880A CA 2137880 A CA2137880 A CA 2137880A CA 2137880 A1 CA2137880 A1 CA 2137880A1
Authority
CA
Canada
Prior art keywords
speech
correlation
auto
code
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002137880A
Other languages
French (fr)
Inventor
Keiichi Funaki
Kazunori Ozawa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
Keiichi Funaki
Kazunori Ozawa
Nec Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Keiichi Funaki, Kazunori Ozawa, Nec Corporation filed Critical Keiichi Funaki
Publication of CA2137880A1 publication Critical patent/CA2137880A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0002Codebook adaptations
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0013Codebook search algorithms

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A speech coding apparatus in which a short term prediction parameter is extracted from an inputted a speech signal and an optimum code vector is searched in a code book minimizing an evaluation function with an auto-correlation functions of a speech source and an impulse response of a synthesis filter generated with the short term parameter to transmit an index of a decided code vector and a gain combined with a spectrum parameter and a pitch parameter is constituted such that the auto-correlations are calculated with an approximation order set to a value of an Ismall being less than the length of the impulse response of the synthesis filter to redcuce the quantity of product and sum operations.

Description

SPECIFICATION

TITILE OF THE INVENTION
SPEECH CODING APPARATUS
BACKGROUND OF THE INVENTION
Field of the Invention This invention relates to a speech coding systems, and more particularly, to a speech coding apparatus for encoding a speech signal in a high quality at a low bit rate, espec~ially at 8-4 kb/s.
Related Art Recently it has been of urgent necessity to implement a digitalized system of such devices as an automotive mobile phone or a codeless phone employing a radio communication.
Since an available frequency band is narrow in a radio communication, it is important to develop a system in which r ~_ 2137880 a speech signal is coded very efficiently in a high quality conditlon at a low bit rate by compressing the speech signal into smaller number of bits.
CELP(Code Excited LPC Coding) as described in a paper titled "Code-excited linear prediction: High quality speech at low bit rates" by M. Schroeder and B. S. Atal(ICASSP
Proc. 85, pp. 937-940, 1985; hereinafter referred to as the "reference No. 1") is known as a coding system in which a speech signal is coded at a low bit rate of 8-4 kb/s.
In this method, an encoding process is carried out at a transmission s-ide in the following procedure. First, for every frame (for example, 20 ms), spectrum parameters representing frequency characteristics of the speech signal are extracted(short-term prediction).
Then each frame is subdivided into narrower subframes (for example, 5 ms). In every subframe, a pitch parameter representing a wide interval correlation(pitch correlation) is extracted from past speech source signals and the long-term prediction of a speech signal in the subframe i,s carried out with the pitch parameter.
Next, a code vector and a gain which minimize the error power between a synthesized signal generated using the code vector extracted from a noise signal (code vector) which is composed of pre-prepared types of quantization codes, and a residual signal obtained by the long-term / T ~ i .

prediction, are decided. The index representing the type of the decided -code vector, the decided gain, the spectrum parameter, and the pitch parameter are transmitted.
More specifically, in search of a quantization code the following procedure is employed. First, a signal z[n] is derived by executing a weighting for compensation of an auditory sense to, and subtraction of a past influence signal from, an inputted speech signal x[n].
Next, a synthesized signal Hej[n] is calculated by driving with a code vector ej[n] of a quantization code j, a synthesis filter H composed of spectrum parameters, obtained by the short-term prediction, quantized, and inversely quantized.
Then, a quantization code j which minimizes Ej representing an error energy between the signal z[n] and the synthesized signal Hej[n], as defined in the following expression, is obtained.

Ns-l 2 (1) E j= ~ (z[n]- H ej[n] ) n=0 In above expression (1), Ns indicates the length of the subframe and H indicates a matrix implementing the synthesis filter. For a practical use, the expression (1) is expanded as follows, Ns-l 2 C i (2) E j= ~ z[n] -n=O G j A numerator Cj in the second term in the above expression (2) is a cross-correlation and a denominator Gj is an auto-correlation, and they are calculated with following expressions (3) and (4) respectively.

Ns-l C j= s z[n]- H ej[n] (3) n=O

Ns-l 2 (4) G j= ~ ( H ej[n]) n=O

The above auto-correlation and cross-correlation are calculated after Hej[n] is calculated by driving the synthesis filter (i.e. filtering). In this case, the number of filtering operations carried out is equal to the size of a code book. Therefore, the quantity of operations, that is, the number of times of product and sum operations(multiply and add operations) for processing one frame becomes vast as seen from the following expression, ~ 2137880 (M- N+N+N)- 2 (5) where M denotes an order of the synthesis filter, N
denotes a length of the frame, and B denotes the number of bits of the speech source.
5A method for calculating a cross-correlation with an inverse filtering and calculatlng an auto-correlation with an auto-correlation approximation method, as described in a paper titled "EFFICIENT PROCEDURES FOR FINDING THE OPTIMUM
INNOVATION IN STOCASTIC CODERS" by I. M. Transco and B. S.
10Atal (ICASSP Proc., p. 2375, 1986; hereinafter referred to as the "reference No. 2") is well known as the method to obtain a code with the reduced quantity of operations.
In this method, a cross-correlation and an auto-correlation are derived as follows. In case of calculating 15the cross-correlation, a value given by the following expression is calculated at first. This process is referred to as an inverse filtering.

H z[n]= ~ h[i- n]- z[i] (6) i=n In the above expression (6), h[n] represents an impulse response of the synthesis filter.
The cross-correlation is calculated with the following -expression using the value obtained from the above expression (6).

C j= ~ H z[n] ej[n] (7) n=O

In this case, the filtering process is carried out only once in calculating the impulse response of the synthesis filter and in the above expression (6) so that the quantity of product and sum operations in each frame for calculating the cross-correlation is given by the following expression (8), M I- sf+ (N - I+ l)- I- sf - I- (I+ l)- sf+ N 2 (8) where sf denotes the number of the subframes in a frame.
The auto-correlation function is calculated with the following approximation expression (9) as described in the reference No.2, G j= hh[O] R j[O]+ 2 ~ hh[i] R j[i] (9) i =O

r l where hh[i] indicates an ith order auto-correlation funct1on of the impulse response of the synthesis filter, Rj[i] indicates an ith order auto-correlation function of the code vector ej[n], and I indicates the order of the impulse response of the synthesis filter.
The order I is-usual-ly set to a value of 21 or so in consideration of an attenuation of the impulse response of the synthesis filter. A transfer function of the synthesis filter is generally represented as an all-pole type 1/A(Z), however it is approximated with an impulse response of limited order (for example, 21) to reduce the quantity of operations.
By calculating beforehand an auto-correlation function of the impulse response and storing it in a data ROM for every speech source, the auto-correlation function may be calculated with a smaller quantity of operations without filtering process. In this method, the auto-correlation is calculated with a quantity of product and sum operations as expressed in the following expression.

. B (10) I ( I + 1 ) sf+ I 2 sf Accordingly, with the above expressions (7) and (9), the quantity of product and sum operations is given by the following expression (11).

M I; sf+ (N - I+ 1) I- sf+ - I- (I+ 1) sf+ N 2 + - I- (I+ 1)- sf+ I- 2 B- sf (11) Under the condition that the length of the frame N=240, the length of the subframe N9=60, the number of the subframes sf=4, the length of the impulse response I=21, the order of the synthesis filtering M=10, and the size of the code book B=7 (bits), the quantity of product and sum operations by the conventional method given as the above expression (5) and the quantity of product and sum operations by the approximation method described in the reference No. 2 in which the quantities of product and sum operations of the cross-correlation and the auto-correlation are given by the above expressions (8) and (10) respectively, are given as those listed in Table 1:

Table 1 Comparison of the quantity of product and sum operations Quantity of Method Expression product and sum operations Filtering for each code vector expression(5) 12.288(MOPS) Approximation expression(8)+
method in the expression(l0) 1.584(MOPS) reference No.2 In above Table 1, MOPS (Million Operations Per Second) indicates a quantity of product and sum operations per second (unit is one million).
As may be seen from Table 1, the approximation method described in the reference No.2 fairly reduces the quantity of product and sum operations as compared with the conventional method (i.e. by nearly one order of magnitude).
However, with the above described approximation method there still remains a large quantity of operations even after the reduction of operations, so that only limited types of processors such as those having a large computational power may carry out such large quantity of operations in a real time processing environment.
In addition, the above described approximation method reduces the quantity of operations in case of searching for the speech source code book in the data ROM which contains . s l~

the auto-correlation functions of the speech sources, -however, the additional number of product and sum operations given by the following expression is needed in case of searching for a code book, such as an adaptive code book, in which a speech source varies by every subframe to be coded, - so that the auto-correlation function must be calculated for .
each code.

( N s I - - I ( I - 1 ) ) 2 sf (12) Under the same condition as that employed in Table 1, the above expression (12) gives a value of 17.92 (MOPS) and a larger quantity of operations are needed than in the conventional method in which the filtering is carried out for each code vector. As a result thereof, the approximation method described in the reference No.2 cannot be employed and it is difficult to carry out operations in a real time processing.
SUMMARY OF THE INVNETION

.In view of the above-mentioned drawbacks of the prior art techniques, it is an object of the present invention to provide a speech coding system with a good tone quality with a smaller quantity of operations even at 4 kb/s.

For accomplishing the above described object, the present invention provides a speech coding apparatus comprising:
a speech analyzing unit for deciding, in every predetermined interval of a spee-ch signal, codes of short-term .prediction parameters representing frequency characteristics of the speech signal, an impulse response calculating unit for calculating an impulse response of a speech synthesis filter generated with the short-term prediction parameters, an inverse filter unit for filtering the speech signal inversely with the impulse response, an adaptive code book for storing an input signal fed to the speech synthesis filter generated within a past speech coding interval, a long-term prediction speech source for generating from the adaptive code book a long-term prediction source representing a pitch correlation of the speech signal, a cross-correlation calculating unit for calculating a cross-correlation between the speech signal and an output signal of the speech synthesis filter fed with said long-term prediction speech source as an input, an auto-correlation calculating unit of an impulse response for calculating an auto-correlation of the impulse response of the speech synthesis filter to an order of ,; ,1 Ismall being less than a length of said impulse response, . an auto-correlation calculation unit of long-term predictlon speech source for calculating an auto-correlation of the long-term prediction speech source to the order Ismall less than the length of said impulse response, an auto-correlation .calculating unit for calculating an auto-correlation of said output signal to the order Ismall less than the length of said impulse response from two types of said auto-correlation function, an evaluation function calculating unit for calculating error energy with results of said auto-correlation and said cross-correlation, an optimum code deciding unit for deciding an optimum long-term prediction code with said evaluation function, a speech source code book comprising speech source signals and quantization codes indicating residual signals after the long-term prediction, and a speech source code book searching unit for deciding an optimum quantization code from said speech source code book.
One of the features of the present invention is that the auto-correlations are calculated with an auto-correlation function approximation order set to a value of an Ismall being less than the length of the impulse response of the synthesis filter.

f , ~

- . . 13 The present invention in the second aspect provides a speech coding apparatus comprising:
a speech analyzing unit for deciding, in every predetermined interval of a speech signal, codes of short-term prediction. parameters representing frequency - characteristics of the speech signal, an adaptive code book for storing an input signal fed to a speech synthesis filter generated in a past speech coding interval, an adaptive code book searching unit for deciding an optimum code from said adaptive code book, an impulse response calculating unit for calculating an impulse response of said speech synthesis filter generated from said short-term prediction parameters, an auto-correlation function calculating unit of an impulse response for calculating an auto-correlation function of the impulse response of said speech synthesis filter to an order of Ismall less than the length of the impulse response, a speech source code book comprising speech source signals and quantization codes indicating residual signals after the long-term prediction, a code vector generating unit for generating a code vector from said speech source code book, an auto-correlation function calculating unit of a .

code vector for obtaining an auto-correlation function of said code vector to the order of Ismall less than the length of said impulse response, ' .an inverse filter unit for inversely filtering said 5 'speech signal'with said impulse responses, a cross-correlation calculating unit for calculating a cross-correlation between said spe'ech signal and an output signal of said speech synthesis filter, with said code vector being fed to said speech synthesis filter as an input, an auto-correlation calculating unit for calculating an auto-correlation of said output signal to the order Ismall less' than the length of said impulse response from two types of said auto-correlation functions, an evaluation function calculating unit for calculating an error energy with said auto-correlation and said cross-correlation, and an optimum code deciding unit for deciding an optimum code vector with said evaluation function.
The present invention in the third aspect provides a speech coding apparatus comprising:
a speech analyzing unit for deciding in every predetermined interval of a speech signal codes of short-term prediction parameters representing frequency characteristics of the speech signal, an- impulse response calculating unit for calculating an im.pulse response of a speech synthesis filter generated . with said short-term prediction parameters, an inverse filter unit for inversely filtering said speech signal with said impulse response, an adaptive code book for storing an input signal fed to said speech synthesis filter generated in a past speech coding interval, a long-term prediction speech source generating unit for generating from said adaptive code book a long-term prediction speech source representing a pitch correlation of said speech signal, a cross-correlation calculating unit for calculating a cross-correlation between said speech signal and an output signal of said speech synthesis filter with said long-term prediction speech sources being fed to said speech synthesis filter as an input, an optimum code deciding unit for deciding an optimum long-term prediction code based on said cross-correlation, a speech source code book comprising speech source signals and quantization codes indicating residual signals after the long-term prediction, and a speech source code book searching unit for deciding an optimum quantization code with said speech source code book.

.

The present invention in the fourth aspect provides a speech coding apparatus comprising: .
a speech analyzing unit for deciding in every predetermined interval of a speech signal codes of short-term prediction parameters - representing frequency characteristics of the speech signal, an adaptive code book for storing an input signal fed to a speech synthesis filter generated in a past speech coding interval, an adaptive code book searching unit for deciding an optimum code from said adaptive code book, an impulse response calculating unit for calculating an impulse response of said speech synthesis filter generated with said short-term prediction parameters, a speech source code book comprising speech source signals and quantization codes indicating residual signals after the long-term prediction, a code vector generating unit for generating a code vector from said speech source code book, an inverse filter unit for inversely filtering said speech signal with said impulse responses, a cross-correlation calculating unit for calculating a cross-correlation between said speech signal and an output signal of said speech synthesis filter with said code vectors being fed to said speech synthesis filter as an input, and an optimum code deciding unit for deciding an optimum code vector with said cross-correlation.
The present invention in the above first and second aspects, provides a speech coding apparatus preferably comprising an approximation order deciding unit for deciding an order of Ismall for calculating an auto-correlation for each interval of a speech signal to be coded.
According to the present invention, by using, preferably in the CELP method, a value of the auto-correlation approximation order of Ismall smaller than the length of an impulse response of a synthesis filter I, accomplishes the significant reduction in the quantity of product and sum operations in a calculation of an auto-correlation, reduces the quantity of product and sumoperations in an auto-correlation for each code, and also prevents tone quality from being degraded.
In addition, the present invention quickly obtains an auto-correlation function of a code vector by table lookup of a speech source auto-correlation code book in which auto-correlation values of a speech source code book are stored beforehand, reduces the quantity of operations in a calculation of an auto-correlation by using a value of an approximation order of Ismall smaller than a length of an impulse response, reduces the number of auto-correlation functions of impulse responses of a synthesis filter, and reduces the memory capacity of a ROM in the speech source auto-correlation code book.
Further, the present invention significantly reduces the quantity of operations, as shown in the above Table 2, by setting the auto-correlation approximation order Ismall to 1 and representing an evaluation function only by cross-correlations without degradation of tone quality.
Furthermore, the present invention provides a speech coding apparatus which reduces the quantity of product and sum operations for an auto-correlation and efficiently prevents tone quality from being degraded by variably controlling, according to the characteristics of coded speech signals, the auto-correlation approximation order Ismall with an approximation order deciding circuit.

In the above described auto-correlation approximation method of the prior art example, the approximation order is set to I being equal to the length of the impulse response of the synthesis filter.
The present invention has been developed based on the knowledge by the present inventors that it is not needed to match the approximation order to the length of the impulse response of the synthesis filter I and that the auto-correlation may well be approximated with good accuracy even by a very small value Ismall.

.
i9 That is, with the present invention, by setting the approximation order of an auto-correlation to Ismall less - than the length of the impulse response of the synthesis filter I, the quantities of operations required for - calculating an auto-correlation function of the speech -source and the impulse response and for calculating an auto-correlation of the synthesized signal are reduced. In addition, the present invention may reduce a memory capacity required for calculating the auto-correlation function of the speech source and the impulse response.
With the present invention the evaluation function may be calculated only with the cross-correlation if the approximation order Ismall is set to 1, so that the quantity of product and sum operations for calculating the auto-correlation may be reduced significantly.
Furthermore, with the present invention the approximation order Ismall may be variably controlled according to characteristics of the coded speech signal.
Brief Description of the Drawin~s The above and other objects, features and advantages of the present invention will be more apparent from the following description taken in conjunction with the accompanying drawings, in which Fig. 1 is a block diagram showing a whole structure of a speech coding and decoding apparatus of the present .

invention;
Fig.2 is a flow chart in operation of a circuit 160 according to a first embodiment of the present invention;
Fig.3 is a flow chart in operation of a circuit 180 -according to an embodiment of the present invention;
Fig.4 is a flow chart in operation of a circuit of 160 according to another embodiment of the present invention;
Fig. 5 is a flow chart in operation of a circuit 180 according to a still another embodiment of the present invention;
Fig. 6 is a flow chart in operation of a circuit 160 according to still another embodiment of the present invention; and Fig.7 is a flow chart in operation of a circuit 160 according to a yet another fifth embodiment of the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
Referring to the drawings, preferred embodiments of the present invention will be described in detail.
Fig. 1 is a schematic diagram showing a speech coding and decoding apparatus according to the present invention.
In Fig. 1, a component (1) shown in the left side and a component (2) shown in right side represent a coding circuit (encoder) and a decoding circuit (decoder) respectively.
First, each component module is explained in the below.

An input terminal 100 is an speech input terminal of an encoder. A buffer circuit 110 is a circuit for storing a speech signal. An LPC analyzing circuit 120 is a circuit for extracting an LPC coefficie-nt, that is, a spectrum parameter of the speech signal. A parameter quantization circuit 130 is a circuit for quantizing the LPC coefficient. A weighting circuit 140 is a circuit for weighting to the speech signal for compensating an auditory sense. An adaptive code book 150 is a circuit for storing past speech sources. An adaptive code book searching circuit 160 is a circuit for searching for a long-term prediction parameter.
A speech source code book 170 is a code book in which code vectors representing long-term prediction residuals, the length of which is equal to the length of subframes, are being stored. This book 170 may be either a noise code book or a learning code book in which learning is made by a vector quantization (VQ) algorithm. The former has been disclosed in detail in the reference No.1, while the latter has been proposed in Japanese Patent Kokai JP-A Nos. Hei 3-243998(1991), Hei 3-243999(1991) by one of the inventors of the present invention.
A speech source code book searching circuit 180 is a circuit for deciding an optimum code vector from the speech source code book 170. A gain code book 190 is a code book in which long-term prediction speech sources and parameters representing gain- terms in the code vector are stored. A
gain code book searching circuit 200 is a circuit for deciding a long-term prediction speech source and a quantlzatlon gain of the code vector from the gain code book 190.
A multiplexer 210 is a circuit for combining code series to output them. A demultiplexer 220 is a circuit for decoding the encoded codes into code series. A synthesis filter 230 is a circuit for reproducing a speech signal from a generated speech source and a speech synthesis filter. An output terminal 240 is an speech output terminal of a decoder.
In operation a speech signal is inputted through the input port 100 and stored in the buffer 110. By executing a short-term prediction analysis based on given samples of the speech signal stored in the buffer 110, the LPC analyzing circuit 120 calculates an LPC coefficient representing spectrum characteristics of the speech signal.

The spectrum parameter (LPC coefficient) obtained by the LPC analyzing circuit 120 is quantized by the parameter quantizing circuit 130. The quantized code of the LPC
coefficient is sent to the multiplexer 210 and the quantized code is inversely quantized to be used in the subsequent coding processes.
The speech signal stored in the buffer 110 is weighted .

: 23 for compensation of an auditory sense with the LPC
coefflcient quantized/inversely quantized by the weighting circuit 140 to be used in the subsequent code book searching.
Code book searching is executed with the adaptive code - book 150, the speech source code book 170, and the gain code book 190 respectively.
~irst, the adaptive code book searching circuit 160 executes a long-term prediction, decides a long-term prediction parameter representing a pitch correlation, transfers the code of the long-term prediction parameter to the multiplexer 210, and generates a long-term prediction speech source. The operation of the circuit 160 according to the present invention will be described in detail later.
Next, after the effect of the speech source representing the obtained long-term correlation is subtracted, the speech source code book searching circuit 180 then searches in the speech source code book to decide a speech source code, generates a code vector, and transfers the speech source code to the multiplexer 210.
After the long-term prediction signal and the code vector are obtained, the gain code book searching circuit 200 calculates gains of the two speech sources and transfers each gain code to the multiplexer 210.
The mulltiplexer 210 combines each code to convert the combined code into and thus outputs a transmission code.
This code lS supplied to the demultiplexer 220 which in turn decomposes the inputted transmission code into each code. It generates a filter from the code representing the LPC factor and transfers it to the synthesis filter 230.
A long-term prediction speech source is generated from the code representing the long-term prediction parameters with the adaptive code book 150, a code vector is generated from the speech source code with the speech source code book 170, and the gains of the code vectors of the adaptive code book 150 and the speech source code book 170 are calculated from the gain code. An input signal fed to the synthesis filter is generated by multiplying each speech source with the gain term. Finally the synthesis filter 230 synthesizes a speech signal with the input signal.
Turning to Fig.2, the processing procedure in the adaptive code book searching circuit 160 is shown as a first embodiment of the present invention.
In Fig.2, (a) is a step for calculating an impulse response of a speech synthesis filter from an order of 0 to I- 1.
(o) is a step for calculating an auto-correlation function of the impulse response from an order of 0 to Ismall- 1. Ismall is set such that I>Ismall may be met.
(b) is a step for inversely filtering the speech `l 2137880 signal with the impulse response (see the above expression : (6)).
(c) is a step for setting the range within which a speech source is searched for.
(dl) is a step for generating a long-term prediction speech source corresponding to each code with the adaptive code book 150.
(el) is a step for calculating an auto-correlation function of the generated long-term prediction speech source from an order of 0 to Ismall- 1.
(f) is a step for calculating an approximate auto-correlation to an order Ismall based on the above approximation expression (9).
(g) is a step for calculating a cross-correlation based on the above expression (7).
(hl) is a step for calculating an evaluation functlon following the above expression (2).
(i) is a step for deciding an optimum code which minimizes the evaluation function.
Describing in more detail, first, in the step (a), an impulse response of the synthesis filter h[0]~ h[I- 1] is derived and in (o), an auto-correlation function from an order of 0 to Ismall- 1 is calculated. Then in the step (b), an inverse filtering is performed.
In the step (c), the range within which the code book is searched is set and the processes by the steps (dl) to (hl) are executed for each search code. Assuming that the -number of bits of a speech source B =7, a large amount of processing is required to execute processes (dl) to (hl) for 128 codes, so that in the step (c) the code book to be searched is limited only within the predetermined range.
The step (dl) generates a long-term prediction speech source (ej[n]) corresponding to each code (e.g. j) with the adaptive code book.
The step (el) calculates an auto-correlation functions of the speech source with a generated speech source (ej[n]) from an order of 0 to Ismall- l(Rj[i]; i=0~ Ismall).
The step (f) calculates an auto-correlation Gj with the auto-correlation function of the speech source (Rj[i]) obtained in the step (el) and the auto-correlation function of the synthesis filtering (hh[i]) by the auto-correlation approximation method, as expressed in the above expression (9) -In this case, the auto-correlation Gj is calculated to an order of Ismall less than the length of the impulse response I to reduce the quantities of operations (I in the above expression (9) is equal to Ismall). Setting Ismall to a lower order reduces the quantities of operations in a calculation of the auto-correlation Gj, the auto-correlation function of the impulse responses obtained in the step (a), .

.

and the auto-correlation function of the speech source in the step (el). Further, it reduces RAM regions.
- Next, the step (g) calculates a cross-correlation Cj with the output of the inverse filtering. The step (hl) calculates an evaluation function as expressed in the above expression (2) with the obtained auto-correlation and cross-correlation. The step (i) decides a code for minimizing the evaluation function as an optimum code.
Referring to Fig.3, there is shown a flow chart in operation of the speech source code book searching circuit 180, (d2) is a step for generating a code vector corresponding to each code with a speech source code book 170. (e2) is a step for generating an auto-correlation function of the speech source and calculating an auto-correlation function corresponding to each search code bytable lookup method into a speech source auto-correlation code book 175. Other steps are the same as those shown in Fig.2.
In operation, the step (a) calculates an impulse response of the synthesis filter and the step (o) calculates its auto-correlation functions from an order of 0 to Ismall - 1. The step (b) then executes an inverse filtering. The step (c) sets the range within which the code book is searched for and the processes by the steps (dl) to (hl) are executed for each search code.

.

The step (d2) generates a code vector corresponding to each code from the speech source code book 170.
- The step (e2) calculates an auto-correlation function of the code vector from an order of 0 to Ismall- 1. Unlike the adaptive code book, values contained in the speech source code book are predetermined. As a result thereof, the auto-correlation values of the code vectors are stored beforehand in the speech source auto-correlation code book 175 and the auto-correlation function of the code vector is obtained by referring to a speech source auto-correlation code book 175.
An auto-correlation Gj is calculated with the auto-correlation function of the speech source and the auto-correlation function of the synthesis filter by the auto-correlation approximation method. In this case, the auto-correlation Gj is calculated to an order Ismall less than the length of the impulse response I to reduce the quantity of operations. Setting Ismall to a lower order reduces the quantity of operations for calculating the auto-correlation Gj and the number of the auto-correlation functions of the impulse response obtained in the step (a). Furthermore, it reduces the memory capacity of the ROM in the speech source auto-correlation code book 175.
Next, the step (g) calculates a cross-correlation Cj with the output of the inverse filter. The step (hl) :- .

calculates an evaluation function with the obtained auto-correlatlon and cross-correlation. The step (i) decides a code which minimizes the evaluation function as the optimum code.
Referring to Fig.4, the adaptive code book searching circuit 160 according to the present embodiment includes a sep (h2) for calculating evaluation function only with cross-correlations. Other modules used in the present embodiment are the same as the ones used in the first embodiment.
The difference between the present embodiment and the first embodiment is that in the present embodiment, the evaluation function is expressed only by a cross-correlation, so that the calculation of an auto-correlation function of an impulse response, a code book for an auto-correlation function of the speech source, and the calculation of an auto-correlations are not required. As a result thereof, a smaller quantity of operations are needed.
The present embodiment corresponds to the first embodiment in which the order Ismall is set to 1.
Turning to Fig.5, there is shown a flow chart of the speech source code book searching circuit 180. The difference between the present embodiment and that shown in Fig.2 is that in the present embodiment, an evaluation function is represented only by a cross-correlation. A

calculation of an auto-correlation function of the impulse response and autb-correlations, and speech source auto-correlation function code book are not required, so that a smaller quantity of operations and less memory capacity are needed. Also-in the present embodiment, the order Ismall is set to 1. Experiment has shown that even when the order Ismall is set to 1 and no auto-correlation is calculated, there exists no special deterioration in coded speech signals.
Fig.6 and Fig.7 show the process procedures in the adaptive code book searching circuit 160 according to another embodiment. In the present embodiment, a step (m) is added to the above described first or second embodiment.
The step (m) is a circuit to decide the approximation order Ismall of an auto-correlation and sets a value of Ismall according to characteristics of encoded speech signals. A value of Ismall is a variable to search a code book and need not be transmitted. With the present embodiment, the approximation order Ismall is varied according to characteristics of coded speech signals such as voiced or unvoiced ones.
In each of the above mentioned embodiments, the invention is described using the LPC analyzing circuit, however other analyzing methods such as the BURG method for extracting a spectrum parameter may accomplish the same . . . . .

effect.
In addition, in each of the above embodiments the invention is described using the LPC coefficient, however it is obvious that other spectrum parameters such as PARCOR
coefflcient or the LSP (Line Spectrum Pair) coefficient may accompllsh the same effect. Furthermore, in each of the above embodiments the speech source code book searching circuit is of a single-stage structure, however a multi-stage structured speech source code book searching circuit may as a matter of course accomplish the same effect.
As is stated, according to the invention the above expression (10) may be replaced with the following expression (13) by using, preferably in the CELP method, with a value of the auto-correlation approximation order Ismall being smaller than the length of an impulse response I. .

(I- I small- - I small- (I small- 1)) sf+ I small- 2 sf (13) The quantity of product and sum operations of an auto-correlation function for each code given by the above expression (12) is replaced with the following expression (14).

-(N I small- - -I small- (I small- 1)) 2 sf (14) s 2 In this case, when the quantity of product and sum operations is calculated on the same condition as in Table 1 with Ismall=1 and Ismall=O, the results are given in Table 2:
Table 2 Comparison of the quantity of product and sum operations Quantity of Method Expression product and sum operations Filtering all code vectors expression(5) 12.288(MOPS) Searching the speech expression(8)+
source code book by expression(10) the approximation in the No.2 document 1.584(MOPS) Searching the expression(8)+
adaptive code book by expression(10)+
the approximation in expression(12) the No.2 document 19.504(MOPS) Searching the expression(8)+
adaptive code book expression(13)+
with Ismall=1 expression(14) 2.239(MOPS) Searching the speech expression(8)+
source code book expression(13) with Ismall=1 1.215(MOPS) I small= 0 expression(8) 1.195(MOPS) ' . . 33 As may be seen from in Table 2, the present invention significantly reduces the quantity of the product and sum operations in the adaptive code book searching circuit, for example, with the approximation order Ismall=l as comparied .with the approximation described.in the reference No.2.
The preferred embodiments described herein are therefore illustrative and not restrictive, the scope of the invention being indicated by the appended claims and all variations which come within the meaning of the claims are intended to be embraced therein.

Claims (13)

1. A speech coding method comprising the steps of:
inputting a speech signal, searching for an optimum code vector in a code book minimizing an evaluation function based on an auto-correlation functions of a speech source, and generating an impulse response of a synthesis filter with a short term parameter of the speech signal and an auto-correlation of a synthesized signal to transmit at least an index of a decided code vector, wherein the auto-correlations are calculated with an auto-correlation function approximation order set to a value of an Ismall being less than the length of the impulse response of the synthesis filter.
2. A speech coding apparatus comprising:
a speech analyzing means for deciding, in every predetermined interval of a speech signal, codes of short-term prediction parameters representing frequency characteristics of the speech signal;
an impulse response calculating means for calculating an impulse response of a speech synthesis filter generated with the short-term prediction parameters;
an inverse filter means for filtering the speech signal inversely with the impulse response;
an adaptive code book for storing an input signal fed to the speech synthesis filter means generated within a past speech coding interval;
a long-term prediction speech source means for generating from the adaptive code book a long-term prediction source representing a pitch correlation of the speech signal;
a cross-correlation calculating means for calculating a cross-correlation between the speech signal and an output signal of the speech synthesis filter means fed with said long-term prediction speech source as an input;
an auto-correlation calculating means of an impulse response for calculating an auto-correlation of the impulse response of the speech synthesis filter means to an order of Ismall being less than a length of said impulse response;
an auto-correlation calculation means for calculating an auto-correlation of the long-term prediction speech source means to the order Ismall less than the length of said impulse response;
an auto-correlation calculating means for calculating an auto-correlation of said output signal to the order Ismall less than the length of said impulse response from two types of said auto-correlation function;
an evaluation function calculating means for calculating error energy with results of said auto-correlation and said cross-correlation;
an optimum code deciding means for deciding an optimum long-term prediction code with said evaluation function;
a speech source code book comprising speech source signals and quantization codes indicating residual signals after the long-term prediction; and a speech source code book searching means for deciding an optimum quantization code with said speech source code book.
3. The speech coding apparatus as defined in claim 2, further comprising an approximation order deciding means for deciding an order Ismall for calculating an auto-correlation corresponding to an interval of the speech signal to be coded.
4. The speech coding apparatus as defined in claim 2, further comprising a means for setting a range of said code book to be searched to a predetermined one.
5. A speech coding apparatus comprising:
a speech analyzing means for deciding, in every predetermined interval of a speech signal, codes of short-term prediction parameters representing frequency characteristics of the speech signal;
an adaptive code book for storing an input signal fed to a speech synthesis filter generated in a past speech coding interval;
an adaptive code book searching means for deciding an optimum code from said adaptive code book;

an impulse response calculating means for calculating an impulse response of said speech synthesis filter generated from said short-term prediction parameters;
an auto-correlation function calculating means of an impulse response for calculating an auto-correlation function of the impulse response of said speech synthesis filter to an order of Ismall less than the length of the impulse response;
a speech source code book comprising speech source signals and quantization codes indicating residual signals after the long-term prediction;
a code vector generating means for generating a code vector from said speech source code book;
an auto-correlation function calculating means of a code vector for obtaining an auto-correlation function of said code vector to the order of Ismall less than the length of said impulse response;
an inverse filter means for inversely filtering said speech signal with said impulse responses;
a cross-correlation calculating means for calculating a cross-correlation between said speech signal and an output signal of said speech synthesis filter, with said code vector being fed to said speech synthesis filter as an input;
an auto-correlation calculating means for calculating an auto-correlation of said output signal to the order of Ismall less than the length of said impulse response from two types of said auto-correlation functions;
an evaluation function calculating means for calculating an error energy with said auto-correlation and said cross-correlation; and an optimum code deciding means for deciding an optimum code vector with said evaluation function.
6. The speech coding apparatus as defined in claim 5, further comprising an approximation order deciding means for deciding an order Ismall for calculating an auto-correlation corresponding to an interval of the speech signal to be coded.
7. The speech coding apparatus as defined in claim 5, further comprising a speech source auto-correlation code book storing auto-correlation values of said code vector wherein said auto-correlation function generating means generates an auto-correlation function by table lookup of said speech source auto-correlation function code book with a code vector generated by said code vector generating means.
8. The speech coding apparatus as defined in claim 5, further comprising a means for setting a range of said code book to be searched to a predetermined one.
9. A speech coding apparatus comprising:

a speech analyzing means for deciding in every predetermined interval of a speech signal codes of short-term prediction parameters representing frequency characteristics of the speech signal;
an impulse response calculating means for calculating an impulse response of a speech synthesis filter generated with said short-term prediction parameters;
an inverse filter means for inversely filtering said speech signal with said impulse response;
an adaptive code book for storing an input signal fed to said speech synthesis filter generated in a past speech coding interval;
a long-term prediction speech source generating means for generating from said adaptive code book a long-term prediction speech source representing a pitch correlation of said speech signal;
a cross-correlation calculating means for calculating a cross-correlation between said speech signal and an output signal of said speech synthesis filter with said long-term prediction speech sources being fed to said speech synthesis filter as an input;
an optimum code deciding means for deciding an optimum long-term prediction code based on said cross-correlation;
a speech source code book comprising speech source signals and quantization codes indicating residual signals after the long-term prediction; and a speech source code book searching means for deciding an optimum quantization code with said speech source code book.
10. The speech coding apparatus as defined in claim 9, further comprising a means for setting a range of said code book to be searched to a predetermined one.
11. A speech coding apparatus comprising:
a speech analyzing means for deciding in every predetermined interval of a speech signal codes of short-term prediction parameters representing frequency characteristics of the speech signal;
an adaptive code book for storing an input signal fed to a speech synthesis filter generated in a past speech coding interval;
an adaptive code book searching means for deciding an optimum code from said adaptive code book;
an impulse response calculating means for calculating an impulse response of said speech synthesis filter generated from said short-term prediction parameters;
a speech source code book comprising speech source signals and quantization codes indicating residual signals after the long-term prediction;
a code vector generating means for generating a code vector from said speech source code book;

an inverse filter means for inversely filtering said speech signal with said impulse responses;
a cross-correlation calculating means for calculating a cross-correlation between said speech signal and an output signal of said speech synthesis filter means with said code vectors being fed to said speech synthesis filter means as an input; and an optimum code deciding means for deciding an optimum code vector with said cross-correlation.
12. The speech coding apparatus as defined in claim 11, further comprising a speech source auto-correlation code book storing auto-correlation values of said code vector wherein said auto-correlation function generating means of a code vector generates an auto-correlation function by table lookup of said speech source auto-correlation function code book with a code vector generated by said code vector generating means.
13. The speech coding apparatus as defined in claim 11, further comprising a means for setting a range of said code book to be searched to a predetermined one.
CA002137880A 1993-12-14 1994-12-12 Speech coding apparatus Abandoned CA2137880A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP5342140A JP2979943B2 (en) 1993-12-14 1993-12-14 Audio coding device
JP342140/1993 1993-12-14

Publications (1)

Publication Number Publication Date
CA2137880A1 true CA2137880A1 (en) 1995-06-15

Family

ID=18351440

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002137880A Abandoned CA2137880A1 (en) 1993-12-14 1994-12-12 Speech coding apparatus

Country Status (3)

Country Link
EP (1) EP0658877A2 (en)
JP (1) JP2979943B2 (en)
CA (1) CA2137880A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3616432B2 (en) * 1995-07-27 2005-02-02 日本電気株式会社 Speech encoding device
JP3157116B2 (en) * 1996-03-29 2001-04-16 三菱電機株式会社 Audio coding transmission system
US5745872A (en) * 1996-05-07 1998-04-28 Texas Instruments Incorporated Method and system for compensating speech signals using vector quantization codebook adaptation
TW419645B (en) * 1996-05-24 2001-01-21 Koninkl Philips Electronics Nv A method for coding Human speech and an apparatus for reproducing human speech so coded
US6789059B2 (en) * 2001-06-06 2004-09-07 Qualcomm Incorporated Reducing memory requirements of a codebook vector search

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FI98104C (en) * 1991-05-20 1997-04-10 Nokia Mobile Phones Ltd Procedures for generating an excitation vector and digital speech encoder

Also Published As

Publication number Publication date
EP0658877A2 (en) 1995-06-21
JP2979943B2 (en) 1999-11-22
JPH07168596A (en) 1995-07-04

Similar Documents

Publication Publication Date Title
EP0409239B1 (en) Speech coding/decoding method
US8364473B2 (en) Method and apparatus for receiving an encoded speech signal based on codebooks
CA2202825C (en) Speech coder
US5675702A (en) Multi-segment vector quantizer for a speech coder suitable for use in a radiotelephone
US5208862A (en) Speech coder
US5485581A (en) Speech coding method and system
US5140638A (en) Speech coding system and a method of encoding speech
EP0657874B1 (en) Voice coder and a method for searching codebooks
US5727122A (en) Code excitation linear predictive (CELP) encoder and decoder and code excitation linear predictive coding method
US5682407A (en) Voice coder for coding voice signal with code-excited linear prediction coding
US5926785A (en) Speech encoding method and apparatus including a codebook storing a plurality of code vectors for encoding a speech signal
EP0842509A1 (en) Method and apparatus for generating and encoding line spectral square roots
US5526464A (en) Reducing search complexity for code-excited linear prediction (CELP) coding
EP0778561B1 (en) Speech coding device
EP0401452B1 (en) Low-delay low-bit-rate speech coder
US5873060A (en) Signal coder for wide-band signals
US5797119A (en) Comb filter speech coding with preselected excitation code vectors
CA2137880A1 (en) Speech coding apparatus
CA2130877C (en) Speech pitch coding system
EP0694907A2 (en) Speech coder
KR100556278B1 (en) Vector Search Method
EP1355298A2 (en) Code Excitation linear prediction encoder and decoder
JP3192051B2 (en) Audio coding device
JP3230380B2 (en) Audio coding device
AU702506C (en) Method and apparatus for generating and encoding line spectral square roots

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued