CA2084338C - Method for speech coding and voice-coder - Google Patents
Method for speech coding and voice-coderInfo
- Publication number
- CA2084338C CA2084338C CA002084338A CA2084338A CA2084338C CA 2084338 C CA2084338 C CA 2084338C CA 002084338 A CA002084338 A CA 002084338A CA 2084338 A CA2084338 A CA 2084338A CA 2084338 C CA2084338 C CA 2084338C
- Authority
- CA
- Canada
- Prior art keywords
- excitation
- codevector
- codebook
- adaptive
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 15
- 230000005284 excitation Effects 0.000 claims abstract description 175
- 230000003044 adaptive effect Effects 0.000 claims abstract description 95
- 239000013598 vector Substances 0.000 claims abstract description 20
- 238000013139 quantization Methods 0.000 claims abstract description 15
- 230000003595 spectral effect Effects 0.000 claims description 33
- 238000004458 analytical method Methods 0.000 claims description 8
- 238000010586 diagram Methods 0.000 description 3
- 238000005311 autocorrelation function Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 101000622137 Homo sapiens P-selectin Proteins 0.000 description 1
- 101100521130 Mus musculus Prelid1 gene Proteins 0.000 description 1
- 102100023472 P-selectin Human genes 0.000 description 1
- 101000873420 Simian virus 40 SV40 early leader protein Proteins 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0003—Backward prediction of gain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0004—Design or structure of the codebook
- G10L2019/0005—Multi-stage vector quantisation
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Abstract
A method for speech coding and a voice-coder for coding speech signals divided into frames spaced with a constant interval are disclosed. An adaptive codebook storing excitation signal determined in advance and a plurality of excitation codebooks for multi-stage vector quantization are provided. Each frame is divided into subframes. For each subframe, a candidate of a first predetermined number of adaptive codevectors is selected, and then candidates of each predetermined number of excitation codevectors are selected from each excitation codebook, respectively, by using the candidate of the adaptive codevector. Finally, a combination of the adaptive codevector and each of the excitation codevector is selected from the candidates of the adaptive codevector and each of the sound codevectors.
Description
METHOD FOR SPEECH CODING AND VOICE-CODER
BACRGROUND OF THE lNV~N'l'ION:
Field of the Invention:
The present invention relates to a method for speech coding and to a voice-coder, particularly to a method for speech coding and to a voice-coder which can achieve high coding quality with a relatively small operation at bit rates not greater than 8 kbit/s.
Description of the Related Art:
As a speech coding system to be applied to vector quantization of excitation signals at low bit rates by using a excitation codebook comprising random numbers, a CELP system described in a paper (hereinafter referred to as literature 1) titled "CODE-EXCITED LINEAR
PREDICTION (CELP): HIGH-QUALITY SPEECH AT VERY LOW BIT
RATES~ (Proc. ICASSP, pp. 937-940, 1985) by Manfred R.
Shroeder and Bishnu S. Atal is known. There is also a CELP system using an adaptive codebook described in a paper (hereinafter referred to as literature 2) titled "IMPROVED SPEECH QUALITY AND EFFICIENT VECTOR
QUANTIZATION IN SELP" (Proc. ICASSP, pp. 155-158, 1988) by W. B. Kleijin, D. J. Krasinski and R. H. Ketchum.
The CELP system using the adaptive codebook receives speech signals divided into frames spaced with a constant interval. The CELP utilizes a linear predictive analyzer for obtaining spectral parameters of input speech signals, the adaptive codebook having excitation signals determined in the past, the excitation codebook comprising random numbers to be used for vector quantization of the excitation signals of said input speech signals. The CELP selects an adaptive codevector by using the input speech signal and the synthesized signal of the adaptive codevector for every subframe made by equally dividing the frame.
Subsequently, CELP performs selection of excitation codevectors by using the input signals, the synthesized signal of the selected adaptive codevector and said excitation codevector.
However, the CELP systems have the following disadvantage, in that a quite large operation is required for searching the excitation codebook.
Moreover since the adaptive codebook is determined independently of the excitation codebook, it is impossible to get a high SN (signal to noise) ratio.
Further in the above CELP system, although the adaptive codebook and the excitation codebook are each searched by using gains not quantized, it becomes possible to obtain a higher SN ratio when the adaptive codebook and the excitation codebook are searched for all the quantization value of gains. Furthermore, it is impossible to obtain sufficiently good speech quality with low bit rates such as 8 kbit/s or less because of the too small size of the excitation codebook.
SUMMARY OF THE lNV~N'l'ION:
An object of the present invention is to provide a method for speech coding which can solve the above problem of the conventional method and achieve high quality speech by a relatively small operation even at the low bit rates such as less than 8 kbit/s.
Another object of the present invention is to provide a voice-coder which can solve the above problem of the conventional method and achieve high quality speech by a relatively small operation even at low bit rates such as less than 8 kbit/s.
The object of the present invention can be achieved by a method for speech coding for coding speech signals divided into frames spaced with a constant interval, wherein an adaptive codebook storing excitation signals determined in the past and a plurality of excitation codebooks for multi-stage vector quantization of an excitation signal of the input speech signal are prepared; a spectral parameter of said input speech signal is obtained; said frame is divided into subframes; a candidate of a first fixed number of adaptive codevectors is selected for every said subframe 2 0 ~ 8 from said adaptive codebook by using sald lnput speech slgnal and sald spectral parameter; candldates of a second flxed number of excitatlon codevectors are selected for every sald subframe from sald excltatlon codebooks, respectlvely, by uslng sald lnput speech slgnal, sald spectral parameter and the candldate of sald adaptlve codevector; and an optlmum comblnation of the adaptlve vector and each of the excltation codevectors formlng an excltatlon slgnal of sald subframe ls selected from the candldates of sald adaptive codevector and each of sald excltatlon codevector by uslng sald lnput speech slgnal and sald spectral parameter.
Another ob~ect of the present lnventlon ls achleved by a volce-coder for codlng speech slgnals dlvlded lnto frames spaced with a constant interval, comprising: linear prediction analysls means for outputtlng spectral parameters of lnput speech slgnals; an adaptlve codebook for storlng excltatlon slgnals determlned ln the past; a plurality of excitatlon codebooks provlded for multl-stage vector quantlzatlon of the excltatlon slgnal of said lnput speech signals; whereln, in case of searchlng for a combinatlon of the adaptlve codevector and each of the excltatlon codevectors for every subframe prepared by further dlvlslon of sald frame, from sald adaptlve codebook and each of sald excltatlon codebooks, respectlvely, sald comblnatlon of the adaptlve codevector and each of the excltatlon codevectors formlng an excltatlon slgnal of sald subframe; a candldate of a flrst predetermlned number of adaptlve codevectors ls selected from sald adaptlve codebook by uslng sald lnput speech slgnal and sald spectral parameter;
'"~
v 74570-20 candldates of each predetermined number of excltation codevectors are selected from a plurallty of sald excltatlon codebooks respectlvely by using said input speech slgnal, said spectral parameter and the candldate of sald adaptlve codevector; and an optlmum candldate of the adaptive codevector and each of excitatlon codevectors formlng the excitatlon slgnal of sald subframe ls selected from the candldate of sald adaptlve codevector and each of sald excltatlon codevectors by uslng said input speech signal and said spectral parameter.
Another ob~ect of the present lnvention is also achleved by a voice-coder for codlng speech slgnals dlvlded lnto frames spaced wlth a constant lnterval, comprlslng llnear predlctlon analysls means for outputtlng spectral pararneters of lnput speech slgnals; an adaptlve codebook storlng excltatlon signals determined in the past; a plurality of excitatlon codebooks provlded for multl-stage vector quantlzatlon of an excltatlon slgnal of sald lnput speech slgnals; subframe dlvlslon means for generatlng subframe slgnals by dividlng sald frame lnto subframes; flrst selectlon means for selecting a candidate of a first fixed number of adaptive codevectors from sald adaptlve codebook ln accordance wlth sald subframe signal and said spectral parameter; second selectlon means provlded for every sald excitatlon codebook for selectlng the candidate of the excitation codevectors of the number predetermlned for every excltatlon codebook, from the correspondlng excitation codebook ln accordance with sald subframe slgnal, said spectral parameter and the candldate of B
208~38 said adaptlve codevector; and means for searching candldate of the adaptlve vector and the excltatlon codevectors whlch forms the excltatlon slgnal of sald subframe, from the candldate of sald adaptlve codevector and the candidate of each of said excitation codevectors in accordance with said input aural signal and said spectral parameter.
The above and other ob~ects, features and advantages of the present invention will be apparent from the following descriptlon referrlng to the accompanylng drawlngs whlch lllustrate examples of preferred embodlments of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
Flg. 1 is a block dlagram showing the structure of a voice-coder of a first embodlment of the B
~ 74570-20 _ 7 - 208q3~8 present invention.
Fig. 2 is a block diagram showing the structure of a voice-coder of a second embodiment of the present invention.
Fig. 3 is a block diagram showing the structure of a voice-coder of a third embodiment of the present invention.
Fig. 4 is a block diagram showing the structure of a voice-coder of a fourth embodiment of the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS:
A first preferable embodiment of the present invention will be described with reference to Fig. 1.
In the voice-coder shown in Fig.l, there are provided adaptive codebook 175, gain codebook 220 and two kinds of excitation codebooks 180, 190.
Speech input circuit 100 is provided for receiving speech signals divided into frames spaced with a constant interval. Subframe division circuit 120 and linear prediction analysis circuit 110 are provided on the output side of speech input circuit 100. Subframe division circuit 120 outputs subframes by equally dividing the frame, and linear prediction analysis circuit 110 performs linear prediction analyses of speech signals composing frames for obtaining spectral - 8 - 20~ 43~8 parameters of the speech signals. On the output side of subframe division circuit 120, weighting filter 130 is provided for performing perceptual weighting by receiving subframes and spectral parameters. On the output side of weighting filter 130, influence signal subtracter 140 is provided for subtracting weighted influence signal from the preceding subframe and outputting the results thereof.
Adaptive codebook 175 stores excitation signals decided in the past as adaptive codevectors.
Corresponding to adaptive codebook 175, adaptive codebook candidate selection circuit 150 is provided for selecting the previously fixed number of adaptive codevectors and for outputting thereof as candidates of adaptive codevectors. Adaptive codebook candidate selection circuit 150 performs selection of the candidate according to the spectral parameter and the output signal of influence signal subtracter 140.
First and second excitation codebooks 180, 190 operate for multi-stage vector quantization of the excitation signal, and store the first and second excitation codevectors, respectively. Corresponding to first and second excitation codebooks 180, 190, candidate selection circuits 160, 170 for the first and second excitation codebooks are provided respectively.
Candidate selection circuits 160, 170 select the 9 20~43~8 previously fixed number of excitation codevectors from corresponding respective excitation codebooks 180, 190 and output thereof as the candidates of the excitation codevectors. Spectral parameters, output signals of the influence signal subtracter and candidates of adaptive codevectors are inputted into each of candidate selection circuits 160, 170 for the excitation codebook.
Optimum combination search circuit 200 is provided for candidates selected by candidate selection circuits 150, 160, 170 for the corresponding codebooks in order to search the optimum combination of candidates. Further, gain codebook search circuit 210 and multiplexer 230 are provided. Optimum combination search circuit 200 is structured so as to output to multiplexer 230 the delay (to the adaptive codevector) or index ~to each excitation codevector) with reference to each of the respective optimum adaptive codevectors, to the first and second excitation codevectors according to the results of the search, and to the output weighted synthetic signals of the above vectors to gain codebook search circuit 210, respectively. Gain codebook search circuit 210 searches for the optimum gain codevector from gain codebook 220 which stores gain codevectors, and outputs the index of thus searched optimum gain codevector. Multiplexer 230 is structured so as to receive delay or indices from optimum combination search circuit 200 or gain codebook search circuit 210, and 8 4 3 3 8 output codes which correspond to input speech signals according to delay or indices.
Next, description will be made with reference to selection or search algorithm of each candidate selection circuit lS0, 160, 170 or optimum combination search circuit 200 of the present embodiment. Under these algorithms, the excitation signal is processed by two-stage vector quantization by using two kinds of excitation codebooks 180, 190.
First, in adaptive codebook candidate selection circuit 150, the predetermined number Lo of the adaptive codevectors is selected, in order, from the one with smaller error Eo expressed by equation tl):
Eo = 1I Z - ~ Osad 11 (1) where, z is a signal obtained by subtracting an influence signal from a perceptually weighted input signal, sad a perceptually weighted synthetic signal of adaptive codevector ad with delay d, ~ 0 a sequential optimum gain of an adaptive codevector, ¦¦ ¦¦ Euclid norm. The sequential optimum gain ~ O of the adaptive codevector is given by:
Z, sad~
= (2) < sad, sad>
By substituting above equation (2) into equation (1), the next equation is obtained.
<z, sad>2 E = ll zll 2 _ < sad, sad>
where, < , > represents an inner product.
In candidate selection circuit 160 for the first excitation codebook, candidates of predetermined number L1 of the first excitation codevectors are selected for each Lo piece of the adaptive codevectors selected by candidate selection circuit 150 for the adaptive codebook, in order, from the one with smaller error El expressed by equation (4):
El = ¦¦ za - ~ Oseil ¦¦ 2 (4) where seil is a perceptually weighted synthesized signal of first excitation codevector eil with index i, ~ 0 a sequential optimum gain of the first excitation codevector, and za = z - ~ 0 sad.
Therefore:
~ za, seil>
< sei 1, sei 1 >
By substituting above equation (5) into equation (4), equation (6) below is obtained:
~ z, seil> 2 El ll ll < seil, sei1> (6) In the same way as described above, in candidate selection circuit 170 for the second excitation codebook, the candidates of predetermined number L2 of the second excitation codevectors are 20843~8 selected for each Lo piece of adaptive codevectors selected by candidate selection circuit 150 for the adaptive codebook, in order, from the one with smaller error E2 expressed by the next equation:
E2 = 1I za - ~ Osej2 ¦¦ 2 (7) where sej2 is a perceptually weighted synthesized signal of second excitation codevector ej2 with index j, and ~ 0 a sequential optimum gain of the second excitation codevector. Therefore:
< za, se.2>
0 < Sej2 sej2> (8) By substituting equation (8) into equation (7), following equation (9) is obtained.
2 < Z~ sej2> 2 < j, j >
In optimum combination search circuit 200, error E is calculated by the following equation for all the combinations of candidates of the selected adaptive codevectors, and the first and second excitation codevectors, and then the combination of the candidates with minimum E is searched.
E = ¦¦ z-~ sad-~ se~ sej2¦¦ 2 (10) where ~ are simultaneous optimum gains of an adaptive codevector, the first and second excitation codevectors, respectively. Therefore:
/< z, sad>
= R~l< z, seil> (11) ,~ , <Z, saj2>~
However, R is to satisfy the following equation:
i < sad,sad> < sad,seil> < sad, se j2> \
R = < seil,sad> < seil,seil> < seil,sej2>
~< se j2,sad> < se j2,seil~ < se j2,se j2> /
_________- (12) By substituting equation (11) into equation (1~), thus obtained, R = ¦¦ Z ¦¦ 2 _ (< z,sad> < z,seil> < z,sej2~) X
~ < z, sad> ~
R-1 < Z~ seil> _______---(13) ~ <z, sej2>~
When above error E is calculated, it is acceptable to assign a particular limitation to simultaneous optimum gains ~ , ~ of each excitation codevector. For example, with the limitation that and ~ are equal, error E is given by, E = ¦¦ z ¦l 2 _ ( < z,sad> < z,sei1 + sej2> ) X
< z, sad>
R-1 ----------(14) I < z, seil+sej2> , where, 208~3~
< sad, sad> < sad, seil + se j2>
,< seil+sej2, sad> ~ seil+sej2, seil+se ( 1 s ~
Next, description will be made with reference to operation of the voice-coder of the present embodiment.
Speech input circuit 100 receives speech signals divided into each frame (e.g., 40 ms in width), which signals are outputted to linear prediction analysis circuit 110 and subframe division circuit 120.
In linear prediction analysis circuit 110, linear prediction analysis of the inputted speech signal is performed for calculating the spectral parameter. This spectral parameter is outputted to weighting filter 130, to influence signal subtracter 140, to candidate selection circuit 150 for the adaptive codebook, to candidate selection circuit 160 for the first excitation codebook, to candidate selection circuit 170 for the second excitation codebook, and to multiplexer 230.
Separately, a frame is divided into subframes (e.g., 8 ms in width) by subframe division circuit 120. Speech signals divided into subframes are inputted into weighting filter 130. Weighting filter 130 performs perceptual weighting of inputted speech signals, and outputs the results to influence signal subtracter 140.
Influence signal subtracter 140 subtracts the weighted - 15 - 20~4~
influence signal from the preceding subframe, and outputs the result to candidate selection circuit 150 for the adaptive codebook, to candidate selection circuit 160 for the first excitation codebook, to candidate selection circuit 170 for the second excitation codebook, and to gain codebook search circuit 210.
Subsequently, candidate selection circuit 150 for the adaptive codebook selects the candidate of Lo pieces of adaptive codevectors from adaptive codebook 175 according to equation (3). Candidate selection circuit 150 for the adaptive codebook outputs the weighted synthetic signal of the candidate of the selected adaptive codevectors and delay d which constitutes the index of the candidate of adaptive codevectors, to candidate selection circuits 160, 170 for the first and second excitation codebooks and to optimum combination search circuit 200.
Candidate selection circuit 160 for the first excitation codebook selects the candidate of Ll pieces of the first excitation codevector from first excitation codebook 180, according to the output of the influence signal subtracter, the spectral parameter and the candidate of the adaptive codevector by using equation (6). Candidate selection circuit 160 for the first excitation codebook outputs the weighted synthetic signal and index of the candidate of the selected first excitation codevector to optimum combination search circuit 200. In the same manner, candidate selection circuit 170 for the second excitation codebook selects the candidate of the second excitation codevector from the second excitation codebook according to equation (9), and outputs the weighted synthetic signal and index of the selected second excitation codevector to optimum combination search circuit 200.
Optimum combination search circuit 200 searches for the combination of the optimum candidates according to equation (14), and outputs the delay of the adaptive codevector and the indices of the first and second excitation codevectors to multiplexer 230, and weighted synthetic signals of each codevector to gain codebook search circuit 210. Gain codebook search circuit 210 searches for the optimum gain codevector from gain codebook 220 according to each of the inputted weighted synthetic signals, and outputs the index of thus obtained gain codevector to multiplexer 230.
Finally, multiplexer 230 assembles and outputs the code for the speech signal divided into subframes according to the delay and index outputted from optimum combination search circuit 200 and to the index outputted from gain codebook search circuit 210. By carrying out the above process, speech coding of every - 17 - 208~338 subframe is completed.
According to the present embodiment, the candidates are selected first from the adaptive codebook and each of excitation codebooks, and then the optimum combination is selected from the combination of each of thus selected candidates, so that a sufficiently good speech quality can be obtained with a relatively small operation. In addition, since the gain codebook which stores the quantized gain vectors is used for selecting the optimum combination from combinations of the candidates, SN ratio is further improved.
The second embodiment of the present invention will be described with reference to Fig. 2. In the voice-coder shown in Fig. 2, each block attached with the same reference numeral as that in Fig. 1 has the same function as that in Fig. 1.
The voice-coder in Fig. 2, when compared with the voice-coder in Fig. 1, differs in that it has no gain codebook search circuit and optimum combination search circuit, but has instead gain-including optimum combination search circuit 300. Gain-including optimum combination search circuit 300 receives candidates of the adaptive codevectors, candidates of the first and second excitation codevectors, and outputs of influence signal subtracter 140, and selects the optimum combination from all of the combinations of the ~ - 18 - 2Q8~38 candidates and gain codevectors by searching for gain codebook 220. Gain-including optimum combination search circuit 300 is structured so as to output the delay or index of each codevector composing the selected combination to multiplexer 230 according to the selected combination.
The search algorithm which controls gain-including optimum combination search circuit 300 will next be described.
Gain-including optimum combination search circuit 300 searches for the combination of candidates which has the minimum value of error E by calculating E
for all of the combinations of candidates of the selected adaptive codevectors, the selected first and second excitation codevectors, and all of the gain codevectors, where E is calculated by the following equation:
¦¦ z Q~ kSad - Q~ kSeil - Q~ ksei2 ¦1 2 (16) where Q~ k~ Q~ k~ Q~ k are each gain codevector.
It is acceptable to use, in place of above Q~ k~ Q~ k~ Q~ k~ not the gain codevector itself, but gain codevectors converted by the matrix to be calculated from the quantized power of the weighted input signal, the weighted synthetic signal of the adaptive codevector and the weighted synthetic signals of the first and second excitation codevectors. 6~n~e it requires large operation to search for the minimum value of E by calculating it against all the gain codevectors, it is also possible to perform a preliminary selection of the gain codebook to reduce the operation. The preli~i n~ry selection of the gain codebook is performed, for example, by selecting the predetermined fixed number of gain codevectors whose first components are close to the sequential optimum gain of the adaptive codevector.
The operation of this voice-coder will be described. It is the same as that of the voice-coder shown in Fig. 1 except that the candidates of vectors are outputted from each of candidate selection circuits 150, 160 and 170. These candidates of codevectors are inputted into gain-including optimum combination search circuit 300, whereby the optimum combination of candidates is searched according to equation (16). Then consulting the searched combination, the delay of the adaptive codevector and indices of the first and second excitation codevectors and gain codevectors are inputted into multiplexer 230, from which speech signal codes are outputted.
Next, the third embodiment of the present invention will be described with reference to Fig. 3.
In the voice-coder shown in Fig. 3, each block attached with the same reference numeral as that in Fig. 1 has the same function as that in Fig. 1.
This voice-coder differs from the one shown in Fig. 1 in that the second excitation codebook is composed of excitation super codebook 390. A super codebook means a codebook which stores codevectors with the number of bits larger than the number of bits to be transmitted. Index i of the candidate of the first excitation codevector is outputted from first excitation codebook selection circuit 160 to second excitation super codebook 390. The selection of the candidate of the second excitation codevectors from second excitation super codebook 390 is carried out by searching codevectors from a portion of second excitation super codebook 390, the portion being expressed by set F2(i) of indices to be determined according to index i of the first excitation codevector.
When searching of the candidates of the first and second codevectors is finished, then the optimum combination of candidates is searched in optimum combination search circuit 200 according to equation (14) as searched in the first embodiment. In the present embodiment, it is possible to modify so as to output all the second excitation codevectors which correspond to set of indices F2(i) without performing selection of candidates of the second excitation ~ - 21 - 20~338 codevectors in candidate selection circuit 170 of the second excitation codebook. In this case, optimum combination search circuit 200 can search the optimum combination from the combination of the candidate of the adaptive codevectors, the candidate of the first excitation codevectors, and all of the second excitation codevectors corresponding to set F2(i).
As described above in the third embodiment of the present invention, by applying the super codebook in the embodiment, it becomes possible to obtain speech quality as substantially good as the case with a excitation codebook of an increased codebook size without increasing the bit rates.
The fourth embodiment of the present invention will nest be described with reference to Fig. 4. In the voice-coder shown in Fig. 4, each block attached with the same reference numeral as that in Fig. 2 has the same function as that in Fig. 2.
This voice-coder uses second excitation super codebook 390 instead of the second excitation codebook, differently from the voice-coder in Fig. 2. Super codebook 390 is similar to the super codebook in the voice-coder shown in Fig. 3. The candidate of the second excitation codevector to be selected from second excitation super codebook 390 is also selected in the same way as in the third embodiment, and other 2~843~8 operations are conducted in the same manner as in the second embodiment. In this case, it is also possible to modify candidate selection circuit 170 for the second excitation codevectors so as to output all of the second excitation codevectors which correspond to set of indices F2(i) without selecting the candidate of the second excitation codevectors.
Although each embodiment of the present invention has been described above, the operation of each embodiment can be modified in such a way that auto-correlation < sei, sei> of weighted synthetic signal sei of the excitation codevector is obtained according to the following equation for the purpose of reducing the operation:
im < sei, sei> = hh(O)eei(0) + 2~ hh(l)eei(l)------(17) where hh is an auto-correlation function of the impulse response of a weighting synthesis filter, eei an auto-correlation function of the excitation code vector with index i, and im a length of the impulse response.
As well, cross-correlation between weighted synthetic signal sei of the excitation codevector and arbitrary vector v can be calculated according to the following equation to reduce the operation:
< v, sei> = < HTv, ei> ______----(18) - 23 - 20843~2 where H is an impulse response matrix of the weighting synthesis filter.
Cross-correlation between weighted synthetic signal sad of the adaptive codevector and arbitrary vector v can be obtained according to the following equation in the same way:
~ v, sad> = ~ HTv, ad> -----------(l9) Further, in the case of searching for the optimum combination in the optimum combination search circuit of the first and third embodiments, although a particular limitation (~ = ~ ) is now assigned to gains ~ , ~ of the first and second excitation codevectors as described above, it is possible to provide limitations other than ~ = ~ or to provide no limitation.
Further, it is also possible to apply a delayed decision system in each embodiment in such a way that the combination of candidates is selected so as to have the minimum cumulative error for the whole frames without uniquely determi n ing the adaptive codevector, the first and second excitation codevectors and the gain codevector for each subframe while leaving the candidates undetermined.
It is to be understood that variations and modifications of the method for speech coding and of the voice-coder disclosed herein will be evident to those skilled in the art. It is intended that all such -- 2n84~3~
modifications and variations be included within the scope of the appended claims.
BACRGROUND OF THE lNV~N'l'ION:
Field of the Invention:
The present invention relates to a method for speech coding and to a voice-coder, particularly to a method for speech coding and to a voice-coder which can achieve high coding quality with a relatively small operation at bit rates not greater than 8 kbit/s.
Description of the Related Art:
As a speech coding system to be applied to vector quantization of excitation signals at low bit rates by using a excitation codebook comprising random numbers, a CELP system described in a paper (hereinafter referred to as literature 1) titled "CODE-EXCITED LINEAR
PREDICTION (CELP): HIGH-QUALITY SPEECH AT VERY LOW BIT
RATES~ (Proc. ICASSP, pp. 937-940, 1985) by Manfred R.
Shroeder and Bishnu S. Atal is known. There is also a CELP system using an adaptive codebook described in a paper (hereinafter referred to as literature 2) titled "IMPROVED SPEECH QUALITY AND EFFICIENT VECTOR
QUANTIZATION IN SELP" (Proc. ICASSP, pp. 155-158, 1988) by W. B. Kleijin, D. J. Krasinski and R. H. Ketchum.
The CELP system using the adaptive codebook receives speech signals divided into frames spaced with a constant interval. The CELP utilizes a linear predictive analyzer for obtaining spectral parameters of input speech signals, the adaptive codebook having excitation signals determined in the past, the excitation codebook comprising random numbers to be used for vector quantization of the excitation signals of said input speech signals. The CELP selects an adaptive codevector by using the input speech signal and the synthesized signal of the adaptive codevector for every subframe made by equally dividing the frame.
Subsequently, CELP performs selection of excitation codevectors by using the input signals, the synthesized signal of the selected adaptive codevector and said excitation codevector.
However, the CELP systems have the following disadvantage, in that a quite large operation is required for searching the excitation codebook.
Moreover since the adaptive codebook is determined independently of the excitation codebook, it is impossible to get a high SN (signal to noise) ratio.
Further in the above CELP system, although the adaptive codebook and the excitation codebook are each searched by using gains not quantized, it becomes possible to obtain a higher SN ratio when the adaptive codebook and the excitation codebook are searched for all the quantization value of gains. Furthermore, it is impossible to obtain sufficiently good speech quality with low bit rates such as 8 kbit/s or less because of the too small size of the excitation codebook.
SUMMARY OF THE lNV~N'l'ION:
An object of the present invention is to provide a method for speech coding which can solve the above problem of the conventional method and achieve high quality speech by a relatively small operation even at the low bit rates such as less than 8 kbit/s.
Another object of the present invention is to provide a voice-coder which can solve the above problem of the conventional method and achieve high quality speech by a relatively small operation even at low bit rates such as less than 8 kbit/s.
The object of the present invention can be achieved by a method for speech coding for coding speech signals divided into frames spaced with a constant interval, wherein an adaptive codebook storing excitation signals determined in the past and a plurality of excitation codebooks for multi-stage vector quantization of an excitation signal of the input speech signal are prepared; a spectral parameter of said input speech signal is obtained; said frame is divided into subframes; a candidate of a first fixed number of adaptive codevectors is selected for every said subframe 2 0 ~ 8 from said adaptive codebook by using sald lnput speech slgnal and sald spectral parameter; candldates of a second flxed number of excitatlon codevectors are selected for every sald subframe from sald excltatlon codebooks, respectlvely, by uslng sald lnput speech slgnal, sald spectral parameter and the candldate of sald adaptlve codevector; and an optlmum comblnation of the adaptlve vector and each of the excltation codevectors formlng an excltatlon slgnal of sald subframe ls selected from the candldates of sald adaptive codevector and each of sald excltatlon codevector by uslng sald lnput speech slgnal and sald spectral parameter.
Another ob~ect of the present lnventlon ls achleved by a volce-coder for codlng speech slgnals dlvlded lnto frames spaced with a constant interval, comprising: linear prediction analysls means for outputtlng spectral parameters of lnput speech slgnals; an adaptlve codebook for storlng excltatlon slgnals determlned ln the past; a plurality of excitatlon codebooks provlded for multl-stage vector quantlzatlon of the excltatlon slgnal of said lnput speech signals; whereln, in case of searchlng for a combinatlon of the adaptlve codevector and each of the excltatlon codevectors for every subframe prepared by further dlvlslon of sald frame, from sald adaptlve codebook and each of sald excltatlon codebooks, respectlvely, sald comblnatlon of the adaptlve codevector and each of the excltatlon codevectors formlng an excltatlon slgnal of sald subframe; a candldate of a flrst predetermlned number of adaptlve codevectors ls selected from sald adaptlve codebook by uslng sald lnput speech slgnal and sald spectral parameter;
'"~
v 74570-20 candldates of each predetermined number of excltation codevectors are selected from a plurallty of sald excltatlon codebooks respectlvely by using said input speech slgnal, said spectral parameter and the candldate of sald adaptlve codevector; and an optlmum candldate of the adaptive codevector and each of excitatlon codevectors formlng the excitatlon slgnal of sald subframe ls selected from the candldate of sald adaptlve codevector and each of sald excltatlon codevectors by uslng said input speech signal and said spectral parameter.
Another ob~ect of the present lnvention is also achleved by a voice-coder for codlng speech slgnals dlvlded lnto frames spaced wlth a constant lnterval, comprlslng llnear predlctlon analysls means for outputtlng spectral pararneters of lnput speech slgnals; an adaptlve codebook storlng excltatlon signals determined in the past; a plurality of excitatlon codebooks provlded for multl-stage vector quantlzatlon of an excltatlon slgnal of sald lnput speech slgnals; subframe dlvlslon means for generatlng subframe slgnals by dividlng sald frame lnto subframes; flrst selectlon means for selecting a candidate of a first fixed number of adaptive codevectors from sald adaptlve codebook ln accordance wlth sald subframe signal and said spectral parameter; second selectlon means provlded for every sald excitatlon codebook for selectlng the candidate of the excitation codevectors of the number predetermlned for every excltatlon codebook, from the correspondlng excitation codebook ln accordance with sald subframe slgnal, said spectral parameter and the candldate of B
208~38 said adaptlve codevector; and means for searching candldate of the adaptlve vector and the excltatlon codevectors whlch forms the excltatlon slgnal of sald subframe, from the candldate of sald adaptlve codevector and the candidate of each of said excitation codevectors in accordance with said input aural signal and said spectral parameter.
The above and other ob~ects, features and advantages of the present invention will be apparent from the following descriptlon referrlng to the accompanylng drawlngs whlch lllustrate examples of preferred embodlments of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
Flg. 1 is a block dlagram showing the structure of a voice-coder of a first embodlment of the B
~ 74570-20 _ 7 - 208q3~8 present invention.
Fig. 2 is a block diagram showing the structure of a voice-coder of a second embodiment of the present invention.
Fig. 3 is a block diagram showing the structure of a voice-coder of a third embodiment of the present invention.
Fig. 4 is a block diagram showing the structure of a voice-coder of a fourth embodiment of the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS:
A first preferable embodiment of the present invention will be described with reference to Fig. 1.
In the voice-coder shown in Fig.l, there are provided adaptive codebook 175, gain codebook 220 and two kinds of excitation codebooks 180, 190.
Speech input circuit 100 is provided for receiving speech signals divided into frames spaced with a constant interval. Subframe division circuit 120 and linear prediction analysis circuit 110 are provided on the output side of speech input circuit 100. Subframe division circuit 120 outputs subframes by equally dividing the frame, and linear prediction analysis circuit 110 performs linear prediction analyses of speech signals composing frames for obtaining spectral - 8 - 20~ 43~8 parameters of the speech signals. On the output side of subframe division circuit 120, weighting filter 130 is provided for performing perceptual weighting by receiving subframes and spectral parameters. On the output side of weighting filter 130, influence signal subtracter 140 is provided for subtracting weighted influence signal from the preceding subframe and outputting the results thereof.
Adaptive codebook 175 stores excitation signals decided in the past as adaptive codevectors.
Corresponding to adaptive codebook 175, adaptive codebook candidate selection circuit 150 is provided for selecting the previously fixed number of adaptive codevectors and for outputting thereof as candidates of adaptive codevectors. Adaptive codebook candidate selection circuit 150 performs selection of the candidate according to the spectral parameter and the output signal of influence signal subtracter 140.
First and second excitation codebooks 180, 190 operate for multi-stage vector quantization of the excitation signal, and store the first and second excitation codevectors, respectively. Corresponding to first and second excitation codebooks 180, 190, candidate selection circuits 160, 170 for the first and second excitation codebooks are provided respectively.
Candidate selection circuits 160, 170 select the 9 20~43~8 previously fixed number of excitation codevectors from corresponding respective excitation codebooks 180, 190 and output thereof as the candidates of the excitation codevectors. Spectral parameters, output signals of the influence signal subtracter and candidates of adaptive codevectors are inputted into each of candidate selection circuits 160, 170 for the excitation codebook.
Optimum combination search circuit 200 is provided for candidates selected by candidate selection circuits 150, 160, 170 for the corresponding codebooks in order to search the optimum combination of candidates. Further, gain codebook search circuit 210 and multiplexer 230 are provided. Optimum combination search circuit 200 is structured so as to output to multiplexer 230 the delay (to the adaptive codevector) or index ~to each excitation codevector) with reference to each of the respective optimum adaptive codevectors, to the first and second excitation codevectors according to the results of the search, and to the output weighted synthetic signals of the above vectors to gain codebook search circuit 210, respectively. Gain codebook search circuit 210 searches for the optimum gain codevector from gain codebook 220 which stores gain codevectors, and outputs the index of thus searched optimum gain codevector. Multiplexer 230 is structured so as to receive delay or indices from optimum combination search circuit 200 or gain codebook search circuit 210, and 8 4 3 3 8 output codes which correspond to input speech signals according to delay or indices.
Next, description will be made with reference to selection or search algorithm of each candidate selection circuit lS0, 160, 170 or optimum combination search circuit 200 of the present embodiment. Under these algorithms, the excitation signal is processed by two-stage vector quantization by using two kinds of excitation codebooks 180, 190.
First, in adaptive codebook candidate selection circuit 150, the predetermined number Lo of the adaptive codevectors is selected, in order, from the one with smaller error Eo expressed by equation tl):
Eo = 1I Z - ~ Osad 11 (1) where, z is a signal obtained by subtracting an influence signal from a perceptually weighted input signal, sad a perceptually weighted synthetic signal of adaptive codevector ad with delay d, ~ 0 a sequential optimum gain of an adaptive codevector, ¦¦ ¦¦ Euclid norm. The sequential optimum gain ~ O of the adaptive codevector is given by:
Z, sad~
= (2) < sad, sad>
By substituting above equation (2) into equation (1), the next equation is obtained.
<z, sad>2 E = ll zll 2 _ < sad, sad>
where, < , > represents an inner product.
In candidate selection circuit 160 for the first excitation codebook, candidates of predetermined number L1 of the first excitation codevectors are selected for each Lo piece of the adaptive codevectors selected by candidate selection circuit 150 for the adaptive codebook, in order, from the one with smaller error El expressed by equation (4):
El = ¦¦ za - ~ Oseil ¦¦ 2 (4) where seil is a perceptually weighted synthesized signal of first excitation codevector eil with index i, ~ 0 a sequential optimum gain of the first excitation codevector, and za = z - ~ 0 sad.
Therefore:
~ za, seil>
< sei 1, sei 1 >
By substituting above equation (5) into equation (4), equation (6) below is obtained:
~ z, seil> 2 El ll ll < seil, sei1> (6) In the same way as described above, in candidate selection circuit 170 for the second excitation codebook, the candidates of predetermined number L2 of the second excitation codevectors are 20843~8 selected for each Lo piece of adaptive codevectors selected by candidate selection circuit 150 for the adaptive codebook, in order, from the one with smaller error E2 expressed by the next equation:
E2 = 1I za - ~ Osej2 ¦¦ 2 (7) where sej2 is a perceptually weighted synthesized signal of second excitation codevector ej2 with index j, and ~ 0 a sequential optimum gain of the second excitation codevector. Therefore:
< za, se.2>
0 < Sej2 sej2> (8) By substituting equation (8) into equation (7), following equation (9) is obtained.
2 < Z~ sej2> 2 < j, j >
In optimum combination search circuit 200, error E is calculated by the following equation for all the combinations of candidates of the selected adaptive codevectors, and the first and second excitation codevectors, and then the combination of the candidates with minimum E is searched.
E = ¦¦ z-~ sad-~ se~ sej2¦¦ 2 (10) where ~ are simultaneous optimum gains of an adaptive codevector, the first and second excitation codevectors, respectively. Therefore:
/< z, sad>
= R~l< z, seil> (11) ,~ , <Z, saj2>~
However, R is to satisfy the following equation:
i < sad,sad> < sad,seil> < sad, se j2> \
R = < seil,sad> < seil,seil> < seil,sej2>
~< se j2,sad> < se j2,seil~ < se j2,se j2> /
_________- (12) By substituting equation (11) into equation (1~), thus obtained, R = ¦¦ Z ¦¦ 2 _ (< z,sad> < z,seil> < z,sej2~) X
~ < z, sad> ~
R-1 < Z~ seil> _______---(13) ~ <z, sej2>~
When above error E is calculated, it is acceptable to assign a particular limitation to simultaneous optimum gains ~ , ~ of each excitation codevector. For example, with the limitation that and ~ are equal, error E is given by, E = ¦¦ z ¦l 2 _ ( < z,sad> < z,sei1 + sej2> ) X
< z, sad>
R-1 ----------(14) I < z, seil+sej2> , where, 208~3~
< sad, sad> < sad, seil + se j2>
,< seil+sej2, sad> ~ seil+sej2, seil+se ( 1 s ~
Next, description will be made with reference to operation of the voice-coder of the present embodiment.
Speech input circuit 100 receives speech signals divided into each frame (e.g., 40 ms in width), which signals are outputted to linear prediction analysis circuit 110 and subframe division circuit 120.
In linear prediction analysis circuit 110, linear prediction analysis of the inputted speech signal is performed for calculating the spectral parameter. This spectral parameter is outputted to weighting filter 130, to influence signal subtracter 140, to candidate selection circuit 150 for the adaptive codebook, to candidate selection circuit 160 for the first excitation codebook, to candidate selection circuit 170 for the second excitation codebook, and to multiplexer 230.
Separately, a frame is divided into subframes (e.g., 8 ms in width) by subframe division circuit 120. Speech signals divided into subframes are inputted into weighting filter 130. Weighting filter 130 performs perceptual weighting of inputted speech signals, and outputs the results to influence signal subtracter 140.
Influence signal subtracter 140 subtracts the weighted - 15 - 20~4~
influence signal from the preceding subframe, and outputs the result to candidate selection circuit 150 for the adaptive codebook, to candidate selection circuit 160 for the first excitation codebook, to candidate selection circuit 170 for the second excitation codebook, and to gain codebook search circuit 210.
Subsequently, candidate selection circuit 150 for the adaptive codebook selects the candidate of Lo pieces of adaptive codevectors from adaptive codebook 175 according to equation (3). Candidate selection circuit 150 for the adaptive codebook outputs the weighted synthetic signal of the candidate of the selected adaptive codevectors and delay d which constitutes the index of the candidate of adaptive codevectors, to candidate selection circuits 160, 170 for the first and second excitation codebooks and to optimum combination search circuit 200.
Candidate selection circuit 160 for the first excitation codebook selects the candidate of Ll pieces of the first excitation codevector from first excitation codebook 180, according to the output of the influence signal subtracter, the spectral parameter and the candidate of the adaptive codevector by using equation (6). Candidate selection circuit 160 for the first excitation codebook outputs the weighted synthetic signal and index of the candidate of the selected first excitation codevector to optimum combination search circuit 200. In the same manner, candidate selection circuit 170 for the second excitation codebook selects the candidate of the second excitation codevector from the second excitation codebook according to equation (9), and outputs the weighted synthetic signal and index of the selected second excitation codevector to optimum combination search circuit 200.
Optimum combination search circuit 200 searches for the combination of the optimum candidates according to equation (14), and outputs the delay of the adaptive codevector and the indices of the first and second excitation codevectors to multiplexer 230, and weighted synthetic signals of each codevector to gain codebook search circuit 210. Gain codebook search circuit 210 searches for the optimum gain codevector from gain codebook 220 according to each of the inputted weighted synthetic signals, and outputs the index of thus obtained gain codevector to multiplexer 230.
Finally, multiplexer 230 assembles and outputs the code for the speech signal divided into subframes according to the delay and index outputted from optimum combination search circuit 200 and to the index outputted from gain codebook search circuit 210. By carrying out the above process, speech coding of every - 17 - 208~338 subframe is completed.
According to the present embodiment, the candidates are selected first from the adaptive codebook and each of excitation codebooks, and then the optimum combination is selected from the combination of each of thus selected candidates, so that a sufficiently good speech quality can be obtained with a relatively small operation. In addition, since the gain codebook which stores the quantized gain vectors is used for selecting the optimum combination from combinations of the candidates, SN ratio is further improved.
The second embodiment of the present invention will be described with reference to Fig. 2. In the voice-coder shown in Fig. 2, each block attached with the same reference numeral as that in Fig. 1 has the same function as that in Fig. 1.
The voice-coder in Fig. 2, when compared with the voice-coder in Fig. 1, differs in that it has no gain codebook search circuit and optimum combination search circuit, but has instead gain-including optimum combination search circuit 300. Gain-including optimum combination search circuit 300 receives candidates of the adaptive codevectors, candidates of the first and second excitation codevectors, and outputs of influence signal subtracter 140, and selects the optimum combination from all of the combinations of the ~ - 18 - 2Q8~38 candidates and gain codevectors by searching for gain codebook 220. Gain-including optimum combination search circuit 300 is structured so as to output the delay or index of each codevector composing the selected combination to multiplexer 230 according to the selected combination.
The search algorithm which controls gain-including optimum combination search circuit 300 will next be described.
Gain-including optimum combination search circuit 300 searches for the combination of candidates which has the minimum value of error E by calculating E
for all of the combinations of candidates of the selected adaptive codevectors, the selected first and second excitation codevectors, and all of the gain codevectors, where E is calculated by the following equation:
¦¦ z Q~ kSad - Q~ kSeil - Q~ ksei2 ¦1 2 (16) where Q~ k~ Q~ k~ Q~ k are each gain codevector.
It is acceptable to use, in place of above Q~ k~ Q~ k~ Q~ k~ not the gain codevector itself, but gain codevectors converted by the matrix to be calculated from the quantized power of the weighted input signal, the weighted synthetic signal of the adaptive codevector and the weighted synthetic signals of the first and second excitation codevectors. 6~n~e it requires large operation to search for the minimum value of E by calculating it against all the gain codevectors, it is also possible to perform a preliminary selection of the gain codebook to reduce the operation. The preli~i n~ry selection of the gain codebook is performed, for example, by selecting the predetermined fixed number of gain codevectors whose first components are close to the sequential optimum gain of the adaptive codevector.
The operation of this voice-coder will be described. It is the same as that of the voice-coder shown in Fig. 1 except that the candidates of vectors are outputted from each of candidate selection circuits 150, 160 and 170. These candidates of codevectors are inputted into gain-including optimum combination search circuit 300, whereby the optimum combination of candidates is searched according to equation (16). Then consulting the searched combination, the delay of the adaptive codevector and indices of the first and second excitation codevectors and gain codevectors are inputted into multiplexer 230, from which speech signal codes are outputted.
Next, the third embodiment of the present invention will be described with reference to Fig. 3.
In the voice-coder shown in Fig. 3, each block attached with the same reference numeral as that in Fig. 1 has the same function as that in Fig. 1.
This voice-coder differs from the one shown in Fig. 1 in that the second excitation codebook is composed of excitation super codebook 390. A super codebook means a codebook which stores codevectors with the number of bits larger than the number of bits to be transmitted. Index i of the candidate of the first excitation codevector is outputted from first excitation codebook selection circuit 160 to second excitation super codebook 390. The selection of the candidate of the second excitation codevectors from second excitation super codebook 390 is carried out by searching codevectors from a portion of second excitation super codebook 390, the portion being expressed by set F2(i) of indices to be determined according to index i of the first excitation codevector.
When searching of the candidates of the first and second codevectors is finished, then the optimum combination of candidates is searched in optimum combination search circuit 200 according to equation (14) as searched in the first embodiment. In the present embodiment, it is possible to modify so as to output all the second excitation codevectors which correspond to set of indices F2(i) without performing selection of candidates of the second excitation ~ - 21 - 20~338 codevectors in candidate selection circuit 170 of the second excitation codebook. In this case, optimum combination search circuit 200 can search the optimum combination from the combination of the candidate of the adaptive codevectors, the candidate of the first excitation codevectors, and all of the second excitation codevectors corresponding to set F2(i).
As described above in the third embodiment of the present invention, by applying the super codebook in the embodiment, it becomes possible to obtain speech quality as substantially good as the case with a excitation codebook of an increased codebook size without increasing the bit rates.
The fourth embodiment of the present invention will nest be described with reference to Fig. 4. In the voice-coder shown in Fig. 4, each block attached with the same reference numeral as that in Fig. 2 has the same function as that in Fig. 2.
This voice-coder uses second excitation super codebook 390 instead of the second excitation codebook, differently from the voice-coder in Fig. 2. Super codebook 390 is similar to the super codebook in the voice-coder shown in Fig. 3. The candidate of the second excitation codevector to be selected from second excitation super codebook 390 is also selected in the same way as in the third embodiment, and other 2~843~8 operations are conducted in the same manner as in the second embodiment. In this case, it is also possible to modify candidate selection circuit 170 for the second excitation codevectors so as to output all of the second excitation codevectors which correspond to set of indices F2(i) without selecting the candidate of the second excitation codevectors.
Although each embodiment of the present invention has been described above, the operation of each embodiment can be modified in such a way that auto-correlation < sei, sei> of weighted synthetic signal sei of the excitation codevector is obtained according to the following equation for the purpose of reducing the operation:
im < sei, sei> = hh(O)eei(0) + 2~ hh(l)eei(l)------(17) where hh is an auto-correlation function of the impulse response of a weighting synthesis filter, eei an auto-correlation function of the excitation code vector with index i, and im a length of the impulse response.
As well, cross-correlation between weighted synthetic signal sei of the excitation codevector and arbitrary vector v can be calculated according to the following equation to reduce the operation:
< v, sei> = < HTv, ei> ______----(18) - 23 - 20843~2 where H is an impulse response matrix of the weighting synthesis filter.
Cross-correlation between weighted synthetic signal sad of the adaptive codevector and arbitrary vector v can be obtained according to the following equation in the same way:
~ v, sad> = ~ HTv, ad> -----------(l9) Further, in the case of searching for the optimum combination in the optimum combination search circuit of the first and third embodiments, although a particular limitation (~ = ~ ) is now assigned to gains ~ , ~ of the first and second excitation codevectors as described above, it is possible to provide limitations other than ~ = ~ or to provide no limitation.
Further, it is also possible to apply a delayed decision system in each embodiment in such a way that the combination of candidates is selected so as to have the minimum cumulative error for the whole frames without uniquely determi n ing the adaptive codevector, the first and second excitation codevectors and the gain codevector for each subframe while leaving the candidates undetermined.
It is to be understood that variations and modifications of the method for speech coding and of the voice-coder disclosed herein will be evident to those skilled in the art. It is intended that all such -- 2n84~3~
modifications and variations be included within the scope of the appended claims.
Claims (16)
1. A method for speech coding for coding speech signals divided into frames spaced with a constant interval, wherein, an adaptive codebook storing excitation signals determined in the past and a plurality of excitation codebooks for multi-stage vector quantization of an excitation signal of the input speech signal are prepared;
a spectral parameter of said input speech signal is obtained;
said frame is divided into subframes; a candidate of a first fixed number of adaptive codevectors is selected for every said subframe from said adaptive codebook by using said input speech signal and said spectral parameter;
candidates of a second fixed number of excitation codevectors are selected for every said subframe from said excitation codebooks, respectively, by using said input speech signal, said spectral parameter and the candidate of said adaptive codevector; and an optimum combination of the adaptive codevector and each of the excitation codevectors forming an excitation signal of said subframe is selected from the candidates of said adaptive codevector and each of said excitation codevectors by using said input speech signal and said spectral parameter.
a spectral parameter of said input speech signal is obtained;
said frame is divided into subframes; a candidate of a first fixed number of adaptive codevectors is selected for every said subframe from said adaptive codebook by using said input speech signal and said spectral parameter;
candidates of a second fixed number of excitation codevectors are selected for every said subframe from said excitation codebooks, respectively, by using said input speech signal, said spectral parameter and the candidate of said adaptive codevector; and an optimum combination of the adaptive codevector and each of the excitation codevectors forming an excitation signal of said subframe is selected from the candidates of said adaptive codevector and each of said excitation codevectors by using said input speech signal and said spectral parameter.
2. A method for speech coding according to Claim 1, wherein selection of the candidates of the adaptive codevector and each of the excitation codevectors are performed, respectively, in order, from the selection of the candidate with a smaller error.
3. A method for speech coding according to Claim 1, wherein, a gain codebook is used for performing quantization of gains of said adaptive codebook and each of said excitation codebooks, respectively; and a gain codevector is determined by using said gain codebook when selection of an optimum combination of the adaptive codevector and the excitation codevectors is searched for from said adaptive codevector and the candidates of said excitation codevector.
4. A method for speech coding according to Claim 1, wherein, at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and selection of the candidate of the excitation codevector from said excitation super codebook is performed corresponding to the candidate of the excitation codevector already selected.
5. A method for speech coding according to Claim 1, wherein the step of selecting the combination of the adaptive codevector and each of the excitation codevectors forming the excitation signal of said subframe from the candidates of said adaptive codevector and said excitation codevector, further comprising the steps of:
determining the optimum gain codevector from said gain codebook; and reflecting said gain codevector on said adaptive codevector and each of said excitation codevectors forming said excitation signal.
determining the optimum gain codevector from said gain codebook; and reflecting said gain codevector on said adaptive codevector and each of said excitation codevectors forming said excitation signal.
6. A method for speech coding according to Claim 5, wherein, at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and selection of the candidate of the excitation codevector from said excitation super codebook is performed corresponding to the candidate of the excitation codevector already selected.
7. A voice-coder for coding speech signals divided into frames spaced with a constant interval, comprising:
linear prediction analysis means for outputting spectral parameters of input speech signals;
an adaptive codebook for storing excitation signals determined in the past;
a plurality of excitation codebooks provided for multi-stage vector quantization of the excitation signal of said input speech signals;
wherein, in case of searching for a combination of the adaptive codevector and each of the excitation codevectors for every subframe prepared by further division of said frame from said adaptive codebook and each of said excitation codebooks, respectively, said combination of the adaptive codevector and each of the excitation codevectors forming an excitation signal of said subframe:
a candidate of a first predetermined number of adaptive codevectors is selected from said adaptive codebook by using said input speech signal and said spectral parameter;
candidates of each predetermined number of excitation codevectors are selected from a plurality of said excitation codebooks, respectively, by using said input speech signal, said spectral parameter and the candidate of said adaptive codevector; and an optimum candidate of said adaptive codevector and each of said excitation codevectors forming the excitation signal of said subframe is selected from the candidate of said adaptive codevector and each of said excitation codevectors by using said input speech signal and said spectral parameter.
linear prediction analysis means for outputting spectral parameters of input speech signals;
an adaptive codebook for storing excitation signals determined in the past;
a plurality of excitation codebooks provided for multi-stage vector quantization of the excitation signal of said input speech signals;
wherein, in case of searching for a combination of the adaptive codevector and each of the excitation codevectors for every subframe prepared by further division of said frame from said adaptive codebook and each of said excitation codebooks, respectively, said combination of the adaptive codevector and each of the excitation codevectors forming an excitation signal of said subframe:
a candidate of a first predetermined number of adaptive codevectors is selected from said adaptive codebook by using said input speech signal and said spectral parameter;
candidates of each predetermined number of excitation codevectors are selected from a plurality of said excitation codebooks, respectively, by using said input speech signal, said spectral parameter and the candidate of said adaptive codevector; and an optimum candidate of said adaptive codevector and each of said excitation codevectors forming the excitation signal of said subframe is selected from the candidate of said adaptive codevector and each of said excitation codevectors by using said input speech signal and said spectral parameter.
8. A voice-coder according to Claim 7, further comprising:
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks;
wherein said input speech signal, said spectral parameter and said gain codebook are used for searching an optimum combination of the adaptive codevector and each of the excitation codevectors which forms the excitation signal of said subframe, from the candidates of said adaptive codevector and said excitation codevectors.
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks;
wherein said input speech signal, said spectral parameter and said gain codebook are used for searching an optimum combination of the adaptive codevector and each of the excitation codevectors which forms the excitation signal of said subframe, from the candidates of said adaptive codevector and said excitation codevectors.
9. A voice-coder according to Claim 7, wherein, at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and selection of the candidate of the excitation codevector from said excitation super codebook is performed corresponding to the candidate of the excitation codevector already selected.
10. A voice-coder according to Claim 8, wherein, at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and selection of the candidate of the excitation codevector from said excitation super codebook is performed corresponding to the candidate of the excitation codevector already selected.
11. A voice-coder for coding speech signals divided into frames spaced with a constant interval, comprising:
linear prediction analysis means for outputting spectral parameters of input speech signals;
an adaptive codebook storing excitation signals determined in the past;
a plurality of excitation codebooks provided for multi-stage vector quantization of an excitation signal of said input speech signals;
subframe division means for generating subframe signals by dividing said frame into subframes;
first selection means for selecting a candidate of a first fixed number of adaptive codevectors from said adaptive codebook in accordance with said subframe signal and said spectral parameter;
second selection means provided for every said excitation codebook for selecting the candidate of the excitation codevectors of the number predetermined for every excitation codebook, from the corresponding excitation codebook in accordance with said subframe signal, said spectral parameter and the candidate of said adaptive codevector; and means for searching an optimum candidate of said adaptive codevector and said excitation codevectors which forms the excitation signal of said subframe, from the candidate of said adaptive codevector and the candidate of each of said excitation codevectors in accordance with said input speech signal and said spectral parameter.
linear prediction analysis means for outputting spectral parameters of input speech signals;
an adaptive codebook storing excitation signals determined in the past;
a plurality of excitation codebooks provided for multi-stage vector quantization of an excitation signal of said input speech signals;
subframe division means for generating subframe signals by dividing said frame into subframes;
first selection means for selecting a candidate of a first fixed number of adaptive codevectors from said adaptive codebook in accordance with said subframe signal and said spectral parameter;
second selection means provided for every said excitation codebook for selecting the candidate of the excitation codevectors of the number predetermined for every excitation codebook, from the corresponding excitation codebook in accordance with said subframe signal, said spectral parameter and the candidate of said adaptive codevector; and means for searching an optimum candidate of said adaptive codevector and said excitation codevectors which forms the excitation signal of said subframe, from the candidate of said adaptive codevector and the candidate of each of said excitation codevectors in accordance with said input speech signal and said spectral parameter.
12. A voice-coder according to Claim 11, wherein, first and second selecting means select each corresponding candidate, in order, from the candidate with a smaller error;
said search means selects the candidate of said codevector on the condition of whose error is lowest.
said search means selects the candidate of said codevector on the condition of whose error is lowest.
13. A voice-coder according to Claim 11, further comprising:
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks;
wherein said search means selects the candidate of said codevector while said gain is quantized by using said gain codebook.
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks;
wherein said search means selects the candidate of said codevector while said gain is quantized by using said gain codebook.
14. A voice-coder according to Claim 12, further comprising:
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks;
wherein said search means further determines the optimum gain codevector from said gain codebook by consulting said gain codebook, and reflects said gain codevector on the adaptive codevector and each of the excitation codevector which forms said excitation signal.
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks;
wherein said search means further determines the optimum gain codevector from said gain codebook by consulting said gain codebook, and reflects said gain codevector on the adaptive codevector and each of the excitation codevector which forms said excitation signal.
15. A voice-coder according to Claim 11, wherein, at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and said second selection means corresponding to said excitation super codebook performs selection of the candidate of the excitation codevector from said excitation super codebook according to the candidate of the excitation codevector already selected.
16. A voice-coder according to Claim 13, wherein, at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted;
said second selection means corresponding to said excitation super codebook performs selection of the candidate of the excitation codevector from said excitation super codebook according to the candidate of the excitation codevector already selected.
said second selection means corresponding to said excitation super codebook performs selection of the candidate of the excitation codevector from said excitation super codebook according to the candidate of the excitation codevector already selected.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP03319314A JP3089769B2 (en) | 1991-12-03 | 1991-12-03 | Audio coding device |
JP319314/1991 | 1991-12-03 |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2084338A1 CA2084338A1 (en) | 1993-06-04 |
CA2084338C true CA2084338C (en) | 1997-03-04 |
Family
ID=18108816
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002084338A Expired - Fee Related CA2084338C (en) | 1991-12-03 | 1992-12-02 | Method for speech coding and voice-coder |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP0545386B1 (en) |
JP (1) | JP3089769B2 (en) |
CA (1) | CA2084338C (en) |
DE (1) | DE69228858T2 (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5701392A (en) * | 1990-02-23 | 1997-12-23 | Universite De Sherbrooke | Depth-first algebraic-codebook search for fast coding of speech |
US5754976A (en) * | 1990-02-23 | 1998-05-19 | Universite De Sherbrooke | Algebraic codebook with signal-selected pulse amplitude/position combinations for fast coding of speech |
US5717824A (en) * | 1992-08-07 | 1998-02-10 | Pacific Communication Sciences, Inc. | Adaptive speech coder having code excited linear predictor with multiple codebook searches |
JP2800618B2 (en) * | 1993-02-09 | 1998-09-21 | 日本電気株式会社 | Voice parameter coding method |
DE4315315A1 (en) * | 1993-05-07 | 1994-11-10 | Ant Nachrichtentech | Method for vector quantization, especially of speech signals |
JP2591430B2 (en) * | 1993-06-30 | 1997-03-19 | 日本電気株式会社 | Vector quantizer |
JPH08263099A (en) * | 1995-03-23 | 1996-10-11 | Toshiba Corp | Encoder |
JP3616432B2 (en) * | 1995-07-27 | 2005-02-02 | 日本電気株式会社 | Speech encoding device |
WO1998004046A2 (en) * | 1996-07-17 | 1998-01-29 | Universite De Sherbrooke | Enhanced encoding of dtmf and other signalling tones |
CN1167048C (en) * | 1998-06-09 | 2004-09-15 | 松下电器产业株式会社 | Speech coding apparatus and speech decoding apparatus |
JP3541680B2 (en) | 1998-06-15 | 2004-07-14 | 日本電気株式会社 | Audio music signal encoding device and decoding device |
DE19845888A1 (en) | 1998-10-06 | 2000-05-11 | Bosch Gmbh Robert | Method for coding or decoding speech signal samples as well as encoders or decoders |
CA2252170A1 (en) | 1998-10-27 | 2000-04-27 | Bruno Bessette | A method and device for high quality coding of wideband speech and audio signals |
US8306007B2 (en) * | 2008-01-16 | 2012-11-06 | Panasonic Corporation | Vector quantizer, vector inverse quantizer, and methods therefor |
KR101290997B1 (en) | 2012-03-26 | 2013-07-30 | 세종대학교산학협력단 | A codebook-based speech enhancement method using adaptive codevector and apparatus thereof |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5208862A (en) * | 1990-02-22 | 1993-05-04 | Nec Corporation | Speech coder |
JP2626223B2 (en) * | 1990-09-26 | 1997-07-02 | 日本電気株式会社 | Audio coding device |
-
1991
- 1991-12-03 JP JP03319314A patent/JP3089769B2/en not_active Expired - Lifetime
-
1992
- 1992-12-02 DE DE69228858T patent/DE69228858T2/en not_active Expired - Fee Related
- 1992-12-02 EP EP92120573A patent/EP0545386B1/en not_active Expired - Lifetime
- 1992-12-02 CA CA002084338A patent/CA2084338C/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
JPH05216500A (en) | 1993-08-27 |
EP0545386B1 (en) | 1999-04-07 |
DE69228858T2 (en) | 1999-08-19 |
DE69228858D1 (en) | 1999-05-12 |
EP0545386A2 (en) | 1993-06-09 |
JP3089769B2 (en) | 2000-09-18 |
CA2084338A1 (en) | 1993-06-04 |
EP0545386A3 (en) | 1993-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5208862A (en) | Speech coder | |
CA2084338C (en) | Method for speech coding and voice-coder | |
US8352255B2 (en) | Method for speech coding, method for speech decoding and their apparatuses | |
US5142584A (en) | Speech coding/decoding method having an excitation signal | |
US5737484A (en) | Multistage low bit-rate CELP speech coder with switching code books depending on degree of pitch periodicity | |
JP3114197B2 (en) | Voice parameter coding method | |
US5271089A (en) | Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits | |
US5485581A (en) | Speech coding method and system | |
US5682407A (en) | Voice coder for coding voice signal with code-excited linear prediction coding | |
JP2800618B2 (en) | Voice parameter coding method | |
JPH056199A (en) | Voice parameter coding system | |
US6094630A (en) | Sequential searching speech coding device | |
US5677985A (en) | Speech decoder capable of reproducing well background noise | |
CA2090205C (en) | Speech coding system | |
JP2624130B2 (en) | Audio coding method | |
US5666464A (en) | Speech pitch coding system | |
JP2538450B2 (en) | Speech excitation signal encoding / decoding method | |
JP2613503B2 (en) | Speech excitation signal encoding / decoding method | |
EP0694907A2 (en) | Speech coder | |
JP2700974B2 (en) | Audio coding method | |
JP3099836B2 (en) | Excitation period encoding method for speech | |
JP3192051B2 (en) | Audio coding device | |
CA2118986C (en) | Speech coding system | |
JP3276357B2 (en) | CELP-type speech coding apparatus and CELP-type speech coding method | |
JP3276355B2 (en) | CELP-type speech decoding apparatus and CELP-type speech decoding method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
MKLA | Lapsed |