EP0545386A2

EP0545386A2 - Method for speech coding and voice-coder

Info

Publication number: EP0545386A2
Application number: EP92120573A
Authority: EP
Inventors: Toshiki Miyano
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1991-12-03
Filing date: 1992-12-02
Publication date: 1993-06-09
Anticipated expiration: 2012-12-02
Also published as: CA2084338C; JP3089769B2; EP0545386A3; EP0545386B1; DE69228858T2; JPH05216500A; CA2084338A1; DE69228858D1

Abstract

A method for speech coding and a voice-coder for coding speech signals divided into frames spaced with a constant interval are disclosed. An adaptive codebook storing excitation signal determined in advance and a plurality of excitation codebooks for multi-stage vector quantization are provided. Each frame is divided into subframes. For each subframe, a candidate of a first predetermined number of adaptive codevectors is selected, and then candidates of each predetermined number of excitation codevectors are selected from each excitation codebook, respectively, by using the candidate of the adaptive codevector. Finally, a combination of the adaptive codevector and each of the excitation codevector is selected from the candidates of the adaptive codevector and each of the sound codevectors.

Description

The present invention relates to a method for speech coding and to a voice-coder, particularly to a method for speech coding and to a voice-coder which can achieve high coding quality with a relatively small operation at bit rates not greater than 8 kbit/s.
As a speech coding system to be applied to vector quantization of excitation signals at low bit rates by using a excitation codebook comprising random numbers, a CELP system described in a paper (hereinafter referred to as literature 1) titled "CODE-EXCITED LINEAR PREDICTION (CELP): HIGH-QUALITY SPEECH AT VERY LOW BIT RATES" (Proc. ICASSP, pp. 937-940, 1985) by Manfred R. Shroeder and Bishnu S. Atal is known. There is also a CELP system using an adaptive codebook described in a paper (hereinafter referred to as literature 2) titled "IMPROVED SPEECH QUALITY AND EFFICIENT VECTOR QUANTIZATION IN SELP" (Proc. ICASSP, pp. 155-158, 1988) by W. B. Kleijin, D. J. Krasinski and R. H. Ketchum. The CELP system using the adaptive codebook receives speech signals divided into frames spaced with a constant interval. The CELP utilizes a linear predictive analyzer for obtaining spectral parameters of input speech signals, the adaptive codebook having excitation signals determined in the past, the excitation codebook comprising random numbers to be used for vector quantization of the excitation signals of said input speech signals. The CELP selects an adaptive codevector by using the input speech signal and the synthesized signal of the adaptive codevector for every subframe made by equally dividing the frame. Subsequently, CELP performs selection of excitation codevectors by using the input signals, the synthesized signal of the selected adaptive codevector and said excitation codevector.
However, the CELP systems have the following disadvantage, in that a quite large operation is required for searching the excitation codebook. Moreover since the adaptive codebook is determined independently of the excitation codebook, it is impossible to get a high SN (signal to noise) ratio. Further in the above CELP system, although the adaptive codebook and the excitation codebook are each searched by using gains not quantized, it becomes possible to obtain a higher SN ratio when the adaptive codebook and the excitation codebook are searched for all the quantization value of gains. Furthermore, it is impossible to obtain sufficiently good speech quality with low bit rates such as 8 kbit/s or less because of the too small size of the excitation codebook.
An object of the present invention is to provide a method for speech coding which can solve the above problem of the conventional method and achieve high quality speech by a relatively small operation even at the low bit rates such as less than 8 kbit/s.
Another object of the present invention is to provide a voice-coder which can solve the above problem of the conventional method and achieve high quality speech by a relatively small operation even at low bit rates such as less than 8 kbit/s.
The object of the present invention can be achieved by a method for speech coding for coding speech signals divided into frames spaced with a constant interval, wherein an adaptive codebook storing excitation signals determined in the past and a plurality of excitation codebooks for multi-stage vector quantization of an excitation signal of the input speech signal are prepared; a spectral parameter of said input speech signal is obtained; said frame is divided into subframes; a candidate of a first fixed number of adaptive codevectors is selected for every said subframe from said adaptive codebook by using said input speech signal and said spectral parameter; candidates of a second fixed number of excitation codevectors are selected for every said subframe from said excitation codebooks, respectively, by using said input speech signal, said spectral parameter and the candidate of said adaptive codevector; and a combination of the adaptive vector and each of the excitation codevectors forming an excitation signal of said subframe is selected from the candidates of said adaptive codevector and each of said excitation codevector by using said input speech signal and said spectral parameter.
Another object of the present invention is achieved by a voice-coder for coding speech signals divided into frames spaced with a constant interval, comprising: linear prediction analysis means for outputting spectral parameters of input speech signals; an adaptive codebook for storing excitation signals determined in the past; a plurality of excitation codebooks provided for multi-stage vector quantization of the excitation signal of said input speech signals; wherein, in case of searching for a combination of the adaptive codevector and each of the excitation codevectors for every subframe prepared by further division of said frame, from said adaptive codebook and each of said excitation codebooks, respectively, said combination of the adaptive codevector and each of the excitation codevectors forming an excitation signal of said subframe; a candidate of a first predetermined number of adaptive codevectors is selected from said adaptive codebook by using said input speech signal and said spectral parameter; candidates of each predetermined number of excitation codevectors are selected from a plurality of said excitation codebooks respectively by using said input speech signal, said spectral parameter and the candidate of said adaptive codevector; and a candidate of the adaptive codevector and each of excitation codevectors forming the excitation signal of said subframe is selected from the candidate of said adaptive codevector and each of said excitation codevectors by using said input speech signal and said spectral parameter.
Another object of the present invention is also achieved by a voice-coder for coding speech signals divided into frames spaced with a constant interval, comprising: linear prediction analysis means for outputting spectral parameters of input speech signals; an adaptive codebook storing excitation signals determined in the past; a plurality of excitation codebooks provided for multi-stage vector quantization of an excitation signal of said input speech signals; subframe division means for generating subframe signals by dividing said frame into subframes; first selection means for selecting a candidate of a first fixed number of adaptive codevectors from said adaptive codebook in accordance with said subframe signal and said spectral parameter; second selection means provided for every said excitation codebook for selecting the candidate of the excitation codevectors of the number predetermined for every excitation codebook, from the corresponding excitation codebook in accordance with said subframe signal, said spectral parameter and the candidate of said adaptive codevector; and means for searching the candidate of the adaptive vector and each of the excitation codevectors which forms the excitation signal of said subframe, from the candidate of said adaptive codevector and the candidate of each of said excitation codevectors in accordance with said input aural signal and said spectral parameter.
The above and other objects, features and advantages of the present invention will be apparent from the following description referring to the accompanying drawings which illustrate examples of preferred embodiments of the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS:

Fig. 1 is a block diagram showing the structure of a voice-coder of a first embodiment of the present invention.
Fig. 2 is a block diagram showing the structure of a voice-coder of a second embodiment of the present invention.
Fig. 3 is a block diagram showing the structure of a voice-coder of a third embodiment of the present invention.
Fig. 4 is a block diagram showing the structure of a voice-coder of a fourth embodiment of the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS:

A first preferable embodiment of the present invention will be described with reference to Fig. 1. In the voice-coder shown in Fig.1, there are provided adaptive codebook 175, gain codebook 220 and two kinds of excitation codebooks 180, 190.
Speech input circuit 100 is provided for receiving speech signals divided into frames spaced with a constant interval. Subframe division circuit 120 and linear prediction analysis circuit 110 are provided on the output side of speech input circuit 100. Subframe division circuit 120 outputs subframes by equally dividing the frame, and linear prediction analysis circuit 110 performs linear prediction analyses of speech signals composing frames for obtaining spectral parameters of the speech signals. On the output side of subframe division circuit 120, weighting filter 130 is provided for performing perceptual weighting by receiving subframes and spectral parameters. On the output side of weighting filter 130, influence signal subtracter 140 is provided for subtracting weighted influence signal from the preceding subframe and outputting the results thereof.
Adaptive codebook 175 stores excitation signals decided in the past as adaptive codevectors. Corresponding to adaptive codebook 175, adaptive codebook candidate selection circuit 150 is provided for selecting the previously fixed number of adaptive codevectors and for outputting thereof as candidates of adaptive codevectors. Adaptive codebook candidate selection circuit 150 performs selection of the candidate according to the spectral parameter and the output signal of influence signal subtracter 140.
First and second excitation codebooks 180, 190 operate for multi-stage vector quantization of the excitation signal, and store the first and second excitation codevectors, respectively. Corresponding to first and second excitation codebooks 180, 190, candidate selection circuits 160, 170 for the first and second excitation codebooks are provided respectively. Candidate selection circuits 160, 170 select the previously fixed number of excitation codevectors from corresponding respective excitation codebooks 180, 190 and output thereof as the candidates of the excitation codevectors. Spectral parameters, output signals of the influence signal subtracter and candidates of adaptive codevectors are inputted into each of candidate selection circuits 160, 170 for the excitation codebook.
Optimum combination search circuit 200 is provided for candidates selected by candidate selection circuits 150, 160, 170 for the corresponding codebooks in order to search the optimum combination of candidates. Further, gain codebook search circuit 210 and multiplexer 230 are provided. Optimum combination search circuit 200 is structured so as to output to multiplexer 230 the delay (to the adaptive codevector) or index (to each excitation codevector) with reference to each of the respective optimum adaptive codevectors, to the first and second excitation codevectors according to the results of the search, and to the output weighted synthetic signals of the above vectors to gain codebook search circuit 210, respectively. Gain codebook search circuit 210 searches for the optimum gain codevector from gain codebook 220 which stores gain codevectors, and outputs the index of thus searched optimum gain codevector. Multiplexer 230 is structured so as to receive delay or indices from optimum combination search circuit 200 or gain codebook search circuit 210, and output codes which correspond to input speech signals according to delay or indices.
Next, description will be made with reference to selection or search algorithm of each candidate selection circuit 150, 160, 170 or optimum combination search circuit 200 of the present embodiment. Under these algorithms, the excitation signal is processed by two-stage vector quantization by using two kinds of excitation codebooks 180, 190.
First, in adaptive codebook candidate selection circuit 150, the predetermined number L₀ of the adaptive codevectors is selected, in order, from the one with smaller error E₀ expressed by equation (1):

${E₀ = ∥z - β₀sa}_{d} ∥² (1)$

where, z is a signal obtained by subtracting an influence signal from a perceptually weighted input signal, sa_d a perceptually weighted synthetic signal of adaptive codevector a_d with delay d, β₀ a sequential optimum gain of an adaptive codevector, ∥ ∥ Euclid norm. The sequential optimum gain β₀ of the adaptive codevector is given by:

By substituting above equation (2) into equation (1), the next equation is obtained.

where, 〈, 〉 represents an inner product.
In candidate selection circuit 160 for the first excitation codebook, candidates of predetermined number L₁ of the first excitation codevectors are selected for each L₀ piece of the adaptive codevectors selected by candidate selection circuit 150 for the adaptive codebook, in order, from the one with smaller error E₁ expressed by equation (4):

${E₁ = ∥za - γ₀se}_{i} ¹∥² (4)$

where se_i¹ is a perceptually weighted synthesized signal of first excitation codevector e_i¹ with index i, γ₀ a sequential optimum gain of the first excitation codevector, and za = z - β₀ sa_d.
Therefore:

By substituting above equation (5) into equation (4), equation (6) below is obtained:

In the same way as described above, in candidate selection circuit 170 for the second excitation codebook, the candidates of predetermined number L₂ of the second excitation codevectors are selected for each L₀ piece of adaptive codevectors selected by candidate selection circuit 150 for the adaptive codebook, in order, from the one with smaller error E₂ expressed by the next equation:

${E₂ = ∥za - δ₀se}_{j} ²∥² (7)$

where se_j² is a perceptually weighted synthesized signal of second excitation codevector e_j² with index j, and δ₀ a sequential optimum gain of the second excitation codevector. Therefore:

By substituting equation (8) into equation (7), following equation (9) is obtained.

In optimum combination search circuit 200, error E is calculated by the following equation for all the combinations of candidates of the selected adaptive codevectors, and the first and second excitation codevectors, and then the combination of the candidates with minimum E is searched.

${E = ∥z-βsa}_{d} {-γse₁¹-δse}_{j} ²∥² (10)$

where β, γ, δ are simultaneous optimum gains of an adaptive codevector, the first and second excitation codevectors, respectively. Therefore:

However, R is to satisfy the following equation:

By substituting equation (11) into equation (10), thus obtained,

When above error E is calculated, it is acceptable to assign a particular limitation to simultaneous optimum gains γ, δ of each excitation codevector. For example, with the limitation that γ and δ are equal, error E is given by,

where,

Next, description will be made with reference to operation of the voice-coder of the present embodiment.
Speech input circuit 100 receives speech signals divided into each frame (e.g., 40 ms in width), which signals are outputted to linear prediction analysis circuit 110 and subframe division circuit 120. In linear prediction analysis circuit 110, linear prediction analysis of the inputted speech signal is performed for calculating the spectral parameter. This spectral parameter is outputted to weighting filter 130, to influence signal subtracter 140, to candidate selection circuit 150 for the adaptive codebook, to candidate selection circuit 160 for the first excitation codebook, to candidate selection circuit 170 for the second excitation codebook, and to multiplexer 230. Separately, a frame is divided into subframes (e.g., 8 ms in width) by subframe division circuit 120. Speech signals divided into subframes are inputted into weighting filter 130. Weighting filter 130 performs perceptual weighting of inputted speech signals, and outputs the results to influence signal subtracter 140. Influence signal subtracter 140 subtracts the weighted influence signal from the preceding subframe, and outputs the result to candidate selection circuit 150 for the adaptive codebook, to candidate selection circuit 160 for the first excitation codebook, to candidate selection circuit 170 for the second excitation codebook, and to gain codebook search circuit 210.
Subsequently, candidate selection circuit 150 for the adaptive codebook selects the candidate of L₀ pieces of adaptive codevectors from adaptive codebook 175 according to equation (3). Candidate selection circuit 150 for the adaptive codebook outputs the weighted synthetic signal of the candidate of the selected adaptive codevectors and delay d which constitutes the index of the candidate of adaptive codevectors, to candidate selection circuits 160, 170 for the first and second excitation codebooks and to optimum combination search circuit 200.
Candidate selection circuit 160 for the first excitation codebook selects the candidate of L₁ pieces of the first excitation codevector from first excitation codebook 180, according to the output of the influence signal subtracter, the spectral parameter and the candidate of the adaptive codevector by using equation (6). Candidate selection circuit 160 for the first excitation codebook outputs the weighted synthetic signal and index of the candidate of the selected first excitation codevector to optimum combination search circuit 200. In the same manner, candidate selection circuit 170 for the second excitation codebook selects the candidate of the second excitation codevector from the second excitation codebook according to equation (9), and outputs the weighted synthetic signal and index of the selected second excitation codevector to optimum combination search circuit 200.
Optimum combination search circuit 200 searches for the combination of the optimum candidates according to equation (14), and outputs the delay of the adaptive codevector and the indices of the first and second excitation codevectors to multiplexer 230, and weighted synthetic signals of each codevector to gain codebook search circuit 210. Gain codebook search circuit 210 searches for the optimum gain codevector from gain codebook 220 according to each of the inputted weighted synthetic signals, and outputs the index of thus obtained gain codevector to multiplexer 230.
Finally, multiplexer 230 assembles and outputs the code for the speech signal divided into subframes according to the delay and index outputted from optimum combination search circuit 200 and to the index outputted from gain codebook search circuit 210. By carrying out the above process, speech coding of every subframe is completed.
According to the present embodiment, the candidates are selected first from the adaptive codebook and each of excitation codebooks, and then the optimum combination is selected from the combination of each of thus selected candidates, so that a sufficiently good speech quality can be obtained with a relatively small operation. In addition, since the gain codebook which stores the quantized gain vectors is used for selecting the optimum combination from combinations of the candidates, SN ratio is further improved.
The second embodiment of the present invention will be described with reference to Fig. 2. In the voice-coder shown in Fig. 2, each block attached with the same reference numeral as that in Fig. 1 has the same function as that in Fig. 1.
The voice-coder in Fig. 2, when compared with the voice-coder in Fig. 1, differs in that it has no gain codebook search circuit and optimum combination search circuit, but has instead gain-including optimum combination search circuit 300. Gain-including optimum combination search circuit 300 receives candidates of the adaptive codevectors, candidates of the first and second excitation codevectors, and outputs of influence signal subtracter 140, and selects the optimum combination from all of the combinations of the candidates and gain codevectors by searching for gain codebook 220. Gain-including optimum combination search circuit 300 is structured so as to output the delay or index of each codevector composing the selected combination to multiplexer 230 according to the selected combination.
The search algorithm which controls gain-including optimum combination search circuit 300 will next be described.
Gain-including optimum combination search circuit 300 searches for the combination of candidates which has the minimum value of error E by calculating E for all of the combinations of candidates of the selected adaptive codevectors, the selected first and second excitation codevectors, and all of the gain codevectors, where E is calculated by the following equation:

${E = ∥z - Qβ}_{k} {sa}_{d} {- Qγ}_{k} {se}_{i} {¹ - Qδ}_{k} {se}_{j} ²∥² (16)$

where Qβ_k, Qγ_k, Qδ_k are each gain codevector.
It is acceptable to use, in place of above Qβ_k, Qγ_k, Qδ_k, not the gain codevector itself, but gain codevectors converted by the matrix to be calculated from the quantized power of the weighted input signal, the weighted synthetic signal of the adaptive codevector and the weighted synthetic signals of the first and second excitation codevectors. Since it requires large operation to search for the minimum value of E by calculating it against all the gain codevectors, it is also possible to perform a preliminary selection of the gain codebook to reduce the operation. The preliminary selection of the gain codebook is performed, for example, by selecting the predetermined fixed number of gain codevectors whose first components are close to the sequential optimum gain of the adaptive codevector.
The operation of this voice-coder will be described. It is the same as that of the voice-coder shown in Fig. 1 except that the candidates of vectors are outputted from each of candidate selection circuits 150, 160 and 170. These candidates of codevectors are inputted into gain-including optimum combination search circuit 300, whereby the optimum combination of candidates is searched according to equation (16). Then consulting the searched combination, the delay of the adaptive codevector and indices of the first and second excitation codevectors and gain codevectors are inputted into multiplexer 230, from which speech signal codes are outputted.
Next, the third embodiment of the present invention will be described with reference to Fig. 3. In the voice-coder shown in Fig. 3, each block attached with the same reference numeral as that in Fig. 1 has the same function as that in Fig. 1.
This voice-coder differs from the one shown in Fig. 1 in that the second excitation codebook is composed of excitation super codebook 390. A super codebook means a codebook which stores codevectors with the number of bits larger than the number of bits to be transmitted. Index i of the candidate of the first excitation codevector is outputted from first excitation codebook selection circuit 160 to second excitation super codebook 390. The selection of the candidate of the second excitation codevectors from second excitation super codebook 390 is carried out by searching codevectors from a portion of second excitation super codebook 390, the portion being expressed by set F₂(i) of indices to be determined according to index i of the first excitation codevector.
When searching of the candidates of the first and second codevectors is finished, then the optimum combination of candidates is searched in optimum combination search circuit 200 according to equation (14) as searched in the first embodiment. In the present embodiment, it is possible to modify so as to output all the second excitation codevectors which correspond to set of indices F₂(i) without performing selection of candidates of the second excitation codevectors in candidate selection circuit 170 of the second excitation codebook. In this case, optimum combination search circuit 200 can search the optimum combination from the combination of the candidate of the adaptive codevectors, the candidate of the first excitation codevectors, and all of the second excitation codevectors corresponding to set F₂(i).
As described above in the third embodiment of the present invention, by applying the super codebook in the embodiment, it becomes possible to obtain speech quality as substantially good as the case with a excitation codebook of an increased codebook size without increasing the bit rates.
The fourth embodiment of the present invention will nest be described with reference to Fig. 4. In the voice-coder shown in Fig. 4, each block attached with the same reference numeral as that in Fig. 2 has the same function as that in Fig. 2.
This voice-coder uses second excitation super codebook 390 instead of the second excitation codebook, differently from the voice-coder in Fig. 2. Super codebook 390 is similar to the super codebook in the voice-coder shown in Fig. 3. The candidate of the second excitation codevector to be selected from second excitation super codebook 390 is also selected in the same way as in the third embodiment, and other operations are conducted in the same manner as in the second embodiment. In this case, it is also possible to modify candidate selection circuit 170 for the second excitation codevectors so as to output all of the second excitation codevectors which correspond to set of indices F₂(i) without selecting the candidate of the second excitation codevectors.
Although each embodiment of the present invention has been described above, the operation of each embodiment can be modified in such a way that auto-correlation 〈se_i, se_i〉 of weighted synthetic signal se_i of the excitation codevector is obtained according to the following equation for the purpose of reducing the operation:

where hh is an auto-correlation function of the impulse response of a weighting synthesis filter, ee_i an auto-correlation function of the excitation code vector with index i, and im a length of the impulse response.
As well, cross-correlation between weighted synthetic signal se_i of the excitation codevector and arbitrary vector v can be calculated according to the following equation to reduce the operation:

${〈v, se}_{i} {〉 = 〈H}^{T} {v, e}_{j} 〉 (18)$

where H is an impulse response matrix of the weighting synthesis filter.
Cross-correlation between weighted synthetic signal sa_d of the adaptive codevector and arbitrary vector v can be obtained according to the following equation in the same way:

${〈v, sa}_{d} {〉 = 〈H}^{T} {v, a}_{d} 〉 (19)$

Further, in the case of searching for the optimum combination in the optimum combination search circuit of the first and third embodiments, although a particular limitation (γ = δ ) is now assigned to gains γ, δ of the first and second excitation codevectors as described above, it is possible to provide limitations other than γ = δ or to provide no limitation.
Further, it is also possible to apply a delayed decision system in each embodiment in such a way that the combination of candidates is selected so as to have the minimum cumulative error for the whole frames without uniquely determining the adaptive codevector, the first and second excitation codevectors and the gain codevector for each subframe while leaving the candidates undetermined.
It is to be understood that variations and modifications of the method for speech coding and of the voice-coder disclosed herein will be evident to those skilled in the art. It is intended that all such modifications and variations be included within the scope of the appended claims.

Claims

A method for speech coding for coding speech signals divided into frames spaced with a constant interval, wherein,
an adaptive codebook storing excitation signals determined in the past and a plurality of excitation codebooks for multi-stage vector quantization of an excitation signal of the input speech signal are prepared;
a spectral parameter of said input speech signal is obtained;
said frame is divided into subframes;
a candidate of a first fixed number of adaptive codevectors is selected for every said subframe from said adaptive codebook by using said input speech signal and said spectral parameter;
candidates of a second fixed number of excitation codevectors are selected for every said subframe from said excitation codebooks, respectively, by using said input speech signal, said spectral parameter and the candidate of said adaptive codevector; and
a combination of the adaptive codevector and each of the excitation codevectors forming an excitation signal of said subframe is selected from the candidates of said adaptive codevector and each of said excitation codevectors by using said input speech signal and said spectral parameter.
A method for speech coding according to Claim 1, wherein selection of the candidates of the adaptive codevector and each of the excitation codevectors are performed, respectively, in order, from the selection of the candidate with a smaller error.
A method for speech coding according to Claim 1 or 2, wherein,
a gain codebook is used for performing quantization of gains of said adaptive codebook and each of said excitation codebooks, respectively; and
a gain codevector is determined by using said gain codebook when selection of a combination of the adaptive codevector and each of the excitation codevectors forming the excitation signal of said subframe from the candidates of said adaptive codevector and said excitation codevector is performed.
A method for speech coding according to any of Claims 1 to 3, wherein,
at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and
selection of the candidate of the excitation codevector from said excitation super codebook is performed corresponding to the candidate of the excitation codevector already selected.
A method for speech coding according to Claim 1, wherein the step of selecting the combination of the adaptive codevector and each of the excitation codevectors forming the excitation signal of said subframe from the candidates of said adaptive codevector and said excitation codevector, further comprising the steps of:
determining the optimum gain codevector from said gain codebook; and
reflecting said gain codevector on said adaptive codevector and each of said excitation codevectors forming said excitation signal.
A method for speech coding according to Claim 5, wherein,
at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and
selection of the candidate of the excitation codevector from said excitation super codebook is performed corresponding to the candidate of the excitation codevector already selected.
A voice-coder for coding speech signals divided into frames spaced with a constant interval, comprising:
linear prediction analysis means for outputting spectral parameters of input speech signals;
an adaptive codebook for storing excitation signals determined in the past;
a plurality of excitation codebooks provided for multi-stage vector quantization of the excitation signal of said input speech signals;
wherein, in case of searching for a combination of the adaptive codevector and each of the excitation codevectors for every subframe prepared by further division of said frame from said adaptive codebook and each of said excitation codebooks, respectively, said combination of the adaptive codevector and each of the excitation codevectors forming an excitation signal of said subframe:
a candidate of a first predetermined number of adaptive codevectors is selected from said adaptive codebook by using said input speech signal and said spectral parameter;
candidates of each predetermined number of excitation codevectors are selected from a plurality of said excitation codebooks, respectively, by using said input speech signal, said spectral parameter and the candidate of said adaptive codevector; and
a candidate of said adaptive codevector and each of said excitation codevectors forming the excitation signal of said subframe is selected from the candidate of said adaptive codevector and each of said excitation codevectors by using said input speech signal and said spectral parameter.
A voice-coder according to Claim 7, further comprising:
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks; wherein
said input speech signal, said spectral parameter and said gain codebook are used for searching a combination of the adaptive codevector and each of the excitation codevectors which forms the excitation signal of said subframe, from the candidates of said adaptive codevector and said excitation codevectors.
A voice-coder according to Claim 7 or 8, wherein,
at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and
selection of the candidate of the excitation codevector from said excitation super codebook is performed corresponding to the candidate of the excitation codevector already selected.
A voice-coder for coding speech signals divided into frames spaced with a constant interval, comprising:
linear prediction analysis means for outputting spectral parameters of input speech signals;
an adaptive codebook storing excitation signals determined in the past;
a plurality of excitation codebooks provided for multi-stage vector quantization of an excitation signal of said input speech signals;
subframe division means for generating subframe signals by dividing said frame into subframes;
first selection means for selecting a candidate of a first fixed number of adaptive codevectors from said adaptive codebook in accordance with said subframe signal and said spectral parameter;
second selection means provided for every said excitation codebook for selecting the candidate of the excitation codevectors of the number predetermined for every excitation codebook, from the corresponding excitation codebook in accordance with said subframe signal, said spectral parameter and the candidate of said adaptive codevector; and
means for searching the candidate of said adaptive codevector and each of said excitation codevectors which forms the excitation signal of said subframe, from the candidate of said adaptive codevector and the candidate of each of said excitation codevectors in accordance with said input speech signal and said spectral parameter.
A voice-coder according to Claim 10, wherein,
first and second selecting means select each corresponding candidate, in order, from the candidate with a smaller error;
said search means searches the candidate of said codevector on the condition of whose error is lowest.
A voice-coder according to Claim 10, further comprising:
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks; wherein
said search means searches the candidate of said codevector by further consulting said gain codebook.
A voice-coder according to Claim 11, further comprising:
a gain codebook for quantization of each gain of said adaptive codebook and each of said excitation codebooks; wherein
said search means further determines the optimum gain codevector from said gain codebook by consulting said gain codebook, and reflects said gain codevector on the adaptive codevector and each of the excitation codevector which forms said excitation signal.
A voice-coder according to any of Claims 10 to 13, wherein, at least one or more of excitation super codebooks is included in said plurality of excitation codebooks, said super codebook comprising bits with the number of bits larger than the number of bits to be transmitted; and
said second selection means corresponding to said excitation super codebook performs selection of the candidate of the excitation codevector from said excitation super codebook according to the candidate of the excitation codevector already selected.