WO2013129439A1

WO2013129439A1 - Encoding device, encoding method, program and recording medium

Info

Publication number: WO2013129439A1
Application number: PCT/JP2013/055048
Authority: WO
Inventors: 守谷　健弘; 優鎌本; 登原田; 弘和亀岡; 茂樹嵯峨山; 崇良大嶋; 小野　順貴; 大輔齋藤
Original assignee: 日本電信電話株式会社; 国立大学法人東京大学; 大学共同利用機関法人情報・システム研究機構
Priority date: 2012-02-28
Filing date: 2013-02-27
Publication date: 2013-09-06
Also published as: JP5789816B2; JPWO2013129439A1

Abstract

A gain codebook having a plurality of sets stored therein is stored in a gain quantization unit. The sets contain candidates for quantized pitch gains, candidates for quantized fixed codebook gains, and indexes. A plurality of indexes includes indexes with different numbers of bits. In selecting a code, the waveform distortion itself caused by gain quantization, and the distortion converted from the length of the corresponding code are both considered. In other words, in order to select a code, not only is the distortion of a waveform caused by the use of the code considered, but the code length assigned to the code is considered.

Description

Encoding apparatus, method, program, and recording medium

The present invention relates to a technique for encoding or decoding sound signals such as voice and music. In particular, the present invention relates to a technique for encoding or decoding a gain of a periodic component and a gain of a pulse component that are encoded by an encoding technique such as CELP.

Conventionally, the gain of periodic components and the gain of pulse components used in encoding and decoding of CELP (Code （Excited Linear Prediction) are encoded with fixed-length bits assigned to increase the tolerance against code errors. It was decoded (for example, refer nonpatent literature 1). For the gain of the pulse component, the amount of code can be reduced in consideration of the temporal continuity of the gain value by using the ratio to the predicted value from the past subframe instead of the gain itself as the object of encoding. I was going.

Further, in Patent Document 1, it is determined whether or not there is temporal continuity of the gain value of the periodic component from the gain of the periodic component, and when it is predicted that there is temporal continuity, The amount of code is reduced by variable-length coding the difference between gain values.

WO2006 / 075605 International Publication

In the encoding method and decoding method described in Non-Patent Document 1, encoding and decoding at a fixed length are performed for the gain of the periodic component and the gain of the pulse component.
However, in the encoding method described in Non-Patent Document 1, encoding and decoding are performed without considering redundancy regarding the frequency of the gain of the periodic component and the gain frequency of the pulse component and the continuity of the gain of the periodic component. Therefore, there is a problem that encoding and decoding efficiency is not good.

Patent Document 1 discloses a technique for performing fixed-length or variable-length encoding and decoding in consideration of the continuity and frequency of gain values of periodic components.
However, the variable-length coding and decoding described in Patent Document 1 is intended to reduce the average code amount, and both the distortion and the code length of the index are taken into account when performing variable-length coding. Was not.

The problem of the present invention is that when the gain obtained by an encoding method such as CELP is encoded with reference to a codebook, it is more efficient considering both the length (information amount) and distortion of the code. It is to provide a good encoding device, this method, a program and a recording medium.

In encoding, the most preferable index is selected from the code book corresponding to the gain, and in this case, not only the waveform distortion caused by using this code but also the code length assigned to this code is considered. .

For vector quantization of gain, use a variable-length code that assigns a short code to a frequently occurring index in advance, and when selecting an index, the length of the index code is approximately converted to distortion. Deform the distortion scale and use the deformed distortion scale. An index with the most favorable balance between distortion and code length can be selected. Thereby, although the average bit rate is smaller than that of the conventional technique, the average waveform distortion can be made substantially the same as that of the conventional technique. Further, by allocating the average number of bits saved, for example, to pulse component encoding, waveform distortion can be reduced while maintaining the same average bit rate as compared with the conventional encoding method.

The functional block diagram of the example of an encoding apparatus. The functional block diagram of the example of a decoding apparatus. The functional block diagram of the example of an encoding apparatus.

Hereinafter, an embodiment of the present invention will be described in detail.

[First embodiment]
<Configuration>
As illustrated in FIG. 1, the encoding device 11 of the first embodiment includes a linear prediction analysis unit 111, an adaptive codebook 112, a fixed codebook 113, and a pitch analysis unit 114 (corresponding to an “adaptive codebook search unit”). , Search section 115 (corresponding to “fixed codebook search section”), perceptual weighting filter 116, synthesis filter 117, gain quantization section 118, and parameter encoding section 119.

2, the decoding device 12 according to the first embodiment includes an adaptive codebook 122, a fixed codebook 123, a fixed codebook selection unit 125, a synthesis filter 127, and a parameter decoding unit 129.

The encoding device 11 and the decoding device 12 according to the present embodiment are, for example, a program in a known computer or a dedicated computer having a CPU (central processing unit), a RAM (random-access memory), a ROM (read-only memory), etc. It is a special device constructed by reading data. Further, at least a part of the processing units of the encoding device 11 and the decoding device 12 may be configured by hardware such as an integrated circuit.

<Encoding>
In the encoding device 11, the input acoustic signals x (n) (n = 0,..., L−1, L), which are digitized and time-series signals divided in units of frames that are predetermined time intervals, An integer greater than or equal to 2 and each n is called a “sample point”). The encoding device 11 encodes the input acoustic signal x (n) (n = 0,..., L−1) for each frame as follows.

The linear prediction analysis unit 111 inputs the input acoustic signal x (n) (n = 0,...) At each sample point n = 0,..., L−1 belonging to the processing target frame (referred to as “current frame”). .., L−1) is performed, and linear prediction information LPC info (“prediction”) is a code corresponding to the quantized value of the coefficient for specifying the all-pole synthesis filter 117 in the current frame. Included in "parameter". That is, the linear prediction analysis unit 111 uses, for each frame, a code that specifies a linear prediction coefficient corresponding to the input acoustic signal x (n) (n = 0,..., L−1) or a coefficient compatible therewith. Obtain and output some linear prediction information LPC info. For example, the linear prediction analysis unit 111 performs linear prediction coefficients a (m) (m = 1,...) Corresponding to the input sound signal x (n) (n = 0,..., L−1) of the current frame. , P, P are linear prediction orders that are positive integers), linear prediction coefficients a (m) (m = 1,..., P) are converted into line spectrum pair coefficients LSP, and quantized lines. A code corresponding to the spectrum pair coefficient LSP is output as linear prediction information LPC info.

The fixed codebook 113 includes a plurality of pulse sequences (“sample”) composed of one or more signals having a value composed of a combination of a non-zero unit pulse and its polarity and one or more signals having a zero value. Information for identifying the column) is stored. Fixed codebook 113 outputs a pulse sequence corresponding to input acoustic signal x (n) for each subframe into which one frame is divided under the control of search section 115. Here, an example is shown in which one frame is equally divided into four subframes. That is, a frame composed of L sample points 0,..., L−1 is a first subframe (first frame ₎ composed of sample points L _{f (0)} ,..., L _{f (1)} −1. Sub-frame), sample points L _{f (1)} ,..., L _{f (2)} −1, the second sub-frame (second sub-frame), sample points L _{f (2)} ,. The third subframe consisting of _{f (3)} −1 (third subframe) and the fourth subframe consisting of sampling points L _{f (3)} ,..., L _{f (4)} −1 _(fourth subframe ₎ Subframes). L _{f (0)} , L _{f (1)} , L _{f (2)} , L _{f (3)} , L _{f (4)} are L _{f (0)} = 0, L _{f (4)} = L, L _{f ( 0)} <L _{f (1)} <L _{f (2)} <L _{f (3)} <L _{f (4)} . Pulse sequences c _f1 , c _f2 , c _f3 , and c _f4 corresponding to the first to fourth subframes are expressed as follows.
c _f1 = (c _f1 (L _{f (0)} ), ..., c _f1 (L _{f (1)} -1))
c _f2 = (c _f2 (L _{f (1)} ), ..., c _f2 (L _{f (2)} -1))
c _f3 = (c _f3 (L _{f (2)} ), ..., c _f3 (L _{f (3)} -1))
c _f4 = (c _f4 (L _{f (3)} ), ..., c _f4 (L _{f (4)} -1))

The pitch analysis unit 114 obtains pitch periods T ₁ , T ₂ , T ₃ , T ₄ corresponding to the input acoustic signal x (n) (n = 0,..., L−1) for each subframe. , Pitch periods T ₁ , T ₂ , T ₃ , T ₄ and pitch codes (periodic component codes) CT ₁ , CT ₂ , CT ₃ for identifying the pitch periods T ₁ , T ₂ , T ₃ , T _4. , CT ₄ . The pitch codes CT ₁ , CT ₂ , CT ₃ , and CT ₄ of each subframe may be uniform lengths or variable lengths. The numbers of bits of CT ₁ , CT ₂ , CT ₃ , and CT ₄ may be the same or different from each other. The pitch period is obtained by decoding the pitch code. Therefore, it is not essential for the pitch analysis unit 114 to output the pitch period. The pitch period is not only expressed as an integer multiple of the sample point interval (integer precision), but is also expressed using an integer multiple of the sample point interval and a decimal value (fractional value) (decimal precision) There is also. In addition, pitch analyzer 114 may obtain and output pitch gains g _p1 , g _p2 , g _p3 , and g _p4 for each subframe for use in search unit 115.

Pitch periods T ₁ , T ₂ , T ₃ , T ₄ corresponding to the input acoustic signal x (n) (n = 0,..., L−1), the pitch periods T ₁ , T ₂ , T ₃ , T The search for pitch codes CT ₁ , CT ₂ , CT ₃ , and CT ₄ that specify ₄ is performed by, for example, pitch periods of excitation signals generated at each past time point stored in the adaptive codebook 112 for each subframe. The perceptual weighting filter is applied to the difference between the synthesized signal obtained by applying the all-pole synthesis filter 117 specified by the linear prediction information LPC info to the signal obtained by delaying the candidate and the input acoustic signal. It is performed so that the value to which 116 is applied is minimized.

The pitch gains g _p1 , g _p2 , g _p3 , and g _p4 are, for example, for each subframe, the input sound that is input with the synthesized signals corresponding to the searched pitch periods T ₁ , T ₂ , T ₃ , and T _4. It is obtained as a value obtained by dividing the cross-correlation value with the signal by the autocorrelation value of the combined signal.

The adaptive codebook 112 stores excitation signals generated at each past time point. The adaptive codebook 112 has an adaptive signal component v (n) obtained by delaying the excitation signal according to the pitch periods T ₁ , T ₂ , T ₃ , T ₄ obtained in each subframe of the 1-4th subframe. (N = 0,..., L−1) is output. When the adaptive signal component v (n) is expressed using a pitch cycle with decimal precision, an interpolation filter that performs a weighted average operation on a plurality of excitation signals delayed according to the pitch cycle is used.

For each subframe, the search unit 115 includes a pulse sequence c _f1 , c _f2 , c _f3 , c _f4 corresponding to the input acoustic signal x (n) (n = 0,..., L−1), and the pulse. Code indices C _f1 , C _f2 , C _f3 , C _f4 (input acoustic signals x (n) (n = 0,..., L−1) corresponding to the sequences c _f1 , c _f2 , c _f3 , c _f4 And a corresponding pulse sequence c _f1 , c _f2 , c _f3 , c _f4 ). Since the pulse sequence is obtained by decoding the code index, it is not essential for the search unit 115 to output the pulse sequence.

The gain quantization unit 118 receives an input acoustic signal x (n) (n = 0,..., L−1) and a synthesized signal x ′ (n) (n = 0,..., L−1). Entered. A sample sequence composed of the synthesized signal x ′ (n) (n = 0,..., L−1) is obtained by passing each pulse sequence corresponding to the code index through the synthesis filter 117 to each sample of the sample sequence. A sample sequence obtained by multiplying the quantized fixed codebook gain candidates and a sample sequence obtained by passing past excitation signals through the synthesis filter 117 by the number of samples corresponding to the pitch period corresponding to the pitch code. And a sample sequence obtained by multiplying a sample of the above by a quantized pitch gain candidate is obtained for each corresponding sample. The gain quantizing unit 118 performs vector quantization using these, that is, for each subframe, the quantized pitch gain or its function value and the quantum so as to minimize the distortion of the input acoustic signal and the synthesized signal. A code corresponding to the set of the fixed codebook gain or its function value is obtained and output. Hereinafter, the quantized pitch gain is expressed as “quantized pitch gain”, and the quantized fixed codebook gain is expressed as “quantized fixed codebook gain”. Also, a code corresponding to a set of quantized pitch gain g _p1 ^ and quantized fixed codebook gain g _c1 ^, and a set of quantized pitch gain g _p2 ^ and quantized fixed codebook gain g _c2 ^ A set corresponding to a set of code, quantized pitch gain g _p3 ^ and quantized fixed codebook gain g _c3 ^, a set of quantized pitch gain g _p4 ^ and quantized fixed codebook gain g _c4 ^ The codes corresponding to are expressed as “gain codes GA _f1 , GA _f2 , GA _f3 , GA _f4 ”. That is, gain quantization section 118 identifies a set of quantized pitch gain g _pj ^ and quantized fixed codebook gain g _cj ^ for each of the first to fourth subframes (jth subframe). Gain code GA _fj to be output.

For such vector quantization, for example, a gain codebook that is a table for specifying a gain code corresponding to a set of quantized pitch gain and quantized fixed codebook gain is used. An example of the gain codebook is a table in which a plurality of sets of quantized pitch gain candidates, quantized fixed codebook gain candidates, and indexes are stored. Note that the function value of the quantized pitch gain may be the target of vector quantization instead of the quantized pitch gain, or the function value of the quantized fixed codebook gain may be used instead of the quantized fixed codebook gain. Although it may be a vector quantization target, an example in which the quantized pitch gain itself and the quantized fixed codebook gain itself are the target of vector quantization will be described below.

An example of a function value of the quantized fixed codebook gain is the current subframe (or frame) predicted based on the energy of the signal component from the fixed codebook 113 in the past or current subframe (or frame). And a correction factor indicating a ratio between the estimated value of the fixed codebook gain at the time and the fixed codebook gain at the current subframe (or frame). An example of the correction coefficient is γ _gc described in “5.8.2 Quantization of codebook gains” in Non-Patent Document 1. For example, the quantized fixed codebook gain g _cj ^ in the j (j = 1,..., 4) th subframe, the quantized value γ _gc ^ of the correction coefficient γ _gc , j (j = 1,. .., 4) The following relationship holds between the quantized values pg _cj ^ of the estimated value of the fixed codebook gain in the fourth subframe.
g _cj ^ = γ _gc ^ × pg _cj ^

<Vector Quantization Performed by Gain Quantization Unit 118>
The vector quantization performed by the gain quantization unit 118 is performed for each subframe by inputting the input acoustic signal x (n) (n = 0,..., L−1) and the synthesized signal x ′ (n) (n = 0,..., L−1), one of a plurality of indexes stored in the gain codebook is selected and used as a gain code. Since the present invention is characterized by the processing to be selected, its [principle] and [example of specific procedure] will be described in order below.

[principle]
In the vector quantization performed by the gain quantization of the present invention, a variable length code is assigned as an index of the gain codebook. The selection of an index in the present invention is not made based on a criterion for minimizing coding distortion (hereinafter referred to as “distortion”), but a balance between distortion and the number of bits of the index (code length, information amount). Based on the criteria considered. Hereinafter, this standard will be described.

In general, when N (N ≧ 1) samples are encoded, the following approximation holds between the distortion _Dg and the number of bits (g / N) of the code (index) per sample.
D _g = D ₀ 2 ^{(-2g / N)} ... (1)
However, the square distance between the N samples to be encoded and the N restored samples obtained when the codes corresponding to the N samples are decoded is defined as distortion. D ₀ is distortion when the number of bits of the code per sample is 0.
Such a relationship, if the amplitude of the sample is uniformly distributed, one bit number g / N code per sample is increased 1 bit whose strain D _g is to become 1/4, and the amplitude of the sample Regardless of the distribution, it is based on the fact that the same can be said if the number of bits g of the code is more than a certain level.

Therefore, the strain change rate D _g / D ₀ can be approximated as follows.
D _g / D ₀ = 2 ^{(-2g / N)} ... (2)
This is converted into the logarithm of the base 10 and expressed in dB as follows.
10log ₁₀ (D _g / D ₀ )
= 10log ₁₀ (2 ^{(-2g / N)} )
= 10 (-2g / N) log ₁₀ (2)
= -6.02 (g / N) ... (3)
That is, generally, the logarithm of the distortion change rate D _g / D ₀ is proportional to the number of bits g / N of the code per sample. For example, in the case of N = 64, the distortion change rate D _g / D ₀ is improved by about 0.1 dB when the number of code bits g per 64 samples increases by one bit. Similar results can be obtained in experiments.

As described above, Equation (1) approximates the general relationship between the distortion _Dg and the number of bits of code per sample g / N. Let D be the distortion caused by using a quantized pitch gain candidate and a quantized fixed codebook gain candidate corresponding to a certain index in the gain codebook, and let b be the number of bits in that index. When b is small, the number of bits output as a whole of the encoding device can be reduced, or the average number of bits of the code output from the encoding device or the restriction that the number of bits is constant within a certain time interval can be saved. There is an advantage that distortion of the subsequent frame can be reduced by using the bit in the subsequent frame. Conventionally, an index selected on the basis of minimizing D is a gain code. However, in this application, distortion and the number of bits are evaluated with one index value, and an optimal index is selected as a gain code. For this purpose, it is assumed that bits saved by gain quantization based on the relationship of equation (1) and a certain number of bits in this frame are allocated to encoding of the input acoustic signal. When b bits are used for the gain code, it is necessary to reduce the number of bits used for encoding the input acoustic signal by 1 bit as compared with the case of (b-1) bits, and the distortion is 2 ^{(2 / N )} Increase by a factor of two. Therefore, when b bits are used for the gain code, the distortion increases by 2 ^{(2b / N)} times. The distortion in terms of distortion consumption of such number of bits defined by the equation (4) as a D _U.
D _U = D × 2 ^{(2b / N)} ... (4)
More distortion D is smaller D _U becomes smaller, as the number of bits to specify the index b is smaller D _U decreases. By searching a code D _U is minimized, it is possible to select a gain code by evaluating the distortion D and the number of bits of the index b by one index value. Hereinafter referred to as an index value _{D U.}

Instead of Equation (4), code following the index value D _U is the minimum may be searched.
D _U = 10log ₁₀ (D × 2 ^{(2b / N)} ) = 10log ₁₀ (D) + (20b / N) log ₁₀ (2) ... (5)

Since the value of the exponent part 2b / N of the formula (4) is very small, the formula an exponential portion (4) and Taylor expansion ^{^{(e x = 1 + x +}} x 2/2 + x 3/6 + ...), obtained thereby This is an approximation that may omit the second and subsequent terms of the polynomial. Therefore, reference numerals the following index values D _U is the minimum may be searched.

Equation (6) is based on various assumptions and approximations as described above. In general, γ is a positive constant and is used as an index value as in Equation (7). It is preferable to set a value based on an experiment tailored to the purpose.
D _U = D (1 + γb) ... (7)
In short, the index value D _U is not limited to the above, and a value obtained by adding or multiplying the distortion D and a coefficient that increases as the number of bits b of the index increases is used as the index value D _U. code value D _U is the minimum need be searched.

[Example of specific procedure]
A specific procedure of vector quantization performed by the gain quantization unit 118 based on the above principle will be exemplified.
Gain quantization section 118 performs variable length coding (for example, Huffman coding) on a set of quantized pitch gain candidates and quantized fixed codebook gain candidates to obtain an index that is a gain code. For example, the gain quantization unit 118 includes a gain codebook in which a plurality of pairs of quantized pitch gain candidates, quantized fixed codebook gain candidates, and indexes that are variable length codes are stored, According to the standard, an index which is a gain code corresponding to the input acoustic signal x (n) (n = 0,..., L−1) is obtained.

The variable length code is obtained from the result of quantizing the learning data, for example. Specifically, when the variable length code vector-quantizes a set of the pitch gain and fixed codebook gain of the learning data, the variable-length code includes a quantized pitch gain candidate and a quantized fixed codebook gain candidate. The set is assigned in advance according to the frequency with which the set is selected. A pair of selected frequent quantized pitch gain candidates and quantized fixed codebook gain candidates is assigned an index (short code) with a small number of bits, and the selected low frequency quantized An index (long code) having a large number of bits is assigned to a set of pitch gain candidates and quantized fixed codebook gain candidates. That is, the plurality of indexes stored in the gain codebook include those with different numbers of bits. An example of such an index is a Huffman code. However, other variable length codes may be used as the index. The frequency for determining the number of bits of each index is, to some extent, the number of pairs of quantized pitch gain candidates and quantized fixed codebook gain candidates that are selected without using learning data. Therefore, a variable length code may be assigned to a set of a quantized pitch gain candidate and a quantized fixed codebook gain candidate by predicting the above frequency without using learning data.

A specific example of the gain codebook is shown below.

Table 1 shows an example of the gain codebook in the case where the index is a Huffman code and the number of bits of the index is also stored in the gain codebook. Although a part thereof is omitted in Table 1, the gain codebook in Table 1 includes a set of a quantized pitch gain candidate, a quantized fixed codebook gain candidate, an index, and the number of bits of the index. 32 sets are stored.

Gain quantization section 118 obtains samples obtained by passing a pulse sequence (sample sequence from fixed codebook 113) corresponding to a code index through synthesis filter 117 for each subframe (time interval) composed of N sample points. A sample filter βZ obtained by multiplying each sample of the sequence Z by a quantized fixed codebook gain candidate β and a past excitation signal by the number of samples corresponding to the pitch period corresponding to the pitch code are combined with the synthesis filter 117. And a sample sequence αY obtained by multiplying each sample of the sample sequence Y obtained by passing through a quantized pitch gain candidate α and a corresponding sample to obtain a combined signal sample sequence αY + βZ. Obtained by adding or multiplying the distortion D with the input acoustic signal X and the coefficient that increases as the number of bits b of the index increases. An index mark value D _U is the smallest, and outputs as a gain code. For example, the composite signal sample sequence αY + βZ of the j-th subframe is a composite signal x ′ (n) = g _pj ^ × v (n) + g _cj ^ × c _{fj of} N sample points n belonging to the j-th subframe. This is a sample sequence consisting of (n). However, g _pj ^ = α and g _cj ^ = β. For example, the input acoustic signal X in the j-th subframe is a sample string composed of the input acoustic signals x (n) at N sample points n belonging to the j-th subframe. For example, the gain quantization unit 118 outputs g _pj ^ = α and g _cj ^ = β corresponding to each index, and from the combined signal x ′ (n) of N sample points n belonging to the j-th subframe. And a sample sequence consisting of input acoustic signals x (n) of N sample points n belonging to the jth subframe, and calculating an index value D _U of the jth subframe for each index, an index index value D _U is the smallest, and outputs as a gain code of the j-th subframe. A set of a quantized pitch gain candidate α, a quantized fixed codebook gain candidate β and an index corresponding to each index value _DU is a quantized pitch gain candidate stored in the gain codebook. And a set of quantized fixed codebook gain candidates and indexes. When the function value of the quantized pitch gain candidate is stored in the gain codebook, the quantized pitch gain candidate obtained from the function value of the quantized pitch gain candidate may be α. Similarly, when a function value of a quantized fixed codebook gain candidate is stored in the gain codebook, a quantized fixed codebook gain candidate obtained from a function value of a quantized fixed codebook gain candidate May be β. The number of samples included in the sample sequence Z, the number of samples included in the sample sequence Y, and the number of samples included in the combined signal sample sequence αY + βZ are all N. The synthesis filter 117 converts a sample ν (n) at a certain sample point n into samples χ (n) of P sample points n−1, n−2,..., NP past the sample point n. −1), χ (n−2),..., Χ (n−P) with linear prediction coefficients a (n−1), a (n−2),. Sum of multiplied values a (n−1) × χ (n−1), a (n−2) × χ (n−2),..., A (n−P) × χ (n−P) Is a linear FIR (Finite Impulse Response) filter. The synthesis filter 117 is shown below.
υ (n) = a (1) × χ (n-1) + a (2) × χ (n-2) + ... + a (P) × χ (nP)
For example, when the sample string A is obtained by passing the sample string A through the synthesis filter 117, the samples included in the sample string A are χ (n−1), χ (n−2),. P) is at least a part, and ν (n) is a sample at the sample point n of the sample sequence C. When at least a part of χ (n−1), χ (n−2),..., χ (n−P) corresponds to a sample point in the past from the sample sequence A, for example, χ (n− 1) At least part of χ (n−2),..., Χ (n−P) is a sample included in a sample sequence before the sample sequence A. Alternatively, when there is no sample sequence that is earlier than the sample sequence A, at least a part of the χ (n−1), χ (n−2),. It is assumed to be a constant.

Hereinafter a specific example of index values D _U in.
A subframe is composed of N sample points S,..., S + N−1 (S is an integer of 0 or more), and an input acoustic signal X belonging to the subframe is represented by a vector X = (x (S),. x (S + N−1)), the sample sequence Z is expressed as a vector Z = (z (S),..., z (S + N−1)), and the sample sequence Y is expressed as a vector Y = (y (S ),..., Y (S + N−1)), and defining the square error between the sample sequence αY + βZ and the input acoustic signal X as distortion D, the distortion D is expressed as follows. However, σ ^T represents transposition of σ.

For example, in the example in which one frame described above is equally divided into four subframes, the jth subframe (j = 1,..., 4) is represented by N = L _{f (j)} −L _{f (j -1) It} consists of L _{f (j-1)} , ..., L _{f (j)} -1 sample points. Here, the input acoustic signal X in the j-th subframe is expressed as a vector X _j = (x (L _{f (j−1)} ),..., X (L _{f (j)} −1)). Also, the pulse sequence c _fj = (c _fj (L _{f (j−1)} ),..., C _fj (L _{f (j)} −1)) from the fixed codebook 113 in the j-th subframe is synthesized. A sample string Z obtained through the filter 117 is expressed as Z _j = (z (L _{f (j−1)} ),..., Z (L _{f (j)} −1)). Further, adaptive signal components (past excitation signals) v (L _{f (j−1)} ),..., V (L _{f (j)} −1) in the j-th subframe are obtained through the synthesis filter 117. The sample sequence Y to be obtained is expressed as a vector Y = (y (L _{f (j−1)} ),..., Y (L _{f (j)} −1)). Then, the distortion D in the j-th subframe is expressed as follows.

Examples of the index value D _U may be a formula described above (4) or the formula (5) or (7) may be less than index value D _U approximated by equation (6) .

Distortion D is encoding distortion, D _U is the index value in consideration of both the distortion and gain code number of bits b (code length). That is, a conventional selection method to select an index that distortion D is minimum as a gain code, to select the index index value D _U is the minimum as the gain code is the method of choice in the present invention. In the present invention, the gain code is selected in consideration of both the distortion D and the number of bits b (code length) of the gain code.

As shown in Table 1, if each index and its bit number b are stored in association with each other in the gain codebook, gain quantization section 118 does not calculate the bit number b from the index. It can calculate the value _{D U.} However, even without the number of bits b is stored in the gain codebook, the gain quantization unit 118 may calculate the index value D _U by calculating the number of bits from the index. Therefore, it is not essential to store the bit number b of each index in the gain codebook.

[Modification Example of Index Value 1]
Alternatively, the gain quantization unit 118, an index corresponding to the candidate of the candidate and the quantized fixed codebook gains of quantized pitch gain which minimizes the following index values D _U searched from the gain codebook, to obtain The obtained index may be output as a gain code.

Here, B is the number of bits of a code necessary for uniformly encoding a set of all quantized pitch gain candidates and quantized fixed codebook gain candidates stored in the gain codebook. For example, the gain codebook illustrated in Table 1 includes 32 = 2 ⁵ indexes, and an example of B in this case is 5. In the case of variable length coding using the gain codebook illustrated in Table 1 and the index having the number of bits b is a gain code, the remainder is obtained by changing the uniform length coding to variable length coding (B− b) It is assumed that the input sound signal can be further encoded using bits and distortion can be reduced. Actually, if the (Bb) bit is positive, information can be saved, or it can be used after the next subframe, but Equation (11) can be changed from the uniform length code as a reference for selecting the gain codebook. The long code has a smaller number of bits and is evaluated in terms of distortion. The term (2log2) B / N in equation (11) is very small compared to 1. Therefore, there is no big difference between the index value D _U approximate to the formula (10) and (11).

[Modification Example 2 of Index Value]
Alternatively, the gain quantization unit 118, an index corresponding to the candidate of the candidate and the quantized fixed codebook gains of quantized pitch gain which minimizes the following index values D _U searched from the gain codebook, to obtain The obtained index may be output as a gain code.

However, v is a positive coefficient.

In other words, gain quantization section 118 is obtained by multiplying the second term in parentheses in equation (10), that is, the term that increases as the number of bits b of the index increases, by coefficient v, which is a positive value. searching an index corresponding to the candidate of the candidate and the quantized fixed codebook gains of quantized pitch gain for the index value D _U which is obtained by the equation (12) minimizing the gain codebook, gain the resulting index You may output as a code | symbol.

If the index value D _U of formula (12) is used, by adjusting the coefficient v, it can be adjusted magnitude of the number of bits of the gain code of each sub-frame. Such adjustment is effective, for example, when the number of bits of the bit stream output from the parameter encoding unit 119 is determined for each frame. Such adjustment is performed when the number of bits of the gain code obtained in the past time interval (for example, subframe) by a predetermined time is smaller than the average bit number of the index, and v in the current time interval (for example, subframe). If the number of bits of the gain code obtained in the past time interval for a predetermined time is larger than the average number of bits in the index, v in the current time interval is set to a value greater than 1. Is possible.

For example, the gain quantization unit 118 selects an index that minimizes the index value D _U which is obtained by the equation was v = 1 in the first subframe (12) as a gain code. As a result, when an index whose number of bits is smaller than the average number of bits in the first subframe is selected as the gain code, the number of bits that can be allocated to the remaining subframes is greater than the average. In the second subframe, the index of v = 0.5 and the formula an index value D _U which is obtained by (12) and the minimum is selected as the gain code. That is, in the second subframe, the index is selected with priority given to reducing the distortion D over reducing the number of bits of the index. Conversely, when an index having a number of bits larger than the average number of bits in the first subframe is selected as the gain code, the number of bits that can be assigned to the remaining subframes becomes smaller than the average. In the second subframe, the index of v = 2.0 and the formula an index value D _U which is obtained by (12) and the minimum is selected as the gain code. That is, in the second subframe, the index is selected with priority given to reducing the number of bits of the index over reducing the distortion D.

If the coefficient v is different, the optimum gain codebook corresponding to the coefficient v is also different. Therefore, gain quantization section 118 may include a plurality of gain codebooks, and obtain and output gain codes using a gain codebook predetermined for each value of coefficient v. For example, a plurality of codebooks of a gain codebook corresponding to v = 1, a gain codebook corresponding to v = 2, and a gain codebook corresponding to v = 0.5 are stored in the gain quantization unit 118 in advance. advance, gain codebook corresponding to the value of v is selected and the index to minimize an index value D _U is selected from the gain codebook is selected, it may be output as a gain code. In this case, the gain codebook selection criterion is the same between the encoding device 11 and the decoding device 12.

Wherein Like the place of the index value D _U which is obtained by (5) can be used an index value D _U which is obtained by the formula (6), wherein in place of the index value D _U which is obtained by the formula (12) (13) it can be used an index value D _U which is obtained by.
D _u = D (1 + vγb) (13)
Here, gamma is a positive constant, v is because it is a positive coefficient, the positive coefficient corresponding to vγ if w, can be used an index value D _U which is obtained by the formula (14).
D _u = D (1 + wb)… (14)

If the index value D _U of formula (14) is used, by adjusting the coefficients w, it can be adjusted magnitude of the number of bits of the gain code of each sub-frame. In this adjustment, when the number of bits of the gain code obtained in the past time interval (for example, subframe) by a predetermined time is smaller than the average bit number of the index, w in the current time interval (for example, subframe) is set in advance. When the number of bits of the gain code obtained in the past time interval is larger than the average bit number of the index with a value smaller than the predetermined value w ₀ , the current time interval w is set to a predetermined value w _This is possible by setting a value larger than _zero .
For example, the gain quantization unit 118 selects the index to the index value D _U which is obtained by the equation was w = w ₀ in the first sub-frame (14) and the minimum as the gain code. As a result, when an index whose number of bits is smaller than the average number of bits in the first subframe is selected as the gain code, the number of bits that can be allocated to the remaining subframes is greater than the average. In the second subframe, the index which minimizes the index value _{D U} which is obtained by the equation (14) with a w = 0.5 w ₀ is selected as the gain code. Conversely, when an index having a number of bits larger than the average number of bits in the first subframe is selected as the gain code, the number of bits that can be assigned to the remaining subframes becomes smaller than the average. In the second subframe, the index which minimizes the index value _{D U} which is obtained by the equation (14) with a w = 2.0 w ₀ is selected as the gain code. Similarly to the case of using the coefficient v described above, also in the case of using the coefficient w, the gain quantization unit 118 includes a plurality of gain codebooks and uses a gain codebook predetermined for each value of the coefficient w. A gain code may be obtained and output. In this case as well, the gain codebook selection criteria are the same between the encoding device 11 and the decoding device 12 (end of description of <vector quantization performed by the gain quantization unit 118>).

The gain code _GA f1 in each sub-frame by the gain quantization unit _{_{118, GA f2, GA f3,}} GA f4 is obtained, code index _{_{_{C f1, C f2, C f3}}} , pulse sequence _{c f1} corresponding to _{C f4,} a sample sequence obtained by multiplying c _f2 , c _f3 , c _f4 (sample sequence from the fixed codebook 113) by quantized fixed codebook gains g _c1 ^, g _c2 ^, g _c3 ^, g _c4 ^ , Adaptive signal component v (n) (n = 0,..., L−1) that is a past excitation signal by the number of samples corresponding to pitch periods T ₁ , T ₂ , T ₃ , T ₄ for each subframe. And a sample sequence obtained by multiplying the quantized pitch gains g _p1 ，, g _p2 ，, g _p3 ，, and g _p4 ＾ by the corresponding samples, and the following excitation signal u ′ (n) (N = 0, ..., L-1) is It is added to the adaptive codebook 112.
u '(n) = g _p1 ^ × v (n) + g _c1 ^ × c _f1 (n) (n = L _{f (0)} , ..., L _{f (1)} -1)
u '(n) = g _p2 ^ × v (n) + g _c2 ^ × c _f2 (n) (n = L _{f (1)} , ..., L _{f (2)} -1)
u '(n) = g _p3 ^ × v (n) + g _c3 ^ × c _f3 (n) (n = L _{f (2)} , ..., L _{f (3)} -1)
u '(n) = g _p4 ^ × v (n) + g _c4 ^ × c _f4 (n) (n = L _{f (3)} , ..., L _{f (4)} -1)

Also, linear prediction information LPC info, pitch period codes CT ₁ , CT ₂ , CT ₃ , CT ₄ , code indexes C _f1 , C _f2 , C _f3 , C _f4 , and gain codes GA _f1 , GA _f2 , GA _f3 , GA _The “excitation parameter” including _f4 is input to the parameter encoding unit 119. The parameter encoding unit 119 generates and outputs a bit stream BS (code) that is a code corresponding to the excitation parameter.

<Decryption>
The bit stream BS output from the parameter encoding unit 119 of the encoding device 11 (FIG. 1) is input as an input code to the parameter decoding unit 129 of the decoding device 12 (FIG. 2). The parameter decoding unit 129 decodes the linear prediction information LPC info obtained from the bit stream BS and the pitch period codes CT ₁ , CT ₂ , CT ₃ , CT _4, and the pitch periods T ₁ ′, T ₂ ′, Decoding pitch gain g _p1 ^ obtained by decoding T ₃ ′, T ₄ ′, code indexes C _f1 , C _f2 , C _f3 , C _f4 , and gain codes GA _f1 , GA _f2 , GA _f3 , GA _f4 g _p2 ^, g _p3 ^, g _p4 ^ and decoded fixed codebook gains g _c1 ^, g _c2 ^, g _c3 ^, g _c4 ^ are output. The parameter decoding unit 129 uses the same codebook (for example, Table 1) as the gain codebook provided in the gain quantization unit 118 of the encoding device 11 to obtain the gain codes _GAf1 , _GAf2 , _GAf3 , GA. _f4 is _subjected to variable length decoding to obtain decoding pitch gains g _p1 , g _p2 , g _p3 , g _p4, and decoding fixed codebook gains g _c1 , g _c2 , g _c3 , g _c4

The fixed codebook 123 decodes the input code indexes C _f1 , C _f2 , C _f3 , and C _f4 based on the control of the fixed codebook selection unit 125, and the pulse sequences c _f1 , c _f2 , c _f3 and c _f4 are obtained and output. The adaptive codebook 122 has adaptive signal components v ′ (n) (n = 0,..., L−1) specified by the inputted pitch periods T ₁ ′, T ₂ ′, T ₃ ′, T ₄ ′. ) Is output.

A sample sequence obtained by multiplying the pulse sequences c _f1 , c _f2 , c _f3 , and c _f4 by decoding fixed codebook gains g _c1 ^, g _c2 ^, g _c3 ^, g _c4 ^, and an adaptive signal component v ′ ( n) a sample sequence obtained by multiplying (n = 0,..., L−1) by decoding pitch gains g _p1 ^, g _p2 ^, g _p3 ^, g _p4 ^ for each corresponding sample The added excitation signals u ′ (n) (n = 0,..., L−1) are added to the adaptive codebook 122 as follows.
u '(n) = g _p1 ^ × v' (n) + g _c1 ^ × c _f1 (n) (n = L _{f (0)} , ..., L _{f (1)} -1)
u '(n) = g _p2 ^ × v' (n) + g _c2 ^ × c _f2 (n) (n = L _{f (1)} , ..., L _{f (2)} -1)
u '(n) = g _p3 ^ × v' (n) + g _c3 ^ × c _f3 (n) (n = L _{f (2)} , ..., L _{f (3)} -1)
u '(n) = g _p4 ^ × v' (n) + g _c4 ^ × c _f4 (n) (n = L _{f (3)} , ..., L _{f (4)} -1)
Further, an all-pole synthesis filter 127 specified by the linear prediction information LPC info is applied to the excitation signal u ′ (n) (n = 0,..., L−1), and the synthesis generated thereby. A signal x ′ (n) (n = 0,..., L−1) is output.

<Other variations, etc.>
The present invention is not limited to the above-described embodiment. For example, in the above-described embodiment, an example in which the present invention is applied to gain encoding in so-called CELP encoding in which a linear prediction residual signal is encoded using a fixed codebook, an adaptive codebook, and a gain codebook. showed that. However, even if the input acoustic signal itself is not the linear prediction residual signal but is the encoding target, or the time series signal that is not the acoustic signal is the encoding target, the fixed codebook and the adaptive code Even if it does not have any of the books, or has more than one of the fixed codebook and adaptive codebook, other sample sequences instead of encoding using the fixed codebook or adaptive codebook Even if the encoding method is adopted, it is an encoding method that can obtain a waveform information code that is a code corresponding to the sample sequence by some method, and is quantized to a sample sequence corresponding to the waveform information code. The present invention can be applied to any coding that is obtained by multiplying a gain sample (hereinafter referred to as “quantized gain”) and a waveform sample sequence obtained by multiplying an input signal and obtaining a variable-length gain code of the input signal. . The input signal is, for example, a time series signal. Examples of input signals are acoustic signals, video signals, biological signals, seismic wave signals, sensor array signals, and the like. The input signal may be a time domain signal or a frequency domain signal. That is, the encoding device includes a gain codebook in which a plurality of sets of quantized gain candidates or function values thereof and indexes are stored, and the plurality of indexes include those having different numbers of bits. cage, each time or frequency intervals may be obtained an index index value D _U is the smallest as a gain code.

The waveform information code is a code that can specify a sample string by decoding the waveform information code. The code index, the pitch code, and a code that replaces these in the above-described embodiment, for example, sampling A code representing a quantized PCM format sample. The index value _DU is a value that increases as the distortion D increases, and increases as the number of bits of an index corresponding to a quantized gain candidate for obtaining a waveform sample sequence increases. The distortion D is a distortion between the waveform sample sequence obtained by multiplying the sample sequence corresponding to the waveform information code by the quantized gain candidates and the input signal, or the first to Γ Γ (Γ Is obtained by multiplying each sample of the sample sequence corresponding to γth (γ is an integer not less than 1 and not more than Γ) waveform information codes by the γth quantized gain candidates. This is a distortion of the total waveform sample sequence obtained by adding the γth waveform sample sequence (Γ waveform sample sequences) for each corresponding sample and the input signal. The sample sequence corresponding to the waveform information code is obtained by, for example, passing the sample sequence itself obtained by decoding the waveform information code or the sample sequence obtained by decoding the waveform information code through the synthesis filter. It is. For example, the index value _DU is a value obtained by adding or multiplying the distortion D and a coefficient that increases as the number of bits of the index increases. An example of a coefficient that increases as the number of bits of the index increases is a power value having an exponent with a value that increases as the number of bits of the index increases. Specific examples of the index value _{D U} is like the above-mentioned formula (4) (5) (7) (10) (11).

The encoding apparatus 21 illustrated in FIG. 3 includes Θ waveform information codebooks 211-1,..., 211-Θ and an encoding unit 212. However, Θ is an integer of 1 or more, and the encoding unit 212 includes a gain quantization unit 218.

The gain quantization unit 218 stores a plurality of sets of quantized gain candidates α _θ or their function values and indexes. However, θ = 1,. That is, the gain quantizing unit 218 performs the quantized gain candidate α ₁ or a function value thereof, the quantized gain candidate α ₂ or the function value thereof _,. Multiple sets of function values and indexes are stored. That is, when Θ = 1, gain quantization section 218 stores a plurality of sets of quantized gain candidates α ₁ or their function values and indexes. When Θ = Γ ≧ 2, the gain quantization unit 218 stores a plurality of first to Γ quantized gain candidates α _γ or a set of function values and indexes. The plurality of indexes include those having different numbers of bits.

The encoding unit 212 obtains the waveform information codes E ₁ ,..., E _Θ for each time interval, obtains the index with the smallest index value D _U as the gain code, and obtains the waveform information codes E ₁ _,. A bit stream corresponding to the gain code is output. The index is selected by the gain quantization unit 218. As described above, the index value D _U becomes larger as the distortion D is large, and the candidate alpha _theta of quantized gain (θ = 1, ..., Θ ) increases the larger the number of bits of the index corresponding to the Value. However, the distortion D in the example of FIG. 3 is obtained with _respect to the sample sequence α _θ Y _θ obtained by multiplying each sample of the sample sequence Y _θ corresponding to the waveform information code E _θ by the quantized gain candidate α _θ . α ₁ Y ₁ +... + α _Θ Y _Θ and the distortion of the input signal X. An example of the sample string Y _θ corresponding to the waveform information code E _θ is a sample obtained by passing the sample string from the waveform information code book 211-θ for the waveform information code E _θ through the synthesis filter corresponding to the input signal X. column and the sample sequence from the waveform information codebook 211-theta for waveform information code E _theta and the like.

The encoding device 21 illustrated in FIG. 3 includes the encoding device of the first embodiment, that is, the encoding device 11 illustrated in FIG. Specifically, in the encoding device 21 of FIG. 3, Θ = 2, the waveform information codebook 211-1 is the fixed codebook 113, and the waveform information codebook 211-2 is the adaptive codebook. 112, and the configuration in which the gain quantization unit 218 is the gain quantization unit 118 is the encoding device 11 itself of the first embodiment. In this case, the quantized gain candidate α ₁ in the gain quantizing unit 218 is the quantized fixed codebook gain candidate β in the gain quantizing unit 118, and the quantized gain candidate in the gain quantizing unit 218 α ₂ is a quantized pitch gain candidate α in the gain quantization unit 118, the waveform information code E ₁ is the code index of the first embodiment, and the waveform information code E ₂ is the pitch code of the first embodiment. Yes, the input signal X is the input acoustic signal of the first embodiment.

The decoding apparatus corresponding to the encoding apparatus 21 stores a plurality of sets of quantized gain candidates α _θ (θ = 1,..., Θ) or their function values and indexes. These sets are the same as the sets stored in the gain quantization unit 218. The decoding apparatus uses the waveform information codes E ₁ ,..., E _Θ and the gain code included in the input bit stream, and gains gain for each sample of the sample sequence Y _θ corresponding to the waveform information code E _θ for each time interval. The sample sequence α ₁ Y ₁ +... + Α _Θ Y _Θ is output for the sample sequence α _θ Y _θ obtained by multiplying the quantized gain candidate α _θ represented by the code.

In the above embodiments, for each sub-frame, to obtain an index index value D _U is the smallest as a gain code. However, for each frame, it may be obtained an index index value D _U is the smallest as a gain code, for each plurality of sub-frame or frame, to obtain an index index value D _U is the smallest as a gain code Also good.

In addition, the various processes described above are not only executed in time series according to the description, but may be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes. Needless to say, other modifications are possible without departing from the spirit of the present invention.

In addition, when the above configuration is realized by a computer, the processing contents of the functions that each device should have are described by a program. The processing functions are realized on the computer by executing the program on the computer.

The program describing the processing contents can be recorded on a computer-readable recording medium. An example of a computer-readable recording medium is a non-transitory recording medium. Examples of such a recording medium are a magnetic recording device, an optical disk, a magneto-optical recording medium, a semiconductor memory, and the like.

This program is distributed, for example, by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

For example, a computer that executes such a program first stores a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. When executing the process, this computer reads the program stored in its own recording device and executes the process according to the read program. As another execution form of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and the program is transferred from the server computer to the computer. Each time, the processing according to the received program may be executed sequentially.

In the above embodiment, the present apparatus is configured by executing a predetermined program on a computer. However, at least a part of these processing contents may be realized by hardware.

11, 21 Coding device 12 Decoding device

Claims

A gain codebook storing a plurality of sets of quantized gain candidates or their function values and indexes is stored, and the plurality of indexes include those having different numbers of bits, and time or frequency intervals Each time, the larger the distortion D of the waveform sample sequence obtained by multiplying each sample of the sample sequence corresponding to the waveform information code by the quantized gain candidate and the input signal, the larger the waveform waveform becomes. Coding having a gain quantization unit for obtaining, as a gain code, the index value D U that becomes the larger as the number of bits of the index corresponding to the quantized gain candidate for obtaining the sample sequence becomes larger apparatus.
A gain codebook in which a plurality of first to Γ (Γ is an integer of 2 or more) quantized gain candidates or pairs of function values thereof and an index are stored; Samples corresponding to Γ-th γ-th (γ is an integer between 1 and Γ) waveform information codes from 1st to Γ are included for each time or frequency interval, including samples having different numbers of bits. A total waveform sample sequence obtained by adding for each corresponding sample a γth waveform sample sequence obtained by multiplying each sample of the sequence by the γth quantized gain candidate, and an input signal, The index becomes smaller as the distortion D becomes larger, and the index value D U becomes larger as the number of bits of the index corresponding to the quantized gain candidates for obtaining the total waveform sample sequence becomes larger. , Gain sign Encoding apparatus having a gain quantization unit that may be.
The sample sequence corresponding to the waveform information code is obtained by passing a sample sequence obtained by decoding the waveform information code through a synthesis filter.
The encoding apparatus according to claim 1 or 2.
For each predetermined time interval, a code index for specifying a sample string from a fixed codebook, a pitch code for specifying a pitch period, a quantized fixed codebook gain, and a quantized pitch gain corresponding to an input acoustic signal An encoding device for obtaining a corresponding gain code,
A gain codebook in which a plurality of quantized fixed codebook gain candidates or function values thereof and quantized pitch gain candidates or function value / index pairs are stored, and the plurality of indexes include bits In each time interval, the quantized fixed codebook gain candidate β is added to each sample of the sample sequence Z obtained by passing the sample sequence from the fixed codebook through the synthesis filter. A sample sequence βZ obtained by multiplication, and a sample sequence αY obtained by multiplying each sample of the sample sequence Y obtained by passing a past excitation signal through the synthesis filter by the quantized pitch gain candidate α , For each corresponding sample, the distortion D between the composite signal sample sequence αY + βZ and the input acoustic signal X, and the larger the number of bits of the index, the larger the value. The number, an index index value D U is the smallest obtained by adding or multiplying, a gain quantization section for obtaining a gain code, the encoding apparatus.
A device that encodes an input acoustic signal every predetermined time interval,
A linear prediction analysis unit that obtains linear prediction information that is a code for specifying a linear prediction coefficient corresponding to the input acoustic signal or a coefficient compatible with the input acoustic signal for each time interval;
A fixed codebook search unit for obtaining a code index for identifying a sample sequence corresponding to the input acoustic signal among a plurality of sample sequences included in the fixed codebook for each time interval;
An adaptive codebook search unit for obtaining a pitch code for specifying a pitch period corresponding to the input acoustic signal for each time interval;
A gain quantization unit for obtaining a gain code corresponding to a quantized fixed codebook gain and a quantized pitch gain for each time interval;
A sample sequence obtained by multiplying each sample of the sample sequence from the fixed codebook corresponding to the code index by the quantized fixed codebook gain, and the number of past excitation signals by the number of samples corresponding to the pitch period A sample sequence obtained by multiplying each sample of the sample sequence by the quantized pitch gain, and an adaptive codebook for storing an excitation signal obtained by adding each sample corresponding to the sample sequence,
The gain quantization unit is
A gain codebook in which a plurality of quantized fixed codebook gain candidates or function values thereof and quantized pitch gain candidates or pairs of function values and indexes are stored;
The plurality of indexes include those with different numbers of bits,
For each time interval, the quantized fixed value is applied to each sample of the sample string Z obtained by passing the sample string from the fixed codebook corresponding to the code index through a synthesis filter using the linear prediction coefficient or a coefficient compatible therewith. The quantization is performed on each sample of the sample sequence βZ obtained by multiplying the codebook gain candidate β and the sample sequence Y obtained by passing past excitation signals by the number of samples corresponding to the pitch period through the synthesis filter. The distortion D of the composite signal sample sequence αY + βZ obtained by multiplying the corresponding sample by the sample sequence αY obtained by multiplying the completed pitch gain candidate α and the number of bits of the index are large. and more larger coefficient, the index of addition or multiplication with index value D U obtained becomes minimum, and as a gain code To get.
The index value D U is obtained by D U = D {1 + γb} where b is the number of bits of the index, N is the number of samples of the input acoustic signal in the time interval, and γ is a predetermined positive constant. The encoding device according to claim 1, wherein the encoding device is a value.
The index value D U is a value obtained by D U = D {1+ (2log2) b / N} where b is the number of bits of the index and N is the number of samples of the input acoustic signal in the time interval. The encoding device according to any one of claims 1 to 5.
The index value DU is such that b is the number of bits of the index and B is a candidate for all quantized pitch gains stored in the gain codebook or a function value thereof and a candidate for quantized fixed codebook gain or D U = D {1+ (2log2) (b−B) where the set of function values is the number of bits of a code necessary for performing uniform length coding and N is the number of samples of the input acoustic signal in the time interval. The encoding apparatus according to claim 1, which is a value obtained by / N}.
The index value DU is
If the number of bits of the gain code obtained in the past time interval for a predetermined time is smaller than the average number of bits of the index stored in the gain codebook, the current time interval w is set to a predetermined positive value w A value less than 0 ,
When the number of bits of the gain code obtained in the past time interval for a predetermined time is larger than the average number of bits of the index stored in the gain codebook, the current time interval w is set to a predetermined positive value w A value greater than 0 ,
b was the number of bits of the index D U = D (1 + wb )
The encoding device according to claim 1, wherein the encoding device is a value obtained by:
The gain quantization unit includes a plurality of gain codebooks,
The encoding apparatus according to claim 9, wherein the gain quantization unit obtains a gain code using a gain codebook determined in advance for each value of w.
The index value DU is
When the number of bits of the gain code obtained in the past time interval for a predetermined time is smaller than the average number of bits of the index stored in the gain codebook, v in the current time interval is set to a value smaller than 1.
When the number of bits of the gain code obtained in the past time interval for a predetermined time is larger than the average number of bits of the index stored in the gain codebook, v in the current time interval is set to a value greater than 1,
The b and the number of bits of the index, D the N was the number of samples of the input audio signals in the time interval U = D {1 + v ( 2log2) b / N}
The encoding device according to claim 1, wherein the encoding device is a value obtained by:
The gain quantization unit includes a plurality of gain codebooks,
12. The encoding apparatus according to claim 11, wherein the gain quantization unit obtains a gain code using a gain codebook predetermined for each value of v.
A gain codebook storing a plurality of sets of quantized gain candidates or their function values and indexes is stored, and the plurality of indexes include those having different numbers of bits, and time or frequency intervals Each time, the larger the distortion D of the waveform sample sequence obtained by multiplying each sample of the sample sequence corresponding to the waveform information code by the quantized gain candidate and the input signal, the larger the waveform waveform becomes. Coding having a gain quantization step for obtaining, as a gain code, the index value D U that becomes larger as the number of bits of the index corresponding to the quantized gain candidate for obtaining the sample sequence becomes larger Method.
A gain codebook in which a plurality of first to Γ (Γ is an integer of 2 or more) quantized gain candidates or pairs of function values thereof and an index are stored; Samples corresponding to Γ-th γ-th (γ is an integer between 1 and Γ) waveform information codes from 1st to Γ are included for each time or frequency interval, including samples having different numbers of bits. A total waveform sample sequence obtained by adding for each corresponding sample a γth waveform sample sequence obtained by multiplying each sample of the sequence by the γth quantized gain candidate, and an input signal, The index becomes smaller as the distortion D becomes larger, and the index value D U becomes larger as the number of bits of the index corresponding to the quantized gain candidates for obtaining the total waveform sample sequence becomes larger. , Gain sign Coding method having a gain quantization step obtained by.
The sample sequence corresponding to the waveform information code is obtained by passing a sample sequence obtained by decoding the waveform information code through a synthesis filter.
The encoding method according to claim 13 or 14.
For each predetermined time interval, a code index for specifying a sample string from a fixed codebook, a pitch code for specifying a pitch period, a quantized fixed codebook gain, and a quantized pitch gain corresponding to an input acoustic signal A coding method for obtaining a corresponding gain code,
A gain codebook in which a plurality of quantized fixed codebook gain candidates or function values thereof and quantized pitch gain candidates or function value / index pairs are stored, and the plurality of indexes include bits In each time interval, the quantized fixed codebook gain candidate β is added to each sample of the sample sequence Z obtained by passing the sample sequence from the fixed codebook through the synthesis filter. A sample sequence βZ obtained by multiplication, and a sample sequence αY obtained by multiplying each sample of the sample sequence Y obtained by passing a past excitation signal through the synthesis filter by the quantized pitch gain candidate α , For each corresponding sample, the distortion D between the composite signal sample sequence αY + βZ and the input acoustic signal X, and the larger the number of bits of the index, the larger the value. The number, an index index value D U is the smallest obtained by adding or multiplying, a gain quantization step to obtain a gain code, the coding method.
A method of encoding an input acoustic signal every predetermined time interval,
A linear prediction step for obtaining linear prediction information, which is a code for specifying a linear prediction coefficient corresponding to the input acoustic signal or a coefficient compatible with the input acoustic signal for each time interval;
A fixed codebook search step for obtaining a code index for identifying a sample sequence corresponding to the input acoustic signal among a plurality of sample sequences included in the fixed codebook for each time interval;
An adaptive codebook search step for obtaining a pitch code specifying a pitch period corresponding to the input acoustic signal for each time interval;
A gain quantization step for obtaining a gain code corresponding to a quantized fixed codebook gain and a quantized pitch gain for each time interval;
A sample sequence obtained by multiplying each sample of the sample sequence from the fixed codebook corresponding to the code index by the quantized fixed codebook gain, and the number of past excitation signals by the number of samples corresponding to the pitch period A sample sequence obtained by multiplying each sample of the sample sequence by the quantized pitch gain, and storing in the adaptive codebook an excitation signal obtained by adding each sample corresponding to the sample sequence,
A gain codebook in which a plurality of candidates for quantized fixed codebook gain or its function value and quantized pitch gain or a set of its function value and index are stored;
The plurality of indexes include those with different numbers of bits,
The gain quantization step is:
For each time interval, the quantized fixed value is applied to each sample of the sample string Z obtained by passing the sample string from the fixed codebook corresponding to the code index through a synthesis filter using the linear prediction coefficient or a coefficient compatible therewith. The quantization is performed on each sample of the sample sequence βZ obtained by multiplying the codebook gain candidate β and the sample sequence Y obtained by passing past excitation signals by the number of samples corresponding to the pitch period through the synthesis filter. The distortion D of the composite signal sample sequence αY + βZ obtained by multiplying the corresponding sample by the sample sequence αY obtained by multiplying the completed pitch gain candidate α and the number of bits of the index are large. and more larger coefficient, the index of addition or multiplication with index value D U obtained becomes minimum, and as a gain code An encoding method to be obtained.
The index value D U is obtained by D U = D {1 + γb} where b is the number of bits of the index, N is the number of samples of the input acoustic signal in the time interval, and γ is a predetermined positive constant. The encoding method according to claim 13, wherein the encoding method is a value.
The index value D U is a value obtained by D U = D {1+ (2log2) b / N} where b is the number of bits of the index and N is the number of samples of the input acoustic signal in the time interval. The encoding method according to claim 13.
The index value DU is such that b is the number of bits of the index and B is a candidate for all quantized pitch gains stored in the gain codebook or a function value thereof and a candidate for quantized fixed codebook gain or D U = D {1+ (2log2) (b−B) where the set of function values is the number of bits of a code necessary for performing uniform length coding and N is the number of samples of the input acoustic signal in the time interval. The encoding method according to claim 13, wherein the encoding method is a value obtained by / N}.
The index value DU is
If the number of bits of the gain code obtained in the past time interval for a predetermined time is smaller than the average number of bits of the index stored in the gain codebook, the current time interval w is set to a predetermined positive value w A value less than 0 ,
When the number of bits of the gain code obtained in the past time interval for a predetermined time is larger than the average number of bits of the index stored in the gain codebook, the current time interval w is set to a predetermined positive value w A value greater than 0 ,
b was the number of bits of the index D U = D (1 + wb )
The encoding method according to claim 13, wherein the encoding method is a value obtained by:
There are multiple gain codebooks,
The encoding method according to claim 21, wherein the gain quantization step is a step of obtaining a gain code using a gain codebook predetermined for each value of w.
The index value DU is
When the number of bits of the gain code obtained in the past time interval for a predetermined time is smaller than the average number of bits of the index stored in the gain codebook, v in the current time interval is set to a value smaller than 1.
When the number of bits of the gain code obtained in the past time interval for a predetermined time is larger than the average number of bits of the index stored in the gain codebook, v in the current time interval is set to a value greater than 1,
The b and the number of bits of the index, D the N was the number of samples of the input audio signals in the time interval U = D {1 + v ( 2log2) b / N}
The encoding method according to claim 13, wherein the encoding method is a value obtained by:
There are multiple gain codebooks,
24. The encoding method according to claim 23, wherein the gain quantization step is a step of obtaining a gain code using a gain codebook predetermined for each value of v.
A program for causing a computer to function as the encoding device according to claim 1, 4 or 5.
A computer-readable recording medium storing a program for causing a computer to function as the encoding device according to claim 1, 4 or 5.