WO2002043052A1 - Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound - Google Patents
Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound Download PDFInfo
- Publication number
- WO2002043052A1 WO2002043052A1 PCT/JP2001/010332 JP0110332W WO0243052A1 WO 2002043052 A1 WO2002043052 A1 WO 2002043052A1 JP 0110332 W JP0110332 W JP 0110332W WO 0243052 A1 WO0243052 A1 WO 0243052A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- vector
- codebook
- code
- vectors
- codebooks
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 76
- 239000013598 vector Substances 0.000 claims abstract description 953
- 230000005236 sound signal Effects 0.000 claims description 90
- 230000003044 adaptive effect Effects 0.000 claims description 77
- 238000001228 spectrum Methods 0.000 claims description 60
- 238000013139 quantization Methods 0.000 claims description 58
- 230000005284 excitation Effects 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 12
- 230000000737 periodic effect Effects 0.000 claims description 11
- 230000010354 integration Effects 0.000 claims description 10
- 230000015572 biosynthetic process Effects 0.000 claims description 9
- 238000003786 synthesis reaction Methods 0.000 claims description 9
- 230000003595 spectral effect Effects 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 16
- 238000004458 analytical method Methods 0.000 description 14
- 239000000872 buffer Substances 0.000 description 13
- 238000004364 calculation method Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 230000006866 deterioration Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000010295 mobile communication Methods 0.000 description 3
- 238000012805 post-processing Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000002542 deteriorative effect Effects 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000007493 shaping process Methods 0.000 description 1
- 235000012046 side dish Nutrition 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
- G10L19/07—Line spectrum pair [LSP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/012—Comfort noise or silence coding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0004—Design or structure of the codebook
- G10L2019/0005—Multi-stage vector quantisation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0007—Codebook element generation
Definitions
- the present invention relates to a method and apparatus for encoding and decoding a low bit rate audio signal in a mobile communication system, the Internet, etc., which encodes and transmits an audio signal such as a voice signal or a music signal, and an acoustic parameter applied to these.
- the present invention relates to an encoding / decoding method and apparatus, and a program for executing these methods on a computer.
- a speech coding apparatus uses a method using a model suitable for representing a speech signal so that a speech signal of high quality can be represented even at a low bit rate.
- a CELP Code Excited Linear Prediction
- MR Schroeder and BS Atal "Code-Excited Linear Prediction (CELP): High-quality Speech at Very Low Bit Rates", Proc. ICASSP-85, 25.1.1.1, 1, pp. 937-940, 1985 ".
- the CELP-type speech coding method is based on a speech synthesis model corresponding to a human vocal utterance mechanism, and comprises a filter represented by linear prediction coefficients representing vocal tract characteristics and an excitation signal driving the filter, and a speech signal.
- a speech signal is synthesized.
- the digitized audio signal is divided at a certain fixed frame length (approximately 5 ms to 50 ms), and the audio signal is linearly predicted for each frame.
- (Excitation signal) is encoded using an adaptive code vector consisting of known waveforms and a fixed code vector.
- the adaptive code vector is stored in the adaptive codebook as a vector representing the excitation signal generated in the past, and is used to represent the periodic component of the audio signal.
- the fixed code vector is stored as a vector having a predetermined number of waveforms prepared in advance in the fixed codebook, and non-periodic components that cannot be expressed by the adaptive codebook are stored. Used primarily for expression.
- the vectors stored in the fixed codebook include vectors composed of random noise sequences and vectors represented by a combination of several pulses.
- An algebraic fixed codebook is one of the typical fixed codebooks that expresses the fixed code vector by a combination of several pulses.
- the specific contents of the algebraic fixed codebook are described in "ITU-T Recommendation G.729J".
- speech linear prediction coefficients are converted into parameters such as partial autocorrelation (PARC0R) coefficients and line spectrum pairs (LSPs: also called line spectrum frequencies). After being further quantized and converted into a digital code, it is stored or transmitted. Details of these methods are described in, for example, "Digital Speech Processing” by Sadahiro Furui (Tokai University Press).
- LSP parameter encoding method is as follows. A weighted vector obtained by multiplying the code vector output from the vector codebook in one or more past frames by a weighting factor selected from the weighting codebook, or this vector is used to calculate the entire audio signal in advance.
- the quantization parameter of the current frame is expressed by a vector obtained by adding the average vector of the LSP parameters of the current frame.
- the code vector to be output by the vector codebook and the weighting factor set to be output by the weighting factor codebook are selected so as to be sufficiently small. , And outputs them as code of LSP parameters.
- weighted vector quantization This is generally referred to as weighted vector quantization, or moving average (MA: Moving Average) prediction vector quantization if the weighting factors are considered as prediction coefficients from the past.
- MA Moving Average
- the decoding side multiplies the code vector of the current frame and the past code vector by a weighting factor based on the received vector code and the weighting factor code, or obtains an average vector of the LSP parameters of the entire voice signal which is obtained in advance. Is output as the quantization vector of the current frame by the vector to which is added.
- the vector codebook that outputs the code vector of each frame includes a basic one-stage vector quantizer, a divided vector quantizer that divides the dimensions of the vector, and a two-stage vector quantizer. Or a multi-stage vector quantizer with more or more stages, or a multi-stage split vector quantizer combining a multi-stage and split vector quantizer is possible.
- the number of frames is large in the silence section and the stationary noise section, and the encoding and decoding processes have a multi-stage configuration. It was not possible to output vectors so that the correspondingly synthesized parameters did not necessarily change smoothly. This is because the vector codebook used for encoding is usually obtained by learning, but in this learning, a sufficient amount of silent sections or stationary noise sections is not included in the training speech. If the vector corresponding to the stationary noise section cannot always be sufficiently reflected and learned, or if the number of bits given to the quantizer is small, the quantization vector corresponding to the non-voice section will be sufficient. It was not possible to design a codebook that included it.
- Such an LSP parameter encoder / decoder could not fully demonstrate the quantization performance of non-speech sections in encoding during actual communication, and could not prevent the quality of reproduced sound from deteriorating.
- Such problems have arisen not only in the encoding of acoustic parameters equivalent to linear prediction coefficients representing the spectral envelope of a speech signal, but also in the case of similar encoding of a music signal.
- An object of the present invention is to provide a method and an apparatus, and a program for implementing these methods on a computer. Disclosure of the invention
- the present invention provides an audio parameter equivalent to a linear prediction coefficient representing a spectrum envelope of an audio signal, that is, encoding and decoding of parameters such as LSP parameters, parameters, and Percoll parameters (hereinafter simply referred to as audio parameters). , which corresponds to a silent section or a stationary noise section that cannot be obtained by codebook learning.
- the main feature is that the vector is added to the codebook for the acoustic parameter vector code representing the flat spectrum envelope, and it can be selected.
- a vector including a component of an acoustic parameter vector representing an almost flat spectrum envelope is obtained by calculation in advance, and stored as one vector of a vector codebook; and
- the difference between the multi-stage vector quantization configuration and the divided vector quantization configuration is that the configuration is such that the code vector is output.
- An acoustic parameter encoding method comprises:
- the code vector of the vector codebook and the weight coefficient of the coefficient codebook are used. Determining the set, determining the index representing the set of the determined code vector and the weighting coefficient as the quantization code of the acoustic parameter, and outputting the determined code.
- the vector codebook includes, as one of the stored code vectors, a vector including a component of the acoustic parameter vector representing the substantially flat spectrum envelope.
- the acoustic parameter decoding method according to the present invention comprises:
- the above-mentioned vector codebook includes, as one of the stored code vectors, a vector including a component of an acoustic parameter vector representing a substantially flat spectrum envelope.
- An acoustic parameter encoding device includes:
- Parameter calculating means for analyzing an input audio signal for each frame and calculating an audio parameter equivalent to a linear prediction coefficient representing a spectrum envelope characteristic of the audio signal
- a vector codebook that stores a plurality of code vectors in correspondence with indices representing them
- a coefficient codebook in which one or more sets of weighting factors are stored in correspondence with indexes representative of those sets,
- each of the weight coefficients of the set selected from the coefficient codebook is used.
- a distortion calculator for calculating a distortion of the quantized acoustic parameter with respect to the acoustic parameter calculated by the parameter calculating unit;
- Codebook search control unit for outputting an index representing each of the set of weighting factors and weighting factors as a code for the acoustic parameter
- the vector codebook is configured to include, as one code vector, a vector including a component of an acoustic parameter vector representing a substantially flat spectrum envelope.
- the acoustic parameter decoding device comprises:
- a vector codebook in which a plurality of code vectors of acoustic parameters equivalent to linear prediction coefficients representing a spectrum envelope characteristic of an audio signal are stored in correspondence with indexes representative of them,
- a coefficient codebook in which one or more sets of weighting factors are stored in correspondence with their representative indexes
- One code vector is output from the vector codebook according to the index represented by the code input for each frame, a set of weighting coefficients is output from the coefficient codebook, and the code vector output in the current frame is output. And a weight vector obtained by multiplying the code vector output in at least one frame of the closest past by the weight coefficient of the set output in the current frame and adding the weight vector.
- Quantization parameter generation means for outputting a vector including the weighted vector component as a decoded quantized acoustic parameter of the current frame.
- a vector containing the component of the acoustic parameter vector that represents an almost flat spectrum envelope is stored as one of the code vectors.
- An audio signal encoding device for encoding an input audio signal includes:
- An adaptive codebook holding an adaptive code vector representing a periodic component of the input audio signal; and- a fixed codebook storing a plurality of fixed vectors,
- An excitation vector generated based on the adaptive code vector from the adaptive codebook and the fixed vector from the fixed codebook is input as an excitation signal
- An adaptive code vector and a fixed vector to be selected from the fixed codebook and the adaptive codebook are determined so that distortion of the synthesized audio signal with respect to the input audio signal is reduced, and the determined adaptive code vector and fixed vector are determined.
- An audio signal decoding apparatus for decoding an input code and outputting an audio signal uses the above-described audio parameter decoding method to generate an audio equivalent to a linear prediction coefficient representing a spectrum envelope characteristic from an input code. Means for decoding the parameters;
- a fixed codebook storing a plurality of fixed vectors
- the corresponding fixed vector is extracted from the fixed codebook, and the corresponding fixed vector is extracted from the adaptive codebook.
- Filter means for setting a filter coefficient based on the acoustic parameter, and reproducing an acoustic signal by the excitation vector
- An audio signal encoding method for encoding an input audio signal according to the present invention includes:
- the adaptive code vector and the fixed vector selected from the fixed codebook and the adaptive codebook are determined so that the distortion of the synthesized audio signal with respect to the input audio signal is reduced, and the determined adaptive code vector is determined.
- Corresponding to the vector and fixed vector respectively Outputting an adaptive code and a fixed code
- An audio signal decoding method for decoding an input code and outputting an audio signal according to the present invention includes:
- (B) A fixed codebook that extracts the adaptive code vector from the adaptive codebook that holds the adaptive code vector representing the periodic component of the input audio signal based on the input adaptive code and fixed code, and stores multiple fixed vectors. Extracting the corresponding fixed vector from, and combining the adaptive code vector and the fixed vector in a vector to generate an excitation vector;
- the present invention described above can be provided in the form of a computer-executable program.
- the code vector of the vector codebook includes a component of an acoustic parameter vector representing a substantially flat spectrum envelope. Since the vectors are obtained and stored in advance, it is possible to output the quantization vectors corresponding to the acoustic parameters corresponding to the corresponding silent sections or stationary noise sections.
- the codebook of one stage is used.
- the vector containing the component of the acoustic parameter vector that represents a substantially flat spectrum envelope is stored in the codebook, and the zero-vector is stored in the codebook of the other stages, so that the corresponding silence is stored. It can output a quantization vector corresponding to an acoustic parameter corresponding to a section or a stationary noise section.
- Zero vectors need not always be stored.
- the vector including the component of the acoustic parameter vector representing the substantially flat spectrum envelope is selected from the codebook of one stage, the substantially flat spectrum is obtained.
- a vector including a component of the acoustic parameter vector representing the audio envelope may be output as a code vector candidate of the current frame.
- the vector codebook is composed of divided vector codebooks
- a plurality of divided vectors obtained by dividing the dimensions of the vector including the components of the acoustic parameter vector representing a substantially flat spectrum envelope are used.
- each divided vector is selected in the search of each divided vector codebook, and a vector obtained by integrating the divided vectors is selected. It can be output as a quantized vector corresponding to an acoustic parameter corresponding to a silent section or a stationary noise section.
- the vector quantizer is configured as a multi-stage divided vector quantization configuration, and by combining the above-described multi-stage vector quantization configuration and the divided vector quantization configuration technology, the corresponding silence section or stationary noise section can be obtained. It can be output as a quantization vector corresponding to the corresponding acoustic parameters.
- the scaling coefficient for each of the codebooks in the second and subsequent stages is provided as a scaling coefficient codebook corresponding to each code vector in the first-stage codebook.
- the scaling coefficients corresponding to the code vectors selected in the second codebook are read out from the respective scaling coefficient codebooks and multiplied by the code vectors respectively selected from the second codebook, thereby reducing the quantization distortion. Small encoding can be realized.
- any one of the parameter encoding devices is used in an acoustic parameter region equivalent to the linear prediction coefficient. According to this configuration, the same operation and effect as any of the above can be obtained.
- any one of the parameter decoding devices is used in an acoustic parameter region equivalent to the linear prediction coefficient. According to this configuration, it is possible to obtain the same operation and effect as any of the above. it can.
- FIG. 1 is a block diagram showing a functional configuration of an acoustic parameter encoding device to which a codebook according to the present invention is applied.
- FIG. 2 is a block diagram showing a functional configuration of an acoustic parameter decoding device to which the codebook according to the present invention is applied.
- FIG. 3 is a diagram showing a configuration example of a vector codebook according to the present invention for LSP parameter encoding and decoding.
- Figure 4 is illustrating a configuration example of a base vector codebook according to the invention when constituted in FIGS c Figure 5 is divided base vector codebooks showing a configuration example of a base vector codebook according to the invention in the case of a multi-stage configuration.
- FIG. 6 is a diagram showing a configuration example of a vector codebook according to the present invention when a scaling coefficient is applied to a multistage vector codebook.
- FIG. 7 is a diagram showing a configuration example of a vector codebook according to the present invention when the second-stage codebook is configured by a divided vector codebook.
- FIG. 8 is a diagram showing a configuration example of a vector codebook when scaling coefficients are applied to two divided vector codebooks in the codebook of FIG.
- FIG. 9 is a diagram showing a configuration example of a vector codebook in a case where each stage of the multi-stage vector codebook in FIG. 4 is a divided vector codebook.
- FIG. 1OA is a block diagram showing a configuration example of an audio signal transmission device to which the encoding method according to the present invention is applied.
- FIG. 10B is a block diagram showing a configuration example of an audio signal receiving apparatus to which the decoding method according to the present invention is applied.
- FIG. 11 is a diagram showing a functional configuration of a speech signal encoding device to which the encoding method according to the present invention is applied.
- FIG. 12 is a diagram showing a functional configuration of an audio signal decoding device to which the decoding method according to the present invention is applied.
- FIG. 13 is a diagram showing a configuration example when the encoding device and the decoding device according to the present invention are implemented by a computer.
- FIG. 14 is a graph for explaining the effect of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION
- FIG. 1 is a block diagram showing a configuration example of an acoustic parameter encoding apparatus according to an embodiment to which a linear prediction parameter encoding method according to the present invention is applied.
- This encoding device includes a linear prediction analysis unit 12, an LSP parameter calculation unit 13, and a codebook 14 constituting a parameter encoding unit 10, a quantization parameter generation unit 15, a distortion calculation unit 16, And a codebook search control unit 17.
- a series of digitized audio signal samples is input from an input terminal T1.
- the linear prediction analysis unit 12 performs linear prediction analysis on the audio signal samples for each frame stored in the internal buffer, and calculates a set of linear prediction coefficients.
- the LSP parameter calculation unit 13 calculates an equivalent P order LSP (line spectrum pair) parameter from the p order linear prediction coefficient. Details of these processing methods are described in the aforementioned book by Furui. These p LSP parameters are
- the integer n indicates the number n of a certain frame, and the frame at that time is called a frame n.
- the codebook 14 contains a vector codebook 14 A that stores N code vectors representing the LSP parameter vectors obtained by learning, and a coefficient codebook that stores K weighting coefficient sets. 14B, and an index .Ix (n) for designating a code vector and an index Iw (n) for designating a weight coefficient code, respectively, correspond to a code vector ⁇ ( ⁇ ),
- 15B m and m + 1 multipliers 15A connected in series are m buffer units ⁇ ⁇ !,..., 15B m . , 15A, 15A m, register 1 5 C, consisting of the vector adder 1 5 D.
- the code vector x (n) selected from the vector codebook 14 A for the current frame n is
- the determined code vector one frame before is x (nl)
- the determined code vector two frames before is x (n-2)
- the determined code vector m frames ago is x (n-m)
- m is appropriately selected as needed.
- the value of m is sufficient to be 6 or less, and a value of 1 to 3 may be used. This m is also called the moving average prediction order.
- the quantization vector candidate y (n) thus obtained is sent to the distortion calculation unit 16 and calculates the quantization distortion for the LSP parameter f (n) calculated by the LPS parameter calculation unit 13.
- the distortion d is defined by the following weighted Euclidean distance, for example.
- i l,-, and p are weighting factors obtained from the LSP parameter f (n), and the performance is good when weighting is applied to the vicinity of the formant frequency of the spectrum.
- the codebook search unit 17 sequentially changes the pair of indices Ix (n) and Iw (n) given to the codebook 14, and calculates the distortion d of equation (5) as described above for each index of the pair.
- the distortion d which is the output of the distortion calculator 16 from the code vector of the codebook 14A in the codebook 14 and the weight coefficient set of the coefficient codebook 14B, is obtained.
- Is searched for the one that minimizes or is sufficiently small, and the indexes Ix (n) and Iw (n) are sent out from the terminal T2 as the sign of the input LSP parameter.
- the codes Ix (n) and Iw (n) sent from the terminal T2 are sent to a decoder via a transmission line or stored in a storage device.
- a feature of the present invention is that one code stored in a vector codebook 14 A used in the above-described weighted vector quantization of LSP parameters or in encoding by moving average vector quantization. If the above average vector y ave is zero, the LSP parameter vector F corresponding to a silent section or a stationary noise section, or if y ave is not zero, the LSP parameter vector Vector C obtained by subtracting y ave from F. Is stored. That is, if y ave is not zero, the LSP parameter vector corresponding to a silent section or a stationary noise section is
- y (n) is C. + y ave , that is, F obtained from the LSP parameter in the silent section or a vector close to it can be output as a quantization vector.
- the coding performance in the silent section or the stationary noise section can be improved.
- the quantization parameter generation unit 15 When the quantization parameter generation unit 15 generates a quantization vector y (n) including the component of the average vector y ave , the code vector including the component of the vector F base used minus the vector y ave, when generating a quantized base vector y (n) that does not contain the component of the average base vector y ave is used base vector F itself.
- FIG. 2 is a configuration example of a decoding device to which the embodiment of the present invention is applied, and is configured by a codebook 24 and a quantization parameter generation unit 25.
- the codebook 24 and the quantization parameter generation unit 25 are configured similarly to the codebook 14 and the quantization parameter generation unit 15 in the encoding device of FIG.
- Indexes Ix (n) and Iw (n) as parameter codes sent from the encoding apparatus of FIG. 1 are input, and a code vector x (n) corresponding to the index Ix (n) is a vector codebook 2 4 A and a weighting coefficient set Wn. WL..., ⁇ Corresponding to the index Iw (n) are output from the coefficient codebook 24 B.
- Vector codebook 2 4 A output for each frame from the code base-vector x (n) is sequentially inputted to the buffer section 25B! 25 ⁇ ⁇ connected in series. Current frame
- this decoding device as in the encoding device shown in FIG. Vector C.
- this in the vector codebook 24 A as one code vector, it is possible to output the LSP parameter vector F obtained in the silent section or the stationary noise section of the acoustic signal.
- the vector C is the LSP parameter vector F corresponding to the silent section or the stationary noise section. Is stored as a single sign vector instead of.
- the LSP parameter vector F or vector C stored in each vector codebook 14A, 24A.
- FIG. 3 shows a configuration example of the vector codebook 14 A in FIG. 1 or the vector codebook 24 A in FIG. 2.
- a one-stage vector codebook 4 1 is used, and the vector codebook 4 1 has N code vectors.
- One of the N code vectors is selected and output.
- the code vector C Q is used as one of the code vectors X.
- the N code vectors of the vector codebook 4 1 are created by learning, for example, as in the past, but in the present invention, one vector (smallest distortion) most similar to the vector CQ is C . , Or simply added.
- Vector C There are several ways to ask.
- the ⁇ -order LSP parameter vector F 0 to vit are divided into p + 1 equal parts. ⁇ values with almost equal intervals such as ⁇ / (1 + ⁇ ) and 2 ⁇ / (1 + ⁇ ) ⁇ / (1 + ⁇ ) may be used as the LSP parameter vector.
- the actual LSP parameter vectors F to C in the silence section and stationary noise section. Calculate by Fy ave .
- the average vector y ave of the LSP parameters of the entire audio signal is generally obtained as the average vector of all the learning vectors when learning the code vector X of the vector codebook 41.
- Use the p 10th order LSP parameter as the acoustic parameter, and set the LSP parameter in the silent section or the stationary noise section from 0 to?
- Table 1 below shows examples of the 10-dimensional betatles y ave and ⁇ normalized to values between r.
- a vector 11 is an example of a code vector of an LSP parameter representing a silent section and a stationary noise section written in a codebook according to the present invention.
- the value of this vector element increases at almost constant intervals, which means that the frequency spectrum is almost flat.
- FIG. 4 shows another example of the configuration of the vector codebook 14 ⁇ of the LSP parameter encoder of FIG. 1 or the vector codebook 24 4 of the LSP parameter decoding device of FIG. 2 as a codebook 4A.
- the first-stage codebook 4 1 stores N P-dimensional code vectors x réelle,..., X 1N
- the second-stage codebook 4 2 stores N ′ p-dimensional codes.
- Vector 21 3 ⁇ 4. Is stored.
- the code analysis unit 43 analyzes the index Ix (n), and an index Ix di that specifies a first-stage code vector. )! And the index Ix (n) 2 specifying the second-stage code vector. Then, the ith index corresponding to the index IXG L Ix (n) 2 of each stage And the ⁇ th code vector x H , 3 ⁇ 4. Are read from the first-stage codebook 41 and the second-stage codebook 42, and the adder 44 adds both codevectors and signs the addition result. Output as vector x (n).
- the code vector search is performed by using only the first-stage codebook 41 up to a predetermined number of candidate code vectors in ascending order of quantization distortion. This search is performed in combination with the weighting coefficient set of the coefficient codebook 14 B shown in FIG. Next, for each combination of the first-stage code vector of each candidate and the code vector of the second-stage codebook, a search is made for a combination of code vectors that minimizes quantization distortion.
- the code vector search is performed with priority given to the first-stage codebook 41
- the code vector is searched for as one code vector in the first-stage codebook 41 of the multi-stage vector codebook 4A.
- the vector C fl (or F) is stored in advance, and the zero vector z is stored in advance as one code vector in the second-stage codebook 42.
- the code vector C from the code book 41 is obtained. Is selected, the zero vector z is selected from the codebook 42, and as a result, the code in the case where the output of the codebook 4A from the adder 44 corresponds to a silent section or a stationary noise section.
- Vector C A configuration that can output the data is realized. Side dishes and stores the zero vector z, when codebook 4 1 from the code vector C e is selected, it may be configured not to perform the selection ⁇ addition from the codebook 4 2.
- the code vector C fl and the zero vector z are different from each other. If it is a codebook, it may be stored in either codebook. Sign vector C. And zero vector z are likely to be selected at the same time in a silent section or a stationary noise section, but they may not always be selected at the same time due to calculation errors and other factors. Code vector C in the codebook at each stage. And zero vector z are selected as one code vector as in other code vectors.
- the zero vector may not be stored in the second codebook 42.
- the vector C from the first stage codebook 4 1. Is selected, the code vector is not selected from the second-stage codebook 4 2 and the code of the codebook 4 1 can be output as it is from the adder 44 c
- the codebook 4A By configuring the codebook 4A with a multi-stage codebook as shown in Fig. 4, it is effectively the same as providing code vectors as many as the number of selectable code vector combinations. There is an advantage that the size of the codebook (here, the total number of code vectors) can be reduced compared to the case of only one stage codebook. Fig.
- Example 3 shows the case of two-stage vector codebooks 4 1 and 4 2.However, when the number of stages is 3 or more, codebooks are simply added by the number of additional stages, and each codebook is indexed by its index. Since it is only necessary to select the code vectors from the codebook at the first stage and to combine them by vector, it is easy to expand.
- Example 3 shows the case of two-stage vector codebooks 4 1 and 4 2.However, when the number of stages is 3 or more, codebooks are simply added by the number of additional stages, and each codebook is indexed by its index. Since it is only necessary to select the code vectors from the codebook at the first stage and to combine them by vector, it is easy to expand.
- Example 3 shows the case of two-stage vector codebooks 4 1 and 4 2.However, when the number of stages is 3 or more, codebooks are simply added by the number of additional stages, and each codebook is indexed by its index. Since it is only necessary to select the code vectors from the codebook at the first stage and to combine them by vector, it
- FIG. 5 shows a code selected from the second-stage codebook 4 2 with a predetermined scaling factor for each code vector of the first-stage codebook 4 1 in the vector codebook 4 A of the embodiment of FIG. This is a case where the vector is multiplied, added to the code vector from the first-stage codebook 41, and output.
- Scaling factor codebook 4 5 are provided, the first stage sign-Book 4 1 of each code base vector x "C 0 x 1N in correspondingly the example from 0.5 to 2 about scale one ring determined by learning in advance
- the coefficient s N is stored, and the same index as the first-stage codebook 4 1 Accessed by
- the code analyzer 43 analyzes the index Ix (n), and an index Ixdi that specifies the first-stage code vector. ), And an index specifying the code vector in the second row
- Ix (n) 2 is obtained.
- the code vector x u corresponding to the index Ixd ⁇ is read from the first-stage codebook 41.
- the scaling coefficient Si corresponding to the index Ixdi)! Is read from the scaling coefficient codebook 45.
- the code vector x 2i Corresponding to the index Ix (n) 2 is read from the second-stage codebook 4 2, and the scaling coefficient Si is multiplied by the multiplier 46 to the code from the second-stage codebook 42. Multiply the vector x 2i .
- the vector obtained by the multiplication and the code vector from the first-stage codebook 41 are added by the adder 44, and the addition result is output as a code vector x (n) from the codebook 4A.
- the search for the code vector is performed by first searching for a predetermined number of candidate code vectors in order from the one with the smallest quantization distortion using only the first-stage codebook 4 1.
- a set that minimizes the quantization distortion is searched.
- the vector Co is stored in advance as one code vector in the first stage codebook 4 1
- the second stage codebook 4 A is stored.
- the zero vector z is stored in advance as one code vector of 2.
- the search is performed for all combinations between the code vectors of the two codebooks 41 and 42, the code vector C.
- the zero vector z may be stored in either of them if they are stored in separate codebooks. Alternatively, the zero vector z does not have to be stored as in the above-described embodiment. In that case, the code vector C. When is selected, selection and addition from codebook 42 is not performed.
- a code vector corresponding to a silent section or a stationary noise section can be output.
- the code vector Co and the zero vector z are likely to be selected at the same time during a silent period or a stationary noise period. However, they may not always be selected at the same time due to calculation errors and other factors.
- the code vector Co and the zero vector z are selected as one code vector, as in the other code vectors.
- Fig. 6 shows a configuration in which the vector codebook 14A of the parameter encoding device in Fig. 1 or the vector codebook 24A of the parameter decoding device in Fig. 2 is configured as a divided vector codebook 4A.
- the case where the invention is applied is shown.
- Fig. 6 is composed of a two-part vector codebook, it can be similarly extended when the number of divisions is three or more. Therefore, the implementation in the case of two divisions is described here.
- a low-order vector codebook 4 storing N low-order code vectors ⁇ ⁇ .,., ⁇ ⁇ and N ′ high-order code vectors ⁇ ⁇ ,. Higher order vector storing., 3 ⁇ 4.
- the low-order and high-order vector codebooks 44 1 H have low-order up to l-k-th and high-order up to k + l-p-th of Constructs a codebook consisting of vectors of each dimension. That is, the i-th vector of the low-order vector codebook 41 L is
- the ⁇ th vector of the higher-order vector codebook 41 H is
- Input index Ix (n) in the analysis portion 43 Ix (n) L and Ix (n) H To a separatory vignetting, from each codebook 41 L, 41 H in response to these IxCii and Ix (n) H,
- the lower-order and higher-order division vectors x u , x Hi . Are selected, and the integration unit 47 integrates these division vectors x Li . X Hi .
- the base-vector C One vector of codebook 41 L of low-order base vector
- the lower-order vector C flI ⁇ is stored, and the higher-order vector C fl of the vector C fl is stored as one vector of the code book 41 H of the higher-order vector.
- FIG. 7 shows still another example of the configuration of the vector codebook 14A of the acoustic parameter encoding device of FIG. 1 or the vector codebook 24A of the acoustic parameter decoding device of FIG. This is the case where it is configured as a multi-stage divided vector codebook.
- the codebook 4A is a codebook 4A of FIG. 4 in which the second-stage codebook 42 is configured by a two-divided vector codebook similar to FIG.
- the first-stage codebook 4 1 stores N code vectors Xu,..., X 1N
- the second-stage low-order codebook 4 2 ⁇ contains N ′ divided vectors x 2U x 2U ( . is stored, and N "divided vectors 3 ⁇ 4 tine x 2HN .. are stored in the second-stage higher-order codebook 4 2 H.
- the input index Ix (n) is Index that specifies the first-stage code vector in the code analysis unit 43
- the index Ix (n) 2 that specifies the second-stage code vector.
- the i-th code vector ⁇ ⁇ corresponding to the first-stage index Ixdi is read from the first-stage codebook 4 1.
- the second-stage index ⁇ ( ⁇ ) 2 is obtained by the analysis unit 4 3 2 using Ix (n ) is parsed into 2L and ⁇ ( ⁇ ) 2 ⁇ , these Ix (n) 2L, Ix ( n) and more second-stage low-order split base vector codebooks 4 2H, 2-stage high-order split base vector codebooks 4 Select the i'th and ⁇ th divided vectors 2i. And x2Hi .. of each of the 2 Hs , and these selected divided vectors are integrated by the integration unit 47, and the second stage A code vector x 2i . Is generated In an adder 44, the first-stage code vector ⁇ ⁇ and the second-stage integrated vector x 2i are added, and a code vector x (n) is obtained. Is output.
- the vector C is used as one code vector of the first-stage codebook 41, as in the embodiments of FIGS.
- the second-stage split vector codebook 4 2, the lower-order split vector codebook 4 2, the higher-order split vector codebook 4 2 H dividing zero storing vector z have z H.
- the number of codebook stages may be three or more.
- the divided vector codebook may be used for any stage, and the number of divided vector codebooks per stage is not limited to two. The number of stages to be divided may be one or more.
- the vector C is used. And split zero vectors z have z H may be stored in the code book of different stages throat together. Alternatively, similarly to the second and third embodiments, the divided zero vector need not be stored. Otherwise, when the vector C fl is selected, selection from the codebook 4 2 or 4 2 H 'Addition is not performed.
- the lower-order scaling coefficient codebook 4 5 ⁇ and the higher-order scaling coefficient codebook 4 5 H each have N values of about 0, for example. Store about 5 to 2 coefficients.
- the input index Ix (n) is analyzed by the analysis unit 43 i at an index Ix n) specifying the first-stage code vector and an index Ix (n) 2 specifying the second-stage code vector. And is parsed. First, a code vector x u corresponding to the index Ixd i is obtained from the first-stage codebook 41. Also, the index And a low-order scaling coefficient codebook 4 5 ⁇ and higher scaling factor codebook 4 5 H, each low-order scaling factors s u and higher scaling factor s Hi are read in response to.
- index Ix (n) 2 is a analyzer 4 3 2, is analyzed in the index Ix (n) 2L and Ix (n) 2H, they, by Ix (n) 2L and Ix (n) 2H 2 Select the respective divided vectors x 2Li ., Of the lower-order divided vector codebook 4 2 L and the higher-order divided vector 4 2 H of the second stage. A vector obtained by multiplying the selected divided vectors by low-order and high-order scaling coefficients s Hi in multipliers 46 or 46 H is integrated by an integration unit 47, and the second stage code is obtained. The vector x 2iT . Is generated.
- the adder 4 4 adds the first-stage code vector x u and the second-stage integrated vector ⁇ 2 .., and outputs the addition result as a code vector x (n).
- the vector C is used as one code vector in the first-stage codebook 4 1.
- Stores, and the second-stage split base vector codebooks lower order split base vector codebooks 4 2 I as higher division base vector sign-Book 4 split vector split zero vectors z have z H and stores, respectively. By doing so, a configuration is realized in which the code vector in the case of a silent section or a stationary noise section is output.
- the number of codebook stages may be three or more. In that case, the second stage The following two or more stages may each be configured with a divided vector codebook. In any case, the number of divided vector codebooks per stage is not limited. Example 7
- FIG. 9 shows still another configuration example of the vector codebook 4A of the acoustic parameter encoding device of FIG. 1 or the vector codebook 24A of the acoustic parameter decoding device of FIG.
- the first-stage codebook 41 in this embodiment is also configured by the same division vector codebook as the embodiment in FIG.
- the first-stage low-order codebook 4 1 stores N low-order division vectors x lu x
- the first-stage high-order codebook 4 1 H has N ′ high-order divisions.
- H stores N ′ ′′ high-order divided vectors x 2H1 x 2HN .
- the input index Ix (n) is converted by the code analyzer 43 into an index Ixdi)! That designates the first-stage vector and an index Ix (n) 2 that designates the second-stage vector. Is parsed.
- First stage index The vectors corresponding to the first-stage low-order split vector codebook 4 1 and the first-stage high-order split vector codebook 4 1H are the i-th and ⁇ -th split vectors x lu , x 1HI. select, generate these integrated unit 4 7 1 base integration of the first stage integrated vector x Hi..
- the second-stage index Ix (n) 2 is the same as the first-stage index for the second-stage low-order split vector codebook 4 2 L and the second-stage high-order split vector codebook 4 2 H , respectively. i "th and
- the low-order vector C 0 As one code vector of the low-order vector codebook 4 of the first stage, the low-order vector C 0 The division vector is stored, and the higher-order division vector C M of the vector C 0 is stored as one division vector of the codebook 4 1 H of the first-order higher-order vector, And the second-stage divided vector codebook 4 2 low-order divided vector codebook 4 2 2nd stage As vector mentioned one each of the high-order split base vector codebooks 4 2 H, divided zero base vector ZL, stores z H.
- This configuration realizes a configuration that can output a code vector in the case of a silent section or a stationary noise section. Also in this case, the number of stages is not limited to two, and the number of divided vector codebooks per stage is not limited to two.
- FIG. 10 is a block diagram showing a configuration of an audio signal transmitting apparatus and a receiving apparatus to which the present invention is applied.
- the audio signal 101 is converted into an electric signal by the input device 102 and output to the A / D converter 103.
- the A / D converter 103 converts the (analog) signal output from the input device 102 into a digital signal, and outputs the digital signal to the speech encoder 104.
- the audio encoding device 104 encodes the digital audio signal output from the A / D conversion device 103 by using an audio encoding method described later, and outputs encoded information to the RF modulation device 105.
- the RF modulator 105 converts the speech coded information output from the speech coder 104 into a signal to be transmitted on a propagation medium such as a radio wave and outputs the signal to the transmission antenna 106.
- the transmission antenna 106 transmits the output signal output from the RF modulator 105 as a radio wave (RF signal) 107.
- RF signal radio wave
- the transmitted radio wave (RF signal) 108 is received by the receiving antenna 109 and output to the RF demodulator 110.
- the radio wave (RF signal) 108 in the figure is the radio wave (RF signal) 107 viewed from the receiving side, and is exactly the same as the radio wave (RF signal) 107 unless there is signal attenuation or superposition of noise in the propagation path. It will be.
- RF demodulation device 110 demodulates audio encoded information from the RF signal output from reception antenna 109 and outputs the demodulated information to audio decoding device 111.
- the audio decoding device 111 decodes an audio signal from the audio coding information output from the RF demodulation device 110 using an audio decoding method described later, and outputs the audio signal to the D / A conversion device 112.
- the D / A converter 112 converts the digital audio signal output from the audio decoder 111 into an analog electrical signal and outputs it to the output device 113.
- the output device 113 converts the electrical signal into air vibration.
- the sound signal 114 is output so as to be audible to human ears.
- a base station device and a mobile terminal device in a mobile communication system can be configured.
- the speech signal transmitting apparatus has the features of the speech encoding apparatus 104.
- FIG. 11 is a block diagram showing the configuration of the speech encoding device 104.
- the input audio signal is a signal output from the A / D converter 103 in FIG. 10 and is input to the preprocessing unit 200.
- the pre-processing unit 200 performs high-pass filter processing to remove the DC component, performs waveform shaping processing and pre-enhance processing to improve the performance of the subsequent encoding processing, and adds the processed signal X in to the LPC analysis unit 201 and the addition.
- LPC analysis section 201 performs linear prediction analysis on X in, and outputs an analysis result (linear prediction coefficient) to LPC quantization section 202.
- LPC quantizing section 202 includes LSP parameter calculating section 13, parameter encoding section 10, decoding section 18, and parameter conversion section 19.
- the parameter encoding unit 10 has the same configuration as the parameter encoding unit 10 in FIG. 1 to which the vector codebook of the present invention according to any of the embodiments in FIGS.
- decoding section 18 has the same configuration as the decoding apparatus in FIG. 2 to which any of the codebooks in FIGS.
- the linear prediction coefficient (LPC) output from the LPC analysis unit 201 is converted into an LSP parameter in the LSP parameter calculation unit 13, and the obtained LSP parameter is described in the parameter encoding unit 10 with reference to FIG. Is encoded as follows.
- the codes I (n) and Iw (n) obtained by encoding that is, the code L representing the quantized LPC, are output to the multiplexing unit 2 13 and the codes Ix (n) and Iw (n ) Are decoded by the decoding unit 18 to obtain quantized LSP parameters, which are again converted to LPC parameters by the parameter conversion unit 19, and the obtained quantized LPC parameters are sent to the synthesis filter 203.
- the synthesis filter 203 uses the quantized LPC as a filter coefficient, synthesizes an acoustic signal by filter processing with the driving sound source signal output from the adder 210, and outputs the synthesized signal to the adder 204.
- the adder 204 calculates an error signal £ between the Xin and the synthesized signal, and outputs the error signal £ to the auditory weighting unit 211.
- the auditory weighting unit 211 performs auditory weighting on the error signal ⁇ output from the adder 204, and The distortion of the combined signal is calculated and output to the parameter determination unit 212.
- the parameter determining unit 212 generates a signal to be generated from the adaptive codebook 205, the fixed codebook 207, and the quantization gain generating unit 206 so that the coding distortion output from the auditory weighting unit 211 is minimized. To determine. It should be noted that not only the coding distortion output from the auditory weighting unit 211 but also the signal to be generated from the above three means by using another coding distortion minimizing method using the Xin together. By determining, the coding performance can be further improved.
- the adaptive codebook 205 buffers the excitation signal of the immediately preceding frame n-1 output by the adder 210 in the past when the distortion is minimized, and is output from the parameter determination unit 212.
- the sound source vector is cut out from the position specified by the adaptive vector code A, and is repeatedly connected until it becomes one frame length to generate an adaptive vector including a desired periodic component.
- the fixed codebook 207 a plurality of fixed vectors of one frame length are stored corresponding to the fixed vector codes, and the shape specified by the fixed vector code F output from the parameter The resulting fixed vector is output to multiplier 209.
- Quantization gain generating section 206 the quantized adaptive base vector gain g A and quantized fixed base vector gain g F for adaptive base vector and a fixed base vector that thus identified the gain code G outputted from the parameter determining unit 212 And applied to multipliers 208 and 209, respectively.
- Multiplier 208 multiplies the quantized adaptive vector gain g A output from quantization gain generating section 206 by the adaptive vector output from adaptive codebook 205, and outputs the result to adder 210.
- Multiplier 209 multiplies the fixed vector output from fixed vector codebook 207 by the quantized fixed vector gain g F output from quantization gain generating section 206, and outputs the result to adder 210.
- Adder 210 performs vector addition on the adaptive vector after the gain multiplication and the fixed vector, and outputs the result to synthesis filter 203 and adaptive codebook 205.
- the multiplexing unit 213 converts the code L representing the quantized LPC from the LPC quantization unit 202, the adaptive vector code A representing the adaptive vector and the fixed vector code representing the fixed vector from the parameter determination unit 212. F and a gain code G representing the quantization gain are input, and these codes are multiplexed and output to the transmission line as coded information.
- FIG. 12 is a block diagram showing a configuration of speech decoding apparatus 111 in FIG.
- the coded information output from the RF demodulation unit 110 separates the coded information multiplexed by the demultiplexing unit 1301 into individual codes L.A, F, and G ( separated LPC Code L is given to LPC decoding section 1302, separated adaptive vector code A is given to adaptive codebook 1305, separated gain code G is given to quantization gain generating section 1306, and separated fixed code
- the vector code F is provided to the fixed codebook 1307.
- the LPC decoding section 1302 includes a decoding section 1302A configured in the same manner as in Fig. 2 and a parameter conversion section 1302B.
- Adaptive codebook 1305 extracts an adaptive vector from the position specified by adaptive vector code A output from demultiplexing section 1301, and outputs the extracted adaptive vector to multiplier 1308.
- Fixed code book 1307 generates a fixed vector specified by fixed vector code F output from demultiplexing section 1301, and outputs the generated fixed vector to multiplier 1309.
- Quantization gain generating unit 1306 outputs respectively from the multi-duplex separation unit 1301 to the decoded multiplier 1308 and 1309 and an adaptive base vector gain g A that is specified fixed base vector gain g F in output gain code G .
- Multiplier 1309 the multiplies fixed code vector gain g F in the fixed code vector is output to the adder 1310.
- Adder 1310 adds the adaptive vector after the gain multiplication output from adders 1308 and 1309 and the fixed vector, and outputs the result to synthesis filter 1303.
- the synthesis filter 1303 performs filter synthesis using the vector output from the adder 1310 as a driving sound source signal and the filter coefficients decoded by the LPC decoding unit 1302, and outputs the synthesized signal to the post-processing unit 1304. .
- the post-processing unit 1304 performs processing to improve the subjective quality of speech, such as formant emphasis and pitch emphasis, and processing to improve the subjective quality of stationary noise. Output.
- the LSP parameter is used as a parameter equivalent to the linear prediction coefficient representing the spectral envelope of the audio signal.
- other parameters for example, a parameter, a Percall coefficient, or the like may be used. Even when these are used, silence sections or stationary noise sections Since the spectrum envelope becomes flat in the interval, it is easy to calculate the parameters in this section.
- the p-order ⁇ parameter the 0th order is 1.0, and the 1st to Pth order is It should be 0.0. Even if other acoustic parameters are used, any other acoustic parameter vector determined to represent a substantially flat spectrum envelope may be used.
- the LSP parameter is practical because of its good quantization efficiency.
- vector C in the case of a multistage configuration as a vector codebook, vector C.
- the present invention can be applied not only to encoding and decoding of audio signals but also to encoding and decoding of general acoustic signals such as music signals.
- the apparatus of the present invention can execute encoding and decoding of an audio signal by causing a computer to execute a program.
- Fig. 13 uses the codebook according to one of the inventions in Figs. 3 to 9 and applies the audio parameter encoding device and decoding device in Figs. 1 and 2 and the encoding method and decoding method.
- FIG. 11 shows an embodiment in which a computer executes the encoded audio signal encoding device and decoding device of FIGS. 11 and 12.
- a computer embodying the present invention includes a modem 410 connected to a communication network, an input / output interface 420 for inputting / outputting an audio signal, a buffer memory 430 for temporarily storing a digital audio signal or an audio signal code, encoding and A random access memory (RAM) 440 that executes the decoding process there, a central processing unit (CPU) 450 that controls data input / output and program execution, a hard disk 460 that stores an encoding and decoding program, It comprises a driving device 470 for driving a recording medium 470M, which are connected to each other by a common bus 480.
- a driving device 470 for driving a recording medium 470M, which are connected to each other by a common bus 480.
- the recording medium 470M a compact disk CD, a digital video disk DVD, a magneto-optical disk MO, a memory card, or any other type of recording medium may be used.
- the heart disk 460 stores a program representing the encoding method and the decoding method implemented in the audio signal encoding device and the decoding device shown in FIGS. 11 and 12 in a processing procedure by a computer.
- the program includes, as a subroutine, a program that executes the acoustic parameter encoding and decoding shown in FIGS.
- the CPU 450 When encoding an input audio signal, the CPU 450 reads the audio signal encoding program from the hard disk 460 into the RAM 440, and converts the audio signal fetched into the buffer memory 430 via the input / output interface 420 into a frame-by-frame R signal.
- the encoding is performed by performing processing according to the encoding program in the AM 440, and the obtained code is transmitted to the communication network via the modem 410, for example, as encoded audio signal data. Alternatively, it is temporarily stored in the hard disk 460. Alternatively, the data is written to the recording medium 470M by the recording medium driving device 470.
- the CPU 450 When decoding the input encoded audio signal data, the CPU 450 reads the decoding program from the hard disk 460 into the RAM 440.
- the acoustic code data is downloaded from the communication network to the buffer memory 430 via the modem 410, or is read from the recording medium 470M into the buffer memory 430 by the driving device 470.
- the obtained audio signal data is output from the input / output interface 420.
- Table 1 of FIG. 1 as an example representative of the effect, and if the embedded vector C 0 and the zero base vector z silence section codebook by the present invention, base codebook as in the conventional Kutonore C.
- the quantization performance of the acoustic parameter coding apparatus when no is embedded is shown.
- the vertical axis represents cepstrum distortion, which corresponds to logarithmic spectrum distortion, and is expressed in decibels (dB). The smaller the cepstrum distortion, the better the quantization performance.
- the average distortion is calculated for the speech section for calculating the distortion in the average (Total) of all sections, in sections other than the silent section and the steady section of the speech (Mode 0), and in the steady section of the speech (Mode 1).
- the weighted sum of the code vector of the current frame and the code vector output in the past, or the average vector obtained in advance is a parameter vector corresponding to a silent section or a stationary noise section, or its parameter. Since the vector obtained by subtracting the average vector from the vector is selected as a code vector and its code can be output, an encoding / decoding method and a device thereof with less quality deterioration in these sections are provided. Can be provided.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
Claims
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2002224116A AU2002224116A1 (en) | 2000-11-27 | 2001-11-27 | Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound |
CA002430111A CA2430111C (en) | 2000-11-27 | 2001-11-27 | Speech parameter coding and decoding methods, coder and decoder, and programs, and speech coding and decoding methods, coder and decoder, and programs |
DE60126149T DE60126149T8 (de) | 2000-11-27 | 2001-11-27 | Verfahren, einrichtung und programm zum codieren und decodieren eines akustischen parameters und verfahren, einrichtung und programm zum codieren und decodieren von klängen |
EP01997802A EP1353323B1 (en) | 2000-11-27 | 2001-11-27 | Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound |
KR1020037006956A KR100566713B1 (ko) | 2000-11-27 | 2001-11-27 | 음향 파라미터 부호화, 복호화 방법, 장치 및 프로그램, 음성 부호화, 복호화 방법, 장치 및 프로그램 |
US10/432,722 US7065338B2 (en) | 2000-11-27 | 2001-11-27 | Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2000-359311 | 2000-11-27 | ||
JP2000359311 | 2000-11-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2002043052A1 true WO2002043052A1 (en) | 2002-05-30 |
Family
ID=18831092
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2001/010332 WO2002043052A1 (en) | 2000-11-27 | 2001-11-27 | Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound |
Country Status (9)
Country | Link |
---|---|
US (1) | US7065338B2 (ja) |
EP (1) | EP1353323B1 (ja) |
KR (1) | KR100566713B1 (ja) |
CN (1) | CN1202514C (ja) |
AU (1) | AU2002224116A1 (ja) |
CA (1) | CA2430111C (ja) |
CZ (1) | CZ304212B6 (ja) |
DE (1) | DE60126149T8 (ja) |
WO (1) | WO2002043052A1 (ja) |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7315815B1 (en) | 1999-09-22 | 2008-01-01 | Microsoft Corporation | LPC-harmonic vocoder with superframe structure |
KR100527002B1 (ko) * | 2003-02-26 | 2005-11-08 | 한국전자통신연구원 | 음성 신호의 에너지 분포 특성을 고려한 쉐이핑 장치 및 방법 |
US7463172B2 (en) * | 2004-03-03 | 2008-12-09 | Japan Science And Technology Agency | Signal processing device and method, signal processing program, and recording medium where the program is recorded |
US7668712B2 (en) * | 2004-03-31 | 2010-02-23 | Microsoft Corporation | Audio encoding and decoding with intra frames and adaptive forward error correction |
US7707034B2 (en) * | 2005-05-31 | 2010-04-27 | Microsoft Corporation | Audio codec post-filter |
US7831421B2 (en) * | 2005-05-31 | 2010-11-09 | Microsoft Corporation | Robust decoder |
US7177804B2 (en) * | 2005-05-31 | 2007-02-13 | Microsoft Corporation | Sub-band voice codec with multi-stage codebooks and redundant coding |
WO2007129726A1 (ja) * | 2006-05-10 | 2007-11-15 | Panasonic Corporation | 音声符号化装置及び音声符号化方法 |
JPWO2007132750A1 (ja) * | 2006-05-12 | 2009-09-24 | パナソニック株式会社 | Lspベクトル量子化装置、lspベクトル逆量子化装置、およびこれらの方法 |
US8396158B2 (en) * | 2006-07-14 | 2013-03-12 | Nokia Corporation | Data processing method, data transmission method, data reception method, apparatus, codebook, computer program product, computer program distribution medium |
US8036767B2 (en) | 2006-09-20 | 2011-10-11 | Harman International Industries, Incorporated | System for extracting and changing the reverberant content of an audio input signal |
US8055192B2 (en) * | 2007-06-25 | 2011-11-08 | Samsung Electronics Co., Ltd. | Method of feeding back channel information and receiver for feeding back channel information |
CN101335004B (zh) * | 2007-11-02 | 2010-04-21 | 华为技术有限公司 | 一种多级量化的方法及装置 |
CN100578619C (zh) * | 2007-11-05 | 2010-01-06 | 华为技术有限公司 | 编码方法和编码器 |
US20090123523A1 (en) * | 2007-11-13 | 2009-05-14 | G. Coopersmith Llc | Pharmaceutical delivery system |
US20090129605A1 (en) * | 2007-11-15 | 2009-05-21 | Sony Ericsson Mobile Communications Ab | Apparatus and methods for augmenting a musical instrument using a mobile terminal |
EP2246845A1 (en) * | 2009-04-21 | 2010-11-03 | Siemens Medical Instruments Pte. Ltd. | Method and acoustic signal processing device for estimating linear predictive coding coefficients |
WO2011044064A1 (en) * | 2009-10-05 | 2011-04-14 | Harman International Industries, Incorporated | System for spatial extraction of audio signals |
CN102623012B (zh) * | 2011-01-26 | 2014-08-20 | 华为技术有限公司 | 矢量联合编解码方法及编解码器 |
SG11201510162WA (en) | 2013-06-10 | 2016-01-28 | Fraunhofer Ges Forschung | Apparatus and method for audio signal envelope encoding, processing and decoding by modelling a cumulative sum representation employing distribution quantization and coding |
CN103474075B (zh) * | 2013-08-19 | 2016-12-28 | 科大讯飞股份有限公司 | 语音信号发送方法及系统、接收方法及系统 |
US9432360B1 (en) * | 2013-12-31 | 2016-08-30 | Emc Corporation | Security-aware split-server passcode verification for one-time authentication tokens |
US9454654B1 (en) * | 2013-12-31 | 2016-09-27 | Emc Corporation | Multi-server one-time passcode verification on respective high order and low order passcode portions |
US9407631B1 (en) * | 2013-12-31 | 2016-08-02 | Emc Corporation | Multi-server passcode verification for one-time authentication tokens with auxiliary channel compatibility |
PL3098812T3 (pl) * | 2014-01-24 | 2019-02-28 | Nippon Telegraph And Telephone Corporation | Urządzenie, sposób i program do analizy liniowo-predykcyjnej oraz nośnik zapisu |
EP3252758B1 (en) * | 2015-01-30 | 2020-03-18 | Nippon Telegraph and Telephone Corporation | Encoding apparatus, decoding apparatus, and methods, programs and recording media for encoding apparatus and decoding apparatus |
US9602127B1 (en) * | 2016-02-11 | 2017-03-21 | Intel Corporation | Devices and methods for pyramid stream encoding |
CN113593527B (zh) * | 2021-08-02 | 2024-02-20 | 北京有竹居网络技术有限公司 | 一种生成声学特征、语音模型训练、语音识别方法及装置 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0573097A (ja) * | 1991-09-17 | 1993-03-26 | Nippon Telegr & Teleph Corp <Ntt> | 低遅延符号駆動形予測符号化方法 |
JPH05113800A (ja) * | 1991-10-22 | 1993-05-07 | Nippon Telegr & Teleph Corp <Ntt> | 音声符号化法 |
JPH06118999A (ja) * | 1992-10-02 | 1994-04-28 | Nippon Telegr & Teleph Corp <Ntt> | 音声のパラメータ情報符号化法 |
JPH06175695A (ja) * | 1992-12-01 | 1994-06-24 | Nippon Telegr & Teleph Corp <Ntt> | 音声パラメータの符号化方法および復号方法 |
JPH06282298A (ja) * | 1993-03-29 | 1994-10-07 | Nippon Telegr & Teleph Corp <Ntt> | 音声の符号化方法 |
JPH0844400A (ja) * | 1994-05-27 | 1996-02-16 | Toshiba Corp | ベクトル量子化装置 |
JPH11136133A (ja) * | 1997-10-28 | 1999-05-21 | Matsushita Electric Ind Co Ltd | ベクトル量子化法 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4896361A (en) * | 1988-01-07 | 1990-01-23 | Motorola, Inc. | Digital speech coder having improved vector excitation source |
JPH0451199A (ja) * | 1990-06-18 | 1992-02-19 | Fujitsu Ltd | 音声符号化・復号化方式 |
EP0500961B1 (en) * | 1990-09-14 | 1998-04-29 | Fujitsu Limited | Voice coding system |
US5271089A (en) * | 1990-11-02 | 1993-12-14 | Nec Corporation | Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits |
JP3151874B2 (ja) * | 1991-02-26 | 2001-04-03 | 日本電気株式会社 | 音声パラメータ符号化方式および装置 |
US5396576A (en) * | 1991-05-22 | 1995-03-07 | Nippon Telegraph And Telephone Corporation | Speech coding and decoding methods using adaptive and random code books |
US5717824A (en) * | 1992-08-07 | 1998-02-10 | Pacific Communication Sciences, Inc. | Adaptive speech coder having code excited linear predictor with multiple codebook searches |
US5457783A (en) * | 1992-08-07 | 1995-10-10 | Pacific Communication Sciences, Inc. | Adaptive speech coder having code excited linear prediction |
US5727122A (en) * | 1993-06-10 | 1998-03-10 | Oki Electric Industry Co., Ltd. | Code excitation linear predictive (CELP) encoder and decoder and code excitation linear predictive coding method |
US5819213A (en) * | 1996-01-31 | 1998-10-06 | Kabushiki Kaisha Toshiba | Speech encoding and decoding with pitch filter range unrestricted by codebook range and preselecting, then increasing, search candidates from linear overlap codebooks |
KR100527217B1 (ko) | 1997-10-22 | 2005-11-08 | 마츠시타 덴끼 산교 가부시키가이샤 | 확산 벡터 생성 방법, 확산 벡터 생성 장치, celp형 음성 복호화 방법 및 celp형 음성 복호화 장치 |
US6240386B1 (en) | 1998-08-24 | 2001-05-29 | Conexant Systems, Inc. | Speech codec employing noise classification for noise compensation |
DE69941999D1 (de) * | 1998-10-09 | 2010-03-25 | Sony Corp | Erkennungsvorrichtung, Erkennungsverfahren und Aufzeichnungsmedium |
-
2001
- 2001-11-27 CN CNB018218296A patent/CN1202514C/zh not_active Expired - Fee Related
- 2001-11-27 US US10/432,722 patent/US7065338B2/en not_active Expired - Fee Related
- 2001-11-27 KR KR1020037006956A patent/KR100566713B1/ko not_active IP Right Cessation
- 2001-11-27 AU AU2002224116A patent/AU2002224116A1/en not_active Abandoned
- 2001-11-27 EP EP01997802A patent/EP1353323B1/en not_active Expired - Lifetime
- 2001-11-27 WO PCT/JP2001/010332 patent/WO2002043052A1/ja active IP Right Grant
- 2001-11-27 CA CA002430111A patent/CA2430111C/en not_active Expired - Fee Related
- 2001-11-27 CZ CZ2003-1465A patent/CZ304212B6/cs not_active IP Right Cessation
- 2001-11-27 DE DE60126149T patent/DE60126149T8/de active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0573097A (ja) * | 1991-09-17 | 1993-03-26 | Nippon Telegr & Teleph Corp <Ntt> | 低遅延符号駆動形予測符号化方法 |
JPH05113800A (ja) * | 1991-10-22 | 1993-05-07 | Nippon Telegr & Teleph Corp <Ntt> | 音声符号化法 |
JPH06118999A (ja) * | 1992-10-02 | 1994-04-28 | Nippon Telegr & Teleph Corp <Ntt> | 音声のパラメータ情報符号化法 |
JPH06175695A (ja) * | 1992-12-01 | 1994-06-24 | Nippon Telegr & Teleph Corp <Ntt> | 音声パラメータの符号化方法および復号方法 |
JPH06282298A (ja) * | 1993-03-29 | 1994-10-07 | Nippon Telegr & Teleph Corp <Ntt> | 音声の符号化方法 |
JPH0844400A (ja) * | 1994-05-27 | 1996-02-16 | Toshiba Corp | ベクトル量子化装置 |
JPH11136133A (ja) * | 1997-10-28 | 1999-05-21 | Matsushita Electric Ind Co Ltd | ベクトル量子化法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP1353323A4 * |
Also Published As
Publication number | Publication date |
---|---|
KR20030062354A (ko) | 2003-07-23 |
DE60126149D1 (de) | 2007-03-08 |
CZ20031465A3 (cs) | 2003-08-13 |
DE60126149T2 (de) | 2007-10-18 |
CA2430111C (en) | 2009-02-24 |
US7065338B2 (en) | 2006-06-20 |
CN1486486A (zh) | 2004-03-31 |
EP1353323A4 (en) | 2005-06-08 |
KR100566713B1 (ko) | 2006-04-03 |
CA2430111A1 (en) | 2002-05-30 |
AU2002224116A1 (en) | 2002-06-03 |
EP1353323B1 (en) | 2007-01-17 |
EP1353323A1 (en) | 2003-10-15 |
DE60126149T8 (de) | 2008-01-31 |
CZ304212B6 (cs) | 2014-01-08 |
US20040023677A1 (en) | 2004-02-05 |
CN1202514C (zh) | 2005-05-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2002043052A1 (en) | Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound | |
US5884253A (en) | Prototype waveform speech coding with interpolation of pitch, pitch-period waveforms, and synthesis filter | |
EP1339040B1 (en) | Vector quantizing device for lpc parameters | |
JP3557662B2 (ja) | 音声符号化方法及び音声復号化方法、並びに音声符号化装置及び音声復号化装置 | |
JPH09127991A (ja) | 音声符号化方法及び装置、音声復号化方法及び装置 | |
JPH1091194A (ja) | 音声復号化方法及び装置 | |
JPH0353300A (ja) | 音声符号化装置 | |
JP3582589B2 (ja) | 音声符号化装置及び音声復号化装置 | |
JP3353852B2 (ja) | 音声の符号化方法 | |
JP3916934B2 (ja) | 音響パラメータ符号化、復号化方法、装置及びプログラム、音響信号符号化、復号化方法、装置及びプログラム、音響信号送信装置、音響信号受信装置 | |
JP3268750B2 (ja) | 音声合成方法及びシステム | |
JP3299099B2 (ja) | 音声符号化装置 | |
JP3153075B2 (ja) | 音声符号化装置 | |
JP3144284B2 (ja) | 音声符号化装置 | |
JP2001318698A (ja) | 音声符号化装置及び音声復号化装置 | |
JP3063087B2 (ja) | 音声符号化復号化装置及び音声符号化装置ならびに音声復号化装置 | |
JP3006790B2 (ja) | 音声符号化復号化方法及びその装置 | |
JPH0519796A (ja) | 音声の励振信号符号化・復号化方法 | |
JPH11259098A (ja) | 音声符号化/復号化方法 | |
JP2002073097A (ja) | Celp型音声符号化装置とcelp型音声復号化装置及び音声符号化方法と音声復号化方法 | |
JP3192051B2 (ja) | 音声符号化装置 | |
JP3024467B2 (ja) | 音声符号化装置 | |
JP3092436B2 (ja) | 音声符号化装置 | |
JP2947788B1 (ja) | 音声および音響信号の高速な符号化方法および装置および記録媒体 | |
JP2000029499A (ja) | 音声符号化装置ならびに音声符号化復号化装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1020037006956 Country of ref document: KR Ref document number: 2001997802 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: PV2003-1465 Country of ref document: CZ Ref document number: 664/KOLNP/2003 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10432722 Country of ref document: US Ref document number: 2430111 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 018218296 Country of ref document: CN |
|
WWP | Wipo information: published in national office |
Ref document number: 1020037006956 Country of ref document: KR |
|
WWP | Wipo information: published in national office |
Ref document number: PV2003-1465 Country of ref document: CZ |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWP | Wipo information: published in national office |
Ref document number: 2001997802 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWG | Wipo information: grant in national office |
Ref document number: 1020037006956 Country of ref document: KR |
|
WWG | Wipo information: grant in national office |
Ref document number: 2001997802 Country of ref document: EP |