WO2006035705A1 - スケーラブル符号化装置およびスケーラブル符号化方法 - Google Patents
スケーラブル符号化装置およびスケーラブル符号化方法 Download PDFInfo
- Publication number
- WO2006035705A1 WO2006035705A1 PCT/JP2005/017618 JP2005017618W WO2006035705A1 WO 2006035705 A1 WO2006035705 A1 WO 2006035705A1 JP 2005017618 W JP2005017618 W JP 2005017618W WO 2006035705 A1 WO2006035705 A1 WO 2006035705A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- channel
- encoding
- signal
- parameter
- monaural
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 19
- 230000005236 sound signal Effects 0.000 claims abstract description 22
- 230000005284 excitation Effects 0.000 claims description 123
- 230000003044 adaptive effect Effects 0.000 claims description 49
- 238000004891 communication Methods 0.000 claims description 36
- 238000004458 analytical method Methods 0.000 claims description 17
- 239000002131 composite material Substances 0.000 claims description 6
- 230000015572 biosynthetic process Effects 0.000 description 17
- 238000003786 synthesis reaction Methods 0.000 description 17
- 238000013139 quantization Methods 0.000 description 15
- 238000012937 correction Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 5
- 238000010295 mobile communication Methods 0.000 description 5
- 230000006978 adaptation Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 101150062821 chlP gene Proteins 0.000 description 2
- 230000006866 deterioration Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 102100025854 Acyl-coenzyme A thioesterase 1 Human genes 0.000 description 1
- 101710175445 Acyl-coenzyme A thioesterase 1 Proteins 0.000 description 1
- 102100023044 Cytosolic acyl coenzyme A thioester hydrolase Human genes 0.000 description 1
- 101710152190 Cytosolic acyl coenzyme A thioester hydrolase Proteins 0.000 description 1
- 241000406668 Loxodonta cyclotis Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
Definitions
- the present invention relates to a scalable encoding device and a scalable encoding method for realizing a scalable code for a stereo audio signal by means of a code based on CELP (hereinafter, sometimes simply referred to as CELP encoding). .
- Non-Patent Document 1 discloses an example of a scalable coding device having this function.
- Non-patent literature l ISO / IEC 14496-3: 1999 (B.14 Scalable AAC with core coder) Invention disclosure
- Non-Patent Document 1 discloses that the CELP method is applied, particularly in the enhancement layer. A specific configuration in the case of applying the CELP code key is not shown. Even if the CELP code key optimized for an unexpected speech signal is applied as it is, the desired code key is not changed. It is difficult to get efficiency.
- an object of the present invention is to realize scalable coding with a CELP code key for a stereo audio signal and improve the code key efficiency, and the scalable code key device and the scalable code key. Is to provide a method.
- the scalable encoding device includes a generation unit that generates a monaural audio signal from a stereo audio signal, and a first encoding unit that obtains an encoding parameter of the mono audio signal by encoding the monaural audio signal by a CELP method.
- the coding means and the difference between the R channel or the L channel of the stereo audio signal is set as a channel to be coded, and the coding pair
- the difference between the nomometer obtained by performing linear prediction analysis and adaptive excitation codebook search on the elephant channel and the encoding parameter of the monaural speech signal is obtained, and the difference parameter is used to determine the encoding parameter of the encoding target channel.
- a second encoding means to be obtained.
- FIG. 1 is a block diagram showing the main configuration of the scalable coding apparatus according to Embodiment 1.
- FIG. 2 is a diagram showing the relationship between a monaural signal, a first channel signal, and a second channel signal.
- 3 Block diagram showing the main configuration inside the CELP code key section according to Embodiment 1
- IV Block diagram showing the main configuration inside the first channel difference information code key section according to Embodiment 1
- FIG. 5 is a block diagram showing the main configuration of the scalable code generator according to the second embodiment.
- FIG. 6 is a block diagram showing the main configuration inside the second channel differential information code input unit according to the second embodiment.
- FIG. 1 is a block diagram showing the main configuration of scalable coding apparatus 100 according to Embodiment 1 of the present invention.
- the scalable code key device 100 includes an adder 101, a multiplier 102, a CELP code key unit 103, and a first channel difference information code key unit 104.
- Each unit of the scalable coding apparatus 100 performs the following operations.
- Adder 101 adds first channel signal CH1 and second channel signal CH2 input to scalable coding apparatus 100, and generates a sum signal.
- Multiplier 102 Multiply the signal by 1Z2 to halve the scale and generate a monaural signal M. That is, the adder 101 and the multiplier 102 obtain an average signal of the first channel signal CH1 and the second channel signal CH2 and set it as the monaural signal M.
- CELP coding section 103 performs CELP coding on monaural signal M, and obtains CELP code parameter of the obtained monaural signal from scalable coding apparatus 100 and first channel difference information. Output to the sign key unit 104.
- the CELP code key parameters are LSP parameters, adaptive excitation codebook index, adaptive excitation gain, fixed excitation codebook index, and fixed excitation gain.
- the first channel differential information encoding unit 104 encodes the first channel signal CH1 input to the scalable encoding unit 100 according to CELP encoding, that is, linear prediction analysis, adaptive Encoding by excitation codebook search and fixed excitation codebook search is performed, and the difference between the encoding parameter obtained in this process and the CELP encoding parameter output from CELP code encoder 103 is obtained. If this code is also simply referred to as a CELP code key, the above processing is different from the monaural signal M and the first channel signal CH1 at the CELP code key parameter level (step). It is equivalent to taking. Then, the first channel difference information encoding unit 104 encodes the difference information (first channel difference information) regarding the first channel, and encodes the obtained first channel difference information. Output to the outside of the scalable encoder 100.
- CELP encoding that is, linear prediction analysis, adaptive Encoding by excitation codebook search and fixed excitation codebook search is performed, and the difference between the encoding parameter obtained in this process and the CELP en
- One feature of the scalable code encoder 100 is that an adder 101, a multiplier 102, and a CELP code encoder 103 are used to convert the first layer into the first channel difference information encoder 104. Therefore, the second layer is configured, and the first layer power is also output as a monaural signal code key parameter, and is decoded together with the first layer (monaural signal) code key parameter. Thus, a coding parameter that can obtain a stereo signal is output. That is, the scalable coding apparatus according to the present embodiment realizes a scalable coding that includes a monaural signal and a stereo signal.
- the decoding device that acquires the coding parameters that also have the first layer and second layer powers described above may be a scalable decoding device that supports both stereo communication and monaural communication, or monaural communication.
- the decoding device corresponding only to the above may be used.
- the second layer code parameter cannot be obtained due to the deterioration of the transmission path environment. There may be cases where only the data can be acquired. However, even when powerful, this scalable decoding device can decode a monaural signal although the quality is low.
- the scalable decoding apparatus can acquire the first layer and second layer code key parameters, a high-quality stereo signal can be decoded using both parameters.
- FIG. 2 is a diagram showing the relationship between the monaural signal, the first channel signal, and the second channel signal, comparing the states before and after the sign ⁇ .
- the monaural signal M can be obtained by multiplying the sum of the first channel signal CH1 and the second channel signal CH2 by 1 Z2, that is, by the following (Equation 1).
- the second channel signal CH2 has the following relationship (Equation 3), where ⁇ CH2 is the difference between CH2 and monaural signal M (second channel signal difference).
- Equation 4 means that the first channel difference information and the second channel difference information after encoding are approximate to be equal in size, in other words, the first channel and the second channel difference information. It is to be approximated that the sign distortion of both channels is equal when the two channels are signed. Actually, these code distortions do not differ greatly even in the actual machine. Therefore, even if the code distortion is performed while ignoring the difference between the first channel and the second channel, the sound quality of the decoded signal is reduced. It can be considered that it does not lead to a large deterioration of the.
- scalable code encoder 100 outputs two code parameters, M and A CH1, using the above principle.
- the decoding apparatus that has obtained these can decode M and A CHl to decode not only CH1, but also CH2.
- FIG. 3 is a block diagram showing the main configuration inside CELP code key section 103.
- the CELP encoding unit 103 includes an LPC analysis unit 111, an LPC quantization unit 112, an LPC synthesis filter 113, an adder 114, an auditory weighting unit 115, a distortion minimizing unit 116, an adaptive excitation codebook 117, and multiplication.
- LPC analysis section 111 performs linear prediction analysis on monaural signal M output from multiplier 102, and outputs an LPC parameter as an analysis result to LPC quantization section 112 and auditory weighting section 115. .
- the LPC quantization unit 112 converts the LPC parameter output from the LPC analysis unit 111 into an LSP parameter suitable for quantization, and then quantizes, and the obtained quantized LSP parameter (C) is CELP coded signal. Output to outside of part 103.
- This quantized LSP parameter is one of the CELP code key parameters obtained by the CELP code key unit 103. Further, the LPC quantization unit 112 reconverts the quantized LSP parameter into a quantized LPC parameter, and then outputs this to the LPC synthesis filter 113.
- the LPC synthesis filter 113 is a quantization LPC parameter output from the LPC quantization unit 112. Is used to perform synthesis by an LPC synthesis filter using the excitation vector generated by the adaptive excitation codebook 117 and the fixed excitation codebook 119 described later as a driving excitation.
- the resultant composite signal is output to adder 114.
- Adder 114 calculates the error signal by inverting the polarity of the synthesized signal output from LPC synthesis filter 113 and adding it to monaural signal M, and outputs this error signal to auditory weighting section 115. .
- This error signal corresponds to coding distortion.
- the perceptual weighting unit 115 uses an perceptual weighting filter configured based on the LPC parameters output from the LPC analysis unit 111, and is perceptual to the sign-distortion output from the adder 114. Weighting is performed, and this signal is output to distortion minimizing section 116.
- Distortion minimizing section 116 is adapted for adaptive excitation codebook 117, fixed excitation codebook 119, and gain codebook 121 so that the code distortion output from perceptual weighting section 115 is minimized. Specify various parameters. Specifically, distortion minimizing section 116 instructs adaptive excitation codebook 117, fixed excitation codebook 119, and gain codebook 121 to use (C 1, C 2, C 3).
- Adaptive excitation codebook 117 stores the excitation vector of the excitation source for LPC synthesis filter 113 generated in the past in an internal buffer, and it corresponds to the adaptation indicated by the distortion minimizing unit 116.
- One subframe is generated from the stored sound source vector based on the sound source lag, and is output to the multiplier 118 as an adaptive sound source vector.
- Fixed excitation codebook 119 outputs the excitation vector corresponding to the index instructed from distortion minimizing section 116 to multiplier 120 as a fixed excitation vector.
- Gain codebook 121 is a gain corresponding to the index instructed from distortion minimizing section 116, specifically, an adaptive excitation vector from adaptive excitation codebook 117, and a fixed code from fixed excitation codebook 119 Each gain for the sound source vector is generated and output to multipliers 118 and 120, respectively.
- Multiplier 118 multiplies the adaptive excitation gain output from gain codebook 121 by the adaptive excitation vector output from adaptive excitation codebook 117 and outputs the result to adder 122.
- Multiplier 120 multiplies the fixed excitation vector output from fixed excitation codebook 119 by the fixed excitation gain output from gain codebook 121 and outputs the result to adder 122.
- Adder 122 adds the adaptive excitation vector output from multiplier 118 and the fixed excitation vector output from multiplier 120, and uses the added excitation vector as a driving excitation to LPC synthesis filter 113. Output. The adder 122 feeds back the obtained excitation excitation excitation vector to the adaptive excitation codebook 117.
- the LPC synthesis filter 113 performs LPC synthesis using the excitation vector output from the adder 122, that is, the excitation vector generated by the adaptive excitation codebook 117 and the fixed excitation codebook 119 as a driving excitation. Performs synthesis using a filter.
- FIG. 4 is a block diagram showing a main configuration inside first channel differential information code key section 104.
- the first channel difference information encoding unit 104 encodes the sound source component parameter and the spectrum envelope component parameter of the first channel signal CH 1 as a difference from the monaural signal M.
- the parameters of the excitation component are the adaptive excitation codebook index, the adaptive excitation source code, the fixed excitation codebook index, and the fixed excitation gain.
- the parameter of the vector envelope component is an LPC analysis. It is the LPC parameter obtained by doing.
- an LPC analysis unit 131 In the first channel difference information encoding unit 104, an LPC analysis unit 131, an LPC synthesis filter 133, an adder 134, an auditory weighting unit 135, a distortion minimizing unit 136, a multiplier 138, a multiplier 140, and
- the adder 142 is the LPC analysis unit 111, the LPC synthesis filter 113, the adder 114, the perceptual weighting unit 115, the distortion minimization unit 116, the multiplier 118, the multiplier 120, and the adder in the CELP encoding unit 103. Since the configuration is the same as that of 122, the description thereof will be omitted, and the configuration different from CELP code key 103 will be described in detail below.
- the differential quantization unit 132 receives the LPC parameter ⁇ (i) of the first channel signal CH1 obtained by the LPC analysis unit 131, and the LP C parameter of the monaural signal M already obtained by the CELP coding unit 103 ( C) to obtain the difference from the first channel by quantizing the difference.
- the differential quantization unit 132 outputs the quantization parameter ⁇ (i) of the LPC parameter of the first channel signal to the LPC synthesis filter 133.
- gain codebook 143 Based on the gain codebook index for monaural signal output from CELP code key unit 103, gain codebook 143 generates an adaptive excitation gain and a fixed excitation gain corresponding thereto, and performs multiplication. Output to devices 138 and 140 respectively.
- Adaptive excitation codebook 137 accumulates the driving excitation generated in the past subframe in the internal buffer. In the case of voiced sound, the adaptive excitation codebook 137 has a strong correlation with the driving excitation waveform of the pitch waveform of the current frame. A driving sound source is cut out and a signal obtained by periodically repeating it is defined as a first approximation as a driving sound source. The adaptive excitation codebook 137 encodes this pitch period, that is, the adaptive excitation lag. In particular, adaptive excitation codebook 137 encodes the pitch period of CH1 as a difference from the pitch period of monaural signal M already encoded by CELP code encoder 103.
- the monaural signal M is a signal generated from the first channel signal CH1 and the second channel signal CH2, and thus is considered to be highly similar to the first channel signal CH1. That is, rather than performing a new adaptive excitation codebook search for the first channel signal CH1, the first channel signal CH1 is used as a difference from this pitch period on the basis of the pitch period obtained for the monaural signal M. This is because it is considered that the sign cycle efficiency is higher when the pitch period is expressed. Specifically, from the pitch period T already calculated for the monaural signal and its value
- the pitch parameter T of CH1 is expressed by the following (Equation 6), and the difference parameter ⁇ T when the optimum T is obtained by adaptive excitation codebook search for CH1 is encoded.
- Fixed excitation codebook 139 is used for the residual component that cannot be approximated by the excitation signal generated based on the past excitation in adaptive excitation codebook 137 among the excitation components of the current frame. Generate a sound source signal that represents the component. This residual component contributes relatively less to the synthesized signal than the component generated by adaptive excitation codebook 137. As already mentioned, the similarity between the monaural signal M and the first channel signal CH1 is high. Therefore, fixed excitation codebook 139 uses the fixed excitation codebook index for monaural signal M used in fixed excitation codebook 119 as the fixed excitation codebook index of CH1. This corresponds to making the fixed source vector of CH1 the same signal as the fixed source vector of the monaural signal.
- the gain codebook 141 specifies the gain of the adaptive excitation vector for CH1 by two parameters: an adaptive excitation gain for monaural signals and a coefficient by which this adaptive excitation gain is multiplied. The same applies to the gain of the fixed excitation vector for CH1, and the gain codebook 141 uses the two parameters of the fixed excitation gain for monaural signal and the coefficient to be multiplied by this fixed excitation gain to determine the gain of the fixed excitation vector for CH1. Is identified. Moreover, these two coefficients are determined as a common gain multiplier value ⁇ and output to the multiplier 144. ⁇ is determined by selecting the optimal gain index for the CH1 gain codebook power prepared in advance so that the error between the CH1 composite signal and the CH1 original signal is minimized.
- the multiplier 144 multiplies the driving sound source ex ′ output from the adder 142 by ⁇ to obtain ex, and outputs the result to the LPC synthesis filter 133.
- a monaural signal is generated from the first channel signal CH1 and the second channel signal CH2 constituting a stereo signal, CELP encoding of the monaural signal is performed, and When encoding CH1, encoding is performed as the difference from the CELP parameter of the monaural signal. Therefore, it is possible to realize a stereo signal with a low bit rate and good quality.
- the ACH1 coding method uses a CELP coding parameter of a monaural signal and a difference parameter for the monaural signal, and generates a combined signal of CH1 and an original signal of CH1.
- the difference parameter of the CELP code Determine the data.
- the second layer code target is obtained by adding a difference at the CELP code parameter stage, not the difference in the waveform between the monaural signal and the first channel signal. is there.
- CELP code ⁇ is a technology that performs coding by modeling the human vocal cords' vocal tract in the first place, and if the difference is taken on the waveform, the obtained difference information is the model of CELP code ⁇ . This is because it will be considered as something that does not physically correspond. Therefore, since it is considered that efficient coding cannot be performed by CELP coding performed on the difference on the waveform, the present invention takes the difference at the CELP code parameter stage.
- the decoding apparatus that has received the code parameter generated by the scalable coding apparatus according to the present embodiment obtains a decoded signal by calculating the received coding parameter power of ⁇ CH1 (Equation 5). That's right.
- fixed excitation codebook 139 uses the same index as fixed excitation codebook 119, that is, fixed excitation codebook 139 is the same as the fixed excitation vector for monaural signals.
- the case where the fixed sound source vector is generated has been described as an example.
- the present invention is not limited to this.
- a fixed excitation codebook search is performed on the fixed excitation codebook 139. It is also possible to obtain a fixed excitation codebook index to be added for CH1. In this case, although the code key bit rate is increased, a higher-quality CH1 code key can be realized.
- the coefficient multiplied by the adaptive excitation gain and the coefficient multiplied by the fixed excitation gain are common, such as ⁇ output from gain codebook 141.
- these two coefficients need not be common.
- the coefficient multiplied by the adaptive sound source gain is ⁇
- y is determined in advance so that the error between the synthesized signal of CH1 and the original signal of CH1 is minimized, as in the case of common gain.
- the optimum gain index is selected from the CHI gain codebook provided.
- the method for determining ⁇ is the same as the method for determining ⁇ , and the synthesized signal of CH2 and the source of CH2
- the optimum gain index is selected from the CH2 gain codebook prepared in advance so that the error with the signal is minimized.
- the first channel code distortion is approximately equal to the second channel code distortion, and the first layer and the second layer perform the code delay.
- the configuration of the “scalable coding system” is shown.
- a third layer is newly provided in order to encode CH2 with higher accuracy.
- encoding of the difference in code distortion between the first channel and the second channel is performed. More specifically, there is a configuration in which a difference between the coding distortion included in the first channel difference information and the coding distortion included in the second channel difference information is further encoded and output as new encoded information. Show.
- the coding method of A CH2 ' is the CHLP CELP code that is estimated using both the CELP coding parameter of the monaural signal and the differential CELP parameter coded in the second layer.
- the above correction parameters are determined so that the error between the CH2 composite signal generated by these parameters and the CH2 original signal is minimized.
- the CELP code of the difference itself on the waveform is not performed, and the reason is the same as in the first embodiment.
- FIG. 5 is a block diagram showing the main configuration of scalable coding apparatus 200 according to Embodiment 2 of the present invention.
- This scalable coding apparatus 200 has the same basic configuration as scalable coding apparatus 100 shown in Embodiment 1, and the same components are denoted by the same reference numerals, and the description thereof will be given. Omitted.
- the new configuration is the second channel difference information code unit 201 that constitutes the third layer.
- FIG. 6 is a block diagram showing the main configuration inside second channel difference information code key section 201.
- this second channel difference information code section 201 LPC analysis section 211, difference quantization section 212, LPC synthesis filter 213, adder 214, perceptual weighting section 215, distortion minimization section 216, Adaptive excitation codebook 217, multiplier 218, fixed excitation codebook 219, multiplier 220, gain codebook 221, adder 222, gain codebook 223, and multiplier 224 are the first channel difference information code Lump analysis unit 131, differential quantization unit 132, LPC synthesis filter 133, adder 134, perceptual weighting unit 135, distortion minimization unit 136, adaptive excitation codebook 137, multiplier 138, fixed excitation code Since the configuration is the same as that of the book 139, the multiplier 140, the gain code book 141, the adder 142, the gain code book 143, and the multiplier 144, description thereof will be omitted.
- the second channel lag parameter estimation unit 225 includes the pitch period T of the monaural signal and CH1
- the pitch period (adaptive excitation lag) of CH2 is predicted using ⁇ which is the CELP code key parameter of, and the predicted value T ′ is output to the adaptive excitation codebook 217.
- the encoding parameter ⁇ is the pitch period T of CH1 with respect to the pitch period T of the monaural signal.
- the second channel LPC parameter estimation unit 226 outputs the LPC parameter ⁇ (i) of the monaural signal.
- the LPC parameter ⁇ (i) of CHI is used to predict the LPC parameter of CH2, and the predicted value ⁇ ′ (i) is output to the differential quantization unit 212.
- the second channel sound source gain estimation unit 227 uses the fact that the driving sound source of the monaural signal is obtained from the driving sound sources of CH1 and CH2 by the above (Equation 1), and the gain multiplier value of CH1 The multiplier value is predicted by back calculation, and the predicted value ⁇ is output to the multiplier 228.
- This predicted value ⁇ is the second channel excitation gain output from the gain codebook 221. Multiplied by ⁇ .
- the closed-loop encoding controlled by the distortion minimizing unit 216 that is, the method of encoding the pitch period (adaptive excitation lag) ⁇ of the second channel signal CH2 has already been encoded.
- the pitch period ⁇ of CH2 is determined by its predicted value ⁇ and its correction value ⁇ .
- the scalable coding apparatus searches the adaptive excitation codebook for CH2 and codes the correction parameter ⁇ when obtaining the optimum ⁇ .
- Fixed excitation codebook 219 is the same as fixed excitation codebook 139 of first channel difference information encoding section 104, but in the excitation signal generated by adaptive excitation codebook 217 among the excitation components of the current frame. A sound source signal for a residual component that cannot be approximated is generated. Similarly to fixed excitation codebook 139, fixed excitation codebook 219 uses the fixed excitation codebook index of monaural signal ⁇ ⁇ ⁇ as the CH2 fixed excitation codebook index. That is, the fixed sound source vector of CH2 is the same signal as the fixed sound source vector of the monaural signal.
- a fixed excitation codebook search is performed for fixed excitation codebook 219.
- a fixed excitation codebook index to be added for CH2 may be obtained. In this case, the encoding bit rate increases, but CH2 encoding with higher sound quality can be realized.
- the gain codebook 221 is a gain multiplier y that multiplies the gain of the sound source vector for CH2 by both the adaptive sound source gain for monaural signal and the gain of the fixed sound source vector.
- the gain codebook 221 includes a gain for monaural signals in the CELP code key unit 103, and a gain multiplier value ⁇ for CH1 in the first channel difference information code key unit 104.
- the correction value ⁇ ⁇ is the pattern prepared in the gain codebook.
- the gain codebook 221 first determines the gain multiplier value ⁇ for CH2 as CHI
- the gain codebook search is performed for the correction coefficient ⁇ for obtaining the optimal ⁇ for CH2.
- ⁇ ⁇ is the monaural gain and monaural at CHI.
- the spectral envelope component is obtained by performing LPC analysis of the CH2 signal to obtain the LPC parameter, and the difference component of the LPC parameter of the monaural signal and the LPC parameter of CH1 with respect to the LPC parameter of the monaural signal already obtained. Using this, the LPC parameter of CH2 is estimated, and the correction component (error component) of the estimated parameter force is quantized to obtain the spectral envelope component parameter of C 2.
- Equation 17 ⁇ ⁇ ( ⁇ ) ( ⁇ ( ⁇ ) + ⁇ 2 ( ⁇ )) (Equation 2 3)
- Equation 24 The LSP parameter ⁇ (i) of CH1 is expressed by the following (Equation 24).
- the CH2 LSP ⁇ (i) is converted to its predicted value ⁇ , (i) and its correction ⁇ ⁇ (i
- the scalable code generator according to this embodiment is represented by ⁇ (i)
- ⁇ ⁇ (i) that minimizes the quantization error is signed.
- ⁇ ⁇ (i) is the mono LSP parameter
- the difference from the estimated value estimated using the difference parameter ⁇ ⁇ (i) for monaural data is smaller than ⁇ ⁇ (i) and is a more efficient code. It can be performed.
- the CELP code parameter of the monaural signal and the difference CELP parameter coded in the second layer are both parameters.
- the correction parameters described above are used so that the error between the CH2 synthesized signal generated by these and the CH2 original signal is minimized. To decide. Therefore, CH2 can be encoded and decoded with higher accuracy.
- the force with the monaural signal M as the average signal of CH1 and CH2 is not necessarily limited to this! /.
- the adaptive excitation codebook may be referred to as an adaptive codebook.
- the fixed excitation codebook is sometimes called a fixed codebook, a noise codebook, a stochastic codebook, or a random codebook.
- the scalable coding apparatus according to the present invention is not limited to the above embodiments, and can be implemented with various modifications.
- the scalable coding apparatus according to the present invention can also be mounted on a communication terminal apparatus and a base station apparatus in a mobile communication system, and thereby a communication terminal apparatus having the same effects as described above, and A base station apparatus can be provided.
- Each functional block used in the description of each of the above embodiments is typically realized as an LSI which is an integrated circuit. These may be individually integrated into one chip, or part of them. Or it ’s okay to make it a chip to include everything!
- the method of circuit integration is not limited to LSI's, and implementation using dedicated circuitry or general purpose processors is also possible. It is also possible to use a field programmable gate array (FPGA) that can be programmed after LSI manufacturing, or a reconfigurable processor that can reconfigure the connection or setting of circuit cells inside the LSI.
- FPGA field programmable gate array
- a scalable coding method and a scalable coding method according to the present invention are used in a mobile communication system, such as a communication terminal device, a base station device, etc., that performs scalable coding on a stereo signal. Applicable to.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Quality & Reliability (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2006537715A JP4555299B2 (ja) | 2004-09-28 | 2005-09-26 | スケーラブル符号化装置およびスケーラブル符号化方法 |
US11/576,004 US20080255832A1 (en) | 2004-09-28 | 2005-09-26 | Scalable Encoding Apparatus and Scalable Encoding Method |
EP05786017A EP1801782A4 (en) | 2004-09-28 | 2005-09-26 | DEVICE AND METHOD FOR SCALABLE CODING |
BRPI0516201-7A BRPI0516201A (pt) | 2004-09-28 | 2005-09-26 | aparelho de codificação escalonável e método de codificação escalonável |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004-282525 | 2004-09-28 | ||
JP2004282525 | 2004-09-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006035705A1 true WO2006035705A1 (ja) | 2006-04-06 |
Family
ID=36118851
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/017618 WO2006035705A1 (ja) | 2004-09-28 | 2005-09-26 | スケーラブル符号化装置およびスケーラブル符号化方法 |
Country Status (7)
Country | Link |
---|---|
US (1) | US20080255832A1 (ja) |
EP (1) | EP1801782A4 (ja) |
JP (1) | JP4555299B2 (ja) |
KR (1) | KR20070061843A (ja) |
CN (1) | CN101027718A (ja) |
BR (1) | BRPI0516201A (ja) |
WO (1) | WO2006035705A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008016098A1 (fr) * | 2006-08-04 | 2008-02-07 | Panasonic Corporation | dispositif de codage audio stéréo, dispositif de décodage audio stéréo et procédé de ceux-ci |
JP2018533057A (ja) * | 2015-09-25 | 2018-11-08 | ヴォイスエイジ・コーポレーション | セカンダリチャンネルを符号化するためにプライマリチャンネルのコーディングパラメータを使用するステレオ音声信号を符号化するための方法およびシステム |
JP2022539571A (ja) * | 2019-06-29 | 2022-09-12 | 華為技術有限公司 | ステレオエンコーディング方法及び装置、並びにステレオデコーディング方法及び装置 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101566025B1 (ko) * | 2007-10-22 | 2015-11-05 | 한국전자통신연구원 | 다객체 오디오 부호화 및 복호화 방법과 그 장치 |
BR122019023924B1 (pt) * | 2009-03-17 | 2021-06-01 | Dolby International Ab | Sistema codificador, sistema decodificador, método para codificar um sinal estéreo para um sinal de fluxo de bits e método para decodificar um sinal de fluxo de bits para um sinal estéreo |
JP5269195B2 (ja) * | 2009-05-29 | 2013-08-21 | 日本電信電話株式会社 | 符号化装置、復号装置、符号化方法、復号方法及びそのプログラム |
WO2012066727A1 (ja) * | 2010-11-17 | 2012-05-24 | パナソニック株式会社 | ステレオ信号符号化装置、ステレオ信号復号装置、ステレオ信号符号化方法及びステレオ信号復号方法 |
EP2661746B1 (en) * | 2011-01-05 | 2018-08-01 | Nokia Technologies Oy | Multi-channel encoding and/or decoding |
US9460729B2 (en) | 2012-09-21 | 2016-10-04 | Dolby Laboratories Licensing Corporation | Layered approach to spatial audio coding |
ES2911515T3 (es) * | 2017-04-10 | 2022-05-19 | Nokia Technologies Oy | Codificación de audio |
CN112151045B (zh) * | 2019-06-29 | 2024-06-04 | 华为技术有限公司 | 一种立体声编码方法、立体声解码方法和装置 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1132399A (ja) * | 1997-05-13 | 1999-02-02 | Sony Corp | 符号化方法及び装置、並びに記録媒体 |
JP2003058195A (ja) * | 2001-08-21 | 2003-02-28 | Canon Inc | 再生装置、再生システム、再生方法、記憶媒体、及びプログラム |
JP2004509367A (ja) * | 2000-09-15 | 2004-03-25 | テレフオンアクチーボラゲツト エル エム エリクソン | 複数チャネル信号の符号化及び復号化 |
Family Cites Families (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH04150522A (ja) * | 1990-10-15 | 1992-05-25 | Sony Corp | ディジタル信号処理装置 |
US5956674A (en) * | 1995-12-01 | 1999-09-21 | Digital Theater Systems, Inc. | Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels |
DE19628293C1 (de) * | 1996-07-12 | 1997-12-11 | Fraunhofer Ges Forschung | Codieren und Decodieren von Audiosignalen unter Verwendung von Intensity-Stereo und Prädiktion |
US6345246B1 (en) * | 1997-02-05 | 2002-02-05 | Nippon Telegraph And Telephone Corporation | Apparatus and method for efficiently coding plural channels of an acoustic signal at low bit rates |
US6356211B1 (en) * | 1997-05-13 | 2002-03-12 | Sony Corporation | Encoding method and apparatus and recording medium |
DE19742655C2 (de) * | 1997-09-26 | 1999-08-05 | Fraunhofer Ges Forschung | Verfahren und Vorrichtung zum Codieren eines zeitdiskreten Stereosignals |
SE519552C2 (sv) * | 1998-09-30 | 2003-03-11 | Ericsson Telefon Ab L M | Flerkanalig signalkodning och -avkodning |
DE19959156C2 (de) * | 1999-12-08 | 2002-01-31 | Fraunhofer Ges Forschung | Verfahren und Vorrichtung zum Verarbeiten eines zu codierenden Stereoaudiosignals |
US6973184B1 (en) * | 2000-07-11 | 2005-12-06 | Cisco Technology, Inc. | System and method for stereo conferencing over low-bandwidth links |
SE519981C2 (sv) * | 2000-09-15 | 2003-05-06 | Ericsson Telefon Ab L M | Kodning och avkodning av signaler från flera kanaler |
SE519976C2 (sv) * | 2000-09-15 | 2003-05-06 | Ericsson Telefon Ab L M | Kodning och avkodning av signaler från flera kanaler |
US7606703B2 (en) * | 2000-11-15 | 2009-10-20 | Texas Instruments Incorporated | Layered celp system and method with varying perceptual filter or short-term postfilter strengths |
US6996522B2 (en) * | 2001-03-13 | 2006-02-07 | Industrial Technology Research Institute | Celp-Based speech coding for fine grain scalability by altering sub-frame pitch-pulse |
US7062429B2 (en) * | 2001-09-07 | 2006-06-13 | Agere Systems Inc. | Distortion-based method and apparatus for buffer control in a communication system |
EP1440433B1 (en) * | 2001-11-02 | 2005-05-04 | Matsushita Electric Industrial Co., Ltd. | Audio encoding and decoding device |
CN1266673C (zh) * | 2002-03-12 | 2006-07-26 | 诺基亚有限公司 | 可伸缩音频编码的有效改进 |
JP3881946B2 (ja) * | 2002-09-12 | 2007-02-14 | 松下電器産業株式会社 | 音響符号化装置及び音響符号化方法 |
US20030231799A1 (en) * | 2002-06-14 | 2003-12-18 | Craig Schmidt | Lossless data compression using constraint propagation |
US7191136B2 (en) * | 2002-10-01 | 2007-03-13 | Ibiquity Digital Corporation | Efficient coding of high frequency signal information in a signal using a linear/non-linear prediction model based on a low pass baseband |
WO2004097796A1 (ja) * | 2003-04-30 | 2004-11-11 | Matsushita Electric Industrial Co., Ltd. | 音声符号化装置、音声復号化装置及びこれらの方法 |
US7349842B2 (en) * | 2003-09-29 | 2008-03-25 | Sony Corporation | Rate-distortion control scheme in audio encoding |
US7447317B2 (en) * | 2003-10-02 | 2008-11-04 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V | Compatible multi-channel coding/decoding by weighting the downmix channel |
US7394903B2 (en) * | 2004-01-20 | 2008-07-01 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Apparatus and method for constructing a multi-channel output signal or for generating a downmix signal |
DE602005025887D1 (de) * | 2004-08-19 | 2011-02-24 | Nippon Telegraph & Telephone | Mehrkanal-signaldekodierverfahren dafür, zugehörige vorrichtung, programm und aufzeichnungsmedium dafür |
CN101031960A (zh) * | 2004-09-30 | 2007-09-05 | 松下电器产业株式会社 | 可扩展性编码装置和可扩展性解码装置及其方法 |
EP1847022B1 (en) * | 2005-01-11 | 2010-09-01 | Agency for Science, Technology and Research | Encoder, decoder, method for encoding/decoding, computer readable media and computer program elements |
US8036390B2 (en) * | 2005-02-01 | 2011-10-11 | Panasonic Corporation | Scalable encoding device and scalable encoding method |
CN101151660B (zh) * | 2005-03-30 | 2011-10-19 | 皇家飞利浦电子股份有限公司 | 多通道音频编码器、解码器以及相应方法 |
-
2005
- 2005-09-26 WO PCT/JP2005/017618 patent/WO2006035705A1/ja active Application Filing
- 2005-09-26 BR BRPI0516201-7A patent/BRPI0516201A/pt not_active Application Discontinuation
- 2005-09-26 US US11/576,004 patent/US20080255832A1/en not_active Abandoned
- 2005-09-26 CN CNA2005800326240A patent/CN101027718A/zh active Pending
- 2005-09-26 EP EP05786017A patent/EP1801782A4/en not_active Withdrawn
- 2005-09-26 KR KR1020077007083A patent/KR20070061843A/ko not_active Application Discontinuation
- 2005-09-26 JP JP2006537715A patent/JP4555299B2/ja not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1132399A (ja) * | 1997-05-13 | 1999-02-02 | Sony Corp | 符号化方法及び装置、並びに記録媒体 |
JP2004509367A (ja) * | 2000-09-15 | 2004-03-25 | テレフオンアクチーボラゲツト エル エム エリクソン | 複数チャネル信号の符号化及び復号化 |
JP2003058195A (ja) * | 2001-08-21 | 2003-02-28 | Canon Inc | 再生装置、再生システム、再生方法、記憶媒体、及びプログラム |
Non-Patent Citations (1)
Title |
---|
See also references of EP1801782A4 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008016098A1 (fr) * | 2006-08-04 | 2008-02-07 | Panasonic Corporation | dispositif de codage audio stéréo, dispositif de décodage audio stéréo et procédé de ceux-ci |
JP2018533057A (ja) * | 2015-09-25 | 2018-11-08 | ヴォイスエイジ・コーポレーション | セカンダリチャンネルを符号化するためにプライマリチャンネルのコーディングパラメータを使用するステレオ音声信号を符号化するための方法およびシステム |
US10984806B2 (en) | 2015-09-25 | 2021-04-20 | Voiceage Corporation | Method and system for encoding a stereo sound signal using coding parameters of a primary channel to encode a secondary channel |
US11056121B2 (en) | 2015-09-25 | 2021-07-06 | Voiceage Corporation | Method and system for encoding left and right channels of a stereo sound signal selecting between two and four sub-frames models depending on the bit budget |
JP2021131569A (ja) * | 2015-09-25 | 2021-09-09 | ヴォイスエイジ・コーポレーション | セカンダリチャンネルを符号化するためにプライマリチャンネルのコーディングパラメータを使用するステレオ音声信号を符号化するための方法およびシステム |
JP7124170B2 (ja) | 2015-09-25 | 2022-08-23 | ヴォイスエイジ・コーポレーション | セカンダリチャンネルを符号化するためにプライマリチャンネルのコーディングパラメータを使用するステレオ音声信号を符号化するための方法およびシステム |
JP2022539571A (ja) * | 2019-06-29 | 2022-09-12 | 華為技術有限公司 | ステレオエンコーディング方法及び装置、並びにステレオデコーディング方法及び装置 |
JP7337966B2 (ja) | 2019-06-29 | 2023-09-04 | 華為技術有限公司 | ステレオエンコーディング方法及び装置、並びにステレオデコーディング方法及び装置 |
Also Published As
Publication number | Publication date |
---|---|
BRPI0516201A (pt) | 2008-08-26 |
CN101027718A (zh) | 2007-08-29 |
EP1801782A1 (en) | 2007-06-27 |
JP4555299B2 (ja) | 2010-09-29 |
EP1801782A4 (en) | 2008-09-24 |
JPWO2006035705A1 (ja) | 2008-05-15 |
KR20070061843A (ko) | 2007-06-14 |
US20080255832A1 (en) | 2008-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2006035705A1 (ja) | スケーラブル符号化装置およびスケーラブル符号化方法 | |
JP5046652B2 (ja) | 音声符号化装置および音声符号化方法 | |
JP5413839B2 (ja) | 符号化装置および復号装置 | |
JP4963965B2 (ja) | スケーラブル符号化装置、スケーラブル復号装置、及びこれらの方法 | |
JP5046653B2 (ja) | 音声符号化装置および音声符号化方法 | |
JP4907522B2 (ja) | 音声符号化装置および音声符号化方法 | |
JP4850827B2 (ja) | 音声符号化装置および音声符号化方法 | |
JP4887279B2 (ja) | スケーラブル符号化装置およびスケーラブル符号化方法 | |
WO2006059567A1 (ja) | ステレオ符号化装置、ステレオ復号装置、およびこれらの方法 | |
JP4842147B2 (ja) | スケーラブル符号化装置およびスケーラブル符号化方法 | |
JPWO2008132850A1 (ja) | ステレオ音声符号化装置、ステレオ音声復号装置、およびこれらの方法 | |
JP4948401B2 (ja) | スケーラブル符号化装置およびスケーラブル符号化方法 | |
JPWO2008090970A1 (ja) | ステレオ符号化装置、ステレオ復号装置、およびこれらの方法 | |
JP2006072269A (ja) | 音声符号化装置、通信端末装置、基地局装置および音声符号化方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2006537715 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200580032624.0 Country of ref document: CN Ref document number: 443/MUMNP/2007 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2005786017 Country of ref document: EP Ref document number: 1020077007083 Country of ref document: KR |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11576004 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 2005786017 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: PI0516201 Country of ref document: BR |