WO2011058758A1 - 符号化装置、復号装置およびこれらの方法 - Google Patents
符号化装置、復号装置およびこれらの方法 Download PDFInfo
- Publication number
- WO2011058758A1 WO2011058758A1 PCT/JP2010/006665 JP2010006665W WO2011058758A1 WO 2011058758 A1 WO2011058758 A1 WO 2011058758A1 JP 2010006665 W JP2010006665 W JP 2010006665W WO 2011058758 A1 WO2011058758 A1 WO 2011058758A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- encoding
- layer
- gain
- information
- band
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
Definitions
- the present invention relates to an encoding device, a decoding device, and these methods used in a communication system that encodes and transmits a signal.
- Non-Patent Document 1 discloses that a spectrum (MDCT (Modified Discrete Cosine Transform) of a desired frequency band is obtained using TwinVQ (Transform Domain Weighed Interleave Vector Quantization) in which the basic structural unit is modularized.
- a method for hierarchically encoding () coefficients) is disclosed. By using the module in common and using it a plurality of times, a simple and highly flexible scalable encoding can be realized.
- the subbands to be encoded in each layer (layer) are basically configured in advance, but the subbands to be encoded in each layer (layer) according to the nature of the input signal.
- a configuration is also disclosed in which the position of is fluctuated within a predetermined band.
- Non-Patent Document 1 for example, in a configuration in which the position of a subband to be encoded in each layer (layer) is varied within a predetermined band, the encoding target is determined for each frame or each layer The subbands selected as different. Therefore, predictive encoding in the time axis direction or predictive encoding in the layer axis direction is applied as a method for encoding the frequency parameter of the band to be encoded (encoding target band). There is a problem that encoding efficiency is insufficient. As a result, there is a problem that the quality of the generated decoded speech becomes insufficient.
- the adding unit 204 calculates the difference spectrum between the input spectrum and the first layer decoded spectrum by inverting the polarity of the first layer decoded spectrum and adding the inverted spectrum to the input spectrum.
- the adding unit 204 outputs the obtained difference spectrum as the first layer difference spectrum to the second layer encoding unit 205.
- the second layer decoding unit 206 decodes the second layer encoded information input from the second layer encoding unit 205 to calculate a second layer decoded spectrum. Next, second layer decoding section 206 outputs the generated second layer decoded spectrum to addition section 207. Details of second layer decoding section 206 will be described later.
- the adding unit 207 calculates a difference spectrum between the first layer difference spectrum and the second layer decoded spectrum by inverting the polarity of the second layer decoded spectrum and adding the inverted polarity to the first layer difference spectrum.
- the adding unit 207 outputs the obtained difference spectrum to the third layer encoding unit 208 as the second layer difference spectrum.
- Third layer encoding section 208 generates third layer encoded information using the second layer difference spectrum input from adding section 207, and generates the generated third layer encoded information to encoded information integrating section 209. Output. Also, the third layer encoding unit 208 sends the third layer gain encoding information and the third layer band information included in the third layer encoding information to the first layer encoding unit 202 and the first layer decoding unit 203. Output. As a result, first layer encoding section 202 and first layer decoding section 203 perform encoding using the third layer gain encoding information and the third layer band information in the next processing frame. Details of third layer encoding section 208 will be described later.
- FIG. 3 is a block diagram showing a main configuration of first layer encoding section 202.
- the first layer encoding unit 202 includes a band selection unit 301, a shape encoding unit 302, an adaptive prediction determination unit 303, a gain encoding unit 304, and a multiplexing unit 305.
- the band selection unit 301 divides the input spectrum input from the orthogonal transform processing unit 201 into a plurality of subbands, and selects a band to be quantized (quantization target band) from the plurality of subbands.
- Band selection section 301 outputs band information (first layer band information) indicating the selected quantization target band to shape coding section 302, adaptive prediction determination section 303, and multiplexing section 305.
- Band selection section 301 also outputs the input spectrum to shape coding section 302. Note that the input spectrum input to the shape encoding unit 302 may be directly input from the orthogonal transform processing unit 201 separately from the input from the orthogonal transform processing unit 201 to the band selection unit 301. Details of the processing of the band selection unit 301 will be described later.
- the shape encoding unit 302 encodes the shape information using the spectrum (MDCT coefficient) corresponding to the band indicated by the first layer band information out of the input spectrum input from the band selection unit 301, and performs the first layer Shape coding information is generated. Next, shape coding section 302 outputs the generated first layer shape coding information to multiplexing section 305. In addition, shape coding section 302 outputs an ideal gain (gain information) calculated at the time of shape coding to gain coding section 304. Details of the processing of the shape encoding unit 302 will be described later.
- the adaptive prediction determination unit 303 outputs the determination result to the gain encoding unit 304 and the multiplexing unit 305 as prediction information (Flag_PRE).
- the adaptive prediction determination unit 303 sets the value of Flag_PRE to 1 when determining to perform prediction, and sets the value of Flag_PRE to 0 when determining that prediction is not performed. Details of the process of the adaptive prediction determination unit 303 will be described later.
- the gain encoding unit 304 performs predictive encoding on the ideal gain input from the shape encoding unit 302 to obtain the first layer gain code. Get information.
- gain encoding section 304 uses the quantization gain of the past frame stored in the internal buffer, the internal gain codebook, the second layer gain encoding information, and the third layer gain encoding information. Thus, predictive coding is performed on the ideal gain.
- the band selection unit 301 calculates the average energy E1 (m) of each of the M types of regions according to the following equation (5).
- j represents the index of each of the J subbands
- m represents the index of each of the M types of regions.
- S (m) indicates the minimum value among the indices of the L subbands constituting the region m
- B (j) is the minimum value among the indices of the plurality of MDCT coefficients constituting the subband j.
- W (j) indicates the bandwidth of subband j, and in the following description, the case where all the J subbands have the same bandwidth, that is, the case where W (j) is a constant will be described as an example.
- the band selection unit 301 is a band (quantization target band) to be quantized for a region having the maximum average energy E1 (m), for example, a band composed of subbands j ′′ to (j ′′ + L ⁇ 1). Select as.
- Band selection section 301 outputs index m_max indicating the selected region as first layer band information to shape coding section 302, adaptive prediction determination section 303, and multiplexing section 305. Further, the band selection unit 301 outputs the input spectrum X1 (k) of the quantization target band to the shape coding unit 302.
- the band index indicating the quantization target band selected by the band selection unit 301 is j ′′ to (j ′′ + L ⁇ 1).
- the shape encoding unit 302 performs shape quantization for each subband on the input spectrum X1 (k) corresponding to the band indicated by the first layer band information. Specifically, the shape encoding unit 302 searches the built-in shape codebook composed of SQ shape code vectors for each of the L subbands, and evaluates the shape scale_q (i) of Equation (6) below. Find the index of the shape code vector that maximizes.
- SC i k indicates a shape code vector constituting the shape code book
- i indicates an index of the shape code vector
- k indicates an index of an element of the shape code vector
- the shape encoding unit 302 outputs the index S_max of the shape code vector that maximizes the evaluation measure Shape_q (i) of the above equation (6) to the multiplexing unit 305 as first layer shape encoding information.
- the shape encoding unit 302 calculates an ideal gain Gain_i (j) according to the following equation (7), and outputs the calculated ideal gain Gain_i (j) to the gain encoding unit 304.
- the adaptive prediction determination unit 303 has a built-in buffer and stores the first layer band information in the past frame.
- the adaptive prediction determination unit 303 includes a buffer that stores band information for one past frame will be described as an example.
- the adaptive prediction determination unit 303 first uses the first layer bandwidth information, the second layer bandwidth information, the third layer bandwidth information in the past frame, and the first layer bandwidth information in the current frame to quantize the past frame. The number of subbands common between the target band and the quantization target band of the current frame is obtained.
- the set M123 t ⁇ 1 can be expressed by the following equation (8) using the set M1 t ⁇ 1 , the set M2 t ⁇ 1 , and the set M3 t ⁇ 1 .
- adaptive prediction determination section 303 outputs prediction information (Flag_PRE) to gain encoding section 304 and multiplexing section 305 as information indicating the determination result.
- adaptive prediction determination section 303 updates the built-in buffer using the first layer band information, the second layer band information, and the three layer band information in the current frame.
- the gain encoding unit 304 adaptively switches the quantization method to either the prediction encoding method or the non-prediction encoding method according to the prediction information (Flag_PRE).
- GC1 i j indicates a gain code vector constituting the gain codebook in first layer encoding section 202
- i indicates an index of the gain code vector
- j indicates an index of an element of the gain code vector.
- the subband index j ′′ is an index indicating the first subband in the band selected by the band selection unit 301.
- C1 t j is the first layer encoding unit temporally before t frames.
- C1 1 j indicates the gain quantized in the first layer encoding unit 202 one frame before in the same manner
- C2 t j and C3 t j represent gains quantized in the second layer encoding unit 205 and the third layer encoding unit 208, respectively, temporally t frames before, and ⁇ 0 to ⁇ 3 are gain encoding units.
- the gain encoding unit 304 treats L subbands in one region as an L-dimensional vector and performs vector quantization.
- the gain encoding unit 304 calculates the quantum in the current frame among the gains stored in the built-in buffer in Equation (9). The gain of the subband that is closest in frequency to the target band is substituted.
- the gain encoding unit 304 performs non-predictive encoding. Specifically, the gain encoding unit 304 directly quantizes the ideal gain Gain_i (j) input from the shape encoding unit 302 according to the following equation (10). Here again, gain encoding section 304 treats the ideal gain as an L-dimensional vector and performs vector quantization.
- the gain encoding unit 304 transmits the index G_min of the gain code vector that minimizes the square error Gain_q (i) in the above equation (9) or (10) to the multiplexing unit 305 as first layer gain encoding information. Output.
- the gain encoding unit 304 uses the first layer gain encoding information G_min, the first layer band information, and the quantization gains C1 t j , C2 t j , and C3 t j obtained in the current frame to The built-in buffer is updated according to Equation (11).
- the multiplexing unit 305 multiplexes the first layer band information, the first layer shape coding information, the first layer gain coding information, and the prediction information, and generates the first layer coding information. Next, multiplexing section 305 outputs the generated first layer encoded information to first layer decoding section 203 and encoded information integration section 209.
- FIG. 5 is a block diagram showing a main configuration of the first layer decoding unit 203. As shown in FIG.
- the first layer decoding unit 203 includes a separating unit 501, a shape decoding unit 502, and a gain decoding unit 503.
- Separating section 501 converts the first layer encoded information output from first layer encoding section 202 into first layer band information, first layer shape encoded information, first layer gain encoded information, and prediction information. To separate. Separating section 501 outputs the obtained first layer band information and first layer shape coding information to shape decoding section 502, and outputs the first layer gain coding information and prediction information to gain decoding section 503.
- the shape decoding unit 502 decodes the first layer shape encoded information input from the separation unit 501, thereby determining the MDCT coefficient corresponding to the quantization target band indicated by the first layer band information input from the separation unit 501. Find the shape value. Shape decoding section 502 outputs the obtained MDCT coefficient shape value to gain decoding section 503. Details of the processing of the shape decoding unit 502 will be described later.
- the gain decoding unit 503 receives the second layer gain encoding information in the processing frame immediately before from the second layer encoding unit 205. Also, gain decoding section 503 receives third layer gain coding information in the processing frame immediately before from third layer coding section 208. Also, gain decoding section 503 receives first layer gain coding information and prediction information from demultiplexing section 501. The gain decoding unit 503 receives the shape value of the MDCT coefficient from the shape decoding unit 502.
- gain decoding section 503 uses the second layer gain coding information, the third layer gain coding information, the gain of the past frame stored in the built-in buffer, and the built-in gain codebook. Predictive decoding is performed on the one-layer gain encoded information.
- gain decoding section 503 uses the built-in gain codebook to directly convert the first layer gain encoded information to the inverse quantum. To gain (ie, without predictive decoding).
- Gain decoding section 503 obtains the MDCT coefficient of the quantization target band using the gain obtained and the value of the shape input from shape decoding section 502, and uses the obtained MDCT coefficient as first layer decoded spectrum to adding section 204. Output. Details of the processing of the gain decoding unit 503 will be described later.
- the first layer decoding unit 203 having the above configuration performs the following operation.
- the separation unit 501 separates the first layer encoded information into first layer band information, first layer shape encoded information, first layer gain encoded information, and prediction information.
- demultiplexing section 501 outputs the obtained first layer band information and first layer shape coding information to shape decoding section 502, and sends the first layer gain coding information and prediction information to gain decoding section 503. Output.
- the shape decoding unit 502 incorporates a shape code book similar to the shape code book included in the shape coding unit 302 of the first layer coding unit 202, and receives the first layer shape coding information S_max input from the separation unit 501. Search for a shape code vector as an index. Shape decoding section 502 outputs the searched shape code vector to gain decoding section 503 as the value of the shape of the MDCT coefficient in the quantization target band indicated by the first layer band information input from separation section 501.
- the gain decoding unit 503 has a built-in buffer and stores the gain obtained in the past frame.
- the gain decoding unit 503 adaptively switches the inverse quantization method to either the predictive decoding method or the non-predictive decoding method according to the prediction information (Flag_PRE).
- the gain decoding unit 503 uses the decoding target band of the current frame in the gain stored in the internal buffer in the above equation (12). The subband gain closest in frequency to is substituted.
- gain decoding section 503 updates the built-in buffer according to the following equation (15).
- FIG. 6 is a block diagram showing the main configuration of second layer encoding section 205.
- the second layer encoding unit 205 includes a band selection unit 601, a shape encoding unit 602, a gain encoding unit 603, and a multiplexing unit 604.
- the shape encoding unit 602 encodes the shape information using the spectrum (MDCT coefficient) corresponding to the band indicated by the second layer band information out of the first layer difference spectrum, and converts the second layer shape encoded information. Generate. Next, shape coding section 602 outputs the generated second layer shape coding information to multiplexing section 604. In addition, shape coding section 602 outputs an ideal gain (gain information) calculated at the time of shape coding to gain coding section 603. Details of the processing of the shape encoding unit 602 are the same as those of the shape encoding unit 302 described above, and thus the description thereof is omitted.
- Shape decoding section 702 decodes the second layer shape encoded information input from demultiplexing section 701, thereby decoding MDCT coefficients corresponding to the quantization target band indicated by the second layer band information input from demultiplexing section 701. Find the value of the shape.
- the shape decoding unit 702 outputs the obtained value of the shape of the decoded MDCT coefficient to the gain decoding unit 703. Details of the processing of the shape decoding unit 702 are the same as those of the shape decoding unit 502 described above, and thus the description thereof is omitted here.
- GC2 i j is a gain code vector constituting a gain codebook used by the gain decoding unit 703.
- FIG. 8 is a block diagram showing a main configuration inside the decoding apparatus 103 shown in FIG.
- the decoding apparatus 103 is a hierarchical decoding apparatus including three decoding hierarchies (layers).
- the first layer, the second layer, and the third layer are referred to in order from the lowest bit rate.
- the encoded information separation unit 801 receives the encoded information sent from the encoding apparatus 101 via the transmission path 102, separates the encoded information into encoded information of each layer, and performs decoding processing responsible for each decoding process To the output. Specifically, the encoded information separation unit 801 outputs the first layer encoded information included in the encoded information to the first layer decoding unit 802. Also, the encoded information separation unit 801 outputs the second layer encoded information included in the encoded information to the second layer decoding unit 803. The encoded information separation unit 801 outputs the third layer encoded information included in the encoded information to the third layer decoding unit 804.
- the first layer decoding unit 802 generates the first layer decoded spectrum X1 ′′ (k) by decoding the first layer encoded information input from the encoded information separation unit 801, and the generated first layer decoded spectrum X1 “(K) is output to the adder 806. Since the processing of the first layer decoding unit 802 is the same as the processing of the first layer decoding unit 203 described above, description thereof is omitted here.
- the second layer decoding unit 803 decodes the second layer encoded information input from the encoded information separation unit 801 to generate a second layer decoded spectrum X2 ′′ (k), and the generated second layer decoded spectrum X2 "(K) is output to the adder 805. Also, second layer decoding section 803 outputs second layer gain coding information and second layer band information included in the second layer coding information to first layer decoding section 802. Since the process of the second layer decoding unit 803 is the same as the process of the second layer decoding unit 206 described above, the description thereof is omitted here.
- the third layer decoding unit 804 decodes the third layer encoded information input from the encoded information separation unit 801 to generate a third layer decoded spectrum X3 ′′ (k), and the generated third layer decoded spectrum X3 "(K) is output to the adder 805. Also, third layer decoding section 804 outputs third layer gain coding information and third layer band information included in the third layer coding information to first layer decoding section 802. Since the process of the third layer decoding unit 804 is the same as the process of the second layer decoding unit 206 described above, the description thereof is omitted here. However, the third layer decoding unit 804 performs processing by replacing GC2 i j used in the processing of the second layer decoding unit 206 with GC3 i j . Here, GC3 i j is a gain code vector constituting a gain codebook used in the third layer decoding section 804.
- the adder 805 receives the second layer decoded spectrum X2 ′′ (k) from the second layer decoder 803. Also, the adder 805 receives the third layer decoded spectrum X3 ′′ from the third layer decoder 804. (K) is input.
- the adding unit 805 adds the input second layer decoded spectrum X2 ′′ (k) and third layer decoded spectrum X3 ′′ (k), and sets the added spectrum as the first added spectrum X4 ′′ (k), and the adding unit 806 Output to.
- the addition unit 806 receives the first addition spectrum X4 ′′ (k) from the addition unit 805. Also, the addition unit 806 receives the first layer decoded spectrum X1 ′′ (k) from the first layer decoding unit 802. Entered. The addition unit 806 adds the input first addition spectrum X4 ′′ (k) and the first layer decoded spectrum X1 ′′ (k), and uses the added spectrum as the second addition spectrum X5 ′′ (k), an orthogonal transform processing unit Output to 807.
- the orthogonal transform processing unit 807 first initializes the built-in buffer buf ′ (k) to a “0” value according to the following equation (16).
- X6 (k) is a vector obtained by combining the second addition spectrum X5 ′′ (k) and the buffer buf ′ (k), and is obtained using the following equation (18).
- the orthogonal transform processing unit 807 updates the buffer buf ′ (k) according to the following equation (19).
- first layer encoding section 202 switches the encoding method of the current layer based on the encoding result of each layer in the temporally previous processing frame. This improves the coding efficiency of the frequency parameter of the current frame when the coding apparatus 101 uses a layer coding method in which a band to be coded is selected for each layer (layer). Can improve the quality.
- first layer encoding section 202 which is the lowest layer, includes adaptive prediction determination section 303, and predictive encoding / decoding is applied to encoding / decoding of first layer gain information.
- the configuration for switching whether or not is described.
- the present invention is not limited to this. That is, the present invention can be similarly applied to a configuration in which the second layer encoding section 205 and the third layer encoding section 208 of the upper layer include the adaptive prediction determination section 303.
- frequency parameters can be encoded with higher accuracy by adaptively performing predictive encoding / decoding processing.
- adaptive predictive coding / coding only in some layers (for example, the lowest layer) as described in the present embodiment.
- the configuration of performing the decoding process is effective.
- the 1st layer encoding part 202 demonstrated the structure which calculates prediction information and transmits this.
- adaptive prediction determination section 303 sets prediction information using band information quantized in the previous processing frame in time and band information selected in the current frame. .
- the band information and the prediction information can be calculated by performing the same processing in the decoding apparatus 103 as well. Therefore, the prediction information does not have to be transmitted from the encoding device 101 to the decoding device 103 for the configuration employing the above determination method. In this case, it is necessary to separately input the second layer band information and the third layer band information to the first layer decoding unit 802.
- the first layer decoding unit 802 with the adaptive prediction determination unit 303 similarly to the first layer encoding unit 202, and to set prediction information.
- the configuration for transmitting the prediction information is effective as described in the present embodiment.
- Embodiment 2 of the present invention describes a configuration in which an encoding / decoding unit of all layers (layers) applies an adaptive prediction encoding / decoding scheme of ideal gain (gain information). Note that the adaptive predictive coding method described in the present embodiment is partially different from the adaptive predictive coding method described in the first embodiment in the past frame information used for prediction.
- the communication system (not shown) according to the second embodiment is basically the same as the communication system shown in FIG. 1, and the encoding apparatus 101 is only part of the configuration and operation of the encoding apparatus / decoding apparatus. And the decoding device 103 is different.
- the encoding device and the decoding device in the communication system according to the present embodiment will be denoted by reference numerals “111” and “113”, respectively.
- FIG. 9 is a block diagram showing a main configuration inside encoding apparatus 111 shown in FIG.
- the encoding device 111 is a hierarchical encoding device including three encoding layers.
- the first layer, the second layer, and the third layer are referred to in order from the lowest bit rate.
- components other than the first layer encoding unit 212, the first layer decoding unit 213, the second layer encoding unit 215, the second layer decoding unit 216, and the third layer encoding unit 218 Is the same as the constituent elements of the encoding apparatus 101 of the first embodiment, and thus the same reference numerals are given and the description thereof is omitted here.
- the input spectrum X1 (k) is input from the orthogonal transform processing unit 201 to the first layer encoding unit 212.
- First layer encoding section 212 encodes input spectrum X1 (k) and generates first layer encoded information.
- first layer encoding section 212 outputs the generated first layer encoded information to first layer decoding section 213 and encoded information integration section 209. Details of first layer encoding section 212 will be described later.
- the first layer decoding unit 213 decodes the first layer encoded information input from the first layer encoding unit 212, and calculates a first layer decoded spectrum. Next, first layer decoding section 213 outputs the generated first layer decoded spectrum to adding section 204. Also, first layer decoding section 213 outputs ideal gain (gain information) obtained when decoding first layer encoded information to second layer encoding section 215 and third layer encoding section 218. Details of first layer decoding section 213 will be described later.
- Second layer encoding section 215 generates second layer encoded information using the first layer difference spectrum input from adding section 204, and generates the generated second layer encoded information as second layer decoding section 216, And output to the encoded information integration unit 209. Details of second layer encoding section 215 will be described later.
- FIG. 10 is a block diagram showing the main configuration of first layer encoding section 212.
- the first layer encoding unit 212 includes a band selection unit 301, a shape encoding unit 302, an adaptive prediction determination unit 313, a gain encoding unit 314, and a multiplexing unit 305.
- the same reference numerals are given, Description is omitted.
- the adaptive prediction determination unit 313 obtains the number of subbands common between the quantization target band of the current frame and the quantization target band of the past frame using the input first layer band information. When the number of common subbands is equal to or greater than a predetermined value, the adaptive prediction determination unit 313 performs predictive coding on the spectrum (MDCT coefficient) of the quantization target band indicated by the first layer band information. judge. On the other hand, when the number of common subbands is smaller than the predetermined value, the adaptive prediction determination unit 313 does not perform predictive coding on the spectrum (MDCT coefficient) of the quantization target band indicated by the first layer band information (that is, , Encoding is performed without applying prediction).
- the ideal gain is input to the gain encoding unit 314 from the shape encoding unit 302. Also, the first layer prediction information is input from the adaptive prediction determination unit 313 to the gain encoding unit 314.
- the gain encoding unit 314 When the first layer prediction information indicates a determination result that predictive encoding is performed, the gain encoding unit 314 performs predictive encoding on the ideal gain input from the shape encoding unit 302, and performs first encoding. Obtain layer gain coding information. At this time, the gain encoding unit 314 performs predictive encoding on the ideal gain using the quantization gain of the past frame stored in the internal buffer and the internal gain codebook, and performs first encoding. Obtain layer gain coding information.
- the gain encoding unit 314 quantizes the ideal gain input from the shape encoding unit 302 as it is (that is, prediction) Quantize without applying) to obtain first layer gain encoded information.
- the gain encoding unit 314 outputs the obtained first layer gain encoding information to the multiplexing unit 305. Details of the processing of the gain encoding unit 314 will be described later.
- the first layer encoding unit 212 having the above configuration performs the following operation. However, processes other than the adaptive prediction determination unit 313 and the gain encoding unit 314 are the same as those in the first embodiment, and thus the description thereof is omitted.
- the gain encoding unit 314 has a built-in buffer and stores the quantization gain obtained in the past frame.
- FIG. 11 is a block diagram showing the main configuration of first layer decoding section 213.
- the gain decoding unit 513 uses the first layer gain coding information, the gain of the past frame stored in the built-in buffer, and the built-in gain codebook to perform the first layer gain coding information. Perform predictive decoding.
- the gain decoding unit 513 performs non-predictive decoding. That is, gain decoding section 513 performs inverse quantization on the gain value according to equation (13) using the above gain codebook. Again, the gain is treated as an L-dimensional vector and vector inverse quantization is performed. That is, when predictive decoding is not performed, gain decoding section 513 directly uses gain code vector GC1 j G_min corresponding to first layer gain encoding information G_min as a gain.
- gain decoding section 513 uses the gain obtained by inverse quantization of the current frame and the value of the shape input from shape decoding section 502, according to equation (14), the first layer decoded spectrum (decoded MDCT coefficient) X1 ′′ (k) is calculated.
- the gain is Gain_q ′ (j ′′). Takes a value.
- the gain decoding unit 513 updates the built-in buffer according to the equation (21).
- Gain decoding section 513 outputs first layer decoded spectrum X1 ′′ (k) calculated according to equation (14) to adding section 204.
- FIG. 12 is a block diagram showing the main configuration of second layer encoding section 215.
- the second layer encoding unit 215 includes a band selection unit 601, a shape encoding unit 602, an adaptive prediction determination unit 613, a gain encoding unit 614, and a multiplexing unit 604.
- constituent elements other than adaptive prediction determination section 613 and gain encoding section 614 are the same as the constituent elements in second layer encoding section 205 in the first embodiment, so the same reference numerals are assigned. The description is omitted.
- the adaptive prediction determination unit 613 has an internal buffer and stores band information (first layer band information and second layer band information) input from the band selection unit 601 and the first layer decoding unit 213 in the past.
- the first layer band information is input from the first layer decoding unit 213 to the adaptive prediction determination unit 613.
- the second layer band information is input from the band selection unit 601 to the adaptive prediction determination unit 613.
- the adaptive prediction determination unit 613 uses the input band information (first layer band information and second layer band information) to share the quantization target band of the current frame and the quantization target band of the past frame. Find the number of subbands.
- the adaptive prediction determination unit 613 When the number of common subbands is equal to or greater than a predetermined value, the adaptive prediction determination unit 613 performs predictive coding on the spectrum (MDCT coefficient) of the quantization target band indicated by the second layer band information. Determine to do. On the other hand, when the number of common subbands is smaller than the predetermined value, adaptive prediction determination section 613 does not perform predictive coding on the spectrum (MDCT coefficient) of the quantization target band indicated by the second layer band information. (That is, encoding without applying prediction) is determined.
- the adaptive prediction determination unit 613 outputs the determination result to the gain encoding unit 614 and the multiplexing unit 604 as second layer prediction information (Flag_PRE2).
- the adaptive prediction determination unit 613 sets the value of Flag_PRE2 to 1 when determining to perform prediction, and sets the value of Flag_PRE2 to 0 when determining that prediction is not performed. Details of the process of the adaptive prediction determination unit 613 will be described later.
- the gain encoding unit 614 has an internal buffer and stores the quantization gain obtained in the past frame.
- the ideal gain is input from the shape encoding unit 602 to the gain encoding unit 614. Further, first layer gain encoding information is input to gain encoding section 614 from first layer decoding section 213. Also, the second layer prediction information is input from the adaptive prediction determination unit 613 to the gain encoding unit 614.
- the gain encoding unit 614 When the second layer prediction information indicates a determination result that predictive encoding is performed, the gain encoding unit 614 performs predictive encoding on the ideal gain input from the shape encoding unit 602, and performs second encoding. Obtain layer gain coding information. At this time, the gain encoding unit 614 predicts the ideal gain using the quantization gain of the past frame stored in the internal buffer, the internal gain codebook, and the first layer gain encoding information. Encoding is performed.
- the gain encoding unit 614 outputs the obtained second layer gain encoding information to the multiplexing unit 604. Details of the processing of the gain encoding unit 614 will be described later.
- the adaptive prediction determination unit 613 has a built-in buffer, and stores the second layer band information and the first layer band information in the past frame.
- the adaptive prediction determination unit 613 includes a buffer that stores band information for one past frame will be described as an example.
- the first layer band information in the current frame is input from the first layer decoding unit 213 to the adaptive prediction determination unit 613.
- the set M12 t-1 can be expressed by the following equation (23) using the set M1 t-1 and the set M2 t-1 . Further, the set M12 t can be expressed as the following Expression (24) using the set M1 t and the set M2 t .
- the adaptive prediction determination unit 613 sets the value of the second layer prediction information Flag_PRE2 based on the number of common subbands among the subbands included in M12 t ⁇ 1 and M12 t as described above. Set.
- the quantization method is adaptively switched to either the predictive coding method or the non-predictive coding method.
- adaptive prediction determination section 613 outputs second layer prediction information (Flag_PRE2) as information indicating the determination result to gain encoding section 614 and multiplexing section 604.
- adaptive prediction determination section 613 updates the built-in buffer using the first layer band information and the second layer band information in the current frame.
- the gain encoding unit 614 has an internal buffer and stores the quantization gain obtained in the past frame. Further, first layer gain encoding information is input to gain encoding section 614 from first layer decoding section 213. Further, the second layer prediction information (Flag_PRE2) is input from the adaptive prediction determination unit 613 to the gain encoding unit 614.
- Flag_PRE2 the second layer prediction information
- the gain encoding unit 614 adaptively switches the quantization method to either the predictive encoding method or the non-predictive encoding method according to the second layer prediction information (Flag_PRE2).
- C1 t j indicates the gain quantized by the first layer encoding unit 212 temporally before t frames.
- C1 1 j indicates the gain quantized by the first layer encoding unit 212 one frame before in time.
- C2 t j indicates the gain quantized by the second layer encoding unit 215 temporally before t frames.
- ⁇ 0 to ⁇ 3 are fourth-order linear prediction coefficients stored in the gain encoding unit 614. Note that gain encoding section 614 treats L subbands in one region as an L-dimensional vector and performs vector quantization.
- the gain encoding unit 614 uses the second layer gain encoding information G_min and the quantization gains C1 t j and C2 t j obtained in the current frame to store the built-in buffer according to the following equation (27). Update.
- FIG. 13 is a block diagram showing the main configuration of second layer decoding section 216.
- the second layer decoding unit 216 includes a separation unit 701, a shape decoding unit 702, and a gain decoding unit 713.
- constituent elements other than gain decoding section 713 are the same as the constituent elements of second layer decoding section 206 described in Embodiment 1, and therefore the same reference numerals are assigned and description thereof is omitted.
- separation section 701 in the present embodiment is the separation section in Embodiment 1 only in that the separated second layer band information and second layer gain coding information are output to third layer coding section 218. It is different from 701.
- the gain decoding unit 713 receives the second layer prediction information (Flag_PRE2) and the second layer gain coding information from the separation unit 701.
- the gain decoding unit 713 receives the MDCT coefficient shape value from the shape decoding unit 702.
- the gain decoding unit 713 uses the second layer gain encoding information, the past frame gain stored in the internal buffer, and the internal gain codebook to perform the second layer gain encoding information. Perform predictive decoding.
- gain decoding section 713 uses the built-in gain codebook to convert second layer gain encoded information.
- the gain is obtained by performing inverse quantization as it is (that is, without performing predictive decoding).
- Gain decoding section 713 obtains the MDCT coefficient of the quantization target band using the gain obtained and the value of the shape input from shape decoding section 702, and provides the obtained MDCT coefficient as second layer decoded spectrum to addition section 207. Output.
- the second layer decoding unit 216 having the above configuration performs the following operation. Only the processing of the gain decoding unit 713 will be described here.
- the gain decoding unit 713 has a built-in buffer and stores the gain obtained in the past frame.
- the gain decoding unit 713 adaptively switches the inverse quantization method to either the predictive decoding method or the non-predictive decoding method according to the second layer prediction information (Flag_PRE2).
- the gain decoding unit 713 performs predictive decoding. That is, the gain decoding unit 713 performs inverse quantization by predicting the gain of the current frame using the gain of the past frame stored in the built-in buffer.
- gain decoding section 713 includes a gain codebook similar to gain encoding section 614 of second layer encoding section 215, and performs gain dequantization according to the following equation (28). To obtain the gain Gain_q ′.
- C1 ′′ t j represents a gain value inversely quantized in the first layer decoding unit 213 t frames before in time.
- C1 ′′ 1 j represents 1 frame before
- the gain obtained by inverse quantization in first layer decoding section 213 is shown.
- C2 ′′ t j represents the gain value inversely quantized by the second layer decoding unit 215.
- ⁇ 0 to ⁇ 3 are fourth-order linear predictions stored in the gain decoding unit 713.
- the gain decoding unit 713 treats L subbands in one region as an L-dimensional vector, and performs vector inverse quantization.
- the gain decoding unit 713 decodes the current frame out of the gains stored in the internal buffer in the above equation (28). The gain of the subband closest in frequency to the target band is substituted.
- the gain decoding unit 713 performs non-predictive decoding. That is, gain decoding section 713 inversely quantizes the gain value according to the following equation (29) using the above gain codebook. Again, the gain is treated as an L-dimensional vector and vector inverse quantization is performed. That is, when predictive decoding is not performed, gain decoding section 713 directly uses gain code vector GC2 j G_min corresponding to second layer gain encoding information G_min as a gain.
- gain decoding section 713 uses the gain obtained by inverse quantization of the current frame and the value of the shape input from shape decoding section 702 to obtain the second layer decoded spectrum (decoded MDCT) according to the following equation (30).
- (Coefficient) X2 ′′ (k) is calculated.
- the gain is Gain_q ′ (j ′′ ).
- the gain decoding unit 713 updates the built-in buffer according to the equation (27).
- Gain decoding section 713 outputs second layer decoded spectrum X2 ′′ (k) calculated according to equation (30) to addition section 207.
- Adaptive prediction determination section 1403 outputs the determination result to gain encoding section 1404 and multiplexing section 1405 as third layer prediction information (Flag_PRE3).
- the adaptive prediction determination unit 1403 sets the value of Flag_PRE3 to 1 when determining to perform prediction, and sets the value of Flag_PRE3 to 0 when not performing prediction. Details of the process of the adaptive prediction determination unit 1403 will be described later.
- the ideal gain is input to the gain encoding unit 1404 from the shape encoding unit 1402. Further, third layer prediction information is input to gain encoding section 1404 from adaptive prediction determination section 1403. Also, gain encoding section 1404 receives first layer gain encoding information from first layer decoding section 213. Further, second layer gain encoding information is input to gain encoding section 1404 from second layer decoding section 216.
- the gain encoding unit 1404 When the third layer prediction information indicates a determination result that predictive encoding is performed, the gain encoding unit 1404 performs predictive encoding on the ideal gain input from the shape encoding unit 1402, and performs third encoding. Obtain layer gain coding information. At this time, gain encoding section 1404 uses the quantization gain of the past frame stored in the internal buffer, the internal gain codebook, the first layer gain encoding information, and the second layer gain encoding information. Thus, predictive coding is performed on the ideal gain to obtain third layer gain coding information.
- the gain encoding unit 1404 quantizes the ideal gain input from the shape encoding unit 1402 as it is (that is, the prediction) Quantize without applying).
- Gain coding section 1404 outputs the obtained third layer gain coding information to multiplexing section 1405. Details of the processing of the gain encoding unit 1404 will be described later.
- the adaptive prediction determination unit 1403 has a built-in buffer and stores the third layer band information, the first layer band information, and the second layer band information in the past frame.
- adaptive prediction determination section 1403 has a built-in buffer for storing band information for one past frame.
- the adaptive prediction determination unit 1403 performs third layer band information, first layer band information, second layer band information (which are stored in the built-in buffer) in the past frame, and third layer band in the current frame.
- the number of subbands common between the quantization target band of the past frame and the quantization target band of the current frame is obtained using the information, the first layer band information, and the second layer band information.
- adaptive prediction determination section 1403 sets the value of third layer prediction information Flag_PRE3 based on the number of common subbands among the subbands included in M123 t ⁇ 1 and M123 t as described above. Set.
- the quantization method is adaptively switched to either the predictive coding method or the non-predictive coding method.
- the gain encoding unit 1404 has an internal buffer and stores the quantization gain obtained in the past frame.
- the gain encoding unit 1404 adaptively switches the quantization method to either the predictive encoding method or the non-predictive encoding method according to the third layer prediction information (Flag_PRE3).
- gain coding section 1404 uses third layer gain coding information obtained in the current frame and quantization gains C1 t j , C2 t j , and C3 t j according to the following equation (35). Update the buffer.
- the second layer decoding unit 813 decodes the second layer encoded information input from the encoded information separation unit 801 to generate a second layer decoded spectrum X2 ′′ (k), and the generated second layer decoded spectrum X2 "(K) is output to the adder 805. Since the process of the first layer decoding unit 812 is the same as the process of the second layer decoding unit 216 in the encoding device 111, the description thereof is omitted.
- Separating section 1601 converts third layer encoded information output from encoded information separating section 801 into third layer band information, third layer shape encoded information, third layer gain encoded information, and third layer prediction. Separate into information. Separating section 1601 outputs the obtained third layer band information and third layer shape coding information to shape decoding section 1602, and outputs the third layer gain coding information and third layer prediction information to gain decoding section 1603. .
- gain decoding section 1603 uses the built-in gain codebook to convert the third layer gain encoded information.
- the gain is obtained by performing inverse quantization as it is (that is, without performing predictive decoding).
- Separation section 1601 separates the third layer encoded information into third layer band information, third layer shape encoded information, third layer gain encoded information, and third layer prediction information.
- demultiplexing section 1601 outputs the obtained third layer band information and third layer shape coding information to shape decoding section 1602, and outputs the third layer gain coding information and the third layer prediction information to gain decoding section. To 1603.
- the gain decoding unit 1603 calculates the decoding target band of the current frame among the gains stored in the internal buffer in the above equation (36). The subband gain closest in frequency to is substituted.
- the gain decoding unit 1603 performs non-predictive decoding. That is, gain decoding section 1603 performs inverse quantization on the gain value according to the following equation (37) using the above gain codebook. Again, the gain is treated as an L-dimensional vector and vector inverse quantization is performed. That is, when predictive decoding is not performed, gain decoding section 1603 directly uses gain code vector GC3 j G_min corresponding to gain encoded information G_min as a gain.
- gain decoding section 1603 uses the gain obtained by inverse quantization of the current frame and the value of the shape input from shape decoding section 1602 to obtain the third layer decoded spectrum (decoded MDCT) according to the following equation (38).
- Coefficient) X3 ′′ (k) is calculated.
- the gain is Gain_q ′ (j ′′ ).
- gain decoding section 1603 updates the built-in buffer according to equation (35).
- the gain decoding unit 1603 outputs the third layer decoded spectrum X3 ′′ (k) calculated according to the above equation (38) to the adding unit 805.
- first layer encoding section 212, second layer encoding section 215, and third layer encoding section 218 determine the band to be encoded for each layer (layer).
- the frequency parameter encoding method of the current layer is switched based on the encoding result of each layer in the temporally previous processing frame.
- the encoding efficiency of the frequency parameter of the current frame is improved. Quality can be improved.
- the gain encoding section of each layer performs adaptive prediction quantization using only the quantization gain of the layers below each layer.
- the encoding device and the decoding device can perform encoding / decoding under the same conditions, so that the encoding performance is guaranteed. Can do.
- adaptive prediction determination sections 313, 613, and 1403 perform prediction using band information quantized in the temporally previous processing frame and band information selected in the current frame. Information set.
- the band information and the prediction information can be calculated in the decoding device 113 by the same process. Therefore, it is not necessary to transmit the prediction information from the encoding device 111 to the decoding device 113 for the configuration employing the above determination method.
- a configuration for transmitting prediction information is effective as described in the present embodiment.
- the present invention is not limited to this, and the present invention can be similarly applied to configurations other than the number of layers.
- multiplexing such as encoded information is performed in two consecutive steps
- multiplexing may be performed collectively in the subsequent steps (for example, multiplexing unit 305).
- two steps of the encoded information integration unit 209) when information such as multiplexed encoded information is separated in two consecutive steps, separation may be performed collectively in the previous step (for example, separated from the encoded information separation unit 801). 2 steps with the unit 1601).
- three or more signals when three or more signals are added in two consecutive steps, they may be added together in a lump (for example, two steps of the addition unit 805 and the addition unit 806).
- the present invention can also be applied to a case where a signal processing program is recorded and written on a machine-readable recording medium such as a memory, a disk, a tape, a CD, or a DVD, and the operation is performed. Actions and effects similar to those of the form can be obtained.
- each functional block used in the description of the above embodiment is typically realized as an LSI which is an integrated circuit. These may be individually made into one chip, or may be made into one chip so as to include a part or all of them. Although referred to as LSI here, it may be referred to as IC, system LSI, super LSI, or ultra LSI depending on the degree of integration.
- the method of circuit integration is not limited to LSI, and implementation with a dedicated circuit or a general-purpose processor is also possible.
- An FPGA Field Programmable Gate Array
- a reconfigurable / processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.
Abstract
Description
図1は、本発明の実施の形態1に係る符号化装置および復号装置を有する通信システムの構成を示すブロック図である。図1において、通信システムは、符号化装置101と復号装置103とを備え、それぞれ伝送路102を介して通信可能な状態となっている。なお、符号化装置101および復号装置103はいずれも、通常、基地局装置あるいは通信端末装置等に搭載されて用いられる。
この場合、利得符号化部304は、予測符号化を行う。すなわち、利得符号化部304は、内蔵のバッファに記憶されている時間的に3つ前までの処理フレームにおいて量子化された量子化利得、第2レイヤ利得符号化情報、および第3レイヤ利得符号化情報を用いて、現フレームの利得を予測することにより、現フレームの量子化利得を生成する。具体的には、利得符号化部304は、L個の各サブバンド毎に、GQ個の利得コードベクトルからなる内蔵の利得コードブックを探索して、下記の式(9)の二乗誤差Gain_q(i)が最小となる利得コードベクトルのインデックスを求める。
この場合、利得符号化部304は、非予測符号化を行う。具体的には、利得符号化部304は、下記の式(10)に従い、形状符号化部302から入力される理想利得Gain_i(j)を直接量子化する。ここでも、利得符号化部304は、理想利得をL次元ベクトルとして扱い、ベクトル量子化を行う。
この場合、利得復号部503は、予測復号する。すなわち、利得復号部503は、内蔵のバッファに記憶されている過去のフレームの利得を用いて、現フレームの利得を予測することにより逆量子化を行う。具体的には、利得復号部503は、第1レイヤ符号化部202の利得符号化部304と同様な利得コードブックを内蔵しており、下記の式(12)に従い、利得の逆量子化を行って利得Gain_q’を得る。
この場合、利得復号部503は、非予測復号する。すなわち、利得復号部503は、上記の利得コードブックを用いて、下記の式(13)に従い利得を逆量子化する。ここでも、利得をL次元ベクトルとして扱い、ベクトル逆量子化を行う。すなわち、予測復号を行わない場合、利得復号部503は、第1レイヤ利得符号化情報G_minに対応する利得コードベクトルGC1j G_minを直接利得とする。
本発明の実施の形態2は、全階層(レイヤ)の符号化部/復号部が、理想利得(利得情報)の適応予測符号化/復号方式を適用する構成について説明する。なお、本実施の形態で説明する適応予測符号化方式は、実施の形態1で説明した適応予測符号化方式とは、予測に用いる過去のフレームの情報が一部異なる。
この場合、利得符号化部314は、予測符号化を行う。すなわち、利得符号化部314は、内蔵のバッファに記憶されている時間的に3つ前までの処理フレームにおいて量子化された量子化利得、および第1レイヤ利得符号化情報を用いて、現フレームの利得を予測することにより、現フレームの量子化利得を生成する。具体的には、利得符号化部314は、L個の各サブバンド毎に、GQ個の利得コードベクトルからなる内蔵の利得コードブックを探索して、下記の式(20)の二乗誤差Gain_q(i)が最小となる利得コードベクトルのインデックスを求める。
この場合、利得符号化部314は、非予測符号化を行う。具体的には、利得符号化部314は、上述の式(10)に従い、形状符号化部302から入力される理想利得Gain_i(j)を直接量子化する。ここでも、利得符号化部314は、理想利得をL次元ベクトルとして扱い、ベクトル量子化を行う。
この場合、利得復号部513は、予測復号する。すなわち、利得復号部513は、内蔵のバッファに記憶されている過去のフレームの利得を用いて、現フレームの利得を予測することにより逆量子化を行う。具体的には、利得復号部513は、第1レイヤ符号化部212の利得符号化部314と同様な利得コードブックを内蔵しており、下記の式(22)に従い、利得の逆量子化を行って利得Gain_q’を得る。
この場合、利得復号部513は、非予測復号する。すなわち、利得復号部513は、上記の利得コードブックを用いて、式(13)に従い利得値を逆量子化する。ここでも、利得をL次元ベクトルとして扱い、ベクトル逆量子化を行う。すなわち、予測復号を行わない場合、利得復号部513は、第1レイヤ利得符号化情報G_minに対応する利得コードベクトルGC1j G_minを直接利得とする。
この場合、利得符号化部614は、予測符号化を行う。すなわち、利得符号化部614は、内蔵のバッファに記憶されている時間的に3つ前までの処理フレームにおいて量子化された量子化利得、および時間的に3つ前までの処理フレームにおける第1レイヤ利得符号化情報を用いて、現フレームの利得を予測することにより、現フレームの量子化利得を生成する。具体的には、利得符号化部614は、L個の各サブバンド毎に、GQ個の利得コードベクトルからなる内蔵の利得コードブックを探索して、下記の式(25)の二乗誤差Gain_q(i)が最小となる利得コードベクトルのインデックスを求める。
この場合、利得符号化部614は、非予測符号化を行う。具体的には、利得符号化部614は、下記の式(26)に従い、形状符号化部602から入力される理想利得Gain_i(j)を直接量子化する。ここでも、利得符号化部614は、理想利得をL次元ベクトルとして扱い、ベクトル量子化を行う。
この場合、利得復号部713は、予測復号する。すなわち、利得復号部713は、内蔵のバッファに記憶されている過去のフレームの利得を用いて、現フレームの利得を予測することにより逆量子化を行う。具体的には、利得復号部713は、第2レイヤ符号化部215の利得符号化部614と同様な利得コードブックを内蔵しており、下記の式(28)に従い、利得の逆量子化を行って利得Gain_q’を得る。
この場合、利得復号部713は、非予測復号する。すなわち、利得復号部713は、上記の利得コードブックを用いて、下記の式(29)に従い利得値を逆量子化する。ここでも、利得をL次元ベクトルとして扱い、ベクトル逆量子化を行う。すなわち、予測復号を行わない場合は、利得復号部713は、第2レイヤ利得符号化情報G_minに対応する利得コードベクトルGC2j G_minを直接利得とする。
この場合、利得符号化部1404は、予測符号化を行う。すなわち、利得符号化部1404は、内蔵のバッファに記憶されている時間的に3つ前までの処理フレームにおいて第3レイヤ符号化部218にて量子化された量子化利得、時間的に3つ前までの処理フレームにおける第1レイヤ利得符号化情報、および時間的に3つ前までの処理フレームにおける第2レイヤ利得符号化情報を用いて、現フレームの利得を予測することにより、現フレームの量子化利得を生成する。具体的には、利得符号化部1404は、L個の各サブバンド毎に、GQ個の利得コードベクトルからなる内蔵の利得コードブックを探索して、下記の式(33)の二乗誤差Gain_q(i)が最小となる利得コードベクトルのインデックスを求める。
この場合、利得符号化部1404は、非予測符号化を行う。具体的には、利得符号化部1404は、下記の式(35)に従い、形状符号化部1402から入力される理想利得Gain_i(j)を直接量子化する。ここでも、利得符号化部1404は、理想利得をL次元ベクトルとして扱い、ベクトル量子化を行う。
この場合、利得復号部1603は、予測復号する。すなわち、利得復号部1603は、内蔵のバッファに記憶されている過去のフレームの利得を用いて、現フレームの利得を予測することにより逆量子化を行う。具体的には、利得復号部1603は、第3レイヤ符号化部218の利得符号化部1404と同様な利得コードブックを内蔵しており、下記の式(36)に従い、利得の逆量子化を行って利得Gain_q’を得る。
この場合、利得復号部1603は、非予測復号する。すなわち、利得復号部1603は、上記の利得コードブックを用いて、下記の式(37)に従い利得値を逆量子化する。ここでも、利得をL次元ベクトルとして扱い、ベクトル逆量子化を行う。すなわち、予測復号を行わない場合は、利得復号部1603は、利得符号化情報G_minに対応する利得コードベクトルGC3j G_minを直接利得とする。
102 伝送路
103、113 復号装置
201、807 直交変換処理部
202、212 第1レイヤ符号化部
203、213、802、812 第1レイヤ復号部
204、207、805、806 加算部
205、215 第2レイヤ符号化部
206、216、803、813 第2レイヤ復号部
208、218 第3レイヤ符号化部
209 符号化情報統合部
301、601、1401 帯域選択部
302、602、1402 形状符号化部
303、313、613、1403 適応予測判定部
304、314、603、614、1404 利得符号化部
305、604、1405 多重化部
501、701、1601 分離部
502、702、1602 形状復号部
503、513、703、713、1603 利得復号部
801 符号化情報分離部
804、814 第3レイヤ復号部
Claims (26)
- 少なくとも2つの符号化レイヤを有する符号化装置であって、
周波数領域の入力信号を入力し、前記周波数領域を分割した複数のサブバンドの中から前記入力信号の第1量子化対象帯域を選択して第1帯域情報を求めるとともに、前記第1量子化対象帯域の前記入力信号の第1利得を求め、前記第1帯域情報と、前記第1利得を符号化して得られる第1利得符号化情報と、を含む第1符号化情報を生成し、前記第1符号化情報を用いた復号を行うことにより得られる復号信号と前記入力信号との差分信号を生成する第1レイヤ符号化手段と、
前記差分信号を入力し、前記複数のサブバンドの中から前記差分信号の第2量子化対象帯域を選択して第2帯域情報を求めるとともに、前記第2量子化対象帯域の前記差分信号の第2利得を求め、前記第2帯域情報と前記第2利得を符号化して得られる第2利得符号化情報とを含む第2符号化情報を生成する第2レイヤ符号化手段と、を具備し、
前記第1レイヤ符号化手段は、
前記第1帯域情報に基づいて、前記第1利得の符号化方法を複数の候補から決定する判定手段、を具備する、
符号化装置。 - 前記判定手段は、
更に前記第2帯域情報に基づいて、前記符号化方法を決定する、
請求項1記載の符号化装置。 - 前記判定手段は、
前記第1帯域情報及び前記第2帯域情報に基づいて、前記符号化方法を、予測符号化方法と非予測符号化方法とのいずれかに決定する、
請求項1記載の符号化装置。 - 前記判定手段は、
過去のフレームにおける前記第1帯域情報および前記第2帯域情報と、現フレームにおける前記第1帯域情報および前記第2帯域情報とに基づいて、前記符号化方法を、予測符号化方法と非予測符号化方法とのいずれかに決定する、
請求項1記載の符号化装置。 - 前記判定手段は、
過去のフレームにおける前記第1帯域情報および前記第2帯域情報を用いて求められる、過去のフレームにおける前記第1量子化対象帯域および前記第2量子化対象帯域の和集合である第3量子化対象帯域と、現フレームにおける前記第1帯域情報および前記第2帯域情報を用いて求められる、現フレームにおける前記第1量子化対象帯域および前記第2量子化対象帯域の和集合である第4量子化対象帯域と、を比較した結果に基づいて、前記符号化方法を、予測符号化方法と非予測符号化方法とのいずれかに決定する、
請求項1記載の符号化装置。 - 前記判定手段は、
前記結果が、前記第3量子化対象帯域と前記第4量子化対象帯域とに含まれる共通のサブバンドの数が予め設定された閾値以上である場合には、前記符号化方法を予測符号化方法に決定し、前記共通のサブバンドの数が前記閾値未満の場合には、前記符号化方法を非予測符号化方法に決定する、
請求項5記載の符号化装置。 - 前記第1レイヤ符号化手段は、
前記複数のサブバンドの中から前記入力信号の前記第1量子化対象帯域を選択して前記第1帯域情報を生成するとともに、前記第1量子化対象帯域の前記入力信号を出力する帯域選択手段と、
前記第1量子化対象帯域の前記入力信号の形状及び前記第1利得を符号化して形状符号化情報及び前記第1利得符号化情報を生成する形状・利得符号化手段と、
を具備する、
請求項1記載の符号化装置。 - 前記形状・利得符号化手段は、
決定された前記符号化方法を用いて、前記第1利得を符号化する、
請求項7記載の符号化装置。 - 少なくとも2つの符号化レイヤを有する符号化装置であって、
周波数領域の入力信号を入力し、前記周波数領域を分割した複数のサブバンドの中から前記入力信号の第1量子化対象帯域を選択して第1帯域情報を求めるとともに、前記第1量子化対象帯域の前記入力信号の第1利得を求め、前記第1帯域情報と、前記第1利得を符号化して得られる第1利得符号化情報と、を含む第1符号化情報を生成し、前記第1符号化情報を用いた復号を行うことにより得られる復号信号と前記入力信号との差分信号を生成する第1レイヤ符号化手段と、
前記差分信号を入力し、前記複数のサブバンドの中から前記差分信号の第2量子化対象帯域を選択して第2帯域情報を求めるとともに、前記第2量子化対象帯域の前記差分信号の第2利得を求め、前記第2帯域情報と前記第2利得を符号化して得られる第2利得符号化情報とを含む第2符号化情報を生成する第2レイヤ符号化手段と、を具備し、
前記第1レイヤ符号化手段あるいは前記第2レイヤ符号化手段の少なくとも一方は、
自レイヤ以下のレイヤにおける帯域情報に基づいて、各レイヤの量子化対象帯域における前記各レイヤの符号化手段への入力信号の利得の符号化方法を複数の候補から決定する判定手段、を具備する、
符号化装置。 - 前記判定手段は、
前記自レイヤ以下のレイヤにおける帯域情報に基づいて、前記符号化方法を、予測符号化方法と非予測符号化方法とのいずれかに決定する、
請求項9記載の符号化装置。 - 前記判定手段は、
過去のフレームにおける前記第1帯域情報および前記第2帯域情報と、現フレームにおける前記第1帯域情報および前記第2帯域情報のうち、前記自レイヤ以下のレイヤにおける帯域情報に基づいて、前記符号化方法を、予測符号化方法と非予測符号化方法とのいずれかに決定する、
請求項9記載の符号化装置。 - 前記判定手段は、
過去のフレームにおける前記第1帯域情報および前記第2帯域情報のうち、前記自レイヤ以下のレイヤにおける帯域情報を用いて求められる、過去のフレームにおける前記第1量子化対象帯域および前記第2量子化対象帯域のうち、前記自レイヤ以下のレイヤにおける帯域情報の和集合である第3量子化対象帯域と、現フレームにおける前記第1帯域情報および前記第2帯域情報のうち、前記自レイヤ以下のレイヤにおける帯域情報を用いて求められる、現フレームにおける前記第1量子化対象帯域および前記第2量子化対象帯域のうち、前記自レイヤ以下のレイヤにおける帯域情報の和集合である第4量子化対象帯域と、を比較した結果に基づいて、前記符号化方法を、予測符号化方法と非予測符号化方法とのいずれかに決定する、
請求項9記載の符号化装置。 - 前記判定手段は、
前記結果が、前記第3量子化対象帯域と前記第4量子化対象帯域とに含まれる共通のサブバンドの数が予め設定された閾値以上である場合には、前記符号化方法を予測符号化方法に決定し、前記共通のサブバンドの数が前記閾値未満の場合には、前記符号化方法を非予測符号化方法に決定する、
請求項9記載の符号化装置。 - 請求項1に記載の符号化装置を具備する通信端末装置。
- 請求項1に記載の符号化装置を具備する基地局装置。
- 少なくとも2つの符号化レイヤを有する符号化装置において生成された情報を受信して復号する復号装置であって、
前記符号化装置の第1レイヤの符号化により得られた、周波数領域を分割した複数のサブバンドの中から前記第1レイヤの第1量子化対象帯域を選択して生成された第1帯域情報を含む前記第1符号化情報と、前記第1符号化情報を用いた前記符号化装置の第2レイヤの符号化により得られた、前記複数のサブバンドの中から前記第2レイヤの第2量子化対象帯域を選択して生成された第2帯域情報を含む前記第2符号化情報と、を有する前記情報を受信する受信手段と、
前記情報から得られる前記第1符号化情報を入力し、前記第1帯域情報に基づいて設定される前記第1量子化対象帯域に対する第1復号信号を生成する第1レイヤ復号手段と、
前記情報から得られる前記第2符号化情報を入力し、前記第2帯域情報に基づいて設定される前記第2量子化対象帯域に対する第2復号信号を生成する第2レイヤ復号手段と、を具備し、
前記第1レイヤ復号手段は、
前記第1帯域情報に基づいて、前記第1復号信号の利得の復号方法を複数の候補から決定する判定手段を、を具備する、
復号装置。 - 前記判定手段は、
更に前記第2帯域情報に基づいて、前記復号方法を決定する、
請求項16記載の復号装置。 - 前記判定手段は、
前記第1帯域情報および前記第2帯域情報に基づいて、前記復号方法を、予測復号方法と非予測復号方法とのいずれかに決定する、
請求項16記載の復号装置。 - 前記判定手段は、
過去のフレームにおける前記第1帯域情報および前記第2帯域情報と、現フレームにおける前記第1帯域情報および前記第2帯域情報とに基づいて、前記復号方法を、予測復号方法と非予測復号方法とのいずれかに決定する、
請求項16記載の復号装置。 - 前記判定手段は、
過去のフレームにおける前記第1帯域情報および前記第2帯域情報を用いて求められる、過去のフレームにおける前記第1量子化対象帯域および前記第2量子化対象帯域の和集合である第3量子化対象帯域と、現フレームにおける前記第1帯域情報および前記第2帯域情報を用いて求められる、現フレームにおける前記第1量子化対象帯域および前記第2量子化対象帯域の和集合である第4量子化対象帯域と、を比較した結果に基づいて、前記復号方法を、予測復号方法と非予測復号方法とのいずれかに決定する、
請求項16記載の復号装置。 - 前記判定手段は、
前記結果が、前記第3量子化対象帯域と前記第4量子化対象帯域とに含まれる共通のサブバンドの数が予め設定された閾値以上である場合には、前記復号方法を予測復号方法に決定し、前記共通のサブバンドの数が前記閾値未満の場合には、前記復号方法を非予測復号方法に決定する、
請求項20記載の復号装置。 - 前記受信手段は、
前記符号化装置の第1レイヤの符号化により得られた、前記第1量子化対象帯域における利得の符号化方法として予測符号化を用いたか否かを判定した判定情報を更に含む前記第1符号化情報を受信し、
前記判定手段は、
更に前記判定情報に基づいて、前記復号方法を、予測復号方法と非予測復号方法とのいずれかに決定する、
請求項16記載の復号装置。 - 請求項16に記載の復号装置を具備する通信端末装置。
- 請求項16に記載の復号装置を具備する基地局装置。
- 少なくとも2つの符号化レイヤを有する符号化方法であって、
周波数領域の入力信号を入力し、前記周波数領域を分割した複数のサブバンドの中から前記入力信号の第1量子化対象帯域を選択して第1帯域情報を求めるとともに、前記第1量子化対象帯域の前記入力信号の第1利得を求め、前記第1帯域情報と、前記第1利得を符号化して得られる第1利得符号化情報と、を含む第1符号化情報を生成し、前記第1符号化情報を用いた復号を行うことにより得られる復号信号と前記入力信号との差分信号を生成する第1レイヤ符号化ステップと、
前記差分信号を入力し、前記複数のサブバンドの中から前記差分信号の第2量子化対象帯域を選択して第2帯域情報を求めるとともに、前記第2量子化対象帯域の前記差分信号の第2利得を求め、前記第2帯域情報と前記第2利得を符号化して得られる第2利得符号化情報とを含む第2符号化情報を生成する第2レイヤ符号化ステップと、を具備し、
前記第1レイヤ符号化ステップは、
前記第1帯域情報に基づいて、前記第1利得の符号化方法を複数の候補から決定する判定ステップ、を具備する、
符号化方法。 - 少なくとも2つの符号化レイヤを有する符号化装置において生成された情報を受信して復号する復号方法であって、
前記符号化装置の第1レイヤの符号化により得られた、周波数領域を分割した複数のサブバンドの中から前記第1レイヤの第1量子化対象帯域を選択して生成された第1帯域情報を含む前記第1符号化情報と、前記第1符号化情報を用いた前記符号化装置の第2レイヤの符号化により得られた、前記複数のサブバンドの中から前記第2レイヤの第2量子化対象帯域を選択して生成された第2帯域情報を含む前記第2符号化情報と、を有する前記情報を受信する受信ステップと、
前記情報から得られる前記第1符号化情報を入力し、前記第1帯域情報に基づいて設定される前記第1量子化対象帯域に対する第1復号信号を生成する第1レイヤ復号ステップと、
前記情報から得られる前記第2符号化情報を入力し、前記第2帯域情報に基づいて設定される前記第2量子化対象帯域に対する第2復号信号を生成する第2レイヤ復号ステップと、を具備し、
前記第1レイヤ復号ステップは、
前記第1帯域情報に基づいて、前記第1復号信号の利得の復号方法を複数の候補から決定する判定ステップを、を具備する、
復号方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201080051050.2A CN102598125B (zh) | 2009-11-13 | 2010-11-12 | 编码装置、解码装置及其方法 |
JP2011540418A JP5746974B2 (ja) | 2009-11-13 | 2010-11-12 | 符号化装置、復号装置およびこれらの方法 |
US13/505,634 US9153242B2 (en) | 2009-11-13 | 2010-11-12 | Encoder apparatus, decoder apparatus, and related methods that use plural coding layers |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009-259949 | 2009-11-13 | ||
JP2009259949 | 2009-11-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011058758A1 true WO2011058758A1 (ja) | 2011-05-19 |
Family
ID=43991424
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/006665 WO2011058758A1 (ja) | 2009-11-13 | 2010-11-12 | 符号化装置、復号装置およびこれらの方法 |
Country Status (4)
Country | Link |
---|---|
US (1) | US9153242B2 (ja) |
JP (1) | JP5746974B2 (ja) |
CN (1) | CN102598125B (ja) |
WO (1) | WO2011058758A1 (ja) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013162450A1 (en) * | 2012-04-24 | 2013-10-31 | Telefonaktiebolaget L M Ericsson (Publ) | Encoding and deriving parameters for coded multi-layer video sequences |
US9502044B2 (en) | 2013-05-29 | 2016-11-22 | Qualcomm Incorporated | Compression of decomposed representations of a sound field |
US9922656B2 (en) | 2014-01-30 | 2018-03-20 | Qualcomm Incorporated | Transitioning of ambient higher-order ambisonic coefficients |
US10770087B2 (en) | 2014-05-16 | 2020-09-08 | Qualcomm Incorporated | Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals |
ES2726193T3 (es) * | 2014-08-28 | 2019-10-02 | Nokia Technologies Oy | Cuantificación de parámetros de audio |
US9747910B2 (en) * | 2014-09-26 | 2017-08-29 | Qualcomm Incorporated | Switching between predictive and non-predictive quantization techniques in a higher order ambisonics (HOA) framework |
JP6907859B2 (ja) * | 2017-09-25 | 2021-07-21 | 富士通株式会社 | 音声処理プログラム、音声処理方法および音声処理装置 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008072670A1 (ja) * | 2006-12-13 | 2008-06-19 | Panasonic Corporation | 符号化装置、復号装置、およびこれらの方法 |
WO2009055493A1 (en) * | 2007-10-22 | 2009-04-30 | Qualcomm Incorporated | Scalable speech and audio encoding using combinatorial encoding of mdct spectrum |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE68911287T2 (de) * | 1988-06-08 | 1994-05-05 | Fujitsu Ltd | Codierer/decodierer. |
KR100935961B1 (ko) * | 2001-11-14 | 2010-01-08 | 파나소닉 주식회사 | 부호화 장치 및 복호화 장치 |
US7752052B2 (en) * | 2002-04-26 | 2010-07-06 | Panasonic Corporation | Scalable coder and decoder performing amplitude flattening for error spectrum estimation |
US20050010396A1 (en) * | 2003-07-08 | 2005-01-13 | Industrial Technology Research Institute | Scale factor based bit shifting in fine granularity scalability audio coding |
US7460990B2 (en) * | 2004-01-23 | 2008-12-02 | Microsoft Corporation | Efficient coding of digital media spectral data using wide-sense perceptual similarity |
WO2005112001A1 (ja) * | 2004-05-19 | 2005-11-24 | Matsushita Electric Industrial Co., Ltd. | 符号化装置、復号化装置、およびこれらの方法 |
JP4771674B2 (ja) * | 2004-09-02 | 2011-09-14 | パナソニック株式会社 | 音声符号化装置、音声復号化装置及びこれらの方法 |
JP4781272B2 (ja) * | 2004-09-17 | 2011-09-28 | パナソニック株式会社 | 音声符号化装置、音声復号装置、通信装置及び音声符号化方法 |
JP4871501B2 (ja) * | 2004-11-04 | 2012-02-08 | パナソニック株式会社 | ベクトル変換装置及びベクトル変換方法 |
EP1798724B1 (en) * | 2004-11-05 | 2014-06-18 | Panasonic Corporation | Encoder, decoder, encoding method, and decoding method |
US7539612B2 (en) * | 2005-07-15 | 2009-05-26 | Microsoft Corporation | Coding and decoding scale factor information |
WO2007035148A2 (en) * | 2005-09-23 | 2007-03-29 | Telefonaktiebolaget Lm Ericsson (Publ) | Successively refinable lattice vector quantization |
CN101283398B (zh) * | 2005-10-05 | 2012-06-27 | Lg电子株式会社 | 信号处理的方法和装置以及编码和解码方法及其装置 |
US7966175B2 (en) * | 2006-10-18 | 2011-06-21 | Polycom, Inc. | Fast lattice vector quantization |
US9153241B2 (en) * | 2006-11-30 | 2015-10-06 | Panasonic Intellectual Property Management Co., Ltd. | Signal processing apparatus |
JP4871894B2 (ja) | 2007-03-02 | 2012-02-08 | パナソニック株式会社 | 符号化装置、復号装置、符号化方法および復号方法 |
JP5403949B2 (ja) * | 2007-03-02 | 2014-01-29 | パナソニック株式会社 | 符号化装置および符号化方法 |
US8423371B2 (en) | 2007-12-21 | 2013-04-16 | Panasonic Corporation | Audio encoder, decoder, and encoding method thereof |
-
2010
- 2010-11-12 CN CN201080051050.2A patent/CN102598125B/zh not_active Expired - Fee Related
- 2010-11-12 WO PCT/JP2010/006665 patent/WO2011058758A1/ja active Application Filing
- 2010-11-12 US US13/505,634 patent/US9153242B2/en active Active
- 2010-11-12 JP JP2011540418A patent/JP5746974B2/ja not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008072670A1 (ja) * | 2006-12-13 | 2008-06-19 | Panasonic Corporation | 符号化装置、復号装置、およびこれらの方法 |
WO2009055493A1 (en) * | 2007-10-22 | 2009-04-30 | Qualcomm Incorporated | Scalable speech and audio encoding using combinatorial encoding of mdct spectrum |
Non-Patent Citations (2)
Title |
---|
HIROYUKI EHARA ET AL.: "Development of 32kbit/s scalable wide-band speech and audio coding algorithm using high-efficiency code- excited linear prediction and band-selective modified discrete cosine transform coding algorithms", JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN, vol. 64, no. 4, 1 April 2008 (2008-04-01), pages 196 - 207, XP008162599 * |
TOMOFUMI YAMANASHI ET AL.: "ITU-T G.718- development of speech/audio codec for next-generation mobile communication systems", PANASONIC TECHNICAL JOURNAL, vol. 55, no. 1, 15 April 2009 (2009-04-15), pages 21 - 26 * |
Also Published As
Publication number | Publication date |
---|---|
CN102598125A (zh) | 2012-07-18 |
JP5746974B2 (ja) | 2015-07-08 |
JPWO2011058758A1 (ja) | 2013-03-28 |
US9153242B2 (en) | 2015-10-06 |
CN102598125B (zh) | 2014-07-02 |
US20120221344A1 (en) | 2012-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5746974B2 (ja) | 符号化装置、復号装置およびこれらの方法 | |
JP5339919B2 (ja) | 符号化装置、復号装置およびこれらの方法 | |
JP5404418B2 (ja) | 符号化装置、復号装置および符号化方法 | |
US8306827B2 (en) | Coding device and coding method with high layer coding based on lower layer coding results | |
JP5328368B2 (ja) | 符号化装置、復号装置、およびこれらの方法 | |
JP4859670B2 (ja) | 音声符号化装置および音声符号化方法 | |
WO2009144953A1 (ja) | 符号化装置、復号装置およびこれらの方法 | |
US20100250244A1 (en) | Encoder and decoder | |
WO2007132750A1 (ja) | Lspベクトル量子化装置、lspベクトル逆量子化装置、およびこれらの方法 | |
JP5714002B2 (ja) | 符号化装置、復号装置、符号化方法及び復号方法 | |
US20100017197A1 (en) | Voice coding device, voice decoding device and their methods | |
JPWO2007114290A1 (ja) | ベクトル量子化装置、ベクトル逆量子化装置、ベクトル量子化方法及びベクトル逆量子化方法 | |
WO2011045926A1 (ja) | 符号化装置、復号装置およびこれらの方法 | |
JP5544371B2 (ja) | 符号化装置、復号装置およびこれらの方法 | |
JP7407110B2 (ja) | 符号化装置及び符号化方法 | |
JP5774490B2 (ja) | 符号化装置、復号装置およびこれらの方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080051050.2 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10829718 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011540418 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13505634 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10829718 Country of ref document: EP Kind code of ref document: A1 |