WO2009113316A1 - 符号化装置、復号装置およびこれらの方法 - Google Patents
符号化装置、復号装置およびこれらの方法 Download PDFInfo
- Publication number
- WO2009113316A1 WO2009113316A1 PCT/JP2009/001129 JP2009001129W WO2009113316A1 WO 2009113316 A1 WO2009113316 A1 WO 2009113316A1 JP 2009001129 W JP2009001129 W JP 2009001129W WO 2009113316 A1 WO2009113316 A1 WO 2009113316A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- subband
- pitch coefficient
- unit
- subbands
- encoding
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 123
- 238000001914 filtration Methods 0.000 claims description 125
- 238000004891 communication Methods 0.000 claims description 30
- 230000003595 spectral effect Effects 0.000 claims description 8
- 230000007423 decrease Effects 0.000 claims description 5
- 238000001228 spectrum Methods 0.000 description 221
- 238000012545 processing Methods 0.000 description 118
- 230000008569 process Effects 0.000 description 65
- 238000010586 diagram Methods 0.000 description 40
- 238000000926 separation method Methods 0.000 description 37
- 230000010354 integration Effects 0.000 description 23
- 238000005070 sampling Methods 0.000 description 19
- 230000005540 biological transmission Effects 0.000 description 12
- 239000000872 buffer Substances 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 230000009467 reduction Effects 0.000 description 10
- 230000009466 transformation Effects 0.000 description 10
- 230000015572 biosynthetic process Effects 0.000 description 9
- 238000003786 synthesis reaction Methods 0.000 description 9
- 230000003044 adaptive effect Effects 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 7
- 230000002159 abnormal effect Effects 0.000 description 5
- 239000000470 constituent Substances 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012805 post-processing Methods 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000006866 deterioration Effects 0.000 description 3
- 238000013139 quantization Methods 0.000 description 3
- 238000010295 mobile communication Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003638 chemical reducing agent Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
Definitions
- the present invention relates to an encoding device, a decoding device, and these methods used in a communication system that encodes and transmits a signal.
- Patent Document 1 among the spectrum data obtained by converting the input acoustic signal for a certain period of time, the characteristics of the high frequency part of the frequency are generated as auxiliary information, and this is combined with the encoded information of the low frequency part. Output. Specifically, the spectrum data of the high frequency part of the frequency is divided into a plurality of groups, and in each group, information for specifying the spectrum of the low frequency part that is closest to the spectrum of the group is used as auxiliary information.
- the high frequency signal is divided into a plurality of subbands, and the similarity between the signal in the subband and the low frequency signal is determined for each subband, and an auxiliary is determined according to the determination result.
- Patent Document 1 and Patent Document 2 in order to generate a high frequency signal (spectral data of a high frequency part), the determination of a low frequency signal similar to the high frequency part is performed by subbands (groups) of the high frequency signal. ), The coding efficiency is not sufficient.
- the auxiliary information is encoded at a low bit rate, the quality of the decoded speech generated using the calculated auxiliary information is insufficient, and abnormal noise may occur depending on circumstances.
- An object of the present invention is to efficiently encode high-frequency spectrum data based on low-frequency spectrum data of a wideband signal and improve the quality of a decoded signal, a decoding device, and the like Is to provide a method.
- the encoding apparatus includes a first encoding unit that encodes a low frequency portion of an input signal having a frequency equal to or lower than a predetermined frequency to generate first encoded information, and decodes the first encoded information to generate a decoded signal.
- a decoding means for generating, and dividing a high frequency portion of the input signal higher than the predetermined frequency into a plurality of subbands, and from the input signal or the decoded signal, each of the plurality of subbands is divided into adjacent subbands.
- a second encoding unit configured to generate second encoded information by estimation using the estimation result.
- the decoding device of the present invention includes first encoded information obtained by encoding a low frequency portion of an input signal that is equal to or lower than a predetermined frequency, and a high frequency portion that is higher than the predetermined frequency of the input signal. Is divided into a plurality of subbands, and each of the plurality of subbands is obtained from an estimation result of adjacent subbands from the input signal or the first decoded signal obtained by decoding the first encoded information.
- the encoding method of the present invention includes a step of generating a first encoded information by encoding a low frequency portion of an input signal having a frequency equal to or lower than a predetermined frequency, and a step of generating a decoded signal by decoding the first encoded information;
- the high frequency part of the input signal higher than the predetermined frequency is divided into a plurality of subbands, and each of the plurality of subbands is estimated from the input signal or the decoded signal using the estimation result of adjacent subbands.
- the decoding method of the present invention includes a first encoded information obtained by encoding a low frequency portion of an input signal that is equal to or lower than a predetermined frequency, and a high frequency portion that is higher than the predetermined frequency of the input signal. Is divided into a plurality of subbands, and each of the plurality of subbands is obtained from an estimation result of adjacent subbands from the input signal or the first decoded signal obtained by decoding the first encoded information. Receiving the second encoded information obtained by estimation using the second encoded information, decoding the first encoded information to generate a second decoded signal, and using the second encoded information. And a step of generating a third decoded signal by estimating a high frequency part of the input signal from the second decoded signal using a decoding result of adjacent subbands.
- the correlation between the high-frequency subbands is used, By performing the encoding based on the encoding result, it is possible to efficiently encode the spectral data of the high frequency part of the wideband signal, and to improve the quality of the decoded signal.
- summary of the search process included in the encoding which concerns on this invention 1 is a block diagram showing a configuration of a communication system having an encoding device and a decoding device according to Embodiment 1 of the present invention.
- the block diagram which shows the main structures inside the encoding apparatus shown in FIG. The block diagram which shows the main structures inside the 2nd layer encoding part shown in FIG.
- the figure for demonstrating the detail of the filtering process in the filtering part shown in FIG. Flow diagram showing the steps in the process of searching for optimal pitch coefficient T p 'for the sub-band SB p in the search unit shown in FIG. 4
- the block diagram which shows the main structures inside the 2nd layer decoding part shown in FIG. The block diagram which shows the main structures inside the encoding apparatus which concerns on Embodiment 2 of this invention.
- the block diagram which shows the main structures inside the 2nd layer encoding part shown in FIG. The block diagram which shows the main structures inside the decoding apparatus which concerns on Embodiment 3 of this invention.
- the block diagram which shows the main structures inside the 2nd layer decoding part shown in FIG. The block diagram which shows the main structures inside the encoding apparatus which concerns on Embodiment 4 of this invention.
- FIG. 18 is a block diagram showing the main configuration inside the first layer decoding unit shown in FIG.
- the block diagram which shows the main structures inside the 2nd layer decoding part shown in FIG. The block diagram which shows the main structures inside the 2nd layer encoding part which concerns on Embodiment 5 of this invention.
- FIG. 1A shows the spectrum of the input signal
- FIG. 1B shows the spectrum (first layer decoded spectrum) obtained by decoding the encoded data in the low frequency part of the input signal.
- the bandwidth of a telephone band (0 to 3.4 kHz) signal is expanded to a wide band (0 to 7 kHz) signal. That is, the sampling frequency of the input signal is 16 kHz, and the sampling frequency of the decoded signal output from the low frequency encoding unit is 8 kHz.
- the high frequency part of the spectrum of the input signal is divided into a plurality of subbands (in FIG. 1, five subband configurations from 1st to 5th are used) For each subband, the first layer decoded spectrum is searched for the portion that most closely approximates the high-band spectrum.
- the first search range and the second search range are part of a decoded low-frequency spectrum (first layer decoded spectrum described later) similar to the first subband (1st) and the second subband (2nd) ( (Band) is searched.
- the first search range is, for example, a range from Tmin (0 kHz) to Tmax.
- the frequency A indicates the start position of the partial band 1st ′ of the decoded low-band spectrum similar to the first subband found by the search, and the frequency B indicates the end of the band 1st ′.
- the search result of the first subband (1st) that has already been searched is used.
- Search for bandwidth As a result of the search corresponding to the second subband, for example, the start position of the partial band 2nd ′ of the decoded low-band spectrum similar to the second subband is C, and the end portion is D.
- the search corresponding to each of the third subband, the fourth subband, and the fifth subband is performed using the search result corresponding to the immediately preceding subband.
- the sampling frequency of the input signal is 16 kHz
- the present invention is not limited to this, and the same applies to the case where the sampling frequency of the input signal is 8 kHz, 32 kHz, or the like. Applicable. That is, the present invention is not limited by the sampling frequency of the input signal.
- FIG. 2 is a block diagram showing a configuration of a communication system having the encoding device and the decoding device according to Embodiment 1 of the present invention.
- the communication system includes an encoding device and a decoding device, and can communicate with each other via a transmission path. Note that both the encoding device and the decoding device are usually mounted and used in a base station device or a communication terminal device.
- the encoding apparatus 101 divides an input signal into N samples (N is a natural number), and encodes each frame with N samples as one frame.
- n represents the (n + 1) th signal element among the input signals divided by N samples.
- the encoded input information is transmitted to the decoding apparatus 103 via the transmission path 102.
- the decoding device 103 receives the encoded information transmitted from the encoding device 101 via the transmission path 102, decodes it, and obtains an output signal.
- FIG. 3 is a block diagram showing the main components inside coding apparatus 101 shown in FIG.
- the downsampling processing unit 201 downsamples the sampling frequency of the input signal from SR input to SR base (SR base ⁇ SR input ), and after downsampling the downsampled input signal
- the input signal is output to first layer encoding section 202.
- the first layer coding unit 202 performs coding on the downsampled input signal input from the downsampling processing unit 201 using, for example, a CELP (Code Excited Linear Prediction) method speech coding method.
- One-layer encoded information is generated, and the generated first layer encoded information is output to first layer decoding section 203 and encoded information integration section 207.
- First layer decoding section 203 decodes the first layer encoded information input from first layer encoding section 202 using, for example, a CELP speech decoding method to generate a first layer decoded signal Then, the generated first layer decoded signal is output to the upsampling processing unit 204.
- the upsampling processing unit 204 upsamples the sampling frequency of the first layer decoded signal input from the first layer decoding unit 203 from SR base to SR input, and first upsamples the upsampled first layer decoded signal. It outputs to the orthogonal transformation process part 205 as a layer decoding signal.
- the decoded signal yn is subjected to modified discrete cosine transform (MDCT).
- MDCT modified discrete cosine transform
- orthogonal transform processing section 205 respectively buffer buf1 n and buf2 n by the following equation (1) and (2), is initialized to "0" as an initial value.
- orthogonal transform processing section 205 the input signal x n, first layer decoded signal y n the following formula with respect to (3) after the up-sampling and to MDCT according to equation (4), MDCT coefficients of the input signal (hereinafter, input spectrum called) S2 (k) and an up-sampled MDCT coefficients of the first layer decoded signal y n (hereinafter, referred to as a first layer decoded spectrum) Request S1 (k).
- k represents the index of each sample in one frame.
- the orthogonal transform processing unit 205 obtains x n ′, which is a vector obtained by combining the input signal x n and the buffer buf1 n by the following equation (5). Further, the orthogonal transform processing unit 205 obtains y n ′, which is a vector obtained by combining the up-sampled first layer decoded signal y n and the buffer buf2 n by the following equation (6).
- the orthogonal transform processing unit 205 updates the buffers buf1 n and buf2 n according to equations (7) and (8).
- the orthogonal transformation processing unit 205 outputs the input spectrum S2 (k) and the first layer decoded spectrum S1 (k) to the second layer encoding unit 206.
- Second layer encoding section 206 generates second layer encoded information using input spectrum S2 (k) and first layer decoded spectrum S1 (k) input from orthogonal transform processing section 205, and generates the generated second layer encoding information.
- the two-layer encoded information is output to the encoded information integration unit 207. Details of second layer encoding section 206 will be described later.
- the encoding information integration unit 207 integrates the first layer encoding information input from the first layer encoding unit 202 and the second layer encoding information input from the second layer encoding unit 206, and integrates them. If necessary, a transmission error code or the like is added to the information source code, which is output to the transmission path 102 as encoded information.
- Second layer encoding section 206 includes band dividing section 260, filter state setting section 261, filtering section 262, search section 263, pitch coefficient setting section 264, gain encoding section 265, and multiplexing section 266. Perform the operation.
- a portion corresponding to the subband SB p in the input spectrum S2 (k) is referred to as a subband spectrum S2 p (k) (BS p ⁇ k ⁇ BS p + BW p ).
- the filter state setting unit 261 sets the first layer decoded spectrum S1 (k) (0 ⁇ k ⁇ FL) input from the orthogonal transform processing unit 205 as a filter state used in the filtering unit 262.
- the first layer decoded spectrum S1 (k) is stored as the internal state (filter state) of the filter in the band of 0 ⁇ k ⁇ FL of the spectrum S (k) of all frequency bands 0 ⁇ k ⁇ FH in the filtering unit 262. .
- the filtering unit 262 outputs the estimated spectrum S2 p ′ (k) of the subband SB p to the search unit 263. Details of the filtering process in the filtering unit 262 will be described later. It is assumed that the number of taps of a multi-tap can take an arbitrary value (integer) of 1 or more.
- the search unit 263 receives the estimated spectrum S2 p ′ (k) of the subband SB p input from the filtering unit 262 and the orthogonal transform processing unit 205 based on the band division information input from the band dividing unit 260.
- the similarity with each subband spectrum S2 p (k) in the high frequency part (FL ⁇ k ⁇ FH) of the input spectrum S2 (k) is calculated.
- the similarity is calculated by, for example, correlation calculation.
- the processes of the filtering unit 262, the search unit 263, and the pitch coefficient setting unit 264 constitute a closed-loop search process for each subband, and in each closed loop, the search unit 263 moves from the pitch coefficient setting unit 264 to the filtering unit 262.
- the degree of similarity corresponding to each pitch coefficient is calculated by variously changing the input pitch coefficient T.
- the search unit 263 obtains the optimum pitch coefficient T p ′ (however, in the range of Tmin to Tmax) that maximizes the similarity in the closed loop corresponding to the subband SB p , and P optimal
- the pitch coefficient is output to multiplexing section 266.
- Search section 263 calculates a partial band of the first layer decoded spectrum that is similar to each subband SB p using each optimum pitch coefficient T p ′.
- Pitch coefficient setting section 264 under the control of searching section 263, together with the filtering section 262 and searching section 263, when performing the search processing of the closed loop corresponding to the first subband SB 0 is a pitch coefficient T, predetermined
- the output is sequentially output to the filtering unit 262 while changing little by little within the obtained search ranges Tmin to Tmax.
- the pitch coefficient T is changed little by little based on the optimum pitch coefficient T p-1 ′ obtained in the closed loop search process corresponding to the subband SB p ⁇ 1.
- the pitch coefficient setting unit 264 outputs the pitch coefficient T shown in the following formula (9) to the filtering unit 262.
- SEARCH represents the search range (number of search entries) for pitch coefficient T corresponding to subband SB p .
- This part similar to subband SB p adjacent subband SB p-1 is based on the reason that there is a tendency that adjacent to the first part-band of layer decoded spectrum similar to subband SB p-1 Is.
- an adaptive similarity search method (ASS). This name is given for convenience, and the search method in the present invention is not limited by this name.
- the harmonic structure of the spectrum tends to gradually weaken as it becomes higher. That is, the subband SB p tends to have a weak harmonic structure compared to the subband SB p-1 . Therefore, for subband SB p , a search is performed for a portion similar to subband SB p on the high frequency side where the harmonic structure is weaker than the portion of the first layer decoded spectrum similar to subband SB p ⁇ 1. The search efficiency can be improved. From this point of view, the efficiency of the search of this method can be explained.
- the gain encoding unit 265 calculates gain information for the high frequency part (FL ⁇ k ⁇ FH) of the input spectrum S2 (k) input from the orthogonal transform processing unit 205. Specifically, gain encoding section 265 divides frequency band FL ⁇ k ⁇ FH into J subbands, and obtains spectrum power for each subband of input spectrum S2 (k). In this case, the spectrum power B j of the (j + 1) th subband is expressed by the following equation (12).
- BL j represents the minimum frequency of the (j + 1) th subband
- BH j represents the maximum frequency of the (j + 1) th subband.
- the estimated spectrum S2 ′ (k) of the high frequency part is constructed.
- gain encoding section 265 similarly to the case of calculating the spectral power for the input spectrum S2 (k), j to the following formula 'spectrum power B of each subband (k)' estimated spectrum S2 ( 13).
- gain encoding section 265 calculates spectrum power variation V j for each subband of estimated spectrum S2 ′ (k) with respect to input spectrum S2 (k) according to equation (14).
- the gain encoding unit 265 encodes the variation amount V j and outputs an index corresponding to the encoded variation amount VQ j to the multiplexing unit 266.
- the filtering unit 262 uses the filter state input from the filter state setting unit 261, the pitch coefficient T input from the pitch coefficient setting unit 264, and the band division information input from the band division unit 260, and uses the subband.
- the transfer function F (z) of the filter used in the filtering unit 262 is expressed by the following equation (15).
- T represents a pitch coefficient given from the pitch coefficient setting unit 264
- ⁇ i represents a filter coefficient stored in advance.
- values such as ( ⁇ ⁇ 1 , ⁇ 0 , ⁇ 1 ) (0.2, 0.6, 0.2), (0.3, 0.4, 0.3) are also appropriate.
- M 1.
- M is an index related to the number of taps.
- the first layer decoded spectrum S1 (k) is stored as an internal state (filter state) of the filter in the band of 0 ⁇ k ⁇ FL of the spectrum S (k) of all frequency bands in the filtering unit 262.
- the estimated spectrum S2 p ′ (k) of the subband SB p is stored by the filtering process of the following procedure. That is, a spectrum S (k ⁇ T) having a frequency lower than this k by T is basically substituted into S2 p ′ (k). However, in order to increase the smoothness of the spectrum, actually, a spectrum ⁇ i .multidot. ⁇ Obtained by multiplying a spectrum S (k ⁇ T + i) in the vicinity away from the spectrum S (k ⁇ T) by a predetermined filter coefficient ⁇ i. A spectrum obtained by adding S (k ⁇ T + i) for all i is substituted into S2 p ′ (k). This process is expressed by the following equation (16).
- the above filtering process is performed by clearing S (k) to zero each time in the range of BS p ⁇ k ⁇ BS p + BW p every time the pitch coefficient T is given from the pitch coefficient setting unit 264. That is, every time the pitch coefficient T changes, S (k) is calculated and output to the search unit 263.
- search section 263 initializes minimum similarity D min that is a variable for storing the minimum value of similarity to “+ ⁇ ” (ST2010).
- search unit 263 according to the following equation (17), is the similarity between the high frequency part (FL ⁇ k ⁇ FH) of the input spectrum S2 (k) at a certain pitch coefficient and the estimated spectrum S2 p ′ (k). D is calculated (ST2020).
- M ′ represents the number of samples when calculating the similarity D, and may be an arbitrary value equal to or smaller than the bandwidth of each subband. It should be noted that S2 p ′ (k) does not exist in the equation (17), because this represents S2 p ′ (k) using BS p and S2 ′ (k).
- search section 263 determines whether or not calculated similarity D is smaller than minimum similarity D min (ST2030).
- search section 263 substitutes similarity D into minimum similarity Dmin (ST2040).
- search section 263 determines whether or not the process over the search range has ended. That is, search section 263 determines whether or not the similarity is calculated according to the above equation (17) in ST2020 for each of all pitch coefficients within the search range (ST2050).
- search section 263 If the process has not been completed over the search range (ST2050: “NO”), search section 263 returns the process to ST2020 again. Then, search section 263 calculates similarity according to equation (17) for a pitch coefficient different from the case where similarity was calculated according to equation (17) in the previous ST2020 procedure. On the other hand, when the process over the search range is completed (ST2050: “YES”), the search unit 263 instructs the multiplexing unit 266 to set the pitch coefficient T corresponding to the minimum similarity D min as the optimum pitch coefficient T p ′. Output (ST2060).
- FIG. 7 is a block diagram showing a main configuration inside the decoding apparatus 103.
- the encoded information separation unit 131 separates the first layer encoded information and the second layer encoded information from the input encoded information, and the first layer encoded information is first layer decoded. And outputs the second layer encoded information to second layer decoding section 135.
- the first layer decoding unit 132 performs decoding on the first layer encoded information input from the encoded information separation unit 131, and outputs the generated first layer decoded signal to the upsampling processing unit 133.
- the operation of first layer decoding section 132 is the same as that of first layer decoding section 203 shown in FIG.
- the upsampling processing unit 133 performs a process of upsampling the sampling frequency from the SR base to the SR input on the first layer decoded signal input from the first layer decoding unit 132, and obtains the first layer decoding after the upsampling obtained.
- the signal is output to the orthogonal transform processing unit 134.
- the orthogonal transform processing unit 134 performs orthogonal transform processing (MDCT) on the first layer decoded signal after upsampling input from the upsampling processing unit 133, and the MDCT coefficient (1) of the first layer decoded signal after upsampling obtained.
- S1 (k) (hereinafter referred to as first layer decoded spectrum) is output to second layer decoding section 135.
- the operation of orthogonal transform processing section 134 is the same as the processing for the first layer decoded signal after upsampling of orthogonal transform processing section 205 shown in FIG.
- the second layer decoding unit 135 uses the first layer decoded spectrum S1 (k) input from the orthogonal transform processing unit 134 and the second layer encoded information input from the encoded information separating unit 131 to generate a high frequency component. Is generated and output as an output signal.
- FIG. 8 is a block diagram showing the main configuration inside second layer decoding section 135 shown in FIG.
- the filter state setting unit 352 sets the first layer decoded spectrum S1 (k) (0 ⁇ k ⁇ FL) input from the orthogonal transform processing unit 134 as a filter state used by the filtering unit 353.
- S (k) the spectrum of the entire frequency band 0 ⁇ k ⁇ FH in the filtering unit 353
- the first layer decoded spectrum S1 ( k) is stored as the internal state (filter state) of the filter.
- the configuration and operation of the filter state setting unit 352 are the same as those of the filter state setting unit 261 shown in FIG.
- the filtering unit 353 includes a multi-tap pitch filter (the number of taps is greater than 1).
- the filter function shown in the above equation (15) is used. However, in this case, the filtering process and the filter function are obtained by replacing T in Equation (15) and Equation (16) with T p ′.
- the pitch coefficient T p ′′ used for filtering is calculated according to the following equation (18) using the pitch coefficient T p ⁇ 1 ′ of the subband SB p ⁇ 1 and the subband width BW p ⁇ 1.
- the filtering process is performed according to an equation in which T is replaced with T p ′′ in equation (16).
- subband SB p 1, 2,..., P ⁇ 1
- subband SB p ⁇ 1 is added to pitch coefficient T p ⁇ 1 ′ of subband SB p ⁇ 1.
- the added bandwidth BW p-1 by adding T p 'to the index obtained by subtracting half the value of the search range sEARCH, and pitch coefficient T p ".
- the gain decoding unit 354 decodes the index of the encoded variation amount VQ j input from the separation unit 351, and obtains a variation amount VQ j that is a quantized value of the variation amount V j .
- the spectrum adjustment unit 355 adjusts the spectrum shape of the estimated spectrum S2 ′ (k) in the frequency band FL ⁇ k ⁇ FH, generates a decoded spectrum S3 (k), and outputs it to the orthogonal transform processing unit 356.
- the low frequency part (0 ⁇ k ⁇ FL) of the decoded spectrum S3 (k) is composed of the first layer decoded spectrum S1 (k), and the high frequency part (FL ⁇ k ⁇ FH) of the decoded spectrum S3 (k). Consists of an estimated spectrum S2 ′ (k) after spectral shape adjustment.
- Orthogonal transform processing section 356 orthogonally transforms decoded spectrum S3 (k) input from spectrum adjusting section 355 into a time domain signal, and outputs the obtained second layer decoded signal as an output signal.
- processing such as appropriate windowing and overlay addition is performed as necessary to avoid discontinuities between frames.
- the orthogonal transform processing unit 356 has a buffer buf ′ (k) therein, and initializes the buffer buf ′ (k) as shown in the following equation (20).
- orthogonal transform processing section 356 calculates and outputs second layer decoded signal y n ′′ according to the following equation (21) using second layer decoded spectrum S3 (k) input from spectrum adjusting section 355. .
- Z4 (k) is a vector obtained by combining the decoded spectrum S3 (k) and the buffer buf ′ (k) as shown in Expression (22) below.
- the orthogonal transform processing unit 356 updates the buffer buf ′ (k) according to the following equation (23).
- the orthogonal transform processing unit 356 outputs the decoded signal y n ′′ as an output signal.
- the high frequency band is divided into a plurality of subbands, Encoding for each subband is performed using the encoding result of adjacent subbands. That is, an efficient search is performed using the correlation between the high frequency sub-bands (Adaptive Similarity Search Method (ASS)), so that the high frequency spectrum is encoded / decoded more efficiently. It is possible to suppress unnatural noise contained in the decoded signal and improve the quality of the decoded signal.
- ASS Adaptive Similarity Search Method
- the present invention performs a high-frequency spectrum search as described above, so that the quality of the decoded signal is comparable to the method of encoding / decoding the high-frequency spectrum without using the correlation between subbands. It is possible to reduce the calculation amount of the similar partial search necessary for achieving the above.
- the number J of subbands obtained by dividing the high frequency part of the input spectrum S2 (k) in the gain encoding unit 265 is the high frequency of the input spectrum S2 (k) in the search unit 263.
- the case where the number is different from the number P of subbands obtained by dividing the part has been described.
- the present invention is not limited to this, and the number of subbands obtained by dividing the high frequency part of the input spectrum S2 (k) in the gain encoding unit 265 may be P.
- the gain encoding unit 265 performs optimal processing in the search unit 263 in place of the square root of the spectral power ratio for each subband as shown in Expression (14).
- the pitch coefficient setting unit 264 sets the search range of the pitch coefficient T as in Expression (9) has been described as an example.
- the present invention is not limited to this, and the following expression is used.
- the search range of the pitch coefficient T may be set as in (25).
- the pitch coefficient T is set to a value in the vicinity of the optimum pitch coefficient T p-1 ′ corresponding to the subband SB p ⁇ 1 . This is based on the reason that the partial band of the first layer decoded spectrum most similar to the subband SB p-1 is likely to be similar to the subband SB p . In particular, when the correlation between the subband SB p-1 and the subband SB p is very high, the search can be performed more efficiently by the pitch coefficient setting method as described above.
- the pitch coefficient setting unit 264 sets the search range of the pitch coefficient T as shown in the equation (25)
- the filtering unit 353 performs the filtering as shown in the equation (26) instead of the equation (18).
- the pitch coefficient T p ′′ used in the above is calculated.
- the pitch coefficient search range may be fixed to a range of Tmin to Tmax as in the first subband.
- the pitch coefficient search range is fixed to a range of Tmin to Tmax.
- This makes it possible to search result corresponding to the first subband SB 0 is, to avoid that affects all of the search from the second subband SB 1 to the P subbands SB P-1. That is, it is possible to avoid that a target for searching for a similar part is excessively biased toward a high frequency with respect to a certain subband.
- a target for searching for a similar part is excessively biased toward a high frequency with respect to a certain subband.
- Embodiment 2 of the present invention describes a case where transform coding such as MDCT is used in the first layer coding section without using the CELP coding method shown in Embodiment 1.
- the communication system (not shown) according to the second embodiment is basically the same as the communication system shown in FIG. 2, and the communication shown in FIG. It differs from the encoding device 101 and decoding device 103 of the system.
- the encoding device and the decoding device of the communication system according to the present embodiment are denoted by reference numerals “111” and “113”, respectively.
- FIG. 9 is a block diagram showing the main components inside coding apparatus 111 according to the present embodiment.
- coding apparatus 111 according to the present embodiment includes downsampling processing section 201, first layer coding section 212, orthogonal transform processing section 215, second layer coding section 216, and coded information integration section 207. Mainly composed.
- the downsampling processing unit 201 and the encoded information integration unit 207 perform the same processing as in the first embodiment, description thereof is omitted.
- the first layer encoding unit 212 encodes the input signal after downsampling input from the downsampling processing unit 201 using the transform encoding method. Specifically, first layer encoding section 212 converts the input signal after downsampling from a time domain signal to a frequency domain component using a technique such as MDCT, and converts the input signal to a frequency component obtained. Quantization is performed. First layer encoding section 212 outputs the quantized frequency component directly to second layer encoding section 216 as a first layer decoded spectrum.
- the MDCT process in first layer encoding section 212 is the same as the MDCT process shown in Embodiment 1, and thus detailed description thereof is omitted.
- the orthogonal transform processing unit 215 performs orthogonal transform such as MDCT on the input signal, and outputs the obtained frequency component to the second layer encoding unit 216 as a high frequency spectrum. Since the MDCT processing in the orthogonal transform processing unit 215 is the same as the MDCT processing shown in Embodiment 1, detailed description thereof is omitted.
- the second layer encoding unit 216 is different from the second layer encoding unit 206 shown in FIG. 3 only in that the first layer decoded spectrum is input from the first layer encoding unit 212, and the other processes are the first. Since it is the same as the process of the 2-layer encoding part 206, detailed description is abbreviate
- FIG. 10 is a block diagram showing a main configuration inside decoding apparatus 113 according to the present embodiment.
- decoding apparatus 113 according to the present embodiment mainly includes encoded information separation section 131, first layer decoding section 142, and second layer decoding section 145. Further, the encoded information separation unit 131 performs the same processing as in the first embodiment, and thus detailed description thereof is omitted.
- the first layer decoding unit 142 decodes the first layer encoded information input from the encoded information separation unit 131, and outputs the obtained first layer decoded spectrum to the second layer decoding unit 145.
- a decoding process in the first layer decoding unit 142 a general inverse quantization method corresponding to the encoding method in the first layer encoding unit 212 shown in FIG. 9 is adopted, and detailed description thereof is omitted. .
- the second layer decoding unit 145 is different from the second layer decoding unit 135 shown in FIG. 7 only in that the first layer decoding spectrum is input from the first layer decoding unit 142, and the second layer decoding is performed for the other processes. Since it is the same as the process of the part 135, detailed description is abbreviate
- the high frequency band is divided into a plurality of subbands, Encoding for each subband is performed using the encoding result of adjacent subbands. That is, since an efficient search is performed using the correlation between the high frequency sub-bands, the high frequency spectrum can be encoded / decoded more efficiently, and unnatural abnormal noise included in the decoded signal can be detected. And the quality of the decoded signal can be improved.
- the present invention can be applied even when, for example, a transform coding / decoding method is adopted for coding of the first layer instead of the CELP coding / decoding method. .
- a transform coding / decoding method is adopted for coding of the first layer instead of the CELP coding / decoding method.
- the downsampling processing unit 201 downsamples the input signal and inputs it to the first layer encoding unit 212 has been described as an example.
- the present invention is not limited to this, and the downsampling processing unit 201 downsamples the input signal.
- the sampling processor 201 may be omitted, and the input spectrum that is the output of the orthogonal transform processor 215 may be input to the first layer encoder 212.
- the orthogonal transform process can be omitted in the first layer encoding unit 212, and the amount of calculation can be reduced accordingly.
- Embodiment 3 of the present invention is a configuration that analyzes the degree of correlation between high-frequency subbands and switches whether to perform a search using the optimal pitch period of adjacent subbands based on the analysis result. explain.
- a communication system (not shown) according to the third embodiment of the present invention is basically the same as the communication system shown in FIG. 2, and only a part of the configuration and operation of the encoding device and decoding device is shown in FIG. 2 is different from the encoding device 101 and the decoding device 103 of the communication system 2.
- the encoding device and the decoding device of the communication system according to the present embodiment will be denoted by reference numerals “121” and “123”, respectively.
- FIG. 11 is a block diagram showing a main configuration inside encoding apparatus 121 according to the present embodiment.
- the encoding apparatus 121 according to the present embodiment includes a downsampling processing unit 201, a first layer encoding unit 202, a first layer decoding unit 203, an upsampling processing unit 204, an orthogonal transformation processing unit 205, a correlation determination unit 221, It mainly includes a second layer encoding unit 226 and an encoded information integration unit 227.
- the components other than the correlation determination unit 221, the second layer encoding unit 226, and the encoded information integration unit 227 are the same as those in the first embodiment, description thereof will be omitted.
- SFM spectral flatness measure
- the second layer encoding unit 226 uses the input spectrum S2 (k) input from the orthogonal transform processing unit 205, the first layer decoded spectrum S1 (k), and the determination information input from the correlation determination unit 221 to perform the second processing. Layer encoding information is generated, and the generated second layer encoding information is output to the encoding information integration unit 227. Second layer encoding section 226 outputs the internally calculated band division information to correlation determination section 221. Details of the band division information in second layer encoding section 226 will be described later.
- FIG. 12 is a block diagram showing the main configuration inside second layer encoding section 226 shown in FIG.
- the components other than the pitch coefficient setting unit 274 and the band dividing unit 275 are the same as those in the first embodiment, and thus the description thereof is omitted.
- the pitch coefficient setting unit 274 sets the pitch coefficient T to a predetermined search range Tmin to Tmax under the control of the search unit 263.
- the data is sequentially output to the filtering unit 262 while changing little by little. That is, when the determination information input from the correlation determination unit 221 is “0”, the pitch coefficient setting unit 274 sets the pitch coefficient T without considering search results corresponding to adjacent subbands.
- pitch coefficient setting unit 274 performs the same processing as pitch coefficient setting unit 264 according to Embodiment 1. That is, when the pitch coefficient setting unit 274 performs search processing of the closed loop corresponding to the first subband SB 0 with the filtering unit 262 and the search unit 263 under the control of the search unit 263, the pitch coefficient T is The data is sequentially output to the filtering unit 262 while being gradually changed within a predetermined search range Tmin to Tmax.
- the pitch coefficient T p-1 ′ obtained in the closed loop search process corresponding to the subband SB p ⁇ 1 is used, and the pitch coefficient T p according to the above equation (9) is used.
- the pitch coefficient T p is used.
- the pitch coefficient setting unit 274 adaptively switches whether to set a pitch coefficient using a search result corresponding to an adjacent subband, according to the value of the input determination information. Therefore, search results corresponding to adjacent subbands can be used only when the correlation between subbands in a frame is equal to or higher than a predetermined level. Therefore, it is possible to suppress a decrease in encoding accuracy due to the use of the subband search result.
- BS p ⁇ FH) is output as band division information to filtering section 262, search section 263, multiplexing section 266, and correlation determination section 221.
- the encoded information integration unit 227 receives the first layer encoded information input from the first layer encoding unit 202, the determination information input from the correlation determination unit 221, and the second layer encoding unit 226.
- the second layer encoded information is integrated, and if necessary, a transmission error code or the like is added to the integrated information source code and output to the transmission path 102 as encoded information.
- FIG. 13 is a block diagram showing a main configuration inside decoding apparatus 123 according to the present embodiment.
- Decoding apparatus 123 according to the present embodiment mainly includes encoded information separation section 151, first layer decoding section 132, upsampling processing section 133, orthogonal transform processing section 134, and second layer decoding section 155.
- constituent elements other than the encoded information demultiplexing unit 151 and the second layer decoding unit 155 are the same as those in the first embodiment, and thus the description thereof is omitted.
- the encoded information separation unit 151 separates the first layer encoded information, the second layer encoded information, and the determination information from the input encoded information, and converts the first layer encoded information into the first layer encoded information. It outputs to 1 layer decoding part 132, and outputs 2nd layer encoding information and determination information to 2nd layer decoding part 155.
- Second layer decoding section 155 uses first layer decoded spectrum S1 (k) input from orthogonal transform processing section 134, second layer encoded information and determination information input from encoded information separating section 131, and A second layer decoded signal including a high frequency component is generated and output as an output signal.
- FIG. 14 is a block diagram showing the main configuration inside second layer decoding section 155 shown in FIG.
- the components other than the filtering unit 363 are the same as those in the first embodiment, and thus the description thereof is omitted.
- the filtering unit 363 determines the pitch of adjacent subbands for all P subbands from subband SB 0 to subband SB P ⁇ 1. It performs filtering using the pitch coefficient T p 'inputted from demultiplexing section 351 without considering the factor. In this case, the filtering process and the filter function are obtained by replacing T in Equation (15) and Equation (16) with T p ′.
- the filtering unit 363 uses the pitch coefficient obtained from the separation unit 351. respect, with the pitch coefficient in the subband SB p-1 T p-1 ' and the sub-band width BW p-1, in accordance with the above-described equation (18), calculates a pitch coefficient T p "used for filtering In this case, the filtering process and the filter function are obtained by replacing T in Equation (15) and Equation (16) with T p ′′.
- the high frequency band is divided into a plurality of subbands, Based on the result of analyzing the degree of correlation between subbands for each frame, whether to perform coding for each subband is switched adaptively using the coding results of adjacent subbands. That is, only when the correlation between subbands in a frame is equal to or higher than a predetermined level, an efficient search can be performed using the correlation between subbands, and a high frequency spectrum can be encoded / decoded more efficiently. And unnatural noise included in the decoded signal can be suppressed.
- the search result of the adjacent subband is not used, and the deterioration of the encoding accuracy due to the use of the search result of the adjacent subband having a low correlation is suppressed. And the quality of the decoded signal can be improved.
- the SFM value is analyzed for each subband, the SFM values of all subbands included in one frame are comprehensively considered, and correlation determination is performed for each frame to obtain the value of the determination information.
- the determination information value may be set by performing correlation determination individually for each subband.
- the energy of each subband may be calculated, and correlation determination may be performed according to the energy difference or ratio between the subbands to set the value of the determination information.
- the value of the determination information may be set by calculating the correlation for the frequency components (MDCT coefficients, etc.) between the sub-bands by correlation calculation and comparing the correlation value with a predetermined threshold value. .
- the pitch coefficient setting unit 274 sets a search range for the pitch coefficient T as in the above equation (9) as an example.
- the present invention is not limited to this, and the search range of the pitch coefficient T may be set as in the above equation (25).
- the sampling frequency of the input signal is 32 kHz
- the encoding scheme of the first layer encoding unit is the G.264 standardized by ITU-T. A configuration when the 729.1 scheme is applied will be described.
- a communication system (not shown) according to the fourth embodiment of the present invention is basically the same as the communication system shown in FIG. 2, and only a part of the configuration and operation of the encoding device and decoding device is shown in FIG. 2 is different from the encoding device 101 and the decoding device 103 of the communication system 2.
- the encoding device and the decoding device of the communication system according to the present embodiment will be described with reference numerals “161” and “163”, respectively.
- FIG. 15 is a block diagram showing a main configuration inside encoding apparatus 161 according to the present embodiment.
- Encoding apparatus 161 according to the present embodiment mainly includes downsampling processing unit 201, first layer encoding unit 233, orthogonal transform processing unit 215, second layer encoding unit 236, and encoded information integration unit 207. Composed.
- components other than the first layer encoding unit 233 and the second layer encoding unit 236 are the same as those in the first embodiment, and thus the description thereof is omitted.
- the first layer encoding unit 233 applies G.D. to the post-downsampled input signal input from the downsampling processing unit 201.
- the first layer encoded information is generated by performing the encoding using the 729.1 speech encoding method. Then, first layer encoding section 233 outputs the generated first layer encoded information to encoded information integration section 207. Moreover, the 1st layer encoding part 233 outputs the information obtained in the process which produces
- Second layer encoding section 236 generates second layer encoded information using the input spectrum input from orthogonal transform processing section 215 and the first layer decoded spectrum input from first layer encoding section 233. The generated second layer encoded information is output to encoded information integration section 207. Details of second layer encoding section 236 will be described later.
- FIG. 16 is a block diagram showing the main configuration inside first layer encoding section 233 shown in FIG.
- G.I. A case where the 729.1 encoding scheme is applied will be described as an example.
- the first layer encoding unit 233 illustrated in FIG. 16 includes a band division processing unit 281, a high pass filter 282, a CELP (Code Excited Linear Prediction) encoding unit 283, an FEC (Forward Error Correction) encoding unit 284, An adder 285, a low-pass filter 286, a TDAC (Time-Domain Aliasing Cancellation) encoding unit 287, a TDBWE (Time-Domain Bandwidth Extension) encoding unit 288, and a multiplexing unit 289 are provided. Each unit performs the following operations.
- the band division processing unit 281 performs band division processing by QMF (Quadrature Mirror Filter) or the like on the input signal after down-sampling having a sampling frequency of 16 kHz, which is input from the down-sampling processing unit 201, and performs 0 to 4 kHz band The first low-frequency signal and the second low-frequency signal in the 4 to 8 kHz band are generated.
- the band division processing unit 281 outputs the generated first low-frequency signal to the high-pass filter 282 and outputs the second low-frequency signal to the low-pass filter 286.
- the high-pass filter 282 suppresses frequency components of 0.05 kHz or less with respect to the first low-frequency signal input from the band division processing unit 281, obtains a signal mainly composed of frequency components higher than 0.05 kHz, and performs filtering. 1 is output to the CELP encoding unit 283 and the addition unit 285 as a low frequency signal.
- the CELP encoding unit 283 performs CELP encoding on the filtered first low-pass signal input from the high-pass filter 282, and converts the obtained CELP parameter into the FEC encoding unit 284, the TDAC encoding unit 287, and the multiplexing. To the conversion unit 289.
- the CELP encoding unit 283 may output part of the CELP parameter or information obtained in the process of generating the CELP parameter to the FEC encoding unit 284 and the TDAC encoding unit 287.
- CELP encoding section 283 performs CELP decoding using the generated CELP parameter, and outputs the resulting CELP decoded signal to adding section 285.
- the FEC encoding unit 284 uses the CELP parameter input from the CELP encoding unit 283 to calculate the FEC parameter used for the erasure frame compensation process of the decoding device 163, and the calculated FEC parameter to the multiplexing unit 289. Output.
- the addition unit 285 outputs a difference signal obtained by subtracting the CELP decoded signal input from the CELP encoding unit 283 from the filtered first low-pass signal input from the high pass filter 282 to the TDAC encoding unit 287.
- the low-pass filter 286 suppresses a frequency component larger than 7 kHz with respect to the second low-frequency signal input from the band division processing unit 281, obtains a signal mainly composed of a frequency component equal to or lower than 7 kHz, and outputs the filtered second low-frequency signal.
- the TDAC encoding unit 287 and the TDBWE encoding unit 288 To the TDAC encoding unit 287 and the TDBWE encoding unit 288.
- the TDAC encoding unit 287 performs orthogonal transform such as MDCT on the differential signal input from the adding unit 285 and the filtered second low-frequency signal input from the low-pass filter 286, and the obtained frequency domain signal (MDCT) Quantize the coefficient. Then, the TDAC encoding unit 287 outputs the TDAC parameter obtained by the quantization to the multiplexing unit 289. Also, the TDAC encoding unit 287 performs decoding using the TDAC parameter, and outputs the obtained decoded spectrum to the second layer encoding unit 236 (FIG. 15) as the first layer decoded spectrum.
- orthogonal transform such as MDCT
- the TDBWE encoding unit 288 performs band expansion encoding in the time domain on the filtered second low-frequency signal input from the low-pass filter 286, and outputs the obtained TDBWE parameter to the multiplexing unit 289.
- the multiplexing unit 289 multiplexes the FEC parameter, the CELP parameter, the TDAC parameter, and the TDBWE parameter, and outputs them to the encoded information integration unit 237 (FIG. 15) as the first layer encoded information. Note that these parameters may be multiplexed by the encoded information integration unit 237 without providing the multiplexing unit 289 in the first layer encoding unit 233.
- the encoding in first layer encoding section 233 according to the present embodiment shown in FIG. 16 is performed by second layer encoding in TDAC encoding section 287 using the decoded spectrum obtained by decoding the TDAC parameter as the first layer decoded spectrum.
- the point to be output to the unit 236 is that G. This is different from 729.1 encoding.
- FIG. 17 is a block diagram showing the main configuration inside second layer encoding section 236 shown in FIG.
- the constituent elements other than the pitch coefficient setting unit 294 are the same as those in the first embodiment, and thus the description thereof is omitted.
- Pitch coefficient setting section 294 presets a pitch coefficient search range for some subbands among a plurality of subbands, and adjacent subbands for other subbands.
- a pitch coefficient search range is set based on the search result corresponding to.
- the pitch coefficient setting unit 294 controls the first subband SB 0 , the third subband SB 2, or the fifth subband SB 4 (subband SB) together with the filtering unit 262 and the search unit 263 under the control of the search unit 263.
- the pitch coefficient setting unit 294 sets the pitch coefficient T to the search range Tmin1 preset for the first subband. Set while gradually changing within ⁇ Tmax1.
- pitch coefficient setting section 294 when performing the search processing of the closed loop corresponding to the third sub-band SB 2 is a pitch coefficient T, the third preset search range for the sub-band Tmin3 ⁇ Tmax3 Set while changing little by little.
- the pitch coefficient setting unit 294 sets the pitch coefficient T to the search range Tmin5 to Tmax5 preset for the fifth subband. Set while changing little by little.
- the coefficient T is sequentially output to the filtering unit 262 while being changed little by little.
- the pitch coefficient setting unit 294 sets the pitch coefficient T for the first subband, the third subband, and the fifth subband within the search range preset for each subband. Change little by little.
- the pitch coefficient setting unit 294 may set the pitch coefficient T with a higher band (high frequency band) of the first decoded spectrum as a search range in a higher frequency band among the plurality of sub bands. That is, pitch coefficient setting section 294 presets the search range of each subband so that the higher the subband, the higher the search range of the first decoded spectrum.
- the pitch coefficient setting unit 294 sets the search range so that the higher the sub-band, the higher the search range, so that the search unit 263 can search for the search range suitable for each sub-band. Therefore, it can be expected to improve the encoding efficiency.
- pitch coefficient setting section 294 uses the lower band (low frequency band) of the first decoded spectrum as the search range for the higher frequency band among the plurality of subbands. T may be set. That is, pitch coefficient setting section 294 presets the search range of each subband so that the higher the subband, the lower the search range of the first decoded spectrum. For example, in the first decoded spectrum, when the spectrum of 0 to 4 kHz is compared with the spectrum of 4 to 7 kHz and the harmonic structure of the spectrum of 0 to 4 kHz is weaker, the higher the subband, the lower the subband. There is a high possibility that a portion similar to is present in the low frequency portion of the first decoded spectrum.
- the search unit 263 has a lower frequency band whose harmonic structure is weaker than the high frequency region of the first decoded spectrum. Since the part similar to the high-frequency subband is searched for the part, the search efficiency is improved.
- the decoding spectrum obtained from the TDAC encoding part 287 in the 1st layer encoding part 233 is made into an example as a 1st decoding spectrum.
- the spectrum of the 0 to 4 kHz portion of the first decoded spectrum is a component obtained by subtracting the CELP decoded signal calculated by the CELP encoding unit 283 from the input signal, and the harmonic structure is relatively weak. For this reason, a method of setting the search range so as to be biased to a lower frequency region is effective for the higher frequency subbands.
- the pitch coefficient setting unit 294 searches for the optimum pitch coefficient T p searched for in the immediately preceding subband (adjacent low band side subband) only for the second subband and the fourth subband.
- the pitch coefficient T is set based on ⁇ 1 ′. That is, the pitch coefficient setting unit 294 sets the pitch coefficient T based on the optimum pitch coefficient T p ⁇ 1 ′ searched for the immediately preceding subband with respect to the subbands separated by one subband.
- the influence of the search result in the low frequency subband on the search in all the subbands higher than that subband can be reduced, so the pitch coefficient set for the high frequency subband. It can be avoided that the value of T becomes too large.
- the search range for searching for a similar portion in the high frequency sub-band is limited to the high frequency. Accordingly, it is possible to avoid searching for the optimum pitch coefficient in a band that is unlikely to be similar, and to avoid deterioration in encoding efficiency and deterioration in quality of the decoded signal.
- FIG. 18 is a block diagram showing a main configuration inside decoding apparatus 163 according to the present embodiment.
- Decoding apparatus 163 according to the present embodiment mainly includes encoded information separation section 171, first layer decoding section 172, second layer decoding section 173, orthogonal transform processing section 174, and addition section 175.
- the encoded information separating unit 171 separates the first layer encoded information and the second layer encoded information from the input encoded information, and the first layer encoded information is subjected to the first layer decoding. And outputs the second layer encoded information to second layer decoding section 173.
- the first layer decoding unit 172 applies G.R to the first layer encoded information input from the encoded information separation unit 171.
- the decoding is performed using a speech encoding method of the 729.1 system, and the generated first layer decoded signal is output to the adding unit 175.
- first layer decoding section 172 outputs the first layer decoded spectrum obtained in the process of generating the first layer decoded signal to second layer decoding section 173. A detailed description of the operation of the first layer decoding unit 172 will be described later.
- Second layer decoding section 173 uses the first layer decoded spectrum input from first layer decoding section 172 and the second layer encoded information input from encoded information separation section 171 to convert the spectrum of the high frequency section.
- the decoded second layer decoded spectrum is output to the orthogonal transform processing unit 174.
- the processing of the second layer decoding unit 173 is the same as that of the second layer decoding unit 135 of FIG. 7 except that the input signal is different from the source of the signal, and detailed description thereof is omitted. A detailed description of the operation of second layer decoding section 173 will be given later.
- Orthogonal transformation processing section 174 performs orthogonal transformation processing (IMDCT) on the second layer decoded spectrum input from second layer decoding section 173 and outputs the obtained second layer decoded signal to addition section 175.
- IMDCT orthogonal transformation processing
- the operation of the orthogonal transformation processing unit 174 is the same as the processing of the orthogonal transformation processing unit 356 shown in FIG. 8 except that the input signal and the source of the signal are different. Description is omitted.
- Adder 175 adds the first layer decoded signal input from first layer decoding section 172 and the second layer decoded signal input from orthogonal transform processing section 174, and outputs the resulting signal as an output signal.
- FIG. 19 is a block diagram showing the main components inside first layer decoding section 172 shown in FIG.
- the first layer decoding unit 172 is standardized by ITU-T.
- ITU-T A description will be given by taking as an example a configuration for performing decoding in the 729.1 system.
- the configuration of first layer decoding section 172 shown in FIG. 19 is a configuration in the case where no frame error occurs during transmission, and the components for frame error compensation processing are not shown and description thereof is omitted.
- the present invention can also be applied when a frame error occurs.
- the first layer decoding unit 172 includes a separation unit 371, a CELP decoding unit 372, a TDBWE decoding unit 373, a TDAC decoding unit 374, a pre / post echo reduction unit 375, an addition unit 376, an adaptive post processing unit 377, a low pass filter 378, / Post-echo reduction unit 379, high-pass filter 380 and band synthesis processing unit 381 are provided, and each unit performs the following operations.
- the separation unit 371 separates the first layer encoded information input from the encoded information separation unit 171 (FIG. 18) into CELP parameters, TDAC parameters, and TDBWE parameters, and outputs the CELP parameters to the CELP decoding unit 372.
- the TDAC parameter is output to the TDAC decoding unit 374
- the TDBWE parameter is output to the TDBWE decoding unit 373. Note that these parameters may be separated together in the encoded information separation unit 171 without providing the separation unit 371.
- the CELP decoding unit 372 performs CELP decoding using the CELP parameter input from the separation unit 371, and uses the obtained decoded signal as a decoded CELP signal as a TDAC decoding unit 374, an addition unit 376, and a pre / post-echo reduction unit 375. Output to.
- the CELP decoding unit 372 may output other information obtained in the process of generating the decoded CELP signal from the CELP parameter to the TDAC decoding unit 374 in addition to the decoded CELP signal.
- the TDBWE decoding unit 373 decodes the TDBWE parameter input from the separation unit 371, and outputs the obtained decoded signal to the TDAC decoding unit 374 and the pre / post-echo reduction unit 379 as a decoded TDBWE signal.
- the TDAC decoding unit 374 calculates the first layer decoded spectrum using the TDAC parameter input from the separation unit 371, the decoded CELP signal input from the CELP decoding unit 372, and the decoded TDBWE signal input from the TDBWE decoding unit 373. To do. Then, the TDAC decoding unit 374 outputs the calculated first layer decoded spectrum to the second layer decoding unit 173 (FIG. 18). Note that the first layer decoded spectrum obtained here is the same as the first layer decoded spectrum calculated by the first layer encoding unit 233 (FIG. 15) in the encoding device 161.
- the TDAC decoding unit 374 performs orthogonal transform processing such as MDCT on the 0 to 4 kHz band and the 4 to 8 kHz band of the calculated first layer decoded spectrum, respectively, and the decoded first TDAC signal (0 to 4 kHz band) and the decoding The second TDAC signal (4 to 8 kHz band) is calculated.
- the TDAC decoding unit 374 outputs the calculated decoded first TDAC signal to the pre / post-echo reduction unit 375, and outputs the decoded second TDAC signal to the pre / post-echo reduction unit 379.
- the pre / post-echo reduction unit 375 performs a process of reducing pre / post-echo on the decoded CELP signal input from the CELP decoding unit 372 and the decoded first TDAC signal input from the TDAC decoding unit 374, thereby deleting the echo.
- the later signal is output to the adder 376.
- Adder 376 adds the decoded CELP signal input from CELP decoder 372 and the signal after echo reduction input from pre / post-echo reducer 375, and outputs the resulting added signal to adaptive post processor 377. .
- the adaptive post-processing unit 377 adaptively performs post-processing on the addition signal input from the addition unit 376, and outputs the obtained decoded first low-frequency signal (0 to 4 kHz band) to the low-pass filter 378.
- the low-pass filter 378 suppresses frequency components larger than 4 kHz with respect to the decoded first low-frequency signal input from the adaptive post-processing unit 377, obtains a signal mainly composed of frequency components of 4 kHz or less, and performs post-filter decoding first low-frequency signal.
- the band signal is output to the band synthesis processing unit 381 as a band signal.
- the pre / post-echo reduction unit 379 performs processing for reducing pre / post-echo on the decoded second TDAC signal input from the TDAC decoding unit 374 and the decoded TDBWE signal input from the TDBWE decoding unit 373, thereby reducing the echo.
- the later signal is output to the high-pass filter 380 as a decoded second low-frequency signal (4 to 8 kHz band).
- the high-pass filter 380 suppresses frequency components of 4 kHz or less with respect to the decoded second low-frequency signal input from the pre / post-echo reduction unit 379, obtains a signal mainly composed of frequency components higher than 4 kHz, and performs decoding after filtering. 2 is output to the band synthesis processing unit 381 as a low frequency signal.
- the band synthesis processing unit 381 receives the filtered decoded first low-frequency signal from the low-pass filter 378 and receives the filtered decoded second low-frequency signal from the high-pass filter 380.
- the band synthesis processing unit 381 performs band synthesis processing on the filtered first decoded low-frequency signal (0 to 4 kHz band) and the filtered second decoded low-frequency signal (4 to 8 kHz band) both having a sampling frequency of 8 kHz.
- the first layer decoded signal having a sampling frequency of 16 kHz (0 to 8 kHz band) is generated.
- Band synthesis processing section 381 then outputs the generated first layer decoded signal to addition section 175.
- band synthesis processing may be performed collectively by the addition unit 175 without providing the band synthesis processing unit 381.
- Decoding in first layer decoding section 172 according to the present embodiment shown in FIG. 19 is performed by TDAC decoding section 374 at the time when the first layer decoded spectrum is calculated from the TDAC parameter, to second layer decoding section 173. Only the point of output is G.C. This is different from the decoding in the 729.1 system.
- FIG. 20 is a block diagram showing the main configuration inside second layer decoding section 173 shown in FIG.
- the internal configuration of second layer decoding section 173 shown in FIG. 20 is a configuration in which orthogonal transform processing section 356 is omitted from second layer decoding section 135 shown in FIG. In 2nd layer decoding part 173, since it is the same as that in 2nd layer decoding part 135 about components other than filtering part 390 and spectrum adjustment part 391, description is omitted.
- the filtering unit 390 includes a multi-tap pitch filter (the number of taps is greater than 1).
- the filtering unit 390 also uses the filter function shown in Expression (15). However, in this case, the filtering process and the filter function are obtained by replacing T in Equation (15) and Equation (16) with T p ′.
- the filtering process is performed according to an equation in which T is replaced with T p ′′ in equation (16).
- subband SB p 1, 2,..., P ⁇ 1
- subband SB p ⁇ 1 is added to pitch coefficient T p ⁇ 1 ′ of subband SB p ⁇ 1.
- the added bandwidth BW p-1 by adding T p 'to the index obtained by subtracting half the value of the search range sEARCH, and pitch coefficient T p ".
- the spectrum adjustment unit 391 adjusts the spectrum shape in the frequency band FL ⁇ k ⁇ FH of the estimated spectrum S2 ′ (k), and generates the decoded spectrum S3 (k).
- the spectrum adjustment unit 391 sets the value of the low frequency part (0 ⁇ k ⁇ FL) of the decoded spectrum S3 (k) to 0.
- the spectrum adjustment unit 391 outputs a decoded spectrum in which the value of the low band part (0 ⁇ k ⁇ FL) is 0 to the orthogonal transform processing unit 174.
- the high frequency band is divided into a plurality of subbands, For some subbands (in this embodiment, the first subband, the third subband, and the fifth subband), a search is performed in a search range set for each subband. For other subbands (second subband and fourth subband in the present embodiment), a search is performed using the encoding result of the immediately preceding subband.
- a search is performed using the encoding result of the immediately preceding subband.
- the sampling frequency of the input signal is 32 kHz, as in the fourth embodiment, and the G.264 standardized by ITU-T is used as the encoding method of the first layer encoding unit. A configuration when the 729.1 scheme is applied will be described.
- a communication system (not shown) according to the fifth embodiment of the present invention is basically the same as the communication system shown in FIG. 2, and only a part of the configuration and operation of the encoding device and decoding device is shown in FIG. 2 is different from the encoding device 101 and the decoding device 103 of the communication system 2.
- the encoding device and the decoding device of the communication system according to the present embodiment are denoted by reference numerals “181” and “184”, respectively.
- Encoding apparatus 181 (not shown) according to the present embodiment is basically the same as encoding apparatus 161 shown in FIG. 15, and includes downsampling processing section 201, first layer encoding section 233, and orthogonal transform.
- the processing unit 215, the second layer encoding unit 246, and the encoded information integration unit 207 are mainly configured.
- the components other than second layer encoding section 246 are the same as those in the fourth embodiment, description thereof will be omitted.
- Second layer encoding section 246 generates second layer encoded information using the input spectrum input from orthogonal transform processing section 215 and the first layer decoded spectrum input from first layer encoding section 233. The generated second layer encoded information is output to encoded information integration section 207. Details of second layer encoding section 246 will be described later.
- FIG. 21 is a block diagram showing a main configuration inside second layer encoding section 246 according to the present embodiment.
- the constituent elements other than the pitch coefficient setting unit 404 are the same as those in the fourth embodiment, and thus the description thereof is omitted.
- the high band part (FL ⁇ k ⁇ FH) of the input spectrum S2 (k) is divided into five subbands SB p.
- Pitch coefficient setting section 404 presets a pitch coefficient search range for some subbands among a plurality of subbands, and adjacent subbands for the other subbands.
- a pitch coefficient search range is set based on the search result corresponding to.
- the pitch coefficient setting unit 404 controls the first subband SB 0 , the third subband SB 2, or the fifth subband SB 4 (subband SB) together with the filtering unit 262 and the search unit 263 under the control of the search unit 263.
- the pitch coefficient setting unit 404 uses the pitch coefficient T as the search range Tmin1 set in advance for the first subband. Set while gradually changing within ⁇ Tmax1.
- pitch coefficient setting section 404 when performing the search processing of the closed loop corresponding to the third sub-band SB 2 is a pitch coefficient T, the third preset search range for the sub-band Tmin3 ⁇ Tmax3 Set while changing little by little.
- the pitch coefficient setting unit 404 uses the pitch coefficient T as a search range Tmin5 to Tmax5 preset for the fifth subband. Set while changing little by little.
- the pitch is determined based on the optimum pitch coefficient T p-1 ′ obtained in the closed loop search process corresponding to the immediately preceding subband SB p ⁇ 1.
- the coefficient T is sequentially output to the filtering unit 262 while being changed little by little.
- pitch coefficient setting section 404 when performing a search process of a closed loop corresponding to the second subband SB 1, the optimal pitch coefficient of the first subband SB 0 is adjacent preceding sub-band T
- the pitch coefficient T is set while gradually changing within the search range calculated according to the equation (27).
- the search range in which the pitch coefficient T is calculated according to the equation (28). Set while changing little by little.
- P 1 in Expression (27) and Expression (28).
- SEARCH1 and SEARCH2 in Expression (27) and Expression (28) indicate a predetermined setting range of the search pitch coefficient. In the following, a case where SEARCH1> SEARCH2 is described.
- equations (31) and (31) The range of the pitch coefficient T is corrected as shown in (32).
- Expression (31) corresponds to Expression (27) and Expression (30)
- Expression (32) corresponds to Expression (28) and Expression (29), respectively.
- equations (33) and ( The range of the pitch coefficient T is corrected as shown in 34).
- Expression (33) corresponds to Expression (27) and Expression (30)
- Expression (34) corresponds to Expression (28) and Expression (29).
- the pitch coefficient setting unit 404 adaptively changes the number of entries when searching for the optimum pitch for the second subband and the fourth subband. That is, the pitch coefficient setting unit 404 increases the number of entries when searching for the optimal pitch for the second subband when the optimal pitch coefficient T 0 ′ of the first subband is smaller than a preset threshold (pattern 1). When the optimum pitch coefficient T 0 ′ of the first subband is equal to or greater than the threshold, the number of entries when searching for the optimum pitch for the second subband is reduced (pattern 2). Further, pitch coefficient setting section 404 increases or decreases the number of entries when searching for the optimum pitch of the fourth subband according to the patterns (pattern 1 and pattern 2) when searching for the optimum pitch of the second subband.
- the pitch coefficient setting unit 404 reduces the number of entries when searching for the optimum pitch of the fourth subband in the case of pattern 1, and reduces the number of entries when searching for the optimum pitch of the fourth subband in the case of pattern 2. Increase the number of entries.
- the bit rate is set by equalizing the total number of entries when searching for the optimal pitch of the second subband and the number of entries when searching for the optimal pitch of the fourth subband. It is possible to search for an optimum pitch coefficient more efficiently while keeping it fixed.
- the first layer decoded spectrum is generally characterized in that when the input signal is an audio signal or the like, the lower frequency side is more periodic. Therefore, the effect of increasing the number of entries at the time of searching increases as the bandwidth for searching for the optimum pitch coefficient is lower. Therefore, as described above, when the value of the optimum pitch coefficient searched for the first subband is small, the number of entries when searching for the optimum pitch for the second subband is increased to increase the second subband. It is possible to search for the optimum pitch that is more effective for the band. At this time, the number of entries when searching for the optimum pitch coefficient for the fourth subband is reduced.
- the optimum pitch coefficient searched for the first subband when the value of the optimum pitch coefficient searched for the first subband is large, the effect is small even if the number of entries for searching the optimum pitch coefficient for the second subband is increased.
- the number of entries when searching for the optimal pitch coefficient is reduced, and the number of entries when searching for the optimal pitch coefficient for the fourth subband is increased.
- the number of entries (bit allocation) at the time of searching for the optimum pitch coefficient is adjusted between the second subband and the fourth subband according to the value of the optimum pitch coefficient searched for the first subband.
- the optimum pitch coefficient can be searched more efficiently, and a decoded signal with high quality can be generated.
- the main internal configuration of the decoding device 184 (not shown) according to the present embodiment is basically the same as that of the decoding device 163 shown in FIG.
- the high frequency band is divided into a plurality of subbands, For some subbands (in this embodiment, the first subband, the third subband, and the fifth subband), a search is performed in a search range set for each subband. For other subbands (second subband and fourth subband in the present embodiment), a search is performed using the encoding result of the immediately preceding subband.
- the number of entries for search is adaptively switched based on the optimum pitch searched for the first subband.
- the present invention is not limited to this, and can be similarly applied to a configuration in which the total number of entries when searching for the optimum pitch coefficient for the second subband and the fourth subband is different for each pattern.
- the case where the number of entries when searching for the optimum pitch coefficient for the second subband and the fourth subband increases or decreases is described as an example.
- the search is increased by increasing the number of search entries.
- the range is the entire low range.
- the value of the optimal pitch coefficient T 0 ′ of the first subband is set in advance. the case is less than a defined threshold TH p (pattern 1), by increasing the number of search entries optimal pitch coefficient for the second subband (by widening the search range), the search for the optimal pitch coefficient for the fourth subband.
- TH p threshold 1
- the present invention is not limited to the above-described configuration, and can be similarly applied to a configuration in which reverse search range setting methods are used for the first subband pattern 1 and pattern 2.
- the present invention when the value of the optimal pitch coefficient T 0 ′ of the first subband is less than the predetermined threshold TH p (pattern 1), the number of search entries for the optimal pitch coefficient of the second subband
- the present invention can be similarly applied to a configuration in which the number of search entries for the optimum pitch coefficient of the fourth subband is increased (the search range is widened).
- a search range setting method opposite to the above is adopted.
- it is possible to efficiently encode an input signal having greatly different spectral characteristics on the low frequency side and the high frequency side in the low frequency part. Specifically, it has been experimentally confirmed that the spectrum is composed of a plurality of peak components, and that an input signal having such characteristics that the density of the peak components greatly varies depending on the band can be efficiently quantized. ing.
- the sampling frequency of the input signal is 32 kHz
- the G. standardized by ITU-T is used as the encoding method of the first layer encoding unit. A configuration when the 729.1 scheme is applied will be described.
- a communication system (not shown) according to the sixth embodiment of the present invention is basically the same as the communication system shown in FIG. 2, and only a part of the configuration and operation of the encoding device and decoding device is shown in FIG. 2 is different from the encoding device 101 and the decoding device 103 of the communication system 2.
- the encoding device and the decoding device of the communication system according to the present embodiment will be described with reference numerals “191” and “193”, respectively.
- Encoding apparatus 191 (not shown) according to the present embodiment is basically the same as encoding apparatus 161 shown in FIG. 15, and includes downsampling processing section 201, first layer encoding section 233, and orthogonal transform.
- the processing unit 215, the second layer encoding unit 256, and the encoded information integration unit 207 are mainly configured.
- the constituent elements other than second layer encoding section 256 are the same as those in the fourth embodiment, and thus the description thereof is omitted.
- Second layer encoding section 256 generates second layer encoded information using the input spectrum input from orthogonal transform processing section 215 and the first layer decoded spectrum input from first layer encoding section 233. The generated second layer encoded information is output to encoded information integration section 207. Details of second layer encoding section 256 will be described later.
- FIG. 22 is a block diagram showing the main components inside second layer encoding section 256 according to the present embodiment.
- the constituent elements other than the pitch coefficient setting unit 414 are the same as those in the fourth embodiment, and thus description thereof is omitted.
- the high frequency part (FL ⁇ k ⁇ FH) of the input spectrum S2 (k) is divided into five subbands SB p.
- Pitch coefficient setting section 414 presets a pitch coefficient search range for some subbands among a plurality of subbands, and adjacent subbands for other subbands.
- a pitch coefficient search range is set based on the search result corresponding to.
- the pitch coefficient setting unit 414 controls the first subband SB 0 , the third subband SB 2, or the fifth subband SB 4 (subband SB) together with the filtering unit 262 and the search unit 263 under the control of the search unit 263.
- the pitch coefficient setting unit 414 sets the pitch coefficient T to the search range Tmin1 preset for the first subband. Set while gradually changing within ⁇ Tmax1.
- pitch coefficient setting section 414 when performing the search processing of the closed loop corresponding to the third sub-band SB 2 is a pitch coefficient T, the third preset search range for the sub-band Tmin3 ⁇ Tmax3 Set while changing little by little. Similarly, when performing the closed loop search process corresponding to the fifth subband SB 4 , the pitch coefficient setting unit 414 sets the pitch coefficient T to the search range Tmin5 to Tmax5 preset for the fifth subband. Set while changing little by little.
- the pitch is determined based on the optimum pitch coefficient T p-1 ′ obtained in the closed loop search process corresponding to the immediately preceding subband SB p ⁇ 1.
- the coefficient T is sequentially output to the filtering unit 262 while being changed little by little.
- the pitch coefficient setting unit 414 performs the optimal pitch coefficient T of the first subband SB 0 that is the immediately preceding subband.
- the pitch coefficient T is set while gradually changing within the search range calculated according to the equation (9).
- P 1.
- the pitch coefficient T is gradually increased within a preset search range Tmin2 to Tmax2. Set while changing.
- pitch coefficient setting section 414 when performing a search process of a closed loop corresponding to the fourth sub-band SB 3, the threshold TH p value of optimal pitch coefficient T 0 'of the first subband SB 0 is predetermined If it is less than the pitch range, the pitch coefficient T is calculated from the optimum pitch coefficient T 2 ′ of the third subband SB 2 , which is the immediately preceding subband, within the search range calculated according to the equation (9). Set while changing little by little.
- Equation (9) 3.
- the pitch coefficient T is gradually increased within a preset search range Tmin4 to Tmax4. Set while changing.
- Pitch coefficient setting section 414 obtains the setting of the search range at the time of searching for the optimum pitch for the second subband and the fourth subband in the search processing of the closed loop corresponding to the adjacent subband SB p ⁇ 1.
- the adaptive pitch coefficient T p-1 ′ is adaptively changed. That is, pitch coefficient setting section 414, when the optimal pitch coefficients are searched against the previous one adjacent subband SB p-1 T p-1 'is less than the threshold value only, optimal pitch coefficient T p The optimum pitch coefficient is searched for the range based on ⁇ 1 ′.
- the pitch coefficient setting unit 414 When the optimum pitch coefficient T p-1 ′ searched for the immediately preceding subband SB p ⁇ 1 adjacent to the adjacent subband SB p ⁇ 1 is equal to or greater than the threshold, the pitch coefficient setting unit 414 The optimum pitch coefficient is searched for the range. With such a configuration, it is possible to suppress abnormal noise that occurs due to the search range of the optimum pitch being biased to a high range, and as a result, the quality of the decoded signal can be improved.
- Decoding apparatus 193 (not shown) according to the present embodiment is basically the same as decoding apparatus 163 shown in FIG. 18, and includes encoded information separation section 171, first layer decoding section 172, and second layer decoding. Unit 183, orthogonal transform processing unit 174, and addition unit 175.
- the components other than the second layer decoding unit 183 are the same as those in the fourth embodiment, and thus description thereof is omitted.
- FIG. 23 is a block diagram showing a main configuration inside second layer decoding section 183 according to the present embodiment.
- the components other than the filtering unit 490 are the same as those in the fourth embodiment, and thus the description thereof is omitted.
- the filtering unit 490 includes a multi-tap pitch filter (the number of taps is greater than 1).
- the filtering unit 490 also uses the filter function shown in Expression (15). However, in this case, the filtering process and the filter function are obtained by replacing T in Equation (15) and Equation (16) with T p ′.
- the equation (18 ) To calculate a pitch coefficient T p ′′ used for filtering.
- the filtering process is performed according to an equation in which T is replaced with T p ′′ in equation (16).
- the value of the pitch coefficient obtained from the separation unit 351 is a predetermined threshold TH p.
- the filtering process and the filter function are obtained by replacing T in Equation (15) and Equation (16) with T p ′.
- the high frequency band is divided into a plurality of subbands, For some subbands (in this embodiment, the first subband, the third subband, and the fifth subband), a search is performed in a search range set for each subband. For other subbands (second subband and fourth subband in the present embodiment), a search is performed using the encoding result of the immediately preceding subband.
- the number of entries for search is adaptively switched based on the optimum pitch searched for the first subband.
- the first layer encoding unit and the first layer decoding unit perform G.
- the case where the 729.1 encoding / decoding method is used has been described as an example.
- the encoding method / decoding method used in the first layer encoding unit and the first layer decoding unit in the present invention is G.264.
- the present invention is not limited to the 729.1 encoding / decoding method.
- G As the encoding method / decoding method used in the first layer encoding unit and the first layer decoding unit, G.
- the present invention can be similarly applied to configurations employing other encoding / decoding methods such as 718.
- Embodiments 4 to 6 the case where information obtained in the first layer encoding unit (decoded spectrum of TDAC parameter obtained by TDAC encoding unit 287) is used as the first layer decoding spectrum will be described. did.
- the present invention is not limited to this, and can be similarly applied to a case where other information calculated inside the first layer encoding unit is used as the first layer decoded spectrum.
- the present invention also applies to the case where the first layer decoded signal obtained by decoding the first layer encoded information is subjected to processing such as orthogonal transformation and the calculated spectrum is used as the first layer decoded spectrum. Applicable to.
- the present invention is not limited to the characteristics of the first layer decoded spectrum, but is a parameter calculated inside the first layer encoding unit or a decoded signal obtained by decoding the first layer encoded information. The same effect can be obtained when all the spectra calculated from the above are used as the first layer decoded spectrum.
- search ranges preset in some subbands in this embodiment, the first subband, the third subband, and the fifth subband
- search ranges preset in some subbands The case where it differs for each band has been described as an example.
- the present invention is not limited to this, and a common search range may be set for all subbands or some subband groups.
- the case where the variation amount of the spectrum power with the input spectrum is encoded for each subband has been described as an example.
- the present invention is not limited to this, and the gain encoding unit 265 may encode the ideal gain corresponding to the optimum pitch coefficient T p ′ calculated by the search unit 263.
- the subband configuration of the gain encoded by the gain encoding unit 265 is the same as the subband configuration at the time of filtering. With this configuration, it is possible to generate an estimated spectrum that approximates the high frequency part of the input spectrum, and to reduce the noise that can be included in the decoded signal.
- the decoding signal of the second layer is always used as the output signal on the decoding side
- the present invention is not limited to this, and the decoding signal of the first layer and the second layer
- the decoded signal of the layer may be switched to be an output signal. For example, when a part of the encoded information is lost in the transmission path or a transmission error occurs in the encoded information, only the decoded signal by the first layer decoding may be obtained. In such a case, the decoded signal of the first layer is output as an output signal.
- the apparatus may be a scalable encoding apparatus / decoding apparatus having three or more layers.
- a common range called SEARCH is set for each subband as a range of pitch coefficients set by the pitch coefficient setting units 264 and 274 in order to search for an optimum pitch coefficient corresponding to each subband.
- the pitch coefficient range set by the pitch coefficient setting units 264, 274, 294, 404, and 414 for searching for the optimum pitch coefficient corresponding to each subband is different for each subband.
- the configuration around the position obtained by adding the previous subband width to the optimum pitch coefficient of the previous subband ( ⁇ SEARCH range) has been described using a common range called SEARCH.
- SEARCH range a common range
- the present invention is not limited to this, and can be similarly applied to a configuration in which an asymmetric range is used as a search range for the optimal pitch coefficient with respect to a position obtained by adding the previous subband width to the optimal pitch coefficient of the previous subband. .
- the low frequency side is widened from the position obtained by adding the front subband width to the optimum pitch coefficient of the previous subband, and the search range is set narrow on the high frequency side.
- a configuration has been described in which for several subbands, a range for searching for an optimal pitch coefficient is set based on the optimal pitch coefficient for an adjacent previous subband.
- the above method uses a correlation on the frequency axis for the optimum pitch coefficient.
- the present invention is not limited to this, and can be similarly applied to the case where the correlation on the time axis is used for the optimum pitch coefficient.
- the periphery is set as the optimum pitch coefficient search range. To do. In this case, the vicinity of the position obtained by the fourth-order linear prediction is searched.
- the correlation on the time axis as described above and the correlation on the frequency axis described in the above embodiments can be used in combination.
- the search range of the optimum pitch coefficient is set for a certain subband based on the optimum pitch coefficient searched for in the past frame and the optimum pitch coefficient searched for the adjacent previous subband.
- the search range for the optimal pitch coefficient is set using the correlation on the time axis, there is a problem that transmission errors propagate.
- the search range for the optimal pitch coefficient is set without being based on the correlation on the time axis. This can be dealt with by providing a frame (for example, every time four frames are processed, a frame that does not use the correlation on the time axis is set).
- the encoding device, the decoding device, and these methods according to the present invention are not limited to the above embodiments, and can be implemented with various modifications.
- each embodiment can be implemented in combination as appropriate.
- the decoding device in each of the above embodiments performs processing using the encoded information transmitted from the encoding device in each of the above embodiments
- the present invention is not limited to this, and necessary parameters As long as the encoded information includes data and data, the processing is not necessarily performed by the encoded information from the encoding device in each of the above embodiments.
- the present invention can also be applied to a case where a signal processing program is recorded and written on a machine-readable recording medium such as a memory, a disk, a tape, a CD, or a DVD, and the operation is performed. Actions and effects similar to those of the form can be obtained.
- each functional block used in the description of each of the above embodiments is typically realized as an LSI which is an integrated circuit. These may be individually made into one chip, or may be made into one chip so as to include a part or all of them.
- the name used here is LSI, but it may also be called IC, system LSI, super LSI, or ultra LSI depending on the degree of integration.
- the method of circuit integration is not limited to LSI, and implementation with a dedicated circuit or a general-purpose processor is also possible.
- An FPGA Field Programmable Gate Array
- a reconfigurable / processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.
- the encoding device, the decoding device, and these methods according to the present invention can improve the quality of the decoded signal when performing band extension using the low-band spectrum and estimating the high-band spectrum, For example, it can be applied to a packet communication system, a mobile communication system, and the like.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
図2は、本発明の実施の形態1に係る符号化装置および復号装置を有する通信システムの構成を示すブロック図である。図2において、通信システムは、符号化装置と復号装置とを備え、それぞれ伝送路を介して通信可能な状態となっている。なお、符号化装置および復号装置はいずれも、通常、基地局装置あるいは通信端末装置等に搭載されて用いられる。
本発明の実施の形態2は、第1レイヤ符号化部に、実施の形態1で示したCELP方式の符号化方法を用いず、MDCTなどの変換符号化を用いる場合について説明する。
本発明の実施の形態3は、高域部のサブバンド間の相関の度合いを分析し、分析結果に基づき、隣接するサブバンドの最適ピッチ周期を利用した探索を行うか否かを切り替える構成について説明する。
本発明の実施の形態4は、入力信号のサンプリング周波数が32kHzであり、第1レイヤ符号化部の符号化方式として、ITU-Tで規格化されているG.729.1方式を適用する場合の構成について説明する。
本発明の実施の形態5は、実施の形態4と同様に入力信号のサンプリング周波数が32kHzであり、第1レイヤ符号化部の符号化方式として、ITU-Tで規格化されているG.729.1方式を適用する場合の構成について説明する。
本発明の実施の形態6は、実施の形態4と同様に入力信号のサンプリング周波数が32kHzであり、第1レイヤ符号化部の符号化方式として、ITU-Tで規格化されているG.729.1方式を適用する場合の構成について説明する。
Claims (22)
- 入力信号の所定周波数以下の低域部分を符号化して第1符号化情報を生成する第1符号化手段と、
前記第1符号化情報を復号して復号信号を生成する復号手段と、
前記入力信号の前記所定周波数より高い高域部分を複数のサブバンドに分割し、前記入力信号または前記復号信号から、前記複数のサブバンドのそれぞれを、隣接するサブバンドの推定結果を用いて推定することにより第2符号化情報を生成する第2符号化手段と、
を具備する符号化装置。 - 前記第2符号化手段は、
前記入力信号の前記高域部分をN(Nは1より大きい整数)個のサブバンドに分割し、前記N個のサブバンドそれぞれの開始位置と帯域幅とを帯域分割情報として得る分割手段と、
前記復号信号をフィルタリングして、第1推定信号から第N推定信号までのN個の第n(n=1,2,…,N)推定信号を生成するフィルタリング手段と、
前記フィルタリング手段に用いられるピッチ係数を変化させながら設定する設定手段と、
前記ピッチ係数のうち、前記第n推定信号と、第nサブバンドとの類似度合いを最も大きくするものを第n最適ピッチ係数として探索する探索手段と、
第1最適ピッチ係数から第N最適ピッチ係数までのN個の最適ピッチ係数と、前記帯域分割情報とを多重化して前記第2符号化情報を得る多重化手段と、
を具備し、
前記設定手段は、
第1サブバンドを推定するために前記フィルタリング手段に用いられるピッチ係数を、所定の範囲で変化させながら設定し、第2サブバンド以降の第m(m=2,3,…,N)サブバンドを推定するために前記フィルタリング手段に用いられるピッチ係数を、第m-1最適ピッチ係数に応じた範囲、または前記所定の範囲で変化させながら設定する、
請求項1記載の符号化装置。 - 前記設定手段は、
前記第m-1最適ピッチ係数を含む所定幅の範囲を、前記第m-1最適ピッチ係数に応じた範囲として前記ピッチ係数を設定する、
請求項2記載の符号化装置。 - 前記設定手段は、
前記第m-1最適ピッチ係数に前記第m-1サブバンドの帯域幅を加算したピッチ係数を含む所定幅の範囲を、前記第m-1最適ピッチ係数に応じた範囲として前記ピッチ係数を設定する、
請求項2記載の符号化装置。 - 前記設定手段は、
前記第2サブバンド以降のすべての第mサブバンドそれぞれを推定するために前記フィルタリング手段に用いられるピッチ係数を、前記第m-1最適ピッチ係数に応じた範囲で変化させながら設定する、
請求項2記載の符号化装置。 - 前記設定手段は、
前記第2サブバンド以降の第mサブバンドのうち、所定数おきの第mサブバンドを推定するために前記フィルタリング手段に用いられるピッチ係数を、前記所定の範囲で変化させながら設定し、それ以外の第mサブバンドを推定するために前記フィルタリング手段に用いられるピッチ係数を、前記第m-1最適ピッチ係数に応じた範囲で変化させながら設定する、
請求項2記載の符号化装置。 - 前記設定手段は、
前記複数のサブバンドのうち、高域のサブバンドほど前記復号信号のより低い帯域を前記所定の範囲として前記ピッチ係数を設定する、
請求項2記載の符号化装置。 - 前記設定手段は、
前記複数のサブバンドのうち、高域のサブバンドほど前記復号信号のより高い帯域を前記所定の範囲として前記ピッチ係数を設定する、
請求項2記載の符号化装置。 - 前記第mサブバンドと第m-1サブバンドとの相関を第m相関として算出し、N-1個の前記第m相関それぞれが所定レベル以上であるか否かを判定する判定手段、
をさらに具備し、
前記設定手段は、
前記判定手段において前記第m相関が所定レベル以上であると判定された前記第mサブバンドを推定するために前記フィルタリング手段に用いられる前記ピッチ係数を、前記第m-1最適ピッチ係数に応じた範囲で変化させながら設定し、
前記判定手段において前記第m相関が所定レベルより低いと判定された前記第mサブバンドを推定するために前記フィルタリング手段に用いられる前記ピッチ係数を、前記所定の範囲で変化させながら設定する、
請求項2記載の符号化装置。 - 前記第mサブバンドと前記第m-1サブバンドとの相関を第m相関として算出し、N-1個の前記第m相関のうち、所定レベル以上となる前記第m相関の数が所定数以上であるか否かを判定する判定手段、
をさらに具備し、
前記設定手段は、
前記判定手段において前記所定レベル以上となる前記第m相関の数が所定数以上であると判定した場合には、前記第2サブバンド以降のすべての前記第mサブバンドそれぞれを推定するために前記フィルタリング手段に用いられる前記ピッチ係数を、前記第m-1最適ピッチ係数に応じた範囲で変化させながら設定し、
前記判定手段において前記所定レベル以上となる前記第m相関の数が所定数より小さいと判定した場合には、前記第2サブバンド以降のすべての前記第mサブバンドそれぞれを推定するために前記フィルタリング手段に用いられる前記ピッチ係数を、前記所定の範囲で変化させながら設定する、
請求項2記載の符号化装置。 - 前記判定手段は、
前記N個のサブバンドそれぞれのSFM(SpectralFlatness Measure)を算出し、前記第mサブバンドと前記第m-1サブバンドとのSFMの差または比の絶対値の逆数を前記第m相関として算出する、
請求項9記載の符号化装置。 - 前記判定手段は、
前記N個のサブバンドそれぞれのエネルギを算出し、前記第mサブバンドと前記第m-1サブバンドとの前記エネルギの差または比の絶対値の逆数を前記第m相関として算出する、
請求項9記載の符号化装置。 - 前記設定手段は、
前記第m-1最適ピッチ係数の値を予め設定した閾値と比較し、比較結果に応じて、前記第mサブバンドを推定するために前記フィルタリング手段に用いられるピッチ係数を探索する際のエントリ数を増加または減少させる、
請求項2記載の符号化装置。 - 前記設定手段は、
前記第m-1最適ピッチ係数の値を予め設定した閾値と比較し、比較結果に応じて、前記第mサブバンドを推定するために前記フィルタリング手段に用いられるピッチ係数の設定方法を切り替える、
請求項2記載の符号化装置。 - 前記設定手段は、
前記所定の範囲で変化させながら設定する方法と、前記第m-1最適ピッチ係数に応じた範囲で変化させながら設定する方法とを切り替える、
請求項14記載の符号化装置。 - 請求項1記載の符号化装置を具備する通信端末装置。
- 請求項1記載の符号化装置を具備する基地局装置。
- 符号化装置において生成された、入力信号の所定周波数以下の低域部分を符号化して得られる第1符号化情報と、前記入力信号の前記所定周波数より高い高域部分を複数のサブバンドに分割し、前記入力信号、または、前記第1符号化情報を復号して得られる第1復号信号から、前記複数のサブバンドのそれぞれを、隣接するサブバンドの推定結果を用いて推定して得られる第2符号化情報と、を受信する受信手段と、
前記第1符号化情報を復号して第2復号信号を生成する第1復号手段と、
前記第2符号化情報を用いて得られる、隣接するサブバンドの復号結果を用いて、前記第2復号信号から前記入力信号の高域部分を推定することにより第3復号信号を生成する第2復号手段と、
を具備する復号装置。 - 請求項18記載の復号装置を具備する通信端末装置。
- 請求項18記載の復号装置を具備する基地局装置。
- 入力信号の所定周波数以下の低域部分を符号化して第1符号化情報を生成するステップと、
前記第1符号化情報を復号して復号信号を生成するステップと、
前記入力信号の前記所定周波数より高い高域部分を複数のサブバンドに分割し、前記入力信号または前記復号信号から、前記複数のサブバンドのそれぞれを、隣接するサブバンドの推定結果を用いて推定することにより第2符号化情報を生成するステップと、
を具備する符号化方法。 - 符号化装置において生成された、入力信号の所定周波数以下の低域部分を符号化して得られる第1符号化情報と、前記入力信号の前記所定周波数より高い高域部分を複数のサブバンドに分割し、前記入力信号、または、前記第1符号化情報を復号して得られる第1復号信号から、前記複数のサブバンドのそれぞれを、隣接するサブバンドの推定結果を用いて推定して得られる第2符号化情報と、を受信するステップと、
前記第1符号化情報を復号して第2復号信号を生成するステップと、
前記第2符号化情報を用いて得られる、隣接するサブバンドの復号結果を用いて、前記第2復号信号から前記入力信号の高域部分を推定することにより第3復号信号を生成するステップと、
を具備する復号方法。
Priority Applications (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP17195359.9A EP3288034B1 (en) | 2008-03-14 | 2009-03-13 | Decoding device, and method thereof |
US12/918,575 US8452588B2 (en) | 2008-03-14 | 2009-03-13 | Encoding device, decoding device, and method thereof |
EP09718708.2A EP2251861B1 (en) | 2008-03-14 | 2009-03-13 | Encoding device and method thereof |
CN2009801084302A CN101971253B (zh) | 2008-03-14 | 2009-03-13 | 编码装置、解码装置以及其方法 |
MX2010009307A MX2010009307A (es) | 2008-03-14 | 2009-03-13 | Dispositivo de codificacion, dispositivo de decodificacion y metodo de los mismos. |
BRPI0908929A BRPI0908929A2 (pt) | 2008-03-14 | 2009-03-13 | dispositivo de codificação, dispositivo de decodificação, e método dos mesmos |
RU2010137838/08A RU2483367C2 (ru) | 2008-03-14 | 2009-03-13 | Устройство кодирования, устройство декодирования и способ для их работы |
JP2010502731A JP5449133B2 (ja) | 2008-03-14 | 2009-03-13 | 符号化装置、復号装置およびこれらの方法 |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2008066202 | 2008-03-14 | ||
JP2008-066202 | 2008-03-14 | ||
JP2008-143963 | 2008-05-30 | ||
JP2008143963 | 2008-05-30 | ||
JP2008-298091 | 2008-11-21 | ||
JP2008298091 | 2008-11-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009113316A1 true WO2009113316A1 (ja) | 2009-09-17 |
Family
ID=41064989
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2009/001129 WO2009113316A1 (ja) | 2008-03-14 | 2009-03-13 | 符号化装置、復号装置およびこれらの方法 |
Country Status (9)
Country | Link |
---|---|
US (1) | US8452588B2 (ja) |
EP (2) | EP3288034B1 (ja) |
JP (1) | JP5449133B2 (ja) |
KR (1) | KR101570550B1 (ja) |
CN (1) | CN101971253B (ja) |
BR (1) | BRPI0908929A2 (ja) |
MX (1) | MX2010009307A (ja) |
RU (1) | RU2483367C2 (ja) |
WO (1) | WO2009113316A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120089389A1 (en) * | 2010-04-14 | 2012-04-12 | Bruno Bessette | Flexible and Scalable Combined Innovation Codebook for Use in CELP Coder and Decoder |
WO2012052802A1 (en) * | 2010-10-18 | 2012-04-26 | Nokia Corporation | An audio encoder/decoder apparatus |
JP2012530946A (ja) * | 2009-06-23 | 2012-12-06 | ヴォイスエイジ・コーポレーション | 重み付けされた信号領域またはオリジナルの信号領域で適用される順方向時間領域エイリアシング取り消し |
US9093066B2 (en) | 2010-01-13 | 2015-07-28 | Voiceage Corporation | Forward time-domain aliasing cancellation using linear-predictive filtering to cancel time reversed and zero input responses of adjacent frames |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8660851B2 (en) | 2009-05-26 | 2014-02-25 | Panasonic Corporation | Stereo signal decoding device and stereo signal decoding method |
PL2491553T3 (pl) | 2009-10-20 | 2017-05-31 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Koder audio, dekoder audio, sposób kodowania informacji audio, sposób dekodowania informacji audio i program komputerowy wykorzystujący iteracyjne zmniejszania rozmiaru przedziału |
EP2500901B1 (en) | 2009-11-12 | 2018-09-19 | III Holdings 12, LLC | Audio encoder apparatus and audio encoding method |
EP2581904B1 (en) * | 2010-06-11 | 2015-10-07 | Panasonic Intellectual Property Corporation of America | Audio (de)coding apparatus and method |
WO2011161886A1 (ja) * | 2010-06-21 | 2011-12-29 | パナソニック株式会社 | 復号装置、符号化装置およびこれらの方法 |
EP3518234B1 (en) * | 2010-11-22 | 2023-11-29 | NTT DoCoMo, Inc. | Audio encoding device and method |
CN102610231B (zh) * | 2011-01-24 | 2013-10-09 | 华为技术有限公司 | 一种带宽扩展方法及装置 |
US9418671B2 (en) * | 2013-08-15 | 2016-08-16 | Huawei Technologies Co., Ltd. | Adaptive high-pass post-filter |
US8879858B1 (en) * | 2013-10-01 | 2014-11-04 | Gopro, Inc. | Multi-channel bit packing engine |
US9786291B2 (en) * | 2014-06-18 | 2017-10-10 | Google Technology Holdings LLC | Communicating information between devices using ultra high frequency audio |
US10306632B2 (en) * | 2014-09-30 | 2019-05-28 | Qualcomm Incorporated | Techniques for transmitting channel usage beacon signals over an unlicensed radio frequency spectrum band |
EP3182411A1 (en) * | 2015-12-14 | 2017-06-21 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for processing an encoded audio signal |
US10475471B2 (en) * | 2016-10-11 | 2019-11-12 | Cirrus Logic, Inc. | Detection of acoustic impulse events in voice applications using a neural network |
US10242696B2 (en) | 2016-10-11 | 2019-03-26 | Cirrus Logic, Inc. | Detection of acoustic impulse events in voice applications |
US20180336469A1 (en) * | 2017-05-18 | 2018-11-22 | Qualcomm Incorporated | Sigma-delta position derivative networks |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003140692A (ja) | 2001-11-02 | 2003-05-16 | Matsushita Electric Ind Co Ltd | 符号化装置及び復号化装置 |
JP2004004530A (ja) | 2002-01-30 | 2004-01-08 | Matsushita Electric Ind Co Ltd | 符号化装置、復号化装置およびその方法 |
WO2005111568A1 (ja) * | 2004-05-14 | 2005-11-24 | Matsushita Electric Industrial Co., Ltd. | 符号化装置、復号化装置、およびこれらの方法 |
WO2006049204A1 (ja) * | 2004-11-05 | 2006-05-11 | Matsushita Electric Industrial Co., Ltd. | 符号化装置、復号化装置、符号化方法及び復号化方法 |
WO2008084688A1 (ja) * | 2006-12-27 | 2008-07-17 | Panasonic Corporation | 符号化装置、復号装置及びこれらの方法 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE69232202T2 (de) * | 1991-06-11 | 2002-07-25 | Qualcomm, Inc. | Vocoder mit veraendlicher bitrate |
SE501340C2 (sv) * | 1993-06-11 | 1995-01-23 | Ericsson Telefon Ab L M | Döljande av transmissionsfel i en talavkodare |
JP3747492B2 (ja) * | 1995-06-20 | 2006-02-22 | ソニー株式会社 | 音声信号の再生方法及び再生装置 |
SE0001926D0 (sv) * | 2000-05-23 | 2000-05-23 | Lars Liljeryd | Improved spectral translation/folding in the subband domain |
DE60208426T2 (de) | 2001-11-02 | 2006-08-24 | Matsushita Electric Industrial Co., Ltd., Kadoma | Vorrichtung zur signalkodierung, signaldekodierung und system zum verteilen von audiodaten |
DE60323331D1 (de) | 2002-01-30 | 2008-10-16 | Matsushita Electric Ind Co Ltd | Verfahren und vorrichtung zur audio-kodierung und -dekodierung |
US7844451B2 (en) * | 2003-09-16 | 2010-11-30 | Panasonic Corporation | Spectrum coding/decoding apparatus and method for reducing distortion of two band spectrums |
CN100507485C (zh) | 2003-10-23 | 2009-07-01 | 松下电器产业株式会社 | 频谱编码装置和频谱解码装置 |
JPWO2006025313A1 (ja) | 2004-08-31 | 2008-05-08 | 松下電器産業株式会社 | 音声符号化装置、音声復号化装置、通信装置及び音声符号化方法 |
JP4977472B2 (ja) * | 2004-11-05 | 2012-07-18 | パナソニック株式会社 | スケーラブル復号化装置 |
JP4899359B2 (ja) * | 2005-07-11 | 2012-03-21 | ソニー株式会社 | 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体 |
EP2012305B1 (en) * | 2006-04-27 | 2011-03-09 | Panasonic Corporation | Audio encoding device, audio decoding device, and their method |
KR101379263B1 (ko) * | 2007-01-12 | 2014-03-28 | 삼성전자주식회사 | 대역폭 확장 복호화 방법 및 장치 |
CN101896967A (zh) * | 2007-11-06 | 2010-11-24 | 诺基亚公司 | 编码器 |
-
2009
- 2009-03-13 EP EP17195359.9A patent/EP3288034B1/en active Active
- 2009-03-13 EP EP09718708.2A patent/EP2251861B1/en active Active
- 2009-03-13 US US12/918,575 patent/US8452588B2/en active Active
- 2009-03-13 BR BRPI0908929A patent/BRPI0908929A2/pt not_active Application Discontinuation
- 2009-03-13 CN CN2009801084302A patent/CN101971253B/zh active Active
- 2009-03-13 JP JP2010502731A patent/JP5449133B2/ja active Active
- 2009-03-13 MX MX2010009307A patent/MX2010009307A/es active IP Right Grant
- 2009-03-13 RU RU2010137838/08A patent/RU2483367C2/ru active
- 2009-03-13 KR KR1020107019870A patent/KR101570550B1/ko active IP Right Grant
- 2009-03-13 WO PCT/JP2009/001129 patent/WO2009113316A1/ja active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003140692A (ja) | 2001-11-02 | 2003-05-16 | Matsushita Electric Ind Co Ltd | 符号化装置及び復号化装置 |
JP2004004530A (ja) | 2002-01-30 | 2004-01-08 | Matsushita Electric Ind Co Ltd | 符号化装置、復号化装置およびその方法 |
WO2005111568A1 (ja) * | 2004-05-14 | 2005-11-24 | Matsushita Electric Industrial Co., Ltd. | 符号化装置、復号化装置、およびこれらの方法 |
WO2006049204A1 (ja) * | 2004-11-05 | 2006-05-11 | Matsushita Electric Industrial Co., Ltd. | 符号化装置、復号化装置、符号化方法及び復号化方法 |
WO2008084688A1 (ja) * | 2006-12-27 | 2008-07-17 | Panasonic Corporation | 符号化装置、復号装置及びこれらの方法 |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012530946A (ja) * | 2009-06-23 | 2012-12-06 | ヴォイスエイジ・コーポレーション | 重み付けされた信号領域またはオリジナルの信号領域で適用される順方向時間領域エイリアシング取り消し |
US9093066B2 (en) | 2010-01-13 | 2015-07-28 | Voiceage Corporation | Forward time-domain aliasing cancellation using linear-predictive filtering to cancel time reversed and zero input responses of adjacent frames |
US20120089389A1 (en) * | 2010-04-14 | 2012-04-12 | Bruno Bessette | Flexible and Scalable Combined Innovation Codebook for Use in CELP Coder and Decoder |
CN102844810A (zh) * | 2010-04-14 | 2012-12-26 | 沃伊斯亚吉公司 | 用于在码激励线性预测编码器和解码器中使用的灵活和可缩放的组合式创新代码本 |
US9053705B2 (en) * | 2010-04-14 | 2015-06-09 | Voiceage Corporation | Flexible and scalable combined innovation codebook for use in CELP coder and decoder |
AU2011241424B2 (en) * | 2010-04-14 | 2016-05-05 | Voiceage Evs Llc | Flexible and scalable combined innovation codebook for use in CELP coder and decoder |
WO2012052802A1 (en) * | 2010-10-18 | 2012-04-26 | Nokia Corporation | An audio encoder/decoder apparatus |
US9230551B2 (en) | 2010-10-18 | 2016-01-05 | Nokia Technologies Oy | Audio encoder or decoder apparatus |
Also Published As
Publication number | Publication date |
---|---|
EP2251861A1 (en) | 2010-11-17 |
EP3288034B1 (en) | 2019-02-20 |
EP3288034A1 (en) | 2018-02-28 |
MX2010009307A (es) | 2010-09-24 |
JP5449133B2 (ja) | 2014-03-19 |
US8452588B2 (en) | 2013-05-28 |
KR20100134580A (ko) | 2010-12-23 |
JPWO2009113316A1 (ja) | 2011-07-21 |
US20100332221A1 (en) | 2010-12-30 |
CN101971253B (zh) | 2012-07-18 |
CN101971253A (zh) | 2011-02-09 |
EP2251861B1 (en) | 2017-11-22 |
RU2483367C2 (ru) | 2013-05-27 |
RU2010137838A (ru) | 2012-03-20 |
KR101570550B1 (ko) | 2015-11-19 |
BRPI0908929A2 (pt) | 2016-09-13 |
EP2251861A4 (en) | 2014-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5449133B2 (ja) | 符号化装置、復号装置およびこれらの方法 | |
JP5448850B2 (ja) | 符号化装置、復号装置およびこれらの方法 | |
JP5404418B2 (ja) | 符号化装置、復号装置および符号化方法 | |
JP5511785B2 (ja) | 符号化装置、復号装置およびこれらの方法 | |
JP5339919B2 (ja) | 符号化装置、復号装置およびこれらの方法 | |
WO2009084221A1 (ja) | 符号化装置、復号装置およびこれらの方法 | |
JP5419876B2 (ja) | スペクトル平滑化装置、符号化装置、復号装置、通信端末装置、基地局装置及びスペクトル平滑化方法 | |
JP5730303B2 (ja) | 復号装置、符号化装置およびこれらの方法 | |
JP5565914B2 (ja) | 符号化装置、復号装置およびこれらの方法 | |
WO2013057895A1 (ja) | 符号化装置及び符号化方法 | |
JP5774490B2 (ja) | 符号化装置、復号装置およびこれらの方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200980108430.2 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09718708 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010502731 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12918575 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2010/009307 Country of ref document: MX |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1794/MUMNP/2010 Country of ref document: IN |
|
ENP | Entry into the national phase |
Ref document number: 20107019870 Country of ref document: KR Kind code of ref document: A |
|
REEP | Request for entry into the european phase |
Ref document number: 2009718708 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009718708 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010137838 Country of ref document: RU |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01E Ref document number: PI0908929 Country of ref document: BR Free format text: APRESENTAR, EM ATE 60 (SESSENTA) DIAS, PROCURACAO REGULAR, UMA VEZ QUE A PROCURACAO APRESENTADA NA PETICAO NO 020100085367 DE 13/09/2010 NAO POSSUI DATA DE ASSINATURA DA MESMA. |
|
ENP | Entry into the national phase |
Ref document number: PI0908929 Country of ref document: BR Kind code of ref document: A2 Effective date: 20100913 |