WO2013002623A2 - 대역폭 확장신호 생성장치 및 방법 - Google Patents
대역폭 확장신호 생성장치 및 방법 Download PDFInfo
- Publication number
- WO2013002623A2 WO2013002623A2 PCT/KR2012/005258 KR2012005258W WO2013002623A2 WO 2013002623 A2 WO2013002623 A2 WO 2013002623A2 KR 2012005258 W KR2012005258 W KR 2012005258W WO 2013002623 A2 WO2013002623 A2 WO 2013002623A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frequency band
- signal
- encoding
- high frequency
- mode
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 51
- 238000001228 spectrum Methods 0.000 claims abstract description 111
- 230000008569 process Effects 0.000 claims description 14
- 230000005284 excitation Effects 0.000 description 61
- 238000010586 diagram Methods 0.000 description 28
- 238000013139 quantization Methods 0.000 description 21
- 238000005070 sampling Methods 0.000 description 19
- 230000001052 transient effect Effects 0.000 description 14
- 230000003044 adaptive effect Effects 0.000 description 11
- 230000003595 spectral effect Effects 0.000 description 10
- 239000000284 extract Substances 0.000 description 9
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 6
- 238000012952 Resampling Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000003068 static effect Effects 0.000 description 4
- 238000011084 recovery Methods 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003362 replicative effect Effects 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000005429 filling process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
Definitions
- the present invention relates to audio encoding / decoding, and more particularly, to an apparatus and method for generating a bandwidth extension signal capable of reducing metallic noise present in a bandwidth extension signal for a high band, and an audio encoding apparatus employing the same. And a method, an audio decoding apparatus, a method and a terminal.
- the signal corresponding to the high frequency region is less sensitive to the fine structure of the frequency than the signal corresponding to the low frequency region. Therefore, in order to overcome the limitation of the available bits when encoding an audio signal, when encoding efficiency needs to be improved, a large number of bits are allocated to the signal corresponding to the low frequency region and encoded. Allocates and encodes a small bit.
- SBR Spectrum Band Replication
- SBR encodes the lower band, such as the low band or the core band of the spectrum
- the upper band such as the high band
- SBR uses the correlation between the lower band and the upper band to extract the features of the lower band and predict the upper band.
- An object of the present invention is to provide an apparatus and method for generating a bandwidth extension signal capable of reducing metallic noise present in a bandwidth extension signal for a high band, an audio encoding apparatus and method using the same, an audio decoding apparatus, and A method and a terminal are provided.
- a method for generating a bandwidth extension signal comprising: performing anti-thinning on a spectrum of a low frequency band; And performing extended encoding of the high frequency band in the frequency domain by using the spectrum of the low frequency band in which the semi-thinning process is performed.
- an apparatus for generating a bandwidth extension signal comprising: a semi-lean processing unit for performing a semi-lean processing on a spectrum of a low frequency band; And an FD high frequency extended decoder configured to perform extended decoding of a high frequency band in a frequency domain by using a spectrum of a low frequency band in which the semi-thinning process is performed.
- FIG. 1 is a block diagram showing the configuration of an audio encoding apparatus according to an embodiment of the present invention.
- FIG. 2 is a block diagram illustrating a configuration of an embodiment of the FD encoder illustrated in FIG. 1.
- FIG. 3 is a block diagram illustrating a configuration of another embodiment of the FD encoder illustrated in FIG. 1.
- Figure 4 is a block diagram showing the configuration of a semi-lean processing unit according to an embodiment of the present invention.
- FIG. 5 is a block diagram illustrating a configuration of an FD high frequency extension encoder according to an embodiment of the present invention.
- 6A and 6B illustrate regions in which extended encoding is performed in the FD encoding module illustrated in FIG. 1.
- FIG. 7 is a block diagram showing the configuration of an audio encoding apparatus according to another embodiment of the present invention.
- FIG. 8 is a block diagram showing a configuration of an audio encoding apparatus according to another embodiment of the present invention.
- FIG. 9 is a block diagram showing the configuration of an audio decoding apparatus according to an embodiment of the present invention.
- FIG. 10 is a block diagram illustrating a configuration of an embodiment of the FD decoder illustrated in FIG. 9.
- FIG. 11 is a block diagram illustrating a configuration of an embodiment of an FD high frequency extension decoder illustrated in FIG. 10.
- FIG. 12 is a block diagram showing the configuration of an audio decoding apparatus according to another embodiment of the present invention.
- FIG. 13 is a block diagram showing the configuration of an audio decoding apparatus according to another embodiment of the present invention.
- FIG. 14 is a diagram illustrating a codebook sharing method according to an embodiment of the present invention.
- 15 is a diagram illustrating a coding mode signaling method according to an embodiment of the present invention.
- first and second may be used to describe various components, but the components are not limited by the terms. The terms are only used to distinguish one component from another.
- FIG. 1 is a block diagram showing the configuration of an audio encoding apparatus according to an embodiment of the present invention.
- the audio encoding apparatus shown in FIG. 1 constitutes a multimedia apparatus, and includes a broadcast or music dedicated terminal including a voice communication terminal including a telephone, a mobile phone, a TV, an MP3 player, and the like, or a broadcast or a dedicated communication terminal with a voice communication or A fusion terminal of a music-only terminal may be included, but is not limited thereto.
- the audio encoding apparatus can be used as a client, a server or a transducer disposed between the client and the server.
- the audio encoding apparatus 100 illustrated in FIG. 1 includes an encoding mode determiner 110, a switching unit 130, a CELP (Code Excited Linear Prediction) encoding module 150, and an FD (Frequency Domain) encoding module 170. It may include.
- the CELP encoding module 150 may include a CELP encoding unit 151 and a TD (Time Domain) extended encoding unit 153.
- the FD encoding module 170 may include a transform unit 171 and an FD encoder 173. It may include. Each component may be integrated into at least one or more modules and implemented as at least one or more processors (not shown).
- the encoding mode determiner 110 may determine an encoding mode of an input signal by referring to characteristics of a signal.
- the encoding mode determiner 110 may determine whether the current frame is a voice mode or a music mode according to the characteristics of the signal, and may determine whether an efficient encoding mode for the current frame is a time domain mode or a frequency domain mode. have.
- the characteristics of the signal may be grasped using the short-term characteristic of the frame or the long-term characteristic of the plurality of frames, but is not limited thereto.
- the encoding mode determiner 110 may determine the CELP mode when the characteristic of the signal corresponds to the voice mode or the time domain mode, and the FD mode when the characteristic of the signal corresponds to the music mode or the frequency domain mode.
- the input signal of the encoding mode determiner 110 may be a signal down-sampled by a down sampling unit (not shown).
- the input signal may be a signal having a sampling rate of 12.8 kHz or 16 kHz obtained by resampling or down sampling a signal having a sampling rate of 32 kHz or 48 kHz.
- a signal having a sampling rate of 32 kHz may be referred to as a super wide band (SWB) signal, and may be referred to as a full-band (FB) signal
- a signal having a sampling rate of 16 kHz may be referred to as a wide-band signal.
- the resampling or downsampling operation may be performed in the encoding mode determiner 110.
- the encoding mode determiner 110 may determine the encoding mode with respect to the resampled or downsampled signal.
- the encoding mode determined by the encoding mode determiner 110 may be provided to the switching unit 130, and may be stored or transmitted in a bitstream in units of frames.
- the switching unit 130 may provide an input signal to one of the CELP encoding module 150 and the FD encoding module 170 according to the encoding mode provided from the encoding mode determiner 110.
- the input signal is a resampled or downsampled signal and may be a low frequency band signal having a sampling rate of 12.8 kHz or 16 kHz.
- the switching unit 130 provides an input signal to the CELP encoding module 150 when the encoding mode is the CELP mode, and provides an input signal to the FD encoding module 170 when the encoding mode is the FD mode.
- the CELP encoding module 150 operates when the encoding mode is the CELP mode, and the CELP encoding unit 151 may perform CELP encoding on the input signal.
- the CELP encoder 151 extracts an excitation signal from the resampled or downsampled signal, and extracts the filtered adaptive code vector (ie, the pitch information) corresponding to the pitch information. , quantized by taking into account adaptive codebook contributions) and filtered fixed codevectors (ie, fixed or innovation codebook contributions).
- the CELP encoder 151 extracts a linear prediction coefficient (LPC), quantizes the extracted linear prediction coefficient, extracts an excitation signal using the quantized linear prediction coefficient,
- LPC linear prediction coefficient
- the extracted excitation signal may be quantized by considering each of the filtered adaptive codevector (ie, adaptive codebook contribution) and the filtered fixed codevector (ie, fixed or innovation codebook contribution) corresponding to the pitch information.
- the CELP encoder 151 may apply different encoding modes according to the characteristics of the signal.
- the coding modes to be applied include voiced coding mode, unvoiced coding mode, transient coding mode, and generic coding mode, but are not limited thereto. no.
- the excitation signal ie, CELP information of the low frequency band obtained as a result of the encoding by the CELP encoder 151 is provided to the TD extension encoder 153, and may be stored or transmitted in a bitstream.
- the TD extension encoder 153 may perform extended encoding of the high frequency band by folding or copying an excitation signal of the low frequency band provided by the CELP encoder 151.
- the extended information of the high frequency band obtained as a result of extended encoding by the TD extension encoder 153 may be included in a bitstream and stored or transmitted.
- the TD extension encoder 153 zeroes the linear predictive coefficient corresponding to the high frequency band of the input signal. In this case, the TD extension encoder 153 may extract a linear predictive coefficient of the high frequency signal of the input signal and quantize the extracted linear predictive coefficient.
- the TD extension encoder 153 may generate a linear predictive coefficient of the high frequency band of the input signal using the excitation signal of the low frequency band of the input signal.
- the linear predictive coefficient of the high frequency band may be used to represent envelope information of the high frequency band.
- the FD encoding module 170 operates when the encoding mode is the FD mode, and the transform unit 171 may convert the resampled or downsampled signal from the time domain to the frequency domain.
- the Modified Discrete Cosine Transform may be used, but is not limited thereto.
- the FD encoder 173 may perform FD encoding on the resampled or downsampled spectrum provided from the converter 171.
- An example of FD encoding is an algorithm applied in AAC (Advanced Audio Codec), but is not limited thereto.
- FD information obtained as a result of FD encoding in the FD encoder 173 may be included in a bitstream and stored or transmitted.
- prediction data may be further included in a bitstream obtained as a result of FD encoding by the FD encoder 173.
- the decoding on the N + 1th frame is performed only by the encoding result according to the FD mode. Since it is impossible to do so, it is necessary to further include prediction data for reference in decoding.
- bitstreams may be generated according to the encoding mode determined by the encoding mode determiner 110.
- the bitstream may include a header and a payload.
- the bitstream when the encoding mode is the CELP mode, the bitstream may include information about the encoding mode in the header, and include the CELP information and the TD extension information in the payload. Meanwhile, when the encoding mode is the FD mode, the bitstream may include information on the encoding mode in the header, and may include the FD information and the prediction data in the payload.
- the FD information may further include FD high frequency extension information.
- each bitstream may further include information about an encoding mode of a previous frame in a header, in order to prepare for a frame error.
- the header of the bitstream may further include information about the encoding mode of the previous frame.
- the audio encoding apparatus 100 shown in FIG. 1 may be switched to operate in either the CELP mode or the FD mode according to the characteristics of the signal, thereby performing efficient encoding adaptively to the characteristics of the signal. Meanwhile, the switching structure of FIG. 1 may be preferably applied to a high bit rate environment.
- FIG. 2 is a block diagram illustrating a configuration of an embodiment of the FD encoder illustrated in FIG. 1.
- the FD encoder 200 includes a Norm encoder 210, a FPC (Factorial Pulse Coding) encoder 230, an FD low frequency extension encoder 240, a noise side information generator 250, An anti-sparseness processor 270 and an FD high frequency extension encoder 290 may be included.
- the Norm encoder 210 estimates or calculates a Norm value for each frequency band, for example, subbands, with respect to the frequency spectrum provided from the converter 171 of FIG. 1, and quantizes the estimated or calculated Norm value.
- the Norm value means an average spectral energy obtained in units of subbands, and may be replaced by power.
- Norm values can be used to normalize the frequency spectrum in subband units.
- the masking threshold value is calculated using the Norm value for each subband unit with respect to the total number of bits according to the target bit rate, and the number of bits allocated for the perceptual encoding of each subband is integer or decimal point using the masking threshold value. Can be determined in units.
- Norm values quantized by the Norm encoder 210 may be provided to the FPC encoder 230 and may be stored or transmitted in a bitstream.
- the FPC encoder 230 may perform quantization using the allocated bits of each subband with respect to the normalized spectrum, and may perform FPC encoding on the quantized result. According to the FPC encoding, information such as the position of the pulse, the magnitude of the pulse, and the sign of the pulse within the allocated number of bits may be expressed in a factorial form. The FPC information obtained by the FPC encoder 230 may be included in the bitstream and stored or transmitted.
- the noise additional information generator 250 may generate noise additional information, that is, a noise level in units of subbands, according to the FPC encoding result.
- the frequency spectrum encoded by the FPC encoder 230 may have a portion that is not encoded in subband units, that is, a hole, due to the lack of the number of bits.
- the noise level may be generated using an average of levels of uncoded spectral coefficients.
- the noise level generated by the noise additional information generator 250 may be included in the bitstream and stored or transmitted. In addition, the noise level may be generated in units of frames.
- the anti-sparseness processor 270 determines the noise addition position and the noise magnitude from the reconstruction spectrum for the low frequency band, and uses the noise level to determine the noise addition position and noise determined for the frequency spectrum in which the noise filling is performed.
- the anti-thinning process according to the size is performed and provided to the FD high frequency extension encoder 290.
- the reconstruction spectrum for the low frequency band may refer to a result of extending the low frequency band with respect to the FPC decoding result, performing noise filling, and then performing a semi-thinning process.
- the FD high frequency extension encoder 290 may perform extended encoding of the high frequency band by using the spectrum of the low frequency band provided from the semi-lean processor 270. In this case, the spectrum of the original high frequency band may also be provided to the FD high frequency extension encoder 290. According to an embodiment, the FD high frequency extension encoder 370 may obtain an extended high frequency band spectrum by folding or replicating a low frequency band spectrum, and extracts energy in subband units with respect to the original high frequency band spectrum. Then adjust the extracted energy and quantize the regulated energy.
- the first tonality is calculated in subbands with respect to the spectrum of the original high frequency band, and subbands are used for the excitation signal of the extended high frequency band using the spectrum of the low frequency band.
- the second tonality may be calculated to be performed according to a ratio between the first tonality and the second tonality.
- the control of energy calculates a first tonality in units of subbands with respect to a spectrum of an original high frequency band, thereby indicating a first noise factor representing a degree to which a noise component is included in a signal.
- the second tonality is calculated by subband unit for the excitation signal of the extended high frequency band using the spectrum of the low frequency band, and the second noise factor is obtained.
- the first noise factor and the second noise factor are obtained. It may be performed corresponding to the ratio between factors. According to this, when the second tonality is larger than the first tonality or when the first noise factor is larger than the second noise factor, the energy of the corresponding subband is reduced to prevent noise from increasing. can do. In the opposite case, the energy of the corresponding subband may be increased.
- a method of generating an excitation signal in a predetermined subband is simulated. If the characteristics of the original signal of the subbands are different, the energy can be adjusted. In this case, the characteristics of the excitation signal and the original signal according to the simulation result may be at least one of the tonality and the noise factor, but are not limited thereto. Accordingly, when the decoding unit performs decoding such as actual energy, it is possible to prevent a phenomenon in which noise increases.
- a multi stage vector quantization (MSVQ) scheme may be applied to quantization of energy, but is not limited thereto.
- the FD high frequency extension encoder 290 performs vector quantization by collecting energy of odd subbands among a predetermined number of subbands in the current stage, and uses an even number using a vector quantization result of odd subbands. The prediction error of the first subbands may be obtained, and vector quantization of the obtained prediction error may be performed at the next stage.
- the reverse may be possible. That is, the FD high frequency extension encoder 370 uses the vector quantization result of the n th subband and the vector quantization result of the n + 2 th subband to obtain a prediction error of the n + 1 th subband. Acquire.
- the weight of the importance of each signal or energy vector minus the mean value for each energy vector may be calculated.
- the weight for the importance may be calculated in a direction maximizing the sound quality of the synthesized sound.
- an optimized quantization index for an energy vector can be obtained by using weighted mean square error (WMSE).
- the FD high frequency extension encoder 290 may apply a multi-mode bandwidth extension method using various excitation signal generation methods according to characteristics of the high frequency signal.
- the multimode bandwidth extension method may operate in a transient mode, a normal mode, a harmonic mode, a noise mode, and the like according to characteristics of a high frequency signal. Since the FD high frequency extension encoder 290 is applied to a stationary frame, the FD high frequency extension encoder 290 may generate an excitation signal using one of a normal mode, a harmonic mode, and a noise mode for each frame according to the characteristics of the high frequency signal. .
- the FD high frequency extension encoder 290 may generate signals for different high frequency bands according to bit rates. That is, the high frequency band in which the extended encoding is performed by the FD high frequency extension encoder 290 may be set differently according to the bit rate. For example, the FD high frequency extension encoder 290 performs extended encoding on a frequency band of about 6.4 to 14.4 kHz at a bit rate of 16 kbps, and performs extended encoding on a frequency band of about 8 to 16 kHz at a bit rate of 16 kbps or more. can do.
- the FD high frequency extension encoder 290 may perform energy quantization by sharing the same codebook for different bit rates.
- the FD encoder 200, Norm encoder 210, FPC (Factorial Pulse Coding) encoder 230, noise side information generator 250, anti-lean (anti- The sparseness processor 270 and the FD extension encoder 290 may be operated.
- the anti-sparseness processor 270 preferably operates in the normal mode of the static frame.
- the noise side information generator 250, the anti-sparseness processor 270, and the FD extension encoder 290 do not operate.
- the FPC encoder 230 may apply a higher frequency band (Fcore) allocated to perform the FPC to a higher level, for example, Fend, as compared with the case where the static frame is input.
- Fcore frequency band
- FIG. 3 is a block diagram illustrating a configuration of another embodiment of the FD encoder illustrated in FIG. 1.
- the FD encoder 300 includes a Norm encoder 310, an FPC encoder 330, an FD low frequency extension encoder 340, a semi-lean processor 370, and an FD high frequency extension encoder ( 390).
- operations of the Norm encoder 310, the FPC encoder 330, and the FD high frequency extended encoder 390 may be performed by the Norm encoder 210, the FPC encoder 230, and the FD high frequency extended encoder (FIG. 2). Since the same as in 290, detailed description thereof will be omitted.
- the semi-lean processor 370 does not use a separate noise level, but uses a Norm value obtained by the Norm encoder 310 in subband units. That is, the semi-thinning processor 370 determines the noise addition position and the noise size from the reconstruction spectrum for the low frequency band, and uses the Norm value according to the noise addition position and the noise size determined for the frequency spectrum where the noise filling is performed. Semi-thinning is performed to provide the FD high frequency extension encoder 290. Specifically, for a subband including a portion dequantized to zero, a noise component may be generated and the energy of the noise component may be adjusted by using a ratio between the energy of the noise component and the dequantized Norm value, that is, the spectral energy. . According to another embodiment, for a subband including a portion dequantized to zero, a noise component may be generated and adjusted so that the average energy of the noise component is one.
- Figure 4 is a block diagram showing the configuration of a semi-lean processing unit according to an embodiment of the present invention.
- the semi-lean processor 400 may include a reconstructed spectrum generator 410, a noise position determiner 430, a noise magnitude determiner 440, and a noise adder 450.
- the reconstructed spectrum generator 410 generates a reconstructed spectrum of a low frequency band by using the FPC information provided from the FPC encoder (230 of FIG. 2 or 330 of FIG. 3) and noise filling information such as noise level or Norm value.
- FPC information provided from the FPC encoder (230 of FIG. 2 or 330 of FIG. 3)
- noise filling information such as noise level or Norm value.
- FD low frequency extended encoding may be additionally performed to generate a restored spectrum of the low frequency band.
- the noise location determiner 430 may determine a spectrum restored to 0 from the restored spectrum of the low frequency band as the noise location.
- the noise location may be determined in consideration of the magnitude of the surrounding spectrum among the spectra restored to zero. For example, when the magnitude of the surrounding spectrum adjacent to the spectrum restored to zero is greater than or equal to a predetermined value, the spectrum restored to the corresponding zero may be determined as the noise position.
- the predetermined value may be set to an optimal value in advance so that information loss of the surrounding spectrum adjacent to the spectrum restored to zero through simulation or experimentally may be minimized.
- the noise determiner 440 may determine an amplitude of noise to be added to the determined noise position.
- the magnitude of the noise may be determined based on the noise level.
- the amount of noise may be determined by varying the noise level by a predetermined ratio. In more detail, it may be determined in the same manner as (0.5 * noise level), but is not limited thereto.
- the size of the noise may be determined by adaptively varying the size of the surrounding spectrum of the determined noise location. If the ambient spectrum is smaller than the amount of noise to be added, the magnitude of the noise may be changed to be smaller than the ambient spectrum.
- the noise adding unit 450 may add noise based on the noise location and the noise size determined using the random noise.
- a random sign may be applied.
- the magnitude of the noise may be a fixed value, and the sign may be varied depending on whether the random signal generated through the random seed is odd or even. For example, when the random signal is even, the + sign may be added, and when the random signal is odd, the ⁇ sign may be added.
- the spectrum of the low frequency band to which the noise is added by the noise adding unit 450 is provided to the FD high frequency extension coding unit (290 of FIG. 2).
- the spectrum of the low frequency band provided to the FD high frequency extension encoder (290 of FIG. 2) is subjected to noise peeling processing and low frequency band extension coding on the spectrum of the low frequency band obtained by performing FPC decoding, and then semi-thinning processing. It may represent a core decoded signal that has been performed.
- FIG. 5 is a block diagram illustrating a configuration of an FD high frequency extension encoder according to an embodiment of the present invention.
- the FD high frequency extension encoder 500 includes a spectrum copy unit 510, a first tonality calculator 520, a second tonality calculator 530, and an excitation signal generation method determiner. 540, an energy controller 550, and an energy quantizer 560. Meanwhile, the encoder may further include a high frequency spectrum generation module 570 when the encoding apparatus requires a reconstructed spectrum of the high frequency band.
- the high frequency recovery spectrum generation module 570 may include a high frequency excitation signal generator 571 and a high frequency spectrum generator 573.
- the FD encoder (173 of FIG. 1) uses a transform, for example, MDCT, which can be restored through an overlap-add with the previous frame, and switching between the CELP mode and the FD mode exists between frames. It is necessary to add a high frequency recovery spectrum generation module 570.
- the spectral radiator 510 may be extended to a high frequency band by folding or replicating a low frequency band spectrum provided from the semi-lean processor 270 of FIG. 2 or 370 of FIG. 3.
- a low frequency band spectrum of 0 to 8 kHz can be used to extend to a high frequency band of 8 to 16 kHz.
- the original low frequency spectrum may be folded or duplicated to extend the high frequency band.
- the first tonality calculator 520 calculates the first tonality for the spectrum of the original high frequency band in predetermined subband units.
- the second tonality calculator 530 calculates the second tonality in units of subbands with respect to the spectrum of the extended high frequency band by using the spectrum of the low frequency band in the spectrum copyer 510.
- the first and second tonalities may be calculated using spectral flatness based on a ratio of the average size and the maximum size of the spectrum of the subband. Specifically, spectral flatness can be measured through the relationship between the geometric mean and the arithmetic mean of the frequency spectrum. That is, the first and second tonalities are measures indicating whether the spectrum has a peaky characteristic or a flat characteristic characteristic.
- the first tonality calculator 520 and the second tonality calculator 530 preferably operate in the same manner and in the same subband units.
- the excitation signal generation method determiner 540 may determine the high frequency excitation signal generation method by comparing the first tonality and the second tonality.
- a method of generating a high frequency excitation signal may be determined through adaptive weighting of a spectrum of a high frequency band and random noise generated by modifying a spectrum of a low frequency band.
- the value corresponding to the adaptive weight is the type information of the excitation signal, and the type information of the excitation signal may be included in the bitstream and stored or transmitted.
- the type information of the excitation signal may be configured as 2 bits. Here, two bits may be configured in four steps based on a weight to be added to random noise.
- the type information of the excitation signal may be transmitted once per frame.
- a plurality of subbands may be bundled to form a group, and type information of an excitation signal may be defined for each group and transmitted for each group.
- the excitation signal generation method determiner 540 may determine a method of generating a high frequency excitation signal by considering only the signal characteristics of the original high frequency band.
- a method of generating an excitation signal according to which area the first tonality value corresponds to, based on the number of types of information of the excitation signal, is classified according to the number of types of information of the excitation signal. You can decide. According to this method, when the tonality value is high, that is, when the picky characteristic of the spectrum is large, the weight added to the random noise can be set small.
- the excitation signal generation method determiner 540 may determine a method of generating a high frequency excitation signal in consideration of the signal characteristics of the original high frequency band and the characteristics of the high frequency signal to be generated through band extension at the same time. . For example, if the signal characteristics of the original high frequency band and the characteristics of the high frequency signal to be generated through the band extension are similar, the weight of the random noise is set small, and the signal characteristics of the original high frequency band and the high frequency signal to be generated through the band extension are set. If the characteristics are different, the weight of the random noise can be large. Meanwhile, the sub-band difference value between the first tonality and the second tonality may be set based on an average.
- the weight of the random noise is set large, and the average of the subband difference values between the first tonality and the second tonality is large. If it is small, the weight of the random noise can be set small.
- the average of the difference between subbands between the first tonality and the second tonality is obtained using the average of subbands belonging to one group.
- the energy controller 550 obtains energy in subband units with respect to the spectrum of the original high frequency band, and performs energy control using the first tonality and the second tonality. For example, when the first tonality is large and the second tonality is small, that is, when the spectrum of the original high frequency band is picked and the output spectrum of the semi-lean processing unit 270 or 370 is flat, 2 Adjust the energy based on the ratio of tonality.
- the energy quantization unit 560 may vector quantize the adjusted energy and store or transmit the quantization index generated as a result of the vector quantization in the bitstream.
- the operations of the high frequency excitation signal generator 571 and the high frequency spectrum generator 573 are performed by the high frequency excitation signal generator 1130 and the high frequency spectrum generator 1170 of FIG. 11. ) Is substantially the same, so a detailed description thereof will be omitted.
- FIG. 6A and 6B illustrate regions in which extended encoding is performed in the FD encoding module 170 illustrated in FIG. 1.
- FIG. 6A shows a case in which the upper frequency band Ffpc on which the actual FPC is performed is the same as the low frequency band allocated to performing the FPC, that is, the core frequency band Fcore, in which case FPC and noise for the low frequency band up to Fcore Filling is performed, and extended encoding is performed on a high frequency band corresponding to Fend-Fcore using a signal of a low frequency band.
- Fend may be the maximum frequency obtained by high frequency extension.
- FIG. 6B illustrates a case in which the upper frequency band Ffpc on which the actual FPC is performed is smaller than the core frequency band Fcore, and FPC and noise peeling are performed on the low frequency band up to Ffpc, corresponding to Fcore-Ffpc.
- Extended coding is performed using a signal of a low frequency band in which FPC and noise peeling are performed on a low frequency band, and extended coding is performed using all signals of a low frequency band for a high frequency band corresponding to Fend-Fcore.
- Fend can be the maximum frequency that can be obtained by high frequency extension.
- Fcore and Fend can be set variably according to the bit rate.
- Fcore may be limited to 6.4 kHz, 8 kHz, or 9.6 kHz depending on the bit rate, but is not limited thereto.
- Fend may be extended to 14 kHz, 14.4 kHz, or 16 kHz, but is not limited thereto.
- Ffpc up to an upper frequency band Ffpc where the actual FPC is performed corresponds to a frequency band in which noise filling is performed.
- FIG. 7 is a block diagram showing the configuration of an audio encoding apparatus according to another embodiment of the present invention.
- the audio encoding apparatus 700 illustrated in FIG. 7 may include an encoding mode determiner 710, an LPC encoder 705, a switching unit 730, a CELP encoding module 750, and an audio encoding module 770.
- the CELP encoding module 750 may include a CELP encoding unit 751 and a TD extension encoding unit 753, and the audio encoding module 770 may include an audio encoding unit 771 and an FD extension encoding unit 773. can do.
- Each component may be integrated into at least one or more modules and implemented as at least one or more processors (not shown).
- the LPC encoder 705 may extract a linear prediction coefficient (LPC) from an input signal and quantize the extracted linear prediction coefficient.
- LPC linear prediction coefficient
- the LPC encoder 705 may quantize the linear prediction coefficient by using a trellis coded quantization (TCQ) method, a multi-stage vector quantization (MSVQ) method, a latency vector quantization (LVQ) method, or the like. It is not limited to this.
- the linear predictive coefficient quantized by the LPC encoder 705 may be included in the bitstream and stored or transmitted.
- the LPC encoder 705 may resample or downsample an input signal having a sampling rate of 32 kHz or 48 kHz to extract a linear predictive coefficient from a signal having a sampling rate of 12.8 kHz or 16 kHz.
- the encoding mode determiner 710 may determine an encoding mode of an input signal by referring to characteristics of a signal. The encoding mode determiner 710 may determine whether the current frame is a voice mode or a music mode according to the characteristics of the signal, and may determine whether an efficient encoding mode for the current frame is a time domain mode or a frequency domain mode. have.
- the input signal of the encoding mode determiner 710 may be a down sampled signal by a down sampling unit (not shown).
- the input signal may be a signal having a sampling rate of 12.8 kHz or 16 kHz obtained by resampling or down sampling a signal having a sampling rate of 32 kHz or 48 kHz.
- a signal having a sampling rate of 32 kHz may be referred to as a super wide band (SWB) signal, and may be referred to as a full-band (FB) signal
- a signal having a sampling rate of 16 kHz may be referred to as a wide-band signal.
- a resampling or downsampling operation may be performed in the encoding mode determiner 710.
- the encoding mode determiner 710 may determine the encoding mode with respect to the resampled or downsampled signal.
- the encoding mode determined by the encoding mode determiner 710 may be provided to the switching unit 730, and may be included in a bitstream on a frame basis to be transmitted or stored.
- the switching unit 730 uses the CELP encoding module 750 and the audio encoding module 770 to calculate the linear prediction coefficients of the low frequency band provided from the LPC encoder 705 according to the encoding mode provided from the encoding mode determiner 710. Can be provided as either.
- the switching unit 730 provides the linear prediction coefficient of the low frequency band to the CELP encoding module 750 when the encoding mode is the CELP mode, and the linear prediction coefficient of the low frequency band when the encoding mode is the audio mode.
- the CELP encoding module 750 provides the linear prediction coefficient of the low frequency band to the CELP encoding module 750 when the encoding mode is the CELP mode, and the linear prediction coefficient of the low frequency band when the encoding mode is the audio mode.
- the CELP encoding module 750 operates when the encoding mode is the CELP mode, and the CELP encoding unit 751 may perform CELP encoding on the excitation signal obtained from the linear prediction coefficient of the low frequency band.
- the CELP encoder 751 may convert the LPC excitation signal into a filtered adaptive codevector (ie, adaptive codebook contribution) and a filtered fixed codevector (ie, fixed or innovation codebook) corresponding to pitch information. contribution can be quantized by considering each.
- the excitation signal may be generated by the LPC encoder 705 and provided to the CELP encoder 751 or generated by the CELP encoder 751.
- the CELP encoder 751 may apply different encoding modes according to the characteristics of the signal.
- the coding modes to be applied include voiced coding mode, unvoiced coding mode, transient coding mode, and generic coding mode, but are not limited thereto. no.
- the excitation signal of the low frequency band that is, the CELP information obtained as a result of the encoding by the CELP encoder 751 may be provided to the TD extension encoder 753 and included in the bitstream.
- the TD extension encoder 753 may perform extension encoding of the high frequency band by folding or copying an excitation signal of the low frequency band provided by the CELP encoder 751.
- the TD extension encoder 151 may include extension information of a high frequency band obtained as a result of extension encoding in the bitstream.
- the audio encoding module 770 operates when the encoding mode is the audio mode.
- the audio encoding unit 771 may perform audio encoding by converting an excitation signal obtained from a linear prediction coefficient of a low frequency band into a frequency domain.
- the audio encoder 771 may use a transformation method in which there is no overlapping area between frames, such as a discrete cosine transform (DCT).
- the audio encoder 771 may perform Lattice VQ (LVQ) and FPC encoding on the excitation signal converted into the frequency domain.
- LVQ Lattice VQ
- FPC FPC encoding
- the audio encoder 771 may filter a adaptive codebook contribution and a filtered fixed codevector contribution.
- the TD information may be further considered to quantize it.
- the FD extension encoder 773 may perform extended encoding of the high frequency band by using an excitation signal of a low frequency band provided from the audio encoder 771. Since the operation of the FD extension encoder 773 is similar to that of the FD high frequency extension encoder 290 or 390, except that the input signal is different, a detailed description thereof will be omitted.
- bitstreams may be generated according to the encoding mode determined by the encoding mode determiner 710.
- the bitstream may include a header and a payload.
- the bitstream when the encoding mode is the CELP mode, the bitstream may include information on the encoding mode in the header, and may include CELP information and TD high frequency extension information in the payload. Meanwhile, when the encoding mode is the audio mode, the bitstream may include information about the encoding mode in the header, and the payload may include information about the audio encoding, that is, the audio information and the FD high frequency extension information.
- the audio encoding apparatus 700 illustrated in FIG. 7 is switched to operate in either the CELP mode or the audio mode according to the characteristics of the signal, thereby adaptively and efficiently encoding the characteristics of the signal. Meanwhile, the switching structure of FIG. 1 may be preferably applied to a low bit rate environment.
- FIG. 8 is a block diagram showing a configuration of an audio encoding apparatus according to another embodiment of the present invention.
- the audio encoding apparatus 800 illustrated in FIG. 8 may include an encoding mode determiner 810, a switching unit 830, a CELP encoding module 850, an FD encoding module 870, and an audio encoding module 890.
- the CELP encoding module 850 may include a CELP encoding unit 851 and a TD extension encoding unit 853, and the FD encoding module 870 may include a conversion unit 871 and an FD encoding unit 873.
- the audio encoding module 890 may include an audio encoder 891 and an FD extension encoder 893. Each component may be integrated into at least one or more modules and implemented as at least one or more processors (not shown).
- the encoding mode determiner 810 may determine an encoding mode of an input signal with reference to a characteristic and a bit rate of the signal.
- the encoding mode determiner 810 determines whether the current frame is the voice mode or the music mode, and whether the effective encoding mode is the time domain mode or the frequency domain mode, depending on the characteristics of the signal. You can decide in mode. If the characteristic of the signal is the voice mode, it is determined as the CELP mode, the music mode and the high bit rate is determined as the FD mode, and the music mode and the low bit rate is determined as the audio mode.
- the switching unit 830 may provide an input signal to one of the CELP encoding module 850, the FD encoding module 870, and the audio encoding module 890 according to the encoding mode provided from the encoding mode determiner 810. .
- the audio encoding apparatus 800 of FIG. 8 extracts the linear predictive coefficient from the input signal by the CELP encoder 851 and extracts the linear predictive coefficient from the input signal by the audio encoder 891. It is similar to the combination of the audio encoding apparatus 100 of FIG. 1 and the audio encoding apparatus 700 of FIG.
- the audio encoding apparatus 800 illustrated in FIG. 8 may be switched to operate in any one of the CELP mode, the FD mode, and the audio mode according to the characteristics of the signal, thereby performing efficient encoding adaptively to the characteristics of the signal. Meanwhile, the switching structure of FIG. 8 may be applied regardless of the bit rate.
- FIG. 9 is a block diagram showing the configuration of an audio decoding apparatus according to an embodiment of the present invention.
- the audio decoding apparatus shown in FIG. 9 alone or in combination with the audio encoding apparatus shown in FIG. 1 constitutes a multimedia device, and includes a broadcast-only terminal including a telephone, a mobile phone, a TV, an MP3 player, and the like.
- the present invention may include a music terminal or a fusion terminal of a voice communication terminal and a broadcast or music terminal, but is not limited thereto.
- the audio decoding apparatus may be used as a client, a server, or a transducer disposed between the client and the server.
- the audio decoding apparatus 900 illustrated in FIG. 9 may include a switching unit 910, a CELP decoding module 930, and an FD decoding module 950.
- the CELP decoding module 930 may include a CELP decoding unit 931 and a TD extended decoding unit 933
- the FD decoding module 950 may include an FD decoding unit 951 and an inverse transform unit 953. have.
- Each component may be integrated into at least one or more modules and implemented as at least one or more processors (not shown).
- the switching unit 910 may provide the bitstream to one of the CELP decoding module 930 and the FD decoding module 950 by referring to information about an encoding mode included in the bitstream.
- the encoding mode is the CELP mode
- the bitstream is provided to the CELP decoding module 930 and the FD decoding module 950 when the FD mode is used.
- the CELP decoding unit 931 decodes the linear prediction coefficients included in the bitstream, performs decoding on the filtered adaptive code vector and the filtered fixed code vector, and synthesizes the decoding result. To generate a reconstruction signal for the low frequency band.
- the TD extension decoder 933 generates a reconstruction signal of the high frequency band by performing extended decoding on the high frequency band using at least one of a CELP decoding result and an excitation signal of a low frequency band. At this time, the excitation signal of the low frequency band may be included in the bitstream. In addition, the TD extension decoder 933 may utilize linear predictive coefficient information for the low frequency band included in the bitstream to generate a reconstruction signal for the high frequency band.
- the TD extension decoder 933 may generate a restored SWB signal by combining the reconstruction signal for the generated high frequency band with the reconstruction signal of the low frequency band generated by the CELP decoder 931.
- the TD extension decoder 933 may further perform the operation of converting the sampling rate of the reconstruction signal of the low frequency band and the reconstruction signal of the high frequency band to be identical to generate the reconstructed SWB signal.
- the FD decoding unit 951 performs FD decoding on an FD coded frame.
- the FD decoder 951 may generate a frequency spectrum by decoding the bitstream.
- the FD decoder 951 may recognize that decoding may be performed by referring to mode information of a previous frame included in the bitstream. That is, the FD decoder 951 may perform FD decoding with reference to previous frame mode information included in the bitstream with respect to the FD encoded frame.
- the inverse transform unit 953 inverts the FD decoding result to the time domain.
- the inverse transform unit 953 generates a reconstruction signal by performing inverse transform on the FD decoded frequency spectrum.
- the inverse transform unit 953 may perform inverse MDCT, but is not limited thereto.
- the audio signal decoding apparatus 900 may decode the bitstream by referring to the encoding mode on a frame basis.
- FIG. 10 is a block diagram illustrating a configuration of an embodiment of the FD decoder illustrated in FIG. 9.
- the FD decoder 1000 illustrated in FIG. 10 includes a Norm decoder 1010, an FPC decoder 1020, a noise filling unit 1030, an FD low frequency extension decoder 1040, an anti-leanness processor 1050, The FD high frequency extension decoder 1060 and the combiner 1070 may be included.
- the Norm decoder 1010 may obtain a restored Norm value by decoding a Norm value included in the bitstream.
- the FPC decoder 1020 may determine the number of allocated bits using the reconstructed Norm value, and may perform FPC decoding on the FPC encoded spectrum using the number of allocated bits.
- the number of allocated bits may be determined in the same manner as in the FPC encoder 230 or 330.
- the noise filling unit 1030 performs noise filling using a noise level generated and provided separately from the audio encoding apparatus by referring to the FPC decoding result of the FPC decoding unit 1020, or uses the reconstructed Norm value. Filling may be performed. That is, the noise filling unit 1030 performs the noise filling process up to the last subband in which FPC decoding is performed.
- the FD low frequency extension decoder 1040 operates when the upper frequency band Ffpc on which the actual FPC decoding is performed is smaller than the core frequency band Fcore, and performs FPC decoding and noise peeling on the low frequency band up to Ffpc.
- extended decoding may be performed using a signal of a low frequency band in which FPC and noise peeling have been performed.
- the anti-thinning processing unit 1050 performs noise filling processing on the FPC decoded signal, the anti-thinning processing unit 1050 further suppresses the generation of metallic noise caused after FD high frequency extension coding by adding noise to the spectrum restored to zero. can do.
- the semi-lean processor 1050 determines the noise addition position and the noise size from the spectrum of the low frequency band provided from the FD low frequency extension decoding unit 1040, and determines the noise addition position and the noise size determined for the spectrum of the low frequency band.
- the anti-thinning process according to the present invention is provided to the FD high frequency extension decoder 1060.
- the semi-lean processor 1050 includes a noise positioning unit 430, a noise sizing unit 440, and a noise adding unit 450 except for the reconstructed spectrum generator 410 illustrated in FIG. 4. Can be.
- the noise peeling process when performing the noise peeling process only when all the spectra in the subband are quantized to zero when FPC decoding is performed, when there is a spectrum restored to zero in the subband where the noise peeling process is not performed
- the noise can be added to perform the semi-thinning process.
- anti-thinning may be performed by adding noise.
- the FD high frequency extension decoder 1060 performs extended encoding on the high frequency band by using the spectrum of the low frequency band to which the noise is added by the semi-lean processor 1050. According to an embodiment, the FD high frequency extension decoder 1060 may perform energy dequantization by sharing the same codebook for different bit rates.
- the combiner 1070 combines the spectrum of the low frequency band provided from the FD low frequency extension decoder 1040 with the spectrum of the high frequency band provided from the FD high frequency extension decoder 1060 to generate a reconstructed spectrum of the SWB.
- FIG. 11 is a block diagram illustrating a configuration of an embodiment of an FD high frequency extension decoder illustrated in FIG. 10.
- the FD high frequency extension encoder 1100 illustrated in FIG. 11 may include a spectrum copy unit 1110, a high frequency excitation signal generator 1130, an energy dequantization unit 1150, and a high frequency spectrum generator 1170. .
- the spectral radiator 1110 may expand or reduce the low frequency band spectrum provided from the semi-lean processor 1050 of FIG. 10 to a high frequency band.
- the high frequency excitation signal generator 1130 generates a high frequency excitation signal using the extended high frequency band spectrum provided from the spectrum copying unit 1110 and the excitation signal type information extracted from the bitstream.
- the high frequency excitation signal generator 1130 may generate a high frequency excitation signal through a weight between the spectrum G (n) and the random noise R (n) modified from the extended high frequency band spectrum provided from the spectrum copying unit 1110.
- the modified spectrum may be obtained by obtaining an average size in units of subbands by using a newly defined subband instead of an existing subband, and normalizing the spectrum to the average size.
- the modified spectrum generated in this way undergoes a process of matching levels in units of a predetermined subband in order to match the level with random noise. Level matching is the process by which the average magnitude of the subbands makes the random noise and the transformed spectrum equal.
- the size of the modified signal may be set to be slightly larger.
- w (n) is a value determined by the type information of the excitation signal
- n is a spectrum bin index, respectively.
- w (n) may be a constant value or, when transmitted for each subband, may be defined as the same value for each subband. It may also be set in consideration of smoothing between adjacent subbands.
- w (n) may be allocated such that when the type information of the excitation signal is defined as 2 bits of 0, 1, 2, and 3, the maximum value is 0 and the minimum value is 3.
- the energy dequantization unit 1150 dequantizes the quantization index included in the bitstream to restore energy.
- the high frequency spectrum generator 1170 may restore the high frequency band spectrum from the high frequency excitation signal based on a ratio between the energy of the high frequency excitation signal and the restored energy so that the energy of the high frequency excitation signal may be matched with the restored energy.
- the high frequency spectrum generator 1170 replaces the input signal instead of the low frequency band spectrum provided from the semi-lean processing unit 1050 of FIG. 10.
- the high frequency spectrum may be generated using the input of the spectrum copying unit 1110.
- FIG. 12 is a block diagram showing the configuration of an audio decoding apparatus according to another embodiment of the present invention.
- the audio decoding apparatus 1200 illustrated in FIG. 12 may include an LPC decoding unit 1205, a switching unit 1210, a CELP decoding module 1230, and an audio decoding module 1250.
- the CELP decoding module 1230 may include a CELP decoding unit 1231 and a TD extension decoding unit 1233, and the audio decoding module 1250 includes an audio decoding unit 1251 and an FD extension decoding unit 1253. can do.
- Each component may be integrated into at least one or more modules and implemented as at least one or more processors (not shown).
- the LPC decoding unit 1205 performs LPC decoding on a bitstream in a frame unit.
- the switching unit 1210 may provide the output of the LPC decoding unit 1205 to one of the CELP decoding module 1230 and the audio decoding module 1250 with reference to the information on the encoding mode included in the bitstream.
- the output of the LPC decoder 1205 is provided to the CELP decoding module 1230 and the audio decoding module 1250 when the audio mode is the audio mode.
- the CELP decoding unit 1231 performs CELP decoding on a CELP encoded frame. For example, the CELP decoder 1231 performs decoding on the filtered adaptive code vector and the filtered fixed code vector, and synthesizes the decoding results to generate a reconstruction signal for the low frequency band.
- the TD extension decoder 1233 generates a reconstruction signal of the high frequency band by performing extended decoding on the high frequency band using at least one of a CELP decoding result and an excitation signal of a low frequency band. At this time, the excitation signal of the low frequency band may be included in the bitstream. In addition, the TD extension decoder 1233 may utilize linear predictive coefficient information for the low frequency band included in the bitstream to generate a reconstruction signal for the high frequency band.
- the TD extension decoder 1233 may generate a reconstructed SWB signal by combining a reconstruction signal of the generated high frequency band with a reconstruction signal of the low frequency band generated by the CELP decoder 1231. In this case, the TD extension decoder 1233 may further perform the operation of converting the sampling rate of the reconstruction signal of the low frequency band and the reconstruction signal of the high frequency band to be identical to generate the reconstructed SWB signal.
- the audio decoder 1125 performs audio decoding on the audio encoded frame. For example, when there is a time domain contribution, the audio decoder 1125 performs decoding in consideration of the time domain contribution and the frequency domain contribution, and when the time domain contribution does not exist. The decoding is performed considering the frequency domain contribution.
- the audio decoder 1125 performs inverse frequency transformation using IDCT or the like on the signal quantized by FPC or LVQ to generate the decoded low frequency excitation signal, and synthesizes the generated excitation signal with inverse quantized LPC coefficients. As a result, a reconstruction signal of a low frequency band can be generated.
- the FD extension decoder 1253 performs extended decoding by using the result of the audio decoding. For example, the FD extension decoder 1253 converts the decoded low frequency band signal to a sampling rate suitable for high frequency extension decoding, and performs a frequency conversion such as MDCT on the converted signal. The FD extension decoder 1253 dequantizes the energy of the quantized high frequency band of the converted low frequency spectrum, generates an excitation signal of the high frequency band using signals of the low frequency band according to various modes of the high frequency bandwidth extension, and generates As the gain is applied to match the energy of the excitation signal with the dequantized energy, a reconstruction signal of a high frequency band may be generated.
- the various modes of high frequency bandwidth extension can be any one of a normal mode, a transition mode, a harmonic mode, or a noise mode.
- the FD extension decoder 1253 performs a frequency inverse transform, such as Inverse MDCT, on the generated high frequency band reconstruction signal and the low frequency band reconstruction signal, and performs an inverse decoding on the signal on which the frequency inverse transform has been performed. After performing the conversion operation to match the generated low frequency signal and the sampling rate, the low frequency signal and the signal on which the conversion operation is performed are synthesized. A final reconstruction signal is generated.
- a frequency inverse transform such as Inverse MDCT
- the FD extension decoder 1253 applies a gain obtained in the time domain so that the decoded signal matches the decoded temporal envelope after the frequency inverse transform is performed, and the gain-applied signal is applied. You can also synthesize
- the audio signal decoding apparatus may perform decoding on the bitstream by referring to the encoding mode in units of frames with respect to the bitstream.
- FIG. 13 is a block diagram showing the configuration of an audio decoding apparatus according to another embodiment of the present invention.
- the audio decoding apparatus 1300 illustrated in FIG. 13 may include a switching unit 1310, a CELP decoding module 1330, an FD decoding module 1350, and an audio decoding module 1370.
- the CELP decoding module 1330 may include a CELP decoding unit 1331 and a TD extended decoding unit 1333
- the FD decoding module 1350 may include an FD decoding unit 1351 and an inverse transform unit 1353.
- the audio decoding module 1370 may include an audio decoder 1372 and an FD extension decoder 1373. Each component may be integrated into at least one or more modules and implemented as at least one or more processors (not shown).
- the switching unit 1310 refers to information about an encoding mode included in a bitstream, and converts the bitstream into a CELP decoding module 1330, an FD decoding module 1350, and an audio decoding module 1370. Can be provided as one.
- the encoding mode is the CELP mode
- the bitstream is provided to the CELP decoding module 1330, the FD decoding module 1350 in the FD mode, and the audio decoding module 1370 when the audio mode is used.
- the CELP decoding module 1330, the FD decoding module 1350, and the audio decoding module 1370 are reversible operations with the CELP encoding module 850, the FD encoding module 870, and the audio encoding module 890 of FIG. 8. The detailed description thereof will be omitted here.
- FIG. 14 is a diagram illustrating a codebook sharing method according to an embodiment of the present invention.
- the FD extension encoder 773 shown in FIG. 7 or the FD extension encoder 893 shown in FIG. 8 may perform energy quantization by sharing the same codebook for different bit rates. Accordingly, the FD extension encoder 773 or the FD extension encoder 893 divides the frequency spectrum corresponding to the input signal into a predetermined number of subbands, and thus has the same subband bandwidth for different bit rates. To do that.
- the bandwidth 1430 for the first subband may be 0.4 kHz at both a 16 kbps bit rate and a bit rate of at least 16 kbps
- the bandwidth 1440 for the second subband may be 0.6 kHz at both a 16 kbps bit rate and a 16 kbps or more bit rate.
- the FD extension encoder 773 or the FD extension encoder 893 share the same codebook for different bit rates to perform energy quantization. Can be done.
- the multimode bandwidth extension technique is applied in a configuration in which the CELP mode and the FD mode are switched, or in a configuration in which the CELP mode and the Audio mode are switched, or a setting in which the CELP mode, the FD mode, and the audio mode are switched.
- codebook sharing that can support various bit rates, the size of a memory (eg, a ROM) can be reduced, and the complexity of implementation can be reduced.
- 15 is a diagram illustrating a coding mode signaling method according to an embodiment of the present invention.
- operation 1510 it is determined whether an input signal corresponds to a transient component. Detection of the transient component can be performed using various known methods.
- the input signal is encoded in the transient mode, and a signal of being encoded in the transient mode is signaled using a 1-bit transient indicator.
- operation 1540 when it is determined in operation 1510 that it does not correspond to the transient component, it is determined whether it corresponds to a harmonic component. Detection of the harmonic component can be carried out using various known methods.
- the input signal is encoded in the harmonic mode, and the signal is encoded in the harmonic mode by using the one-bit harmonic indicator together with the one-bit transient indicator.
- step 1560 if the result of the determination in step 1540 does not correspond to the harmonic component, bit allocation in units of decimal points is performed.
- step 1570 the input signal is encoded in the normal mode, and the signal is encoded in the normal mode using the 1-bit harmonic indicator together with the 1-bit transient indicator.
- a transient mode a harmonic mode
- a normal mode can be signaled using a 2-bit indicator.
- the method derived from the apparatus according to the embodiments can be written in a program that can be executed in a computer, and can be implemented in a general-purpose digital computer for operating the program using a computer-readable recording medium.
- data structures, program instructions, or data files that can be used in the above-described embodiments of the present invention can be recorded on a computer-readable recording medium through various means.
- the computer-readable recording medium may include all kinds of storage devices in which data that can be read by a computer system is stored. Examples of computer-readable recording media include magnetic media, such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, floppy disks, and the like.
- Such as magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like.
- the computer-readable recording medium may also be a transmission medium for transmitting a signal specifying a program command, a data structure, or the like.
- Examples of program instructions may include high-level language code that can be executed by a computer using an interpreter as well as machine code such as produced by a compiler.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Quality & Reliability (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Mobile Radio Communication Systems (AREA)
- Error Detection And Correction (AREA)
- Radar Systems Or Details Thereof (AREA)
Abstract
Description
Claims (2)
- 저주파수 대역의 스펙트럼에 대하여 반-희박성 처리를 수행하는 단계; 및상기 반-희박성 처리가 수행된 저주파수 대역의 스펙트럼을 이용하여 주파수 도메인에서 고주파수 대역의 확장 부호화를 수행하는 단계를 포함하는 대역폭 확장신호 생성방법.
- 저주파수 대역의 스펙트럼에 대하여 반-희박성 처리를 수행하는 반-희박성 처리부; 및상기 반-희박성 처리가 수행된 저주파수 대역의 스펙트럼을 이용하여 주파수 도메인에서 고주파수 대역의 확장 복호화를 수행하는 FD 고주파수 확장 복호화부를 포함하는 대역폭 확장신호 생성장치.
Priority Applications (15)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MX2017011044A MX370012B (es) | 2011-06-30 | 2012-07-02 | Aparato y método para generar señal extendida de ancho de banda. |
MX2016008879A MX350162B (es) | 2011-06-30 | 2012-07-02 | Aparato y método para generar señal extendida de ancho de banda. |
CN201280042439.XA CN103843062B (zh) | 2011-06-30 | 2012-07-02 | 用于产生带宽扩展信号的设备和方法 |
AU2012276367A AU2012276367B2 (en) | 2011-06-30 | 2012-07-02 | Apparatus and method for generating bandwidth extension signal |
EP12804615.8A EP2728577A4 (en) | 2011-06-30 | 2012-07-02 | APPARATUS AND METHOD FOR GENERATING A BANDWIDTH EXTENSION SIGNAL |
US14/130,021 US9349380B2 (en) | 2011-06-30 | 2012-07-02 | Apparatus and method for generating bandwidth extension signal |
BR112013033900-4A BR112013033900B1 (pt) | 2011-06-30 | 2012-07-02 | Método para gerar um sinal estendido de largura de banda para decodificação de áudio |
MX2014000161A MX340386B (es) | 2011-06-30 | 2012-07-02 | Aparato y metodo para generar señal extendida de ancho de banda. |
BR122021019883-7A BR122021019883B1 (pt) | 2011-06-30 | 2012-07-02 | Método de gerar um sinal estendido de largura de banda, e mídia não transitória legível por computador |
BR122021019877-2A BR122021019877B1 (pt) | 2011-06-30 | 2012-07-02 | Aparelho para gerar um sinal estendido de largura de banda |
JP2014518822A JP6001657B2 (ja) | 2011-06-30 | 2012-07-02 | 帯域幅拡張信号生成装置及びその方法 |
CA2840732A CA2840732C (en) | 2011-06-30 | 2012-07-02 | Apparatus and method for generating bandwidth extension signal |
ZA2014/00704A ZA201400704B (en) | 2011-06-30 | 2014-01-29 | Apparatus and method for generating bandwidth extension signal |
US15/142,949 US9734843B2 (en) | 2011-06-30 | 2016-04-29 | Apparatus and method for generating bandwidth extension signal |
US15/676,209 US10037766B2 (en) | 2011-06-30 | 2017-08-14 | Apparatus and method for generating bandwith extension signal |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161503241P | 2011-06-30 | 2011-06-30 | |
US61/503,241 | 2011-06-30 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/130,021 A-371-Of-International US9349380B2 (en) | 2011-06-30 | 2012-07-02 | Apparatus and method for generating bandwidth extension signal |
US15/142,949 Continuation US9734843B2 (en) | 2011-06-30 | 2016-04-29 | Apparatus and method for generating bandwidth extension signal |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2013002623A2 true WO2013002623A2 (ko) | 2013-01-03 |
WO2013002623A3 WO2013002623A3 (ko) | 2013-04-11 |
WO2013002623A4 WO2013002623A4 (ko) | 2013-06-06 |
Family
ID=47424723
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2012/005258 WO2013002623A2 (ko) | 2011-06-30 | 2012-07-02 | 대역폭 확장신호 생성장치 및 방법 |
Country Status (12)
Country | Link |
---|---|
US (3) | US9349380B2 (ko) |
EP (1) | EP2728577A4 (ko) |
JP (3) | JP6001657B2 (ko) |
KR (3) | KR102078865B1 (ko) |
CN (3) | CN106157968B (ko) |
AU (3) | AU2012276367B2 (ko) |
BR (3) | BR122021019883B1 (ko) |
CA (2) | CA2840732C (ko) |
MX (3) | MX340386B (ko) |
TW (3) | TWI576832B (ko) |
WO (1) | WO2013002623A2 (ko) |
ZA (1) | ZA201400704B (ko) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015162500A3 (ko) * | 2014-03-24 | 2016-01-28 | 삼성전자 주식회사 | 고대역 부호화방법 및 장치와 고대역 복호화 방법 및 장치 |
KR20170024048A (ko) * | 2014-07-28 | 2017-03-06 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | 독립적 잡음-충전을 사용하여 향상된 신호를 발생시키기 위한 장치 및 방법 |
US11676614B2 (en) | 2014-03-03 | 2023-06-13 | Samsung Electronics Co., Ltd. | Method and apparatus for high frequency decoding for bandwidth extension |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
MX340386B (es) * | 2011-06-30 | 2016-07-07 | Samsung Electronics Co Ltd | Aparato y metodo para generar señal extendida de ancho de banda. |
CN105976824B (zh) | 2012-12-06 | 2021-06-08 | 华为技术有限公司 | 信号解码的方法和设备 |
ES2714289T3 (es) | 2013-01-29 | 2019-05-28 | Fraunhofer Ges Forschung | Llenado con ruido en la codificación de audio por transformada perceptual |
EP2830061A1 (en) | 2013-07-22 | 2015-01-28 | Fraunhofer Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for encoding and decoding an encoded audio signal using temporal noise/patch shaping |
KR102625143B1 (ko) * | 2014-02-17 | 2024-01-15 | 삼성전자주식회사 | 신호 부호화방법 및 장치와 신호 복호화방법 및 장치 |
WO2015133795A1 (ko) * | 2014-03-03 | 2015-09-11 | 삼성전자 주식회사 | 대역폭 확장을 위한 고주파 복호화 방법 및 장치 |
PL3128513T3 (pl) * | 2014-03-31 | 2019-11-29 | Fraunhofer Ges Forschung | Koder, dekoder, sposób kodowania, sposób dekodowania i program |
CN106409304B (zh) * | 2014-06-12 | 2020-08-25 | 华为技术有限公司 | 一种音频信号的时域包络处理方法及装置、编码器 |
FR3024581A1 (fr) * | 2014-07-29 | 2016-02-05 | Orange | Determination d'un budget de codage d'une trame de transition lpd/fd |
JP2016038435A (ja) * | 2014-08-06 | 2016-03-22 | ソニー株式会社 | 符号化装置および方法、復号装置および方法、並びにプログラム |
WO2016142002A1 (en) | 2015-03-09 | 2016-09-15 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio encoder, audio decoder, method for encoding an audio signal and method for decoding an encoded audio signal |
EP3435376B1 (en) * | 2017-07-28 | 2020-01-22 | Fujitsu Limited | Audio encoding apparatus and audio encoding method |
KR102457573B1 (ko) * | 2021-03-02 | 2022-10-21 | 국방과학연구소 | 잡음 신호 생성 장치 및 방법, 컴퓨터 판독 가능한 기록 매체 및 컴퓨터 프로그램 |
KR102473886B1 (ko) | 2021-11-25 | 2022-12-06 | 한국프리팩 주식회사 | 친환경 발포 다층시트, 이를 이용한 아이스팩 및 그의 제조방법 |
KR102574372B1 (ko) | 2023-01-26 | 2023-09-05 | 한국프리팩 주식회사 | 공압출된 친환경 발포 다층필름 및 이를 이용한 아이스팩 |
Family Cites Families (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS5857759B2 (ja) * | 1979-10-01 | 1983-12-21 | 日本電信電話株式会社 | 駆動音源信号発生装置 |
JPS57125999A (en) * | 1981-01-29 | 1982-08-05 | Seiko Instr & Electronics | Voice synthesizer |
US6029125A (en) | 1997-09-02 | 2000-02-22 | Telefonaktiebolaget L M Ericsson, (Publ) | Reducing sparseness in coded speech signals |
US6058359A (en) * | 1998-03-04 | 2000-05-02 | Telefonaktiebolaget L M Ericsson | Speech coding including soft adaptability feature |
DE60110086T2 (de) * | 2000-07-27 | 2006-04-06 | Activated Content Corp., Inc., Burlingame | Stegotextkodierer und -dekodierer |
KR100510434B1 (ko) * | 2001-04-09 | 2005-08-26 | 니폰덴신뎅와 가부시키가이샤 | Ofdm신호전달 시스템, ofdm신호 송신장치 및ofdm신호 수신장치 |
AU2002348961A1 (en) * | 2001-11-23 | 2003-06-10 | Koninklijke Philips Electronics N.V. | Audio signal bandwidth extension |
US20040002856A1 (en) * | 2002-03-08 | 2004-01-01 | Udaya Bhaskar | Multi-rate frequency domain interpolative speech CODEC system |
US7668711B2 (en) * | 2004-04-23 | 2010-02-23 | Panasonic Corporation | Coding equipment |
JP5129117B2 (ja) | 2005-04-01 | 2013-01-23 | クゥアルコム・インコーポレイテッド | 音声信号の高帯域部分を符号化及び復号する方法及び装置 |
US7813931B2 (en) * | 2005-04-20 | 2010-10-12 | QNX Software Systems, Co. | System for improving speech quality and intelligibility with bandwidth compression/expansion |
US7831434B2 (en) * | 2006-01-20 | 2010-11-09 | Microsoft Corporation | Complex-transform channel coding with extended-band frequency coding |
KR20070115637A (ko) * | 2006-06-03 | 2007-12-06 | 삼성전자주식회사 | 대역폭 확장 부호화 및 복호화 방법 및 장치 |
CN101089951B (zh) * | 2006-06-16 | 2011-08-31 | 北京天籁传音数字技术有限公司 | 频带扩展编码方法及装置和解码方法及装置 |
KR101390188B1 (ko) * | 2006-06-21 | 2014-04-30 | 삼성전자주식회사 | 적응적 고주파수영역 부호화 및 복호화 방법 및 장치 |
KR101375582B1 (ko) * | 2006-11-17 | 2014-03-20 | 삼성전자주식회사 | 대역폭 확장 부호화 및 복호화 방법 및 장치 |
US8639500B2 (en) * | 2006-11-17 | 2014-01-28 | Samsung Electronics Co., Ltd. | Method, medium, and apparatus with bandwidth extension encoding and/or decoding |
KR101379263B1 (ko) * | 2007-01-12 | 2014-03-28 | 삼성전자주식회사 | 대역폭 확장 복호화 방법 및 장치 |
DK2571024T3 (en) * | 2007-08-27 | 2015-01-05 | Ericsson Telefon Ab L M | Adaptive transition frequency between the noise filling and bandwidth extension |
ES2704286T3 (es) * | 2007-08-27 | 2019-03-15 | Ericsson Telefon Ab L M | Método y dispositivo para la descodificación espectral perceptual de una señal de audio, que incluyen el llenado de huecos espectrales |
KR101452722B1 (ko) * | 2008-02-19 | 2014-10-23 | 삼성전자주식회사 | 신호 부호화 및 복호화 방법 및 장치 |
EP2144230A1 (en) | 2008-07-11 | 2010-01-13 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Low bitrate audio encoding/decoding scheme having cascaded switches |
US8880410B2 (en) * | 2008-07-11 | 2014-11-04 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for generating a bandwidth extended signal |
CA2836871C (en) * | 2008-07-11 | 2017-07-18 | Stefan Bayer | Time warp activation signal provider, audio signal encoder, method for providing a time warp activation signal, method for encoding an audio signal and computer programs |
EP4407610A1 (en) * | 2008-07-11 | 2024-07-31 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoder, audio decoder, methods for encoding and decoding an audio signal, audio stream and computer program |
CN102177426B (zh) * | 2008-10-08 | 2014-11-05 | 弗兰霍菲尔运输应用研究公司 | 多分辨率切换音频编码/解码方案 |
RU2493618C2 (ru) * | 2009-01-28 | 2013-09-20 | Долби Интернешнл Аб | Усовершенствованное гармоническое преобразование |
EP2239732A1 (en) * | 2009-04-09 | 2010-10-13 | Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V. | Apparatus and method for generating a synthesis audio signal and for encoding an audio signal |
KR101826331B1 (ko) * | 2010-09-15 | 2018-03-22 | 삼성전자주식회사 | 고주파수 대역폭 확장을 위한 부호화/복호화 장치 및 방법 |
ES2664090T3 (es) * | 2011-03-10 | 2018-04-18 | Telefonaktiebolaget Lm Ericsson (Publ) | Relleno de subvectores no codificados en señales de audio codificadas por transformada |
TWI606441B (zh) | 2011-05-13 | 2017-11-21 | 三星電子股份有限公司 | 解碼裝置 |
MX340386B (es) * | 2011-06-30 | 2016-07-07 | Samsung Electronics Co Ltd | Aparato y metodo para generar señal extendida de ancho de banda. |
-
2012
- 2012-07-02 MX MX2014000161A patent/MX340386B/es active IP Right Grant
- 2012-07-02 EP EP12804615.8A patent/EP2728577A4/en not_active Ceased
- 2012-07-02 MX MX2016008879A patent/MX350162B/es unknown
- 2012-07-02 CN CN201610801708.6A patent/CN106157968B/zh active Active
- 2012-07-02 CN CN201610801479.8A patent/CN106128473B/zh active Active
- 2012-07-02 US US14/130,021 patent/US9349380B2/en active Active
- 2012-07-02 MX MX2017011044A patent/MX370012B/es unknown
- 2012-07-02 WO PCT/KR2012/005258 patent/WO2013002623A2/ko active Application Filing
- 2012-07-02 BR BR122021019883-7A patent/BR122021019883B1/pt active IP Right Grant
- 2012-07-02 TW TW101123831A patent/TWI576832B/zh active
- 2012-07-02 JP JP2014518822A patent/JP6001657B2/ja active Active
- 2012-07-02 TW TW106133069A patent/TWI619116B/zh active
- 2012-07-02 CA CA2840732A patent/CA2840732C/en active Active
- 2012-07-02 AU AU2012276367A patent/AU2012276367B2/en active Active
- 2012-07-02 BR BR112013033900-4A patent/BR112013033900B1/pt active IP Right Grant
- 2012-07-02 KR KR1020120071987A patent/KR102078865B1/ko active IP Right Grant
- 2012-07-02 CN CN201280042439.XA patent/CN103843062B/zh active Active
- 2012-07-02 CA CA2966987A patent/CA2966987C/en active Active
- 2012-07-02 BR BR122021019877-2A patent/BR122021019877B1/pt active IP Right Grant
- 2012-07-02 TW TW106103594A patent/TWI605448B/zh active
-
2014
- 2014-01-29 ZA ZA2014/00704A patent/ZA201400704B/en unknown
-
2016
- 2016-04-05 AU AU2016202120A patent/AU2016202120B2/en active Active
- 2016-04-29 US US15/142,949 patent/US9734843B2/en active Active
- 2016-09-01 JP JP2016170949A patent/JP6247358B2/ja active Active
-
2017
- 2017-04-04 AU AU2017202211A patent/AU2017202211C1/en active Active
- 2017-08-14 US US15/676,209 patent/US10037766B2/en active Active
- 2017-11-16 JP JP2017221260A patent/JP6599419B2/ja active Active
-
2020
- 2020-02-12 KR KR1020200017008A patent/KR102240271B1/ko active IP Right Grant
- 2020-12-17 KR KR1020200177792A patent/KR102343332B1/ko active IP Right Grant
Non-Patent Citations (1)
Title |
---|
None |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11676614B2 (en) | 2014-03-03 | 2023-06-13 | Samsung Electronics Co., Ltd. | Method and apparatus for high frequency decoding for bandwidth extension |
US10468035B2 (en) | 2014-03-24 | 2019-11-05 | Samsung Electronics Co., Ltd. | High-band encoding method and device, and high-band decoding method and device |
US11688406B2 (en) | 2014-03-24 | 2023-06-27 | Samsung Electronics Co., Ltd. | High-band encoding method and device, and high-band decoding method and device |
US10909993B2 (en) | 2014-03-24 | 2021-02-02 | Samsung Electronics Co., Ltd. | High-band encoding method and device, and high-band decoding method and device |
WO2015162500A3 (ko) * | 2014-03-24 | 2016-01-28 | 삼성전자 주식회사 | 고대역 부호화방법 및 장치와 고대역 복호화 방법 및 장치 |
US10529348B2 (en) | 2014-07-28 | 2020-01-07 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for generating an enhanced signal using independent noise-filling identified by an identification vector |
US10354663B2 (en) | 2014-07-28 | 2019-07-16 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for generating an enhanced signal using independent noise-filling |
JP2019194704A (ja) * | 2014-07-28 | 2019-11-07 | フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ | 独立したノイズ充填を用いた強化された信号を生成するための装置および方法 |
KR101958359B1 (ko) | 2014-07-28 | 2019-03-15 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | 독립적 잡음-충전을 사용하여 향상된 신호를 발생시키기 위한 장치 및 방법 |
US10885924B2 (en) | 2014-07-28 | 2021-01-05 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for generating an enhanced signal using independent noise-filling |
KR101958360B1 (ko) | 2014-07-28 | 2019-03-15 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | 독립적 잡음-충전을 사용하여 향상된 신호를 발생시키기 위한 장치 및 방법 |
JP6992024B2 (ja) | 2014-07-28 | 2022-01-13 | フラウンホッファー-ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ | 独立したノイズ充填を用いた強化された信号を生成するための装置および方法 |
US11264042B2 (en) | 2014-07-28 | 2022-03-01 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for generating an enhanced signal using independent noise-filling information which comprises energy information and is included in an input signal |
KR20170063534A (ko) * | 2014-07-28 | 2017-06-08 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | 독립적 잡음-충전을 사용하여 향상된 신호를 발생시키기 위한 장치 및 방법 |
KR20170024048A (ko) * | 2014-07-28 | 2017-03-06 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | 독립적 잡음-충전을 사용하여 향상된 신호를 발생시키기 위한 장치 및 방법 |
US11705145B2 (en) | 2014-07-28 | 2023-07-18 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for generating an enhanced signal using independent noise-filling |
US11908484B2 (en) | 2014-07-28 | 2024-02-20 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for generating an enhanced signal using independent noise-filling at random values and scaling thereupon |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2013002623A2 (ko) | 대역폭 확장신호 생성장치 및 방법 | |
WO2013141638A1 (ko) | 대역폭 확장을 위한 고주파수 부호화/복호화 방법 및 장치 | |
RU2483364C2 (ru) | Схема аудиокодирования/декодирования с переключением байпас | |
JP5328368B2 (ja) | 符号化装置、復号装置、およびこれらの方法 | |
JP5869537B2 (ja) | 帯域幅拡張復号化方法 | |
JP5140730B2 (ja) | 切り換え可能な時間分解能を用いた低演算量のスペクトル分析/合成 | |
KR101244310B1 (ko) | 광대역 부호화 및 복호화 방법 및 장치 | |
WO2012036487A2 (en) | Apparatus and method for encoding and decoding signal for high frequency bandwidth extension | |
EP2630641A2 (en) | Apparatus and method for determining weighting function having low complexity for linear predictive coding (lpc) coefficients quantization | |
WO2013058635A2 (ko) | 프레임 에러 은닉방법 및 장치와 오디오 복호화방법 및 장치 | |
SE521129C2 (sv) | Sätt och anordning för audiokodning | |
KR20140050002A (ko) | 고주파수 신호 복호화 방법 및 장치 | |
WO2016024853A1 (ko) | 음질 향상 방법 및 장치, 음성 복호화방법 및 장치와 이를 채용한 멀티미디어 기기 | |
WO2015108358A1 (ko) | 선형 예측 부호화 계수를 양자화하기 위한 가중치 함수 결정 장치 및 방법 | |
KR20120074314A (ko) | 신호 처리 방법 및 이의 장치 | |
WO2015170899A1 (ko) | 선형예측계수 양자화방법 및 장치와 역양자화 방법 및 장치 | |
KR20080053739A (ko) | 적응적으로 윈도우 크기를 적용하는 부호화 장치 및 방법 | |
JP5863765B2 (ja) | 符号化方法および装置、そして、復号化方法および装置 | |
WO2015065137A1 (ko) | 광대역 신호 생성방법 및 장치와 이를 채용하는 기기 | |
WO2010134757A2 (ko) | 계층형 정현파 펄스 코딩을 이용한 오디오 신호의 인코딩 및 디코딩 방법 및 장치 | |
WO2015037969A1 (ko) | 신호 부호화방법 및 장치와 신호 복호화방법 및 장치 | |
WO2015122752A1 (ko) | 신호 부호화방법 및 장치와 신호 복호화방법 및 장치 | |
WO2015133795A1 (ko) | 대역폭 확장을 위한 고주파 복호화 방법 및 장치 | |
Lei et al. | Digital synthesis of Mandarin speech using its special characteristics | |
WO2011114192A1 (en) | Method and apparatus for audio coding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12804615 Country of ref document: EP Kind code of ref document: A2 |
|
ENP | Entry into the national phase |
Ref document number: 2840732 Country of ref document: CA |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2014518822 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2014/000161 Country of ref document: MX |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012804615 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2012276367 Country of ref document: AU Date of ref document: 20120702 Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14130021 Country of ref document: US |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112013033900 Country of ref document: BR |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01E Ref document number: 112013033900 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112013033900 Country of ref document: BR Kind code of ref document: A2 Effective date: 20131230 |