WO2014091694A1 - 音声音響符号化装置、音声音響復号装置、音声音響符号化方法及び音声音響復号方法 - Google Patents
音声音響符号化装置、音声音響復号装置、音声音響符号化方法及び音声音響復号方法 Download PDFInfo
- Publication number
- WO2014091694A1 WO2014091694A1 PCT/JP2013/006948 JP2013006948W WO2014091694A1 WO 2014091694 A1 WO2014091694 A1 WO 2014091694A1 JP 2013006948 W JP2013006948 W JP 2013006948W WO 2014091694 A1 WO2014091694 A1 WO 2014091694A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- group
- energy
- bits
- envelope
- subband
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000001228 spectrum Methods 0.000 claims abstract description 82
- 230000003595 spectral effect Effects 0.000 claims description 60
- 238000006243 chemical reaction Methods 0.000 claims description 24
- 238000013139 quantization Methods 0.000 claims description 23
- 239000006185 dispersion Substances 0.000 claims description 11
- 230000001131 transforming effect Effects 0.000 claims 1
- 230000005236 sound signal Effects 0.000 abstract description 3
- 230000001052 transient effect Effects 0.000 description 22
- 238000004364 calculation method Methods 0.000 description 19
- 238000010586 diagram Methods 0.000 description 11
- 238000010606 normalization Methods 0.000 description 7
- 230000009466 transformation Effects 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 238000007493 shaping process Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/035—Scalar quantisation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
Definitions
- the present invention relates to a speech / acoustic encoding apparatus, a speech / acoustic decoding apparatus, a speech / acoustic encoding method, and a speech / acoustic decoding method using a transform encoding method.
- ITU-T International Telecommunication Union Telecommunication Standardization Sector
- conversion encoding conversion encoding
- transform coding uses time-frequency transform such as discrete cosine transform (DCT: Discrete Cosine Transform) or modified discrete cosine transform (MDCT: Modified Discrete Cosine Transform) to convert the input signal from time domain to frequency domain. It is an encoding method that converts the signal so that the signal can be accurately mapped to the auditory characteristic.
- DCT discrete cosine transform
- MDCT Modified Discrete Cosine Transform
- spectral coefficients are divided into a plurality of frequency subbands.
- the overall sound quality can be improved by assigning more quantized bits to bands that are perceptually important to the human ear.
- Non-Patent Document 1 a technique disclosed in Non-Patent Document 1 is known.
- the bit allocation method disclosed in Patent Document 1 will be described with reference to FIGS. 1 and 2.
- FIG. 1 is a block diagram showing a configuration of a speech acoustic coding apparatus disclosed in Patent Document 1.
- An input signal sampled at 48 kHz is input to the transient detector 11 and the conversion unit 12 of the speech acoustic coding apparatus.
- the transient detector 11 detects, from the input signal, either a transient frame corresponding to the beginning or end of speech or a stationary frame corresponding to other speech sections, and the conversion unit 12 detects the transient detector 11. Depending on whether the detected frame is a transient frame or a steady frame, a high frequency resolution conversion or a low frequency resolution conversion is applied to the frame of the input signal to obtain a spectral coefficient (or conversion coefficient).
- the norm estimation unit 13 divides the spectrum coefficient obtained by the conversion unit 12 into bands having different bandwidths. Further, the norm estimation unit 13 estimates the norm (or energy) of each divided band.
- the norm quantization unit 14 Based on the norm of each band estimated by the norm estimation unit 13, the norm quantization unit 14 obtains a spectrum envelope composed of the norms of all the bands, and quantizes the obtained spectrum envelope.
- the spectrum normalization unit 15 normalizes the spectrum coefficient obtained by the conversion unit 12 with the norm quantized by the norm quantization unit 14.
- the norm adjustment unit 16 adjusts the norm quantized by the norm quantization unit 14 based on adaptive spectrum weighting.
- the bit allocation unit 17 allocates usable bits for each band in the frame using the quantization norm adjusted by the norm adjustment unit 16.
- the lattice vector encoding unit 18 performs lattice vector encoding on the spectrum coefficient normalized by the spectrum normalization unit 15 with the bits allocated for each band by the bit allocation unit 17.
- the noise level adjustment unit 19 estimates the level of the spectrum coefficient before encoding in the lattice vector encoding unit 18 and encodes the estimated level. Thereby, the noise level adjustment index is obtained.
- the multiplexer 20 is a frame configuration of the input signal acquired by the conversion unit 12, that is, a transient signal flag indicating whether the frame is a stationary frame or a transient frame, a norm quantized by the norm quantization unit 14, and a lattice vector encoding
- the lattice code vector obtained by the unit 18 and the noise level adjustment index obtained by the noise level adjustment unit 19 are multiplexed to form a bit stream, and the bit stream is transmitted to the audio-acoustic decoding apparatus.
- FIG. 2 is a block diagram showing a configuration of the audio-acoustic decoding apparatus disclosed in Patent Document 1.
- the bit stream transmitted from the audio / acoustic encoder is received by the audio / acoustic decoder and demultiplexed by the demultiplexer 21.
- the norm inverse quantization unit 22 inversely quantizes the quantized norm to obtain a spectrum envelope composed of norms of all bands, and the norm adjustment unit 23 performs a norm dequantized by the norm inverse quantization unit 22 Are adjusted based on the adaptive spectral weighting.
- the bit allocation unit 24 allocates usable bits for each band in the frame using the norm adjusted by the norm adjustment unit 23. That is, the bit allocation unit 24 recalculates the bit allocation that is essential for decoding the lattice vector code of the normalized spectral coefficient.
- the lattice decoding unit 25 decodes the transient signal flag, decodes the lattice code vector based on the frame configuration indicated by the decoded transient signal flag, and the bits allocated by the bit allocation unit 24, and obtains the spectrum coefficient. .
- the spectrum fill generator 26 regenerates low-frequency spectral coefficients to which no bits have been allocated, using a codebook created based on the spectral coefficients decoded by the lattice decoding unit 25.
- the spectral fill generator 26 adjusts the level of the regenerated spectral coefficient using the noise level adjustment index.
- the spectral fill generator 26 regenerates the high frequency uncoded spectral coefficients using the low frequency encoded spectral coefficients.
- the adder 27 combines the decoded spectral coefficient and the regenerated spectral coefficient to generate a normalized spectral coefficient.
- the envelope forming unit 28 applies the spectrum envelope dequantized by the norm inverse quantization unit 22 to the normalized spectral coefficient generated by the adder 27 to generate a full band spectral coefficient.
- the inverse transform unit 29 applies an inverse transform such as an inverse modified discrete cosine transform (IMDCT) to the full band spectrum coefficient generated by the envelope shaping unit 28 to convert it into a time domain signal.
- IMDCT inverse modified discrete cosine transform
- inverse transformation with high frequency resolution is applied in the case of a steady frame
- inverse transformation with low frequency resolution is applied in the case of a transient frame.
- the spectral coefficients are divided into spectral groups. Each spectral group is divided into equal length sub-vector bands as shown in FIG. Subvectors have different lengths between groups, and this length increases with increasing frequency. Regarding the resolution of conversion, a higher frequency resolution is used at a low frequency, and a lower frequency resolution is used at a high frequency. G. As described at 719, grouping allows efficient use of the bit budget available during encoding.
- bit allocation method is the same in the encoding device and the decoding device. Here, the bit allocation method will be described with reference to FIG.
- step (hereinafter abbreviated as “ST”) 31 the quantized norm is adjusted before bit allocation in order to adjust the psychoacoustic weighting and masking effect.
- a subband having the maximum norm among all subbands is identified, and in ST33, 1 bit is assigned to each spectrum coefficient in the subband having the maximum norm. That is, as many bits as the number of spectral coefficients are allocated.
- the norm is decreased according to the allocated bits, and in ST35, it is determined whether or not the remaining number of allocatable bits is 8 bits or more. If the remaining allocatable bit number is 8 bits or more, the process returns to ST32. If the remaining allocatable bit number is less than 8 bits, the bit allocation procedure is terminated.
- the bit allocation method allocates usable bits in a frame between subbands using the adjusted quantization norm. Then, the normalized spectrum coefficient is encoded by lattice vector encoding with bits assigned to each subband.
- bit allocation method has a problem that since the input signal characteristics are not considered when spectrum bands are grouped, efficient bit allocation cannot be performed and further improvement in sound quality cannot be expected.
- An object of the present invention is to provide a speech / acoustic encoding device, a speech / acoustic decoding device, a speech / acoustic encoding method, and a speech / acoustic decoding method that perform efficient bit allocation and improve sound quality.
- the speech acoustic coding apparatus includes an energy envelope representing an energy level for each of a plurality of subbands obtained by dividing a frequency spectrum of the input signal, and a conversion unit that converts an input signal from a time domain to a frequency domain.
- Estimating means for estimating a line; quantizing means for quantizing the energy envelope; group determining means for grouping the quantized energy envelope into a plurality of groups; and assigning bits to the plurality of groups
- the frequency spectrum is encoded using first bit allocation means, second bit allocation means for assigning bits assigned to the plurality of groups to subbands for each group, and bits assigned to the subbands. And a coding means.
- the audio-acoustic decoding apparatus includes an inverse quantization unit that inversely quantizes a quantized spectrum envelope, a group determination unit that groups the quantized spectrum envelope into a plurality of groups, and the plurality First bit allocating means for allocating bits to a plurality of groups, second bit allocating means for allocating bits allocated to the plurality of groups to subbands for each group, and using the bits allocated to the subbands.
- Decoding means for decoding the frequency spectrum of the acoustic signal; envelope decoding means for reproducing the decoded spectrum by applying the spectrum envelope dequantized to the decoded frequency spectrum; and decoding the spectrum from the frequency domain
- a reverse conversion means for performing reverse conversion to the time domain.
- the speech acoustic coding method of the present invention converts an input signal from the time domain to the frequency domain, and estimates an energy envelope representing an energy level for each of a plurality of subbands obtained by dividing the frequency spectrum of the input signal. And quantizing the energy envelope, grouping the quantized energy envelope into a plurality of groups, assigning bits to the plurality of groups, and assigning the bits assigned to the plurality of groups to a subband for each group And the frequency spectrum is encoded using the bits allocated to the subband.
- the audio-acoustic decoding method of the present invention performs inverse quantization on the quantized spectral envelopes, groups the quantized spectral envelopes into a plurality of groups, assigns bits to the plurality of groups, and the plurality of groups.
- the bit assigned to the subband is assigned to each group, the frequency spectrum of the audio-acoustic signal is decoded using the bit assigned to the subband, and the quantized spectrum is dequantized to the decoded frequency spectrum.
- An envelope was applied to reproduce the decoded spectrum, and the decoded spectrum was inversely transformed from the frequency domain to the time domain.
- efficient bit allocation can be performed to improve sound quality.
- voice acoustic encoding apparatus disclosed by patent document 1 The block diagram which shows the structure of the speech acoustic decoding apparatus disclosed by patent document 1
- the figure which shows grouping of the spectrum coefficient in the steady mode disclosed by patent document 1 Flow chart showing bit allocation method disclosed in Patent Document 1
- the block diagram which shows the structure of the speech acoustic decoding apparatus which concerns on one embodiment of this invention The block diagram which shows the internal structure of the bit allocation part shown in FIG.
- FIG. 5 is a block diagram showing a configuration of speech acoustic coding apparatus 100 according to an embodiment of the present invention.
- An input signal sampled at 48 kHz is input to the transient detector 101 and the conversion unit 102 of the speech acoustic coding apparatus 100.
- the transient detector 101 detects, from the input signal, either a transient frame corresponding to the beginning or end of speech or a stationary frame corresponding to another speech section, and outputs the detection result to the conversion unit 102. .
- the conversion unit 102 applies high-frequency resolution conversion or low-frequency resolution conversion to the frame of the input signal according to whether the detection result output from the transient detector 101 is a transient frame or a steady frame, and thereby applies a spectral coefficient (or conversion coefficient). ) Is output to the norm estimation unit 103 and the spectrum normalization unit 105.
- the conversion unit 102 also outputs to the multiplexer 110 a frame configuration that is a detection result output from the transient detector 101, that is, a transient signal flag indicating whether the frame is a steady frame or a transient frame.
- the norm estimation unit 103 divides the spectrum coefficient output from the conversion unit 102 into bands having different bandwidths, and estimates the norm (or energy) of each divided band.
- the norm estimation unit 103 outputs the estimated norm of each band to the norm quantization unit 104.
- the norm quantization unit 104 Based on the norm of each band output from the norm estimation unit 103, the norm quantization unit 104 obtains a spectrum envelope including the norms of all bands, quantizes the obtained spectrum envelope, and quantizes the spectrum envelope. The line is output to the spectrum normalization unit 105 and the norm adjustment unit 106.
- Spectral normalization section 105 normalizes the spectral coefficient output from transform section 102 with the quantized spectral envelope output from norm quantization section 104, and outputs the normalized spectral coefficient to lattice vector encoding section 108. To do.
- the norm adjustment unit 106 adjusts the quantized spectrum envelope output from the norm quantization unit 104 based on adaptive spectrum weighting, and outputs the adjusted quantized spectrum envelope to the bit allocation unit 107.
- the bit allocation unit 107 allocates usable bits for each band in the frame, using the adjusted quantized spectral envelope output from the norm adjustment unit 106, and assigns the allocated bits to the lattice vector encoding unit 108. Output to. Details of the bit allocation unit 107 will be described later.
- the lattice vector encoding unit 108 performs lattice vector encoding on the spectrum coefficient normalized by the spectrum normalization unit 105 with the bits allocated for each band by the bit allocation unit 107, and converts the lattice code vector into a noise level adjustment unit. 109 and the multiplexer 110.
- the noise level adjustment unit 109 estimates the level of the spectrum coefficient before encoding in the lattice vector encoding unit 108, and encodes the estimated level. Thereby, the noise level adjustment index is obtained.
- the noise level adjustment index is output to the multiplexer 110.
- the multiplexer 110 includes a transient signal flag output from the conversion unit 102, a quantized spectral envelope output from the norm quantization unit 104, a lattice code vector output from the lattice vector encoding unit 108, and a noise level.
- the noise level adjustment index output from the adjustment unit 109 is multiplexed to form a bit stream, and the bit stream is transmitted to the audio-acoustic decoding apparatus.
- FIG. 6 is a block diagram showing the configuration of the audio-acoustic decoding apparatus 200 according to an embodiment of the present invention.
- the bit stream transmitted from the speech acoustic encoding apparatus 100 is received by the speech acoustic decoding apparatus 200 and demultiplexed by the demultiplexer 201.
- the norm inverse quantization unit 202 inversely quantizes the quantized spectral envelope output from the multiplexer (that is, the norm) to obtain a spectral envelope composed of norms of all bands, and adjusts the obtained spectral envelope to the norm.
- the data is output to the unit 203.
- the norm adjustment unit 203 adjusts the spectrum envelope output from the norm inverse quantization unit 202 based on the adaptive spectrum weighting, and outputs the adjusted spectrum envelope to the bit allocation unit 204.
- the bit allocation unit 204 allocates usable bits for each band in the frame, using the spectrum envelope output from the norm adjustment unit 203. That is, the bit allocation unit 204 recalculates the bit allocation that is essential for decoding the normalized lattice coefficient lattice vector code.
- the allocated bits are output to the lattice decoding unit 205.
- the lattice decoding unit 205 decodes the lattice code vector output from the demultiplexer 201 based on the frame configuration indicated by the transient signal flag output from the demultiplexer 201 and the bits output from the bit allocation unit 204, Get spectral coefficients.
- the spectral coefficient is output to the spectral fill generator 206 and the adder 207.
- the spectrum fill generator 206 regenerates the low-frequency spectral coefficients to which no bits have been allocated, using a codebook created based on the spectral coefficients output from the lattice decoding unit 205. In addition, the spectrum fill generator 206 adjusts the level of the regenerated spectrum coefficient using the noise level adjustment index output from the demultiplexer 201. In addition, the spectral fill generator 206 regenerates the high frequency uncoded spectral coefficients using the low frequency encoded spectral coefficients. The low-frequency spectral coefficient whose level is adjusted and the regenerated high-frequency spectral coefficient are output to the adder 207.
- the adder 207 combines the spectral coefficient output from the lattice decoding unit 205 and the spectral coefficient output from the spectral fill generator 206 to generate a normalized spectral coefficient, and generates the normalized spectral coefficient. Output to the envelope forming unit 208.
- the envelope shaping unit 208 applies the spectrum envelope output from the norm inverse quantization unit 202 to the normalized spectral coefficient generated by the adder 207 to generate a full band spectral coefficient (corresponding to a decoded spectrum). To do.
- the generated full band spectral coefficient is output to the inverse transform unit 209.
- the inverse transform unit 209 applies an inverse transform such as inverse modified discrete cosine transform (IMDCT) to the full band spectrum coefficient output from the envelope shaping unit 208 to convert it into a time domain signal. Output the output signal.
- IMDCT inverse modified discrete cosine transform
- inverse transformation with high frequency resolution is applied in the case of a steady frame
- inverse transformation with low frequency resolution is applied in the case of a transient frame.
- bit allocation unit 107 Details of the bit allocation unit 107 described above will be described with reference to FIG. Note that since the bit allocation unit 107 of the speech acoustic coding apparatus 100 and the bit allocation unit 204 of the speech acoustic decoding apparatus 200 have the same configuration, only the bit allocation unit 107 will be described here, and the bit allocation unit 204 is described. Description of is omitted.
- FIG. 7 is a block diagram showing an internal configuration of the bit allocation unit 107 shown in FIG.
- the dominant frequency band identifying unit 301 Based on the quantized spectral envelope output from the norm adjusting unit 106, the dominant frequency band identifying unit 301 identifies and identifies the dominant frequency bands that are subbands in which the norm coefficient value in the spectrum has a maximum value.
- the superior frequency band is output to the superior group determination units 302-1 to 302N, respectively.
- the band having the maximum norm coefficient value among all the subbands is set as the dominant frequency band, or is determined in advance.
- a band having a norm coefficient value exceeding a threshold calculated from the threshold value or the norm of all subbands may be considered as a dominant frequency band.
- the superior group determining units 302-1 to 302N adaptively determine the group width according to the input signal characteristics with the superior frequency band output from the superior frequency band identifying unit 301 as the center. Specifically, the group width is defined as the group width until the down slope of the norm coefficient value on both sides centering on the dominant frequency band stops.
- the superior group determination units 302-1 to 302N determine the frequency band included in the group width as the superior group, and output the determined superior group to the non-excellent group determination unit 303. When the dominant frequency band is at the edge (end of usable frequency), only one side of the downward gradient is included in the group.
- the non-excellent group determination unit 303 determines a continuous subband other than the superior groups output from the superior group determination units 302-1 to 302N as a non-excellent group having no dominant frequency band.
- the non-excellent group determination unit 303 outputs the superior group and the non-excellent group to the group energy calculation unit 304 and the norm variance calculation unit 306.
- the group energy calculation unit 304 calculates the energy for each group for the superior group and the non-excellent group output from the non-excellent group determination unit 303, and outputs the calculated energy to the total energy calculation unit 305 and the group bit distribution unit 308. To do.
- the energy for each group is calculated by the following equation (1).
- the total energy calculation unit 305 adds all the energy for each group output from the group energy calculation unit 304 and calculates the total energy of all the groups.
- the calculated total energy is output to the group bit distribution unit 308.
- the total energy is calculated by the following equation (2).
- Energy total is the total energy of all groups
- N is the total number of groups in the spectrum
- k is the index of the group
- Energy (G (k)) is the energy of group k.
- the norm variance calculation unit 306 calculates a norm variance for each group for the superior group and the non-excellent group output from the non-excellent group determination unit 303, and the calculated norm variance is used as the total norm variance calculation unit 307 and the group bit distribution unit. Output to 308.
- the norm variance for each group is calculated by the following equation (3).
- k is the group index
- Norm var (G (k)) is the norm variance of group k
- Norm max (G (k)) is the maximum norm coefficient value of group k
- Norm min (G (k)) is Represents the minimum norm coefficient value of group k.
- the total norm variance calculation unit 307 calculates the total norm variance of all groups based on the norm variance for each group output from the norm variance calculation unit 306.
- the calculated total norm variance is output to group bit distribution section 308.
- the total norm variance is calculated by the following equation (4).
- Norm total represents the total norm variance of all groups
- N represents the total number of groups in the spectrum
- k represents the group index
- Norm var (G (k)) represents the norm variance of group k.
- the group bit distribution unit 308 (corresponding to the first bit allocation unit) is an energy for each group output from the group energy calculation unit 304, a total energy of all groups output from the total energy calculation unit 305, and a norm variance calculation unit. Based on the norm variance for each group output from 306 and the total norm variance for all groups output from the total norm variance calculation unit 307, bit allocation is performed for each group, and the bits allocated for each group are The data is output to the subband bit distribution unit 309. The bit allocated for each group is calculated by the following equation (5).
- k is the index of the group
- Bits (G (k)) is the number of bits allocated to group k
- Bits total is the number of all available bits
- scale1 is the percentage of bits allocated by energy
- Energy ( G (k)) represents the energy of group k
- Energy total represents the total energy of all groups
- Normvar (G (k)) represents the norm variance of group k.
- scale1 takes a value in the range of [0, 1], and adjusts the ratio of bits allocated by energy or norm variance.
- the group bit allocation unit 308 allocates bits for each group, so that more bits can be allocated to the superior group and fewer bits can be allocated to the non-excellent group.
- the perceptual importance of the group is determined by the energy and norm variance, and the superior group can be emphasized more.
- Norm variance is consistent with masking theory and can be used to more accurately determine perceptual importance.
- a subband bit allocation unit 309 (corresponding to the second bit allocation unit) allocates bits to the subbands in each group based on the bits for each group output from the group bit allocation unit 308, and subbands for each group.
- the bits allocated to the band are output to the lattice vector encoding unit 108 as a bit allocation result.
- perceptually more important subbands are allocated more bits, and perceptually less important subbands are allocated fewer bits.
- the bit allocated to each subband in the group is calculated by the following equation (6).
- Bits G (k) sb (i) is the bit assigned to subband i of group k
- i is the subband index of group k
- Bits (G (k)) is the bit assigned to group k
- Energy (G (k)) represents the energy of group k
- Norm (i) represents the norm coefficient value of subband i of group k.
- the peak frequency band identifying unit 301 identifies the dominant frequency bands 9 and 20 based on the input quantized spectrum envelope (see FIG. 8B).
- the same superior group is determined until the descending slope of the norm coefficient values on both sides centering on the dominant frequency bands 9 and 20 stops.
- subbands 6 to 12 are determined as the superior group (group 2)
- subbands 17 to 22 are determined as the superior group (group 4). (See FIG. 8 (c)).
- non-excellent group determination unit 303 a continuous frequency band other than the superior group is determined as a non-excellent group having no dominant frequency band.
- subbands 1 to 5 (group 1), subbands 13 to 16 (group 3), and subbands 23 to 25 (group 5) are determined as non-excellent groups, respectively (FIG. 8C). reference).
- the quantized spectral envelopes are grouped into five groups: two dominant groups (groups 2, 4) and three non-dominant groups (groups 1, 3, 5).
- Such a grouping method can adaptively determine the group width according to the input signal characteristics. Moreover, in this method, since quantized norm coefficients that can be used also in the speech acoustic decoding apparatus are used, it is not necessary to transmit additional information to the speech acoustic decoding apparatus.
- the norm variance calculation unit 306 calculates a norm variance for each group.
- the norm variance Energy var (G (2)) in group 2 in the example of FIG. 8 is shown in FIG.
- a peak is comprised from the spectral component (dominant sound component) located in the dominant frequency of a speech sound signal. Peaks are perceptually very important.
- the perceptual importance of the peak can be determined by the difference between the peak energy and the valley energy, that is, the norm variance. Theoretically, if a peak has enough energy compared to an adjacent frequency band, the peak should be encoded with a sufficient number of bits, and if encoded with an insufficient number of bits. , Mixed encoding noise becomes prominent, and sound quality deteriorates.
- the valley is not composed of the dominant sound component of the audio-acoustic signal and is not perceptually important.
- the dominant frequency band corresponds to the peak of the spectrum, and grouping the frequency bands includes the peak (a dominant group having the dominant frequency band) and the valley (the dominant frequency).
- Non-excellent group without band the dominant frequency band
- the group bit distribution unit 308 determines the perceptual importance of the peak. G.
- the perceptual importance is determined only by energy.
- the perceptual importance is determined by both energy and norm (energy) dispersion, and the perceptual importance thus determined is determined. Bits to be allocated to each group are determined based on importance.
- the norm variance within the group is large, this means that the group is one of the peaks, the peak is perceptually more important, and the norm coefficient having the maximum value is accurately determined. Should be encoded. Therefore, more bits are allocated to the peak subband.
- the norm variance within the group is very small, it means that this group is one of the valleys, and the valleys are not perceptually important and need not be encoded as accurately. For this reason, fewer bits are allocated to each subband of this group.
- the dominant frequency band in which the norm coefficient value in the spectrum of the input speech acoustic signal has the maximum value is identified, and all the subbands are distinguished from the superior group including the dominant frequency band and the superior frequency band.
- Group into non-excellent groups that do not include frequency bands allocate bits to each group based on the energy and norm variance for each group, and distribute the allocated bits to each group according to the ratio of norm to group energy Further allocation to each subband. As a result, many bits can be allocated to perceptually important groups and subbands, and efficient bit allocation can be performed. As a result, the sound quality can be improved.
- the norm coefficient in this Embodiment represents subband energy, and is also called an energy envelope.
- the audio-acoustic encoding apparatus, audio-acoustic decoding apparatus, audio-acoustic encoding method, and audio-acoustic decoding method according to the present invention are a radio communication terminal apparatus, a radio communication base station apparatus, a telephone conference terminal apparatus, a video conference terminal apparatus, and The present invention can be applied to a voice over internet protocol (VoIP) terminal device or the like.
- VoIP voice over internet protocol
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
図5は、本発明の一実施の形態に係る音声音響符号化装置100の構成を示すブロック図である。48kHzでサンプリングされた入力信号が音声音響符号化装置100の過渡検出器101および変換部102に入力される。
102 変換部
103 ノルム推定部
104 ノルム量子化部
105 スペクトル正規化部
106、203 ノルム調整部
107、204 ビット割当部
108 格子ベクトル符号化部
109 ノイズレベル調整部
110 マルチプレクサ
201 デマルチプレクサ
202 ノルム逆量子化部
205 格子復号部
206 スペクトルフィル生成器
207 加算器
208 包絡線成形部
209 逆変換部
301 卓越周波数バンド識別部
302-1~302-N 卓越グループ決定部
303 非卓越グループ決定部
304 グループエネルギー算出部
305 総エネルギー算出部
306 ノルム分散算出部
307 総ノルム分散算出部
308 グループビット配分部
309 サブバンドビット配分部
Claims (10)
- 入力信号を時間領域から周波数領域に変換する変換手段と、
前記入力信号の周波数スペクトルが分割されてなる複数のサブバンドのそれぞれについて、エネルギーレベルを表すエネルギー包絡線を推定する推定手段と、
前記エネルギー包絡線を量子化する量子化手段と、
量子化された前記エネルギー包絡線を複数のグループにグループ化するグループ決定手段と、
前記複数のグループにビットを割り当てる第1ビット割当手段と、
前記複数のグループに割り当てられたビットをグループ毎にサブバンドに割り当てる第2ビット割当手段と、
前記サブバンドに割り当てられたビットを用いて、前記周波数スペクトルを符号化する符号化手段と、
を具備する音声音響符号化装置。 - 前記周波数スペクトルのうち、エネルギー包絡線が極大値を有するサブバンドである卓越周波数バンドを識別する卓越周波数バンド識別手段をさらに具備し、
前記グループ決定手段は、
前記卓越周波数バンド、および、前記卓越周波数バンドの両側におけるエネルギー包絡線の下り勾配をなすサブバンドを卓越グループに決定し、前記卓越周波数バンド以外の連続するサブバンドを非卓越グループに決定する、
請求項1に記載の音声音響符号化装置。 - グループ毎のエネルギーを算出するエネルギー算出手段と、
グループ毎のエネルギー包絡線分散を算出する分散算出手段と、
をさらに具備し、
前記第1ビット割当手段は、
算出された前記グループ毎のエネルギーおよび前記グループ毎のエネルギー包絡線分散に基づいて、エネルギーおよびエネルギー包絡線分散の少なくとも一方が大きいほど、より多くのビットをグループに割り当て、エネルギーおよびエネルギー包絡線分散の少なくとも一方が小さいほど、より少ないビットをグループに割り当てる、
請求項1に記載の音声音響符号化装置。 - 前記第2ビット割当手段は、
前記サブバンドのエネルギー包絡線が大きいほど、当該サブバンドにより多くのビットを割り当て、前記サブバンドのエネルギー包絡線が小さいほど、当該サブバンドにより少ないビットを割り当てる、
請求項1に記載の音声音響符号化装置。 - 量子化されたスペクトル包絡線を逆量子化する逆量子化手段と、
量子化された前記スペクトル包絡線を複数のグループにグループ化するグループ決定手段と、
前記複数のグループにビットを割り当てる第1ビット割当手段と、
前記複数のグループに割り当てられたビットをグループ毎にサブバンドに割り当てる第2ビット割当手段と、
前記サブバンドに割り当てられたビットを用いて、音声音響信号の周波数スペクトルを復号する復号手段と、
復号された前記周波数スペクトルに逆量子化された前記スペクトル包絡線を適用し、復号スペクトルを再現する包絡線成形手段と、
前記復号スペクトルを周波数領域から時間領域に逆変換する逆変換手段と、
を具備する音声音響復号装置。 - 前記周波数スペクトルのうち、エネルギー包絡線が極大値を有するサブバンドである卓越周波数バンドを識別する卓越周波数バンド識別手段をさらに具備し、
前記グループ決定手段は、
前記卓越周波数バンド、および、前記卓越周波数バンドの両側におけるエネルギー包絡線の下り勾配をなすサブバンドを卓越グループに決定し、前記卓越周波数バンド以外の連続するサブバンドを非卓越グループに決定する、
請求項5に記載の音声音響復号装置。 - グループ毎のエネルギーを算出するエネルギー算出手段と、
グループ毎のエネルギー包絡線分散を算出する分散算出手段と、
をさらに具備し、
前記第1ビット割当手段は、
算出された前記グループ毎のエネルギーおよび前記グループ毎のエネルギー包絡線分散に基づいて、エネルギーおよびエネルギー包絡線分散の少なくとも一方が大きいほど、より多くのビットをグループに割り当て、エネルギーおよびエネルギー包絡線分散の少なくとも一方が小さいほど、より少ないビットをグループに割り当てる、
請求項5に記載の音声音響復号装置。 - 前記第2ビット割当手段は、
前記サブバンドのエネルギー包絡線が大きいほど、当該サブバンドにより多くのビットを割り当て、前記サブバンドのエネルギー包絡線が小さいほど、当該サブバンドにより少ないビットを割り当てる、
請求項5に記載の音声音響復号装置。 - 入力信号を時間領域から周波数領域に変換し、
前記入力信号の周波数スペクトルが分割されてなる複数のサブバンドのそれぞれについて、エネルギーレベルを表すエネルギー包絡線を推定し、
前記エネルギー包絡線を量子化し、
量子化された前記エネルギー包絡線を複数のグループにグループ化し、
前記複数のグループにビットを割り当て、
前記複数のグループに割り当てられたビットをグループ毎にサブバンドに割り当て、
前記サブバンドに割り当てられたビットを用いて、前記周波数スペクトルを符号化する、
音声音響符号化方法。 - 量子化されたスペクトル包絡線を逆量子化し、
量子化された前記スペクトル包絡線を複数のグループにグループ化し、
前記複数のグループにビットを割り当て、
前記複数のグループに割り当てられたビットをグループ毎にサブバンドに割り当て、
前記サブバンドに割り当てられたビットを用いて、音声音響信号の周波数スペクトルを復号し、
復号された前記周波数スペクトルに逆量子化された前記スペクトル包絡線を適用し、復号スペクトルを再現し、
前記復号スペクトルを周波数領域から時間領域に逆変換する、
音声音響復号方法。
Priority Applications (15)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020157016672A KR102200643B1 (ko) | 2012-12-13 | 2013-11-26 | 음성 음향 부호화 장치, 음성 음향 복호 장치, 음성 음향 부호화 방법 및 음성 음향 복호 방법 |
BR112015013233A BR112015013233B8 (pt) | 2012-12-13 | 2013-11-26 | dispositivo e método de codificação de voz/áudio |
JP2014551851A JP6535466B2 (ja) | 2012-12-13 | 2013-11-26 | 音声音響符号化装置、音声音響復号装置、音声音響符号化方法及び音声音響復号方法 |
ES13862073.7T ES2643746T3 (es) | 2012-12-13 | 2013-11-26 | Dispositivo de codificación de audio de voz, dispositivo de descodificación de audio de voz, método de codificación de audio de voz y método de descodificación de audio de voz |
CN201380063794.XA CN104838443B (zh) | 2012-12-13 | 2013-11-26 | 语音声响编码装置、语音声响解码装置、语音声响编码方法及语音声响解码方法 |
EP13862073.7A EP2933799B1 (en) | 2012-12-13 | 2013-11-26 | Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method |
RU2015121716A RU2643452C2 (ru) | 2012-12-13 | 2013-11-26 | Устройство кодирования аудио/голоса, устройство декодирования аудио/голоса, способ кодирования аудио/голоса и способ декодирования аудио/голоса |
EP17173916.2A EP3232437B1 (en) | 2012-12-13 | 2013-11-26 | Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method |
EP18202397.8A EP3457400B1 (en) | 2012-12-13 | 2013-11-26 | Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method |
MX2015006161A MX341885B (es) | 2012-12-13 | 2013-11-26 | Dispositivo de codificacion de sonido de voz, dispositivo de decodificacion de sonido de voz, metodo de codificacion de sonido de voz y metodo de decodificacion de sonido de voz. |
PL17173916T PL3232437T3 (pl) | 2012-12-13 | 2013-11-26 | Urządzenie do kodowania głosowego audio, urządzenie do dekodowania głosowego audio, sposób kodowania głosowego audio i sposób dekodowania głosowego audio |
PL13862073T PL2933799T3 (pl) | 2012-12-13 | 2013-11-26 | Urządzenie kodujące głos, urządzenie dekodujące głos, sposób kodowania głosu i sposób dekodowania głosu |
US14/650,093 US9767815B2 (en) | 2012-12-13 | 2013-11-26 | Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method |
US15/673,957 US10102865B2 (en) | 2012-12-13 | 2017-08-10 | Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method |
US16/141,934 US10685660B2 (en) | 2012-12-13 | 2018-09-25 | Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012272571 | 2012-12-13 | ||
JP2012-272571 | 2012-12-13 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/650,093 A-371-Of-International US9767815B2 (en) | 2012-12-13 | 2013-11-26 | Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method |
US15/673,957 Continuation US10102865B2 (en) | 2012-12-13 | 2017-08-10 | Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014091694A1 true WO2014091694A1 (ja) | 2014-06-19 |
Family
ID=50934002
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/006948 WO2014091694A1 (ja) | 2012-12-13 | 2013-11-26 | 音声音響符号化装置、音声音響復号装置、音声音響符号化方法及び音声音響復号方法 |
Country Status (13)
Country | Link |
---|---|
US (3) | US9767815B2 (ja) |
EP (3) | EP3232437B1 (ja) |
JP (3) | JP6535466B2 (ja) |
KR (1) | KR102200643B1 (ja) |
CN (2) | CN104838443B (ja) |
BR (1) | BR112015013233B8 (ja) |
ES (3) | ES2706148T3 (ja) |
HK (1) | HK1249651A1 (ja) |
MX (1) | MX341885B (ja) |
PL (3) | PL2933799T3 (ja) |
PT (2) | PT2933799T (ja) |
RU (1) | RU2643452C2 (ja) |
WO (1) | WO2014091694A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016009026A (ja) * | 2014-06-23 | 2016-01-18 | 富士通株式会社 | オーディオ符号化装置、オーディオ符号化方法、オーディオ符号化プログラム |
CN109286922A (zh) * | 2018-09-27 | 2019-01-29 | 珠海市杰理科技股份有限公司 | 蓝牙提示音处理方法、系统、可读存储介质和蓝牙设备 |
JP2020518030A (ja) * | 2017-04-25 | 2020-06-18 | ディーティーエス・インコーポレイテッドDTS,Inc. | デジタルオーディオ信号における差分データ |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104838443B (zh) * | 2012-12-13 | 2017-09-22 | 松下电器(美国)知识产权公司 | 语音声响编码装置、语音声响解码装置、语音声响编码方法及语音声响解码方法 |
EP3066760B1 (en) * | 2013-11-07 | 2020-01-15 | Telefonaktiebolaget LM Ericsson (publ) | Methods and devices for vector segmentation for coding |
EP4407609A3 (en) * | 2013-12-02 | 2024-08-21 | Top Quality Telephony, Llc | A computer-readable storage medium and a computer software product |
CN106409303B (zh) * | 2014-04-29 | 2019-09-20 | 华为技术有限公司 | 处理信号的方法及设备 |
PL3174050T3 (pl) | 2014-07-25 | 2019-04-30 | Fraunhofer Ges Forschung | Urządzenie do kodowania sygnałów audio, urządzenie do dekodowania sygnałów audio i ich sposoby |
KR102709737B1 (ko) * | 2016-11-30 | 2024-09-26 | 삼성전자주식회사 | 오디오 신호를 전송하는 전자 장치 및 오디오 신호를 전송하는 전자 장치의 제어 방법 |
KR20190069192A (ko) | 2017-12-11 | 2019-06-19 | 한국전자통신연구원 | 오디오 신호의 채널 파라미터 예측 방법 및 장치 |
US10559315B2 (en) | 2018-03-28 | 2020-02-11 | Qualcomm Incorporated | Extended-range coarse-fine quantization for audio coding |
US10586546B2 (en) | 2018-04-26 | 2020-03-10 | Qualcomm Incorporated | Inversely enumerated pyramid vector quantizers for efficient rate adaptation in audio coding |
US10734006B2 (en) | 2018-06-01 | 2020-08-04 | Qualcomm Incorporated | Audio coding based on audio pattern recognition |
US10762910B2 (en) | 2018-06-01 | 2020-09-01 | Qualcomm Incorporated | Hierarchical fine quantization for audio coding |
US10580424B2 (en) * | 2018-06-01 | 2020-03-03 | Qualcomm Incorporated | Perceptual audio coding as sequential decision-making problems |
KR20200142787A (ko) * | 2019-06-13 | 2020-12-23 | 네이버 주식회사 | 멀티미디어 신호 인식을 위한 전자 장치 및 그의 동작 방법 |
CN112037802B (zh) * | 2020-05-08 | 2022-04-01 | 珠海市杰理科技股份有限公司 | 基于语音端点检测的音频编码方法及装置、设备、介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6358500A (ja) * | 1986-08-25 | 1988-03-14 | インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン | 副帯域音声コ−ダ用ビツト割振り方法 |
JP2001044844A (ja) * | 1999-07-26 | 2001-02-16 | Matsushita Electric Ind Co Ltd | サブバンド符号化方式 |
JP2002542522A (ja) * | 1999-04-16 | 2002-12-10 | ドルビー・ラボラトリーズ・ライセンシング・コーポレーション | 音声符号化のための利得−適応性量子化及び不均一符号長の使用 |
JP2009063623A (ja) * | 2007-09-04 | 2009-03-26 | Nec Corp | 符号化装置および符号化方法、ならびに復号化装置および復号化方法 |
WO2012016126A2 (en) * | 2010-07-30 | 2012-02-02 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for dynamic bit allocation |
WO2012144128A1 (ja) * | 2011-04-20 | 2012-10-26 | パナソニック株式会社 | 音声音響符号化装置、音声音響復号装置、およびこれらの方法 |
Family Cites Families (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5222189A (en) * | 1989-01-27 | 1993-06-22 | Dolby Laboratories Licensing Corporation | Low time-delay transform coder, decoder, and encoder/decoder for high-quality audio |
US5893065A (en) * | 1994-08-05 | 1999-04-06 | Nippon Steel Corporation | Apparatus for compressing audio data |
US5956674A (en) | 1995-12-01 | 1999-09-21 | Digital Theater Systems, Inc. | Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels |
JP3189660B2 (ja) * | 1996-01-30 | 2001-07-16 | ソニー株式会社 | 信号符号化方法 |
US6246945B1 (en) * | 1996-08-10 | 2001-06-12 | Daimlerchrysler Ag | Process and system for controlling the longitudinal dynamics of a motor vehicle |
JPH10233692A (ja) * | 1997-01-16 | 1998-09-02 | Sony Corp | オーディオ信号符号化装置および符号化方法並びにオーディオ信号復号装置および復号方法 |
KR100261254B1 (ko) | 1997-04-02 | 2000-07-01 | 윤종용 | 비트율 조절이 가능한 오디오 데이터 부호화/복호화방법 및 장치 |
KR100261253B1 (ko) | 1997-04-02 | 2000-07-01 | 윤종용 | 비트율 조절이 가능한 오디오 부호화/복호화 방법및 장치 |
EP0966109B1 (en) * | 1998-06-15 | 2005-04-27 | Matsushita Electric Industrial Co., Ltd. | Audio coding method and audio coding apparatus |
JP3466507B2 (ja) * | 1998-06-15 | 2003-11-10 | 松下電器産業株式会社 | 音声符号化方式、音声符号化装置、及びデータ記録媒体 |
JP3434260B2 (ja) * | 1999-03-23 | 2003-08-04 | 日本電信電話株式会社 | オーディオ信号符号化方法及び復号化方法、これらの装置及びプログラム記録媒体 |
US6246345B1 (en) | 1999-04-16 | 2001-06-12 | Dolby Laboratories Licensing Corporation | Using gain-adaptive quantization and non-uniform symbol lengths for improved audio coding |
JP4168976B2 (ja) * | 2004-05-28 | 2008-10-22 | ソニー株式会社 | オーディオ信号符号化装置及び方法 |
KR100888474B1 (ko) * | 2005-11-21 | 2009-03-12 | 삼성전자주식회사 | 멀티채널 오디오 신호의 부호화/복호화 장치 및 방법 |
JP4548348B2 (ja) | 2006-01-18 | 2010-09-22 | カシオ計算機株式会社 | 音声符号化装置及び音声符号化方法 |
KR101434198B1 (ko) * | 2006-11-17 | 2014-08-26 | 삼성전자주식회사 | 신호 복호화 방법 |
KR101412255B1 (ko) * | 2006-12-13 | 2014-08-14 | 파나소닉 인텔렉츄얼 프로퍼티 코포레이션 오브 아메리카 | 부호화 장치, 복호 장치 및 이들의 방법 |
CN101868821B (zh) * | 2007-11-21 | 2015-09-23 | Lg电子株式会社 | 用于处理信号的方法和装置 |
EP2144230A1 (en) * | 2008-07-11 | 2010-01-13 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Low bitrate audio encoding/decoding scheme having cascaded switches |
WO2010031003A1 (en) * | 2008-09-15 | 2010-03-18 | Huawei Technologies Co., Ltd. | Adding second enhancement layer to celp based core layer |
KR101301245B1 (ko) * | 2008-12-22 | 2013-09-10 | 한국전자통신연구원 | 스펙트럼 계수의 서브대역 할당 방법 및 장치 |
US8386266B2 (en) | 2010-07-01 | 2013-02-26 | Polycom, Inc. | Full-band scalable audio codec |
CN102081927B (zh) * | 2009-11-27 | 2012-07-18 | 中兴通讯股份有限公司 | 一种可分层音频编码、解码方法及系统 |
WO2011080916A1 (ja) | 2009-12-28 | 2011-07-07 | パナソニック株式会社 | 音声符号化装置および音声符号化方法 |
US20130030796A1 (en) | 2010-01-14 | 2013-01-31 | Panasonic Corporation | Audio encoding apparatus and audio encoding method |
JP5695074B2 (ja) | 2010-10-18 | 2015-04-01 | パナソニック インテレクチュアル プロパティ コーポレーション オブアメリカPanasonic Intellectual Property Corporation of America | 音声符号化装置および音声復号化装置 |
CN102741831B (zh) * | 2010-11-12 | 2015-10-07 | 宝利通公司 | 多点环境中的可伸缩音频 |
EP2681734B1 (en) * | 2011-03-04 | 2017-06-21 | Telefonaktiebolaget LM Ericsson (publ) | Post-quantization gain correction in audio coding |
EP2701144B1 (en) * | 2011-04-20 | 2016-07-27 | Panasonic Intellectual Property Corporation of America | Device and method for execution of huffman coding |
TWI606441B (zh) | 2011-05-13 | 2017-11-21 | 三星電子股份有限公司 | 解碼裝置 |
CN102208188B (zh) * | 2011-07-13 | 2013-04-17 | 华为技术有限公司 | 音频信号编解码方法和设备 |
WO2013061531A1 (ja) * | 2011-10-28 | 2013-05-02 | パナソニック株式会社 | 音声符号化装置、音声復号装置、音声符号化方法及び音声復号方法 |
US9454972B2 (en) | 2012-02-10 | 2016-09-27 | Panasonic Intellectual Property Corporation Of America | Audio and speech coding device, audio and speech decoding device, method for coding audio and speech, and method for decoding audio and speech |
CN104838443B (zh) * | 2012-12-13 | 2017-09-22 | 松下电器(美国)知识产权公司 | 语音声响编码装置、语音声响解码装置、语音声响编码方法及语音声响解码方法 |
EP4407609A3 (en) * | 2013-12-02 | 2024-08-21 | Top Quality Telephony, Llc | A computer-readable storage medium and a computer software product |
JP6358500B2 (ja) | 2014-06-06 | 2018-07-18 | 株式会社リコー | クリーニングブレード、画像形成装置、及びプロセスカートリッジ |
-
2013
- 2013-11-26 CN CN201380063794.XA patent/CN104838443B/zh active Active
- 2013-11-26 CN CN201710759624.5A patent/CN107516531B/zh active Active
- 2013-11-26 EP EP17173916.2A patent/EP3232437B1/en active Active
- 2013-11-26 RU RU2015121716A patent/RU2643452C2/ru active
- 2013-11-26 EP EP13862073.7A patent/EP2933799B1/en active Active
- 2013-11-26 MX MX2015006161A patent/MX341885B/es active IP Right Grant
- 2013-11-26 ES ES17173916T patent/ES2706148T3/es active Active
- 2013-11-26 ES ES13862073.7T patent/ES2643746T3/es active Active
- 2013-11-26 BR BR112015013233A patent/BR112015013233B8/pt active Search and Examination
- 2013-11-26 US US14/650,093 patent/US9767815B2/en active Active
- 2013-11-26 PL PL13862073T patent/PL2933799T3/pl unknown
- 2013-11-26 EP EP18202397.8A patent/EP3457400B1/en active Active
- 2013-11-26 PT PT138620737T patent/PT2933799T/pt unknown
- 2013-11-26 PL PL17173916T patent/PL3232437T3/pl unknown
- 2013-11-26 ES ES18202397T patent/ES2970676T3/es active Active
- 2013-11-26 PL PL18202397.8T patent/PL3457400T3/pl unknown
- 2013-11-26 KR KR1020157016672A patent/KR102200643B1/ko active IP Right Grant
- 2013-11-26 WO PCT/JP2013/006948 patent/WO2014091694A1/ja active Application Filing
- 2013-11-26 JP JP2014551851A patent/JP6535466B2/ja active Active
- 2013-11-26 PT PT17173916T patent/PT3232437T/pt unknown
-
2017
- 2017-08-10 US US15/673,957 patent/US10102865B2/en active Active
-
2018
- 2018-06-22 HK HK18108017.2A patent/HK1249651A1/zh unknown
- 2018-09-25 US US16/141,934 patent/US10685660B2/en active Active
-
2019
- 2019-06-03 JP JP2019103964A patent/JP7010885B2/ja active Active
-
2022
- 2022-01-13 JP JP2022003475A patent/JP2022050609A/ja active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6358500A (ja) * | 1986-08-25 | 1988-03-14 | インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン | 副帯域音声コ−ダ用ビツト割振り方法 |
JP2002542522A (ja) * | 1999-04-16 | 2002-12-10 | ドルビー・ラボラトリーズ・ライセンシング・コーポレーション | 音声符号化のための利得−適応性量子化及び不均一符号長の使用 |
JP2001044844A (ja) * | 1999-07-26 | 2001-02-16 | Matsushita Electric Ind Co Ltd | サブバンド符号化方式 |
JP2009063623A (ja) * | 2007-09-04 | 2009-03-26 | Nec Corp | 符号化装置および符号化方法、ならびに復号化装置および復号化方法 |
WO2012016126A2 (en) * | 2010-07-30 | 2012-02-02 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for dynamic bit allocation |
WO2012144128A1 (ja) * | 2011-04-20 | 2012-10-26 | パナソニック株式会社 | 音声音響符号化装置、音声音響復号装置、およびこれらの方法 |
Non-Patent Citations (1)
Title |
---|
"Low-complexity full-band audio coding for high-quality conversational applications", ITU-T RECOMMENDATION G.719, 2009 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016009026A (ja) * | 2014-06-23 | 2016-01-18 | 富士通株式会社 | オーディオ符号化装置、オーディオ符号化方法、オーディオ符号化プログラム |
JP2020518030A (ja) * | 2017-04-25 | 2020-06-18 | ディーティーエス・インコーポレイテッドDTS,Inc. | デジタルオーディオ信号における差分データ |
JP7257965B2 (ja) | 2017-04-25 | 2023-04-14 | ディーティーエス・インコーポレイテッド | デジタルオーディオ信号における差分データ |
CN109286922A (zh) * | 2018-09-27 | 2019-01-29 | 珠海市杰理科技股份有限公司 | 蓝牙提示音处理方法、系统、可读存储介质和蓝牙设备 |
CN109286922B (zh) * | 2018-09-27 | 2021-09-17 | 珠海市杰理科技股份有限公司 | 蓝牙提示音处理方法、系统、可读存储介质和蓝牙设备 |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7010885B2 (ja) | 音声または音響符号化装置、音声または音響復号装置、音声または音響符号化方法及び音声または音響復号方法 | |
CN104485111B (zh) | 音频/语音编码装置、音频/语音解码装置及其方法 | |
JP6600054B2 (ja) | 方法、符号化器、復号化器、及び移動体機器 | |
JP2011509428A (ja) | オーディオ信号処理方法及び装置 | |
WO2013143221A1 (zh) | 信号编码和解码的方法和设备 | |
WO2015151451A1 (ja) | 符号化装置、復号装置、符号化方法、復号方法、およびプログラム | |
JP2012118205A (ja) | オーディオ符号化装置、オーディオ符号化方法及びオーディオ符号化用コンピュータプログラム | |
JP2019070823A (ja) | 音響信号符号化装置、音響信号復号装置、音響信号符号化方法および音響信号復号方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13862073 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2014551851 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2015/006161 Country of ref document: MX |
|
REEP | Request for entry into the european phase |
Ref document number: 2013862073 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14650093 Country of ref document: US Ref document number: 2013862073 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 20157016672 Country of ref document: KR Kind code of ref document: A |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112015013233 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 2015121716 Country of ref document: RU Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 112015013233 Country of ref document: BR Kind code of ref document: A2 Effective date: 20150608 |