WO2014008786A1 - 音频信号的比特分配的方法和装置 - Google Patents
音频信号的比特分配的方法和装置 Download PDFInfo
- Publication number
- WO2014008786A1 WO2014008786A1 PCT/CN2013/076393 CN2013076393W WO2014008786A1 WO 2014008786 A1 WO2014008786 A1 WO 2014008786A1 CN 2013076393 W CN2013076393 W CN 2013076393W WO 2014008786 A1 WO2014008786 A1 WO 2014008786A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- group
- sub
- subbands
- band
- bits
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 84
- 238000000034 method Methods 0.000 title claims abstract description 65
- 238000010606 normalization Methods 0.000 claims abstract description 208
- 230000003595 spectral effect Effects 0.000 claims description 11
- 238000013139 quantization Methods 0.000 claims description 8
- 238000001228 spectrum Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 125000004122 cyclic group Chemical group 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/035—Scalar quantisation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/22—Mode decision, i.e. based on audio signal content versus external parameters
Definitions
- Embodiments of the present invention relate to the field of audio technology and, more particularly, to methods and apparatus for bit allocation of audio signals. Background technique
- Transform coding usually needs to band the frequency domain coefficients, obtain the normalized energy of each band, normalize the energy of the in-band coefficients, then perform bit allocation, and finally according to the bit pairs in each band.
- the coefficients are quantized, where bit allocation is a very critical one.
- Bit allocation means that in the process of quantizing the spectral coefficients, the bits of the audio signal used for the quantized spectral coefficients are allocated on the respective sub-bands according to the sub-band characteristics of the spectrum, that is, the coding resources usable by the audio signal are allocated to the respective sub-bands, generally
- the coding resources are characterized by bits.
- the existing bit allocation process includes: banding the frequency speech signal, for example, gradually increasing the bandwidth from the low frequency to the high frequency according to the critical band theory; spectrum banding, finding the normalized energy norm of each subband And quantizing the subband normalization factor wnorrm; arranging the subbands in descending order of the subband normalization factor wnorm; bit allocation, for example, iterative cyclic allocation according to the value of the subband normalization factor wnorm The number of bits per subband.
- the iterative loop allocation bit can be further refined into the following steps: Step 1, initializing the number of bits of each subband and an iteration factor fac; Step 2, find the band corresponding to the largest subband normalization factor wnorm; Step 3, accumulate the bandwidth value of the number of bits allocated by the band, and subtract the value of the subband normalization factor wnorm from the iteration factor fac; Step 4. Iterate steps 2 and 3 until the bit allocation is completed. It can be seen that in the prior art, the bit unit allocated each time is the bandwidth value, and the minimum number of bits required for quantization is smaller than the bandwidth value, which makes the bit allocation of such an integer less efficient at a low bit rate. A lot of the bands are not allocated, and the other bands are too much. Because it is a full-band cyclic iteration allocation bit, the loop iteration parameters are the same for different bandwidth sub-bands, which will make the allocation result 4 ⁇ random, the quantization comparison is scattered, and the front and back frames are discontinuous.
- bit allocation has a large impact on performance.
- the usual bit allocation is mainly distributed in the whole frequency band according to the normalized energy of each sub-band. In the case of insufficient bit rate, the allocation is random and scattered, and quantization discontinuity is generated in the time domain. phenomenon. Summary of the invention
- Embodiments of the present invention provide a method and apparatus for bit allocation of an audio signal, which can solve the problem of low and medium bit rate, and the existing bit allocation method causes the allocation to be random and scattered, thereby generating a problem of quantization discontinuity in the time domain.
- a method for bit allocation of an audio signal including: dividing a frequency band of an audio signal into a plurality of sub-bands, and quantizing a sub-band normalization factor of each sub-band; dividing the plurality of sub-bands into multiple a group, the group of the plurality of groups comprising one or more sub-bands, obtaining group parameters of each group, wherein the group parameters are used to characterize signal characteristics and energy attributes of audio signals of the corresponding group; a group parameter of the group, the coded bits are allocated to the at least one group, wherein the sum of the number of coded bits allocated by the at least one group is the number of coded bits of the audio signal; according to each of each of the at least one group A subband normalization factor of the band, the coded bits allocated to the at least one group are allocated to each of the subbands of each of the at least one group.
- an apparatus for providing bit allocation of an audio signal comprising: a band division quantization unit configured to divide a frequency band of the audio signal into a plurality of sub-bands, and quantize a sub-band normalization factor of each sub-band; And the plurality of sub-bands are divided into multiple groups, and one of the plurality of groups includes one or more sub-bands, and group parameters of each group are obtained, where the group parameters are used to represent the corresponding groups.
- a signal characteristic and an energy attribute of the audio signal configured to allocate, according to the group parameter of each group, a coding bit, wherein a sum of the number of coded bits allocated by the at least one group is the audio a number of coded bits of the signal; a second allocation unit, configured to use the at least one group A subband normalization factor for each subband of each of the groups, the coded bits assigned to the at least one group are assigned to each of the subbands of each of the at least one group.
- FIG. 1 is a flow chart of a method of bit allocation of an audio signal in accordance with an embodiment of the present invention.
- FIG. 2 is a block diagram showing the structure of an apparatus for bit allocation of an audio signal according to an embodiment of the present invention.
- Figure 3 is a block diagram showing the structure of an apparatus for bit allocation of an audio signal according to another embodiment of the present invention. detailed description
- Coding technology solutions and decoding technology solutions are widely used in various electronic devices, such as: mobile phones, wireless devices, personal data assistants (PDAs), handheld or portable computers, GPS receivers/navigators, cameras, audio/video Players, camcorders, video recorders, surveillance equipment, etc.
- PDAs personal data assistants
- Such an electronic device includes an audio encoder or an audio decoder, and the audio encoder or decoder may be directly implemented by a digital circuit or a chip such as a DSP (digital signal processor), or may be executed by a software code driven processor in the software code. The process is implemented.
- DSP digital signal processor
- an audio time domain signal is first converted into a frequency domain signal, and then a coded bit is allocated to an audio frequency domain signal for encoding, and the encoded signal is transmitted to a decoding end through a communication system.
- the decoding end decodes and recovers the encoded signal.
- the present invention performs bit allocation based on the theory of the packet and the characteristics of the signal.
- the bands are grouped, and according to the characteristics of each group, the energy in the group is weighted, and the energy is added to each group according to the weighted energy.
- Line bit allocation and then assign bits to each band according to the characteristics of the signals within the group. Because the entire group is allocated first, the phenomenon of discontinuous distribution is avoided, thereby improving the coding quality of different signals.
- the characteristics of the signal are taken into account in the intra-group allocation, so that limited bits can be allocated to important audio bands that affect perception.
- 1 is a flow chart of a method of bit allocation of an audio signal in accordance with an embodiment of the present invention.
- 101 Divide the frequency band of the audio signal into a plurality of sub-bands, and quantize the sub-band normalization of each sub-band.
- the MDCT transform is taken as an example for description.
- the input audio signal is subjected to MDCT transform to obtain frequency domain coefficients.
- the MDCT transform here can include several processes of windowing, time domain aliasing, and discrete DCT transform.
- the frequency domain envelope is then extracted from the MDCT coefficients and quantized.
- the entire frequency band is divided into subbands of different frequency domain resolutions, the normalization factor of each subband is extracted, and the subband normalization factor is quantized.
- a frequency band corresponding to a 16 kHz bandwidth such as a frame length of 20 ms (640 samples;) can be divided into the following 44 subbands:
- L p is the number of coefficients in the subband, which is the starting point of the subband, ⁇ is the ending point of the subband, and P is the total number of subbands.
- the normalization factor After the normalization factor is obtained, it can be quantified in the log domain to obtain the quantized subband normalization factor wnorm.
- subbands having the same bandwidth may be divided into one group, and adjacent subbands having the same bandwidth are preferably divided into one group.
- all subbands can be divided into four groups, and at low bit rates, only the first two groups or the first three groups are used, and the remaining groups are not allocated bits.
- subbands with subband normalization factors wnorm close to each other can be grouped.
- wnorm[i] is greater than a predetermined threshold K
- the sub-band number i is recorded, and the sub-bands whose sub-band normalization factor wnorm[i] is greater than a predetermined threshold K are finally grouped into one group, and the remaining sub-bands are divided into groups. Another group. It should be understood that a plurality of predetermined thresholds may be set according to different needs, thereby obtaining more groups.
- the group parameters for each group can be obtained to characterize the energy properties of the group.
- the group parameters may include one or more of the following: the sum of the sub-band normalization factors within the group, the group-wnorm, the intra-group sub-band normalization factor, the peak-to-average ratio group-sharp.
- the peak-to-average ratio of the subband normalization factor in the group is group_sharp is the ratio of the peak value of the subband normalization factor in the group to the mean value of the subband normalization factor in the group.
- Group _ avg[i] where grou p_p ea k[i] is the peak of the sub-band normalization factor of the i-th group, and group_avg[i] is the average of the sub-band normalization factors of the i-th group.
- the bits of the audio signal can be assigned to each group according to the group parameters.
- the principle of grouping is used to consider the energy properties of the group, so that the bit allocation of the audio signal is more concentrated, and the bit allocation between frames is more continuous.
- the group parameters are not limited to the ones listed herein, but may be other parameters that can characterize the energy properties of the group.
- only bits are allocated to a partial group, for example, a group having a sum of subband normalization factors in the group is not allocated to bits; for example, when the number of bits is very large When there are few, there will also be groups that are not assigned to bits.
- the coded bits may be allocated to at least one group according to the sum of the sub-band normalization factors in each group, wherein the sum of the coded bits allocated by the at least one group The number of encoded bits for the audio signal.
- the result of assigning bits of the audio signal to each group can also be optimized by adjusting the group parameters. For example, different weights are assigned to group parameters of different groups according to different allocation requirements. The limited number of bits is allocated in the appropriate group and then allocated in the group so that the bit allocation is no longer dispersed, which will facilitate the encoding of the audio signal.
- the peak-to-average ratio of the sub-band normalization factors in the group may be weighted.
- the inner subband normalized factor sum group_wnorm, and the weighted group subband normalization factor sum group_wnorm_w is obtained.
- the peak-to-average ratio of the sub-band normalization factor in the first group is compared with the group-sharp[i-l] of the group-sharp[i] and the sub-group normalization factor of the second group. If the peak-to-average ratio of the normalized factor of the sub-band in the first group is greater than the first threshold of the normalized factor of the sub-band in the second group, the group of the first group is adjusted according to the first weighting factor. The sum of the inner subband normalization factors adjusts the sum of the subgroup normalization factors of the second group according to the second weighting factor. vice versa.
- the group of the second group is adjusted according to the first weighting factor.
- the sum of the inner subband normalization factors, and the sum of the subband normalization factors of the first group of groups is adjusted according to the second weighting factor.
- weighting method of the cartridge is schematically illustrated.
- Other weighting methods should be readily apparent to those skilled in the art to adjust the weights of the subbands by different weighting coefficients. For example, the weight of subbands that need to allocate more signal bits can be increased, while the weight of subbands that do not need or need to allocate fewer signal bits can be reduced.
- the bits of the audio signal are assigned to each group based on the sum of the weighted intra-group sub-band normalization factors. For example, according to the sum of the weighted group subband normalization factors and the sum_wnorm ratio of the subband normalization factors of all subbands, the group bit number of the group is determined, and the bits of the audio signal are determined according to the determined The number of group bits is assigned to this group.
- the subbands within the group can be bit allocated using existing iterative loop allocation methods. However, the iterative loop allocation method still makes the bit allocation result in the group very random, and the front and back frames are not continuous.
- the signal characteristics of the audio signals assigned to the group can be assigned to the sub-bands within the group based on the signal characteristics of the different audio signals, i.e., different signal types, depending on the sub-band normalization factors within the group.
- the number of subbands that can be allocated in the group can be determined first. Then, according to the type of the audio signal, the bits of the audio signal allocated to the group are allocated to the group according to the subband normalization factor in the group. In the subband in which bit allocation is performed, the number of subbands in which bit allocation is performed in the group is equal to the number of subbands band_num.
- the number of sub-bands of the initial bit allocation in each group may be determined according to the number of group bits and the third threshold, wherein the third threshold represents the minimum number of bits used to quantize a normalized spectral coefficient. For example, if a group is assigned 13 bits and the third threshold is 7 bits, then the number of subbands allocated by the initial bits in the group is 2. Then, the number of subbands band_num for bit allocation in the group is determined according to the number of subbands allocated in the initial bit in the group and the total number of subbands in the group.
- band_num is the total number of subbands in the group, otherwise the value of band_num is a group.
- the subband normalization factor in the group may be used to allocate bits for the number of subbands band_num subbands in the group.
- the peak-to-average ratio of the sub-band normalization factor in the group of the group may be performed according to group_sharp.
- the existing iterative loop allocation method may be used to perform bit allocation for the group; If it is determined that the audio signal of the group is a harmonic signal, the existing iterative cyclic allocation method may be used to perform bit allocation for the group, or the following method a or method b may be used for bit allocation.
- Step 1 Sort the subband normalization factors of all subbands in the group from large to small, and select the top N subbands, where N is the number of subbands in the group for band allocation.
- Step 2 Initialize the number of bits of the N subbands to 1, and initialize the number of loops j to 0.
- Step 3 Determine subband normalization of subbands whose subband normalization factors are greater than zero in the N subbands 4 ⁇ factor sum band_wnorm;
- Step 4 allocate a number of bits for a subband whose subband normalization factor is greater than zero in the N subbands;
- Step 5 determine whether the number of bits allocated by the last subband of the N subbands is less than a fixed threshold fac, if If it is less than the fixed threshold fac, the number of bits allocated by the subband is set to zero;
- Step 6 adding 1 to the number of cycles j;
- Step 7 Restoring the original original ordering of all sub-bands within the group, i.e., reverting to the ordering of all sub-bands before quantifying the sub-band normalization factor of each sub-band.
- Step 1 Sort the subband normalization factors of all subbands in the group from large to small, and select the top N subbands, where N is the number of subbands in the group for band allocation.
- Step 2 initializing the number of bits of the N subbands is 1, and initializing the number of loops j is 0, and initializing the allocated number of bits bit_sum is 0;
- Step 3 determining a subband of the subbands whose subband normalization factor is greater than zero in the N subbands is a sum of band factors and a band_wnorm;
- Step 4 assigning a number of bits to the subbands whose subband normalization factor is greater than zero in the N subbands; Step 5, determining whether the number of bits allocated by the N subbands is less than a fixed threshold fac, if less than a fixed threshold fac, Then zero the number of bits allocated by this subband;
- Step 6 calculating a sum of the number of bits allocated by all N subbands temp_sum
- Step 7 adding 1 to the number of cycles j;
- Step 8 it is determined whether temp_sum and bit_sum are equal, if they are equal, step 10 is performed; otherwise, step 9 is continued;
- Step 9 update bit_sum, assign the temp_sum value to bit_sum; Cycle from step 3 to step 9, until the number of cycles j is equal to N;
- Step 10 Restore the original original ordering for all sub-bands within the group.
- method a and method b can also be combined with the method of determining band_num, that is, combining intra-group allocation with different audio signal characteristics. For example, if the number of subbands in the initial bit allocation in the group is greater than the total number of subbands in the group multiplied by the value of the scale factor k, method a is used; if the number of subbands allocated in the initial bit in the group is less than or equal to the total in the group Multiply the number of subbands by the value of the scale factor k, then method b is used.
- the process of bit allocation for subbands in a group is to select the first N subbands with the largest subband normalization factor from all the subbands in the group as the subband to be allocated, where N is the intra-group.
- the number of subbands band_num is allocated; then, according to the subband normalization factors of the N subbands, the number of bits is allocated for the N subbands in turn; finally, the original original ordering is restored for all subbands of the group.
- the bits can be effectively allocated to the frequency band that can reflect the auditory perception of the signal. For example, for a band with strong harmonics, it is necessary to distribute the bits to the bands with harmonics, and for those spectral energy comparisons. For the average signal, the bits need to be evenly distributed.
- the group can be further subdivided, that is, the sub-bands in the group are subdivided into a plurality of groups, and the group parameters of each group are obtained; then, according to the group parameters of each group, the group will be assigned to the group. Bits are assigned to each group. Finally, based on the subband normalization factor, the bits of the audio signal assigned to each group are assigned to each subband within each group. One possibility is to continue to refine until there is only one band in each group.
- the grouping mode of the embodiment of the present invention ensures that the front and rear frame allocations are relatively stable, and different bits are allocated in the group according to the signal characteristics, so that the allocated bits are used to quantize the important frequency information, thereby improving the audio signal. Coding quality.
- the method for bit allocation of an audio signal according to an embodiment of the present invention can ensure that the frame allocation is stable before and after the packet, and the global influence on the local discontinuity is reduced.
- the bit allocation in each group can be set with different threshold parameters, thereby more adaptively allocating bits, and differently assigning bit assignments within the group according to spectral signal characteristics, for example, harmonic-like signals with more concentrated frequency. Focus on the subbands with large energy, the subbands between the harmonics do not need to allocate more bits, and for the signals with more gradual spectrum, the bit allocation tries to ensure the smoothness between the subbands, so that the allocated bits are used to quantify the important bits. On the spectrum information.
- a schematic structure of an apparatus for bit allocation of an audio signal according to an embodiment of the present invention will be described below with reference to FIG.
- the apparatus 20 for bit allocation of audio signals includes a band division quantization unit 21, a packet unit 22, a first allocation unit 23, and a second allocation unit 24.
- the subband quantization unit 21 is configured to divide the frequency band of the audio signal into a plurality of subbands, and quantize the subband normalization factor of each subband.
- the grouping unit 22 is configured to divide the plurality of sub-bands into a plurality of groups, and one of the plurality of groups includes one or more sub-bands, and obtain group parameters of each group, where the group parameters are used to represent the corresponding group.
- the signal characteristics and energy properties of the audio signal are configured to divide the plurality of sub-bands into a plurality of groups, and one of the plurality of groups includes one or more sub-bands, and obtain group parameters of each group, where the group parameters are used to represent the corresponding group.
- the first allocating unit 23 is configured to allocate, for the at least one component, coding bits according to the group parameter of each group, wherein the sum of the number of coded bits allocated by the at least one group is an encoding bit of the audio signal.
- the grouping unit 22 may be configured to divide the sub-bands having the same bandwidth into one group, so that the plurality of sub-bands are divided into a plurality of groups.
- the grouping unit 22 may be configured to group the sub-bands whose sub-band normalization factors are close, so that the plurality of sub-bands are divided into a plurality of groups.
- the subbands in each group can be contiguous.
- the grouping unit 22 is configured to obtain a sum of intra-group sub-band normalization factors of each group, and a peak-to-average ratio of intra-group sub-band normalization factors of each group, wherein the intra-group sub-band normalization factor
- the sum is the sum of the sub-band normalization factors of all sub-bands in the group
- the peak-to-average ratio of the sub-band normalization factors in the group is the peak of the sub-band normalization factor in the group and the sub-band normalization within the group
- the ratio of the mean of the factor, wherein the peak of the subband normalization factor within the group is the maximum of the subband normalization factor for all subbands within the group, and the average of the subband normalization factors within the group
- the value is the average of the subband normalization factors for all subbands within the group.
- the grouping unit 22 is configured to further weight the sum of the sub-band normalization factors of each group according to the peak-to-average ratio of the sub-band normalization factors of each group, and obtain each group. The sum of the normalized factors within the weighted group.
- the grouping module 22 may be configured to compare the peak-to-average ratio of the intra-group sub-band normalization factor of the first group to the peak-to-average ratio of the sub-group normalization factor of the second group;
- the peak-to-average ratio of the normalization factor with the normalization factor is greater than the first threshold value of the sub-band normalization factor of the second group, and the sum of the sub-band normalization factors of the first group is adjusted according to the first weighting factor.
- adjusting the sum of the sub-band normalization factors of the second group according to the second weighting factor.
- the first allocating unit 23 may be configured to allocate coding bits to the at least one group according to the sum of the intra-group sub-band normalization factors of each group, where the sum of the coded bits allocated by the at least one group is audio The number of coded bits of the signal.
- the first allocating unit 23 may be configured to allocate coded bits to the at least one group according to the sum of the weighted intra-group sub-band normalization factors, wherein the sum of the coded bits allocated by the at least one group is the number of coded bits of the audio signal .
- the first allocating unit 23 may be configured to determine the number of group bits of the group according to a ratio of a sum of weighted intra-group sub-band normalization factors of the group to a sum of sub-band normalization factors of all sub-bands, and The coded bits of the audio signal are assigned to the group according to the set of bits.
- the second allocating unit 24 is configured to allocate coded bits allocated to the at least one group to each of the at least one group according to a subband normalization factor of each subband of each of the at least one group. Each subband of the group.
- the second bit allocation module 24 can include a determination module 241 and an allocation module 242.
- the determining module 241 is configured to determine a number of subbands band_num for performing bit allocation in the group.
- the allocating module 242 is configured to allocate bit bits of the audio signal allocated to the group according to the subband normalization factor in the group. To the sub-bands in which bit allocation is performed in the group, the number of sub-bands in which bit allocation is performed within the group is equal to the number of sub-bands band_num in which bit allocation is performed within the group.
- the determining submodule 241 is configured to determine, according to the set of the number of bits and the third threshold, the number of subbands of the initial bit allocation in the group, where the third threshold is used to quantize a normalized spectral coefficient.
- the minimum number of bits; the smaller of the number of subbands determining the initial bit allocation within the group and the total number of subbands within the group is the number of subbands band_num for bit allocation within the group.
- the determining submodule 241 can be configured to determine, according to the set of bit numbers and the third threshold, the number of subbands of the initial bit allocation in the group, wherein the third threshold represents a minimum bit used to quantize a normalized spectral coefficient.
- the allocating module 242 may be configured to select, from the group of all subbands, the first N subbands with the largest subband normalization factor as the subband to be allocated, where N is the number of subbands in the group for bit allocation;
- the sub-band normalization factors of the N sub-bands are sequentially allocated bit numbers for the N sub-bands.
- the original original ordering is restored for all subbands of the group.
- the allocation module 242 specifically performs the following steps:
- Step 1 Sort the subband normalization factors of all subbands in the group from large to small, and select the top N subbands, and N is the number of subbands in the group for band allocation.
- Step 2 Initialize the number of bits of the N subbands to 1, and initialize the number of loops j to 0.
- Step 3 Determine subband normalization of subbands whose subband normalization factors are greater than zero in the N subbands 4 ⁇ factor sum band — wnorm;
- Step 4 allocate a number of bits for a subband whose subband normalization factor is greater than zero in the N subbands;
- Step 5 determine whether the number of bits allocated by the last subband of the N subbands is less than a fixed threshold fac, if If it is less than the fixed threshold fac, the number of bits allocated by the subband is set to zero;
- Step 6 adding 1 to the number of cycles j;
- Step 7. Restore the original original ordering for all subbands within the group.
- allocation sub-module 242 can be used to perform the following specific steps:
- Step 1 Sort the subband normalization factors of all subbands in the group from large to small, and select the top N subbands, where N is the number of subbands in the group for band allocation.
- Step 2 initializing the number of bits of the N subbands to 1, and initializing the number of loops, j is 0, and initializing the allocated number of bits bit_.sum is 0;
- Step 3 determining a subband of the subbands whose subband normalization factor is greater than zero in the N subbands is a sum of band factors and a band_wnorm;
- Step 4 allocate a number of bits for the subbands whose subband normalization factor is greater than zero in the N subbands.
- Step 5 Determine whether the number of bits allocated by the N subbands is less than a fixed threshold fac, if less than a fixed threshold fac , then zero the number of bits allocated by this subband;
- Step 6 calculating a sum of the number of bits allocated by all N subbands temp_sum
- Step 7 adding 1 to the number of cycles j;
- Step 8 Determine whether temp_sum and bit_sum are equal. If they are equal, perform step 10; otherwise, continue to step 9;
- Step 9 update bit_sum, assign temp_sum value to bit_sum;
- Step 10 Restore the original original ordering for all sub-bands within the group.
- the first allocating unit 23 may further divide the sub-bands in the group into a plurality of groups, The group parameters for each group; then the bits assigned to the group are assigned to each of the groups based on the group parameters for each group.
- the second allocation unit 24 is operative to assign bits of the audio signal assigned to each of the groups to each of the sub-bands in each of the groups in accordance with the sub-band normalization factor.
- the apparatus for bit allocation of the audio signal can ensure that the frame allocation before and after is relatively stable by the grouping, and reduce the influence of the global on the local discontinuity.
- the bit allocation in each group can be set with different threshold parameters, thereby more adaptively allocating bits, and differently assigning bit assignments within the group according to spectral signal characteristics, for example, harmonic-like signals with more concentrated frequency. Focus on the subbands with large energy, the subbands between the harmonics do not need to allocate more bits, and for the signals with more gradual spectrum, the bit allocation tries to ensure the smoothness between the subbands, so that the allocated bits are used to quantify the important bits. On the spectrum information.
- the disclosed systems, devices, and methods may be implemented in other ways.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
- the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.
- the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment.
- each functional unit in various embodiments of the present invention may be integrated into one processing unit
- each unit may exist physically separately, or two or more units may be integrated into one unit.
- the functions, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium.
- the technical solution of the present invention which is essential to the prior art or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, including
- the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes. .
- FIG. 3 is a schematic block diagram of another embodiment of an apparatus 30 for bit allocation of audio signals of the present invention.
- the device 30 includes a processor 31, a memory 32, an input device 33, an output device 34, and the like, and communicates with each other via a bus.
- the processor 31 calls the program stored in the memory 32 to execute the steps of the embodiment of the bit allocation method of the audio signal.
- the processor 31 is operative to execute the program of the embodiment of the present invention stored in the memory 32 and to communicate bidirectionally with other devices via the bus.
- Memory 32 may be data including RAM and ROM, or any fixed storage medium, or removable processing.
- Memory 32 and processor 31 may also be integrated into a physical module to which embodiments of the present invention are applied, on which the programs implementing the embodiments of the present invention are stored and executed.
- Input device 33 may include any suitable means, such as a keyboard, mouse, etc., for receiving user input or input from other devices and transmitting to processor 31.
- the output device 34 is for outputting the result of the bit allocation of the audio signal, which may be a display, a printer or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020157003447A KR101661868B1 (ko) | 2012-07-13 | 2013-05-29 | 오디오 신호를 위한 비트 할당 방법 및 장치 |
JP2015520801A JP6092383B2 (ja) | 2012-07-13 | 2013-05-29 | オーディオ信号中でビットを割り当てる方法及び装置 |
KR1020167026037A KR101736705B1 (ko) | 2012-07-13 | 2013-05-29 | 오디오 신호를 위한 비트 할당 방법 및 장치 |
EP13816528.7A EP2863388B1 (en) | 2012-07-13 | 2013-05-29 | Bit allocation method and device for audio signal |
US14/595,672 US9424850B2 (en) | 2012-07-13 | 2015-01-13 | Method and apparatus for allocating bit in audio signal |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210243316.4 | 2012-07-13 | ||
CN201210243316.4A CN103544957B (zh) | 2012-07-13 | 2012-07-13 | 音频信号的比特分配的方法和装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/595,672 Continuation US9424850B2 (en) | 2012-07-13 | 2015-01-13 | Method and apparatus for allocating bit in audio signal |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014008786A1 true WO2014008786A1 (zh) | 2014-01-16 |
Family
ID=49915373
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2013/076393 WO2014008786A1 (zh) | 2012-07-13 | 2013-05-29 | 音频信号的比特分配的方法和装置 |
Country Status (6)
Country | Link |
---|---|
US (1) | US9424850B2 (zh) |
EP (1) | EP2863388B1 (zh) |
JP (2) | JP6092383B2 (zh) |
KR (2) | KR101661868B1 (zh) |
CN (2) | CN106941004B (zh) |
WO (1) | WO2014008786A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105632505A (zh) * | 2014-11-28 | 2016-06-01 | 北京天籁传音数字技术有限公司 | 主成分分析pca映射模型的编解码方法及装置 |
JP2019152871A (ja) * | 2014-04-29 | 2019-09-12 | 華為技術有限公司Huawei Technologies Co.,Ltd. | 信号処理方法及び装置 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3117432B1 (en) | 2014-03-14 | 2019-05-08 | Telefonaktiebolaget LM Ericsson (publ) | Audio coding method and apparatus |
CN106409300B (zh) * | 2014-03-19 | 2019-12-24 | 华为技术有限公司 | 用于信号处理的方法和装置 |
EP3483882A1 (en) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Controlling bandwidth in encoders and/or decoders |
US11133891B2 (en) | 2018-06-29 | 2021-09-28 | Khalifa University of Science and Technology | Systems and methods for self-synchronized communications |
US10951596B2 (en) * | 2018-07-27 | 2021-03-16 | Khalifa University of Science and Technology | Method for secure device-to-device communication using multilayered cyphers |
US11355139B2 (en) | 2020-09-22 | 2022-06-07 | International Business Machines Corporation | Real-time vs non-real time audio streaming |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1910656A (zh) * | 2004-01-20 | 2007-02-07 | 杜比实验室特许公司 | 基于块分组的音频编码 |
EP1852849A1 (en) * | 2006-05-05 | 2007-11-07 | Deutsche Thomson-Brandt Gmbh | Method and apparatus for lossless encoding of a source signal, using a lossy encoded data stream and a lossless extension data stream |
CN101101755A (zh) * | 2007-07-06 | 2008-01-09 | 北京中星微电子有限公司 | 一种音频编码的比特分配及量化方法及音频编码装置 |
CN101499279A (zh) * | 2009-03-06 | 2009-08-05 | 武汉大学 | 空间参数逐级精细的比特分配方法及其装置 |
US20090313029A1 (en) * | 2006-07-14 | 2009-12-17 | Anyka (Guangzhou) Software Technologiy Co., Ltd. | Method And System For Backward Compatible Multi Channel Audio Encoding and Decoding with the Maximum Entropy |
Family Cites Families (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3270212D1 (en) * | 1982-04-30 | 1986-05-07 | Ibm | Digital coding method and device for carrying out the method |
GB8421498D0 (en) * | 1984-08-24 | 1984-09-26 | British Telecomm | Frequency domain speech coding |
US5752225A (en) * | 1989-01-27 | 1998-05-12 | Dolby Laboratories Licensing Corporation | Method and apparatus for split-band encoding and split-band decoding of audio information using adaptive bit allocation to adjacent subbands |
WO1995032499A1 (fr) * | 1994-05-25 | 1995-11-30 | Sony Corporation | Procede de codage, procede de decodage, procede de codage-decodage, codeur, decodeur et codeur-decodeur |
JP3491425B2 (ja) * | 1996-01-30 | 2004-01-26 | ソニー株式会社 | 信号符号化方法 |
JP3521596B2 (ja) * | 1996-01-30 | 2004-04-19 | ソニー株式会社 | 信号符号化方法 |
DE69924922T2 (de) * | 1998-06-15 | 2006-12-21 | Matsushita Electric Industrial Co., Ltd., Kadoma | Audiokodierungsmethode und Audiokodierungsvorrichtung |
JP3466507B2 (ja) * | 1998-06-15 | 2003-11-10 | 松下電器産業株式会社 | 音声符号化方式、音声符号化装置、及びデータ記録媒体 |
JP4242516B2 (ja) * | 1999-07-26 | 2009-03-25 | パナソニック株式会社 | サブバンド符号化方式 |
JP4287545B2 (ja) * | 1999-07-26 | 2009-07-01 | パナソニック株式会社 | サブバンド符号化方式 |
JP2001094433A (ja) * | 1999-09-17 | 2001-04-06 | Matsushita Electric Ind Co Ltd | サブバンド符号化・復号方法 |
JP2002091498A (ja) * | 2000-09-19 | 2002-03-27 | Victor Co Of Japan Ltd | オーディオ信号符号化装置 |
DE60135487D1 (de) * | 2000-12-22 | 2008-10-02 | Sony Corp | Codierer |
US7725313B2 (en) * | 2004-09-13 | 2010-05-25 | Ittiam Systems (P) Ltd. | Method, system and apparatus for allocating bits in perceptual audio coders |
KR100754389B1 (ko) * | 2005-09-29 | 2007-08-31 | 삼성전자주식회사 | 음성 및 오디오 신호 부호화 장치 및 방법 |
GB2454190A (en) | 2007-10-30 | 2009-05-06 | Cambridge Silicon Radio Ltd | Minimising a cost function in encoding data using spectral partitioning |
US8207875B2 (en) | 2009-10-28 | 2012-06-26 | Motorola Mobility, Inc. | Encoder that optimizes bit allocation for information sub-parts |
US8386266B2 (en) * | 2010-07-01 | 2013-02-26 | Polycom, Inc. | Full-band scalable audio codec |
CN102081926B (zh) * | 2009-11-27 | 2013-06-05 | 中兴通讯股份有限公司 | 格型矢量量化音频编解码方法和系统 |
US8831932B2 (en) | 2010-07-01 | 2014-09-09 | Polycom, Inc. | Scalable audio in a multi-point environment |
US9536534B2 (en) * | 2011-04-20 | 2017-01-03 | Panasonic Intellectual Property Corporation Of America | Speech/audio encoding apparatus, speech/audio decoding apparatus, and methods thereof |
CN102208188B (zh) | 2011-07-13 | 2013-04-17 | 华为技术有限公司 | 音频信号编解码方法和设备 |
-
2012
- 2012-07-13 CN CN201710079399.0A patent/CN106941004B/zh active Active
- 2012-07-13 CN CN201210243316.4A patent/CN103544957B/zh active Active
-
2013
- 2013-05-29 KR KR1020157003447A patent/KR101661868B1/ko active IP Right Grant
- 2013-05-29 WO PCT/CN2013/076393 patent/WO2014008786A1/zh active Application Filing
- 2013-05-29 JP JP2015520801A patent/JP6092383B2/ja active Active
- 2013-05-29 EP EP13816528.7A patent/EP2863388B1/en active Active
- 2013-05-29 KR KR1020167026037A patent/KR101736705B1/ko active IP Right Grant
-
2015
- 2015-01-13 US US14/595,672 patent/US9424850B2/en active Active
-
2017
- 2017-02-08 JP JP2017021030A patent/JP6351770B2/ja active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1910656A (zh) * | 2004-01-20 | 2007-02-07 | 杜比实验室特许公司 | 基于块分组的音频编码 |
EP1852849A1 (en) * | 2006-05-05 | 2007-11-07 | Deutsche Thomson-Brandt Gmbh | Method and apparatus for lossless encoding of a source signal, using a lossy encoded data stream and a lossless extension data stream |
US20090313029A1 (en) * | 2006-07-14 | 2009-12-17 | Anyka (Guangzhou) Software Technologiy Co., Ltd. | Method And System For Backward Compatible Multi Channel Audio Encoding and Decoding with the Maximum Entropy |
CN101101755A (zh) * | 2007-07-06 | 2008-01-09 | 北京中星微电子有限公司 | 一种音频编码的比特分配及量化方法及音频编码装置 |
CN101499279A (zh) * | 2009-03-06 | 2009-08-05 | 武汉大学 | 空间参数逐级精细的比特分配方法及其装置 |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2019152871A (ja) * | 2014-04-29 | 2019-09-12 | 華為技術有限公司Huawei Technologies Co.,Ltd. | 信号処理方法及び装置 |
JP2021043453A (ja) * | 2014-04-29 | 2021-03-18 | 華為技術有限公司Huawei Technologies Co.,Ltd. | 信号処理方法及び装置 |
US11081121B2 (en) | 2014-04-29 | 2021-08-03 | Huawei Technologies Co., Ltd. | Signal processing method and device |
JP7144499B2 (ja) | 2014-04-29 | 2022-09-29 | 華為技術有限公司 | 信号処理方法及び装置 |
US11580996B2 (en) | 2014-04-29 | 2023-02-14 | Huawei Technologies Co., Ltd. | Signal processing method and device |
US11881226B2 (en) | 2014-04-29 | 2024-01-23 | Huawei Technologies Co., Ltd. | Signal processing method and device |
CN105632505A (zh) * | 2014-11-28 | 2016-06-01 | 北京天籁传音数字技术有限公司 | 主成分分析pca映射模型的编解码方法及装置 |
CN105632505B (zh) * | 2014-11-28 | 2019-12-20 | 北京天籁传音数字技术有限公司 | 主成分分析pca映射模型的编解码方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
JP6092383B2 (ja) | 2017-03-08 |
KR20160114192A (ko) | 2016-10-04 |
CN103544957B (zh) | 2017-04-12 |
EP2863388A1 (en) | 2015-04-22 |
EP2863388B1 (en) | 2018-09-12 |
CN103544957A (zh) | 2014-01-29 |
KR101736705B1 (ko) | 2017-05-16 |
CN106941004B (zh) | 2021-05-18 |
KR101661868B1 (ko) | 2016-09-30 |
US20150162011A1 (en) | 2015-06-11 |
CN106941004A (zh) | 2017-07-11 |
JP6351770B2 (ja) | 2018-07-04 |
KR20150032737A (ko) | 2015-03-27 |
JP2017107224A (ja) | 2017-06-15 |
JP2015524574A (ja) | 2015-08-24 |
US9424850B2 (en) | 2016-08-23 |
EP2863388A4 (en) | 2015-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6702593B2 (ja) | 音声信号の符号化と復号化の方法および装置 | |
WO2014008786A1 (zh) | 音频信号的比特分配的方法和装置 | |
JP6351783B2 (ja) | オーディオ信号のビットを割り当てる方法及び装置 | |
JP6574820B2 (ja) | 高周波帯域信号を予測するための方法、符号化デバイス、および復号デバイス | |
JP6202545B2 (ja) | 帯域幅拡張周波数帯域信号を予測する方法、および復号デバイス | |
US10789964B2 (en) | Dynamic bit allocation methods and devices for audio signal | |
RU2688259C2 (ru) | Способ и устройство обработки сигналов | |
WO2012139401A1 (zh) | 一种音频编码方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13816528 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2015520801 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2013816528 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 20157003447 Country of ref document: KR Kind code of ref document: A |