WO2014008786A1 - 音频信号的比特分配的方法和装置 - Google Patents

音频信号的比特分配的方法和装置 Download PDF

Info

Publication number
WO2014008786A1
WO2014008786A1 PCT/CN2013/076393 CN2013076393W WO2014008786A1 WO 2014008786 A1 WO2014008786 A1 WO 2014008786A1 CN 2013076393 W CN2013076393 W CN 2013076393W WO 2014008786 A1 WO2014008786 A1 WO 2014008786A1
Authority
WO
WIPO (PCT)
Prior art keywords
group
sub
subbands
band
bits
Prior art date
Application number
PCT/CN2013/076393
Other languages
English (en)
French (fr)
Inventor
齐峰岩
苗磊
刘泽新
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to KR1020157003447A priority Critical patent/KR101661868B1/ko
Priority to JP2015520801A priority patent/JP6092383B2/ja
Priority to KR1020167026037A priority patent/KR101736705B1/ko
Priority to EP13816528.7A priority patent/EP2863388B1/en
Publication of WO2014008786A1 publication Critical patent/WO2014008786A1/zh
Priority to US14/595,672 priority patent/US9424850B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/002Dynamic bit allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/035Scalar quantisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters

Definitions

  • Embodiments of the present invention relate to the field of audio technology and, more particularly, to methods and apparatus for bit allocation of audio signals. Background technique
  • Transform coding usually needs to band the frequency domain coefficients, obtain the normalized energy of each band, normalize the energy of the in-band coefficients, then perform bit allocation, and finally according to the bit pairs in each band.
  • the coefficients are quantized, where bit allocation is a very critical one.
  • Bit allocation means that in the process of quantizing the spectral coefficients, the bits of the audio signal used for the quantized spectral coefficients are allocated on the respective sub-bands according to the sub-band characteristics of the spectrum, that is, the coding resources usable by the audio signal are allocated to the respective sub-bands, generally
  • the coding resources are characterized by bits.
  • the existing bit allocation process includes: banding the frequency speech signal, for example, gradually increasing the bandwidth from the low frequency to the high frequency according to the critical band theory; spectrum banding, finding the normalized energy norm of each subband And quantizing the subband normalization factor wnorrm; arranging the subbands in descending order of the subband normalization factor wnorm; bit allocation, for example, iterative cyclic allocation according to the value of the subband normalization factor wnorm The number of bits per subband.
  • the iterative loop allocation bit can be further refined into the following steps: Step 1, initializing the number of bits of each subband and an iteration factor fac; Step 2, find the band corresponding to the largest subband normalization factor wnorm; Step 3, accumulate the bandwidth value of the number of bits allocated by the band, and subtract the value of the subband normalization factor wnorm from the iteration factor fac; Step 4. Iterate steps 2 and 3 until the bit allocation is completed. It can be seen that in the prior art, the bit unit allocated each time is the bandwidth value, and the minimum number of bits required for quantization is smaller than the bandwidth value, which makes the bit allocation of such an integer less efficient at a low bit rate. A lot of the bands are not allocated, and the other bands are too much. Because it is a full-band cyclic iteration allocation bit, the loop iteration parameters are the same for different bandwidth sub-bands, which will make the allocation result 4 ⁇ random, the quantization comparison is scattered, and the front and back frames are discontinuous.
  • bit allocation has a large impact on performance.
  • the usual bit allocation is mainly distributed in the whole frequency band according to the normalized energy of each sub-band. In the case of insufficient bit rate, the allocation is random and scattered, and quantization discontinuity is generated in the time domain. phenomenon. Summary of the invention
  • Embodiments of the present invention provide a method and apparatus for bit allocation of an audio signal, which can solve the problem of low and medium bit rate, and the existing bit allocation method causes the allocation to be random and scattered, thereby generating a problem of quantization discontinuity in the time domain.
  • a method for bit allocation of an audio signal including: dividing a frequency band of an audio signal into a plurality of sub-bands, and quantizing a sub-band normalization factor of each sub-band; dividing the plurality of sub-bands into multiple a group, the group of the plurality of groups comprising one or more sub-bands, obtaining group parameters of each group, wherein the group parameters are used to characterize signal characteristics and energy attributes of audio signals of the corresponding group; a group parameter of the group, the coded bits are allocated to the at least one group, wherein the sum of the number of coded bits allocated by the at least one group is the number of coded bits of the audio signal; according to each of each of the at least one group A subband normalization factor of the band, the coded bits allocated to the at least one group are allocated to each of the subbands of each of the at least one group.
  • an apparatus for providing bit allocation of an audio signal comprising: a band division quantization unit configured to divide a frequency band of the audio signal into a plurality of sub-bands, and quantize a sub-band normalization factor of each sub-band; And the plurality of sub-bands are divided into multiple groups, and one of the plurality of groups includes one or more sub-bands, and group parameters of each group are obtained, where the group parameters are used to represent the corresponding groups.
  • a signal characteristic and an energy attribute of the audio signal configured to allocate, according to the group parameter of each group, a coding bit, wherein a sum of the number of coded bits allocated by the at least one group is the audio a number of coded bits of the signal; a second allocation unit, configured to use the at least one group A subband normalization factor for each subband of each of the groups, the coded bits assigned to the at least one group are assigned to each of the subbands of each of the at least one group.
  • FIG. 1 is a flow chart of a method of bit allocation of an audio signal in accordance with an embodiment of the present invention.
  • FIG. 2 is a block diagram showing the structure of an apparatus for bit allocation of an audio signal according to an embodiment of the present invention.
  • Figure 3 is a block diagram showing the structure of an apparatus for bit allocation of an audio signal according to another embodiment of the present invention. detailed description
  • Coding technology solutions and decoding technology solutions are widely used in various electronic devices, such as: mobile phones, wireless devices, personal data assistants (PDAs), handheld or portable computers, GPS receivers/navigators, cameras, audio/video Players, camcorders, video recorders, surveillance equipment, etc.
  • PDAs personal data assistants
  • Such an electronic device includes an audio encoder or an audio decoder, and the audio encoder or decoder may be directly implemented by a digital circuit or a chip such as a DSP (digital signal processor), or may be executed by a software code driven processor in the software code. The process is implemented.
  • DSP digital signal processor
  • an audio time domain signal is first converted into a frequency domain signal, and then a coded bit is allocated to an audio frequency domain signal for encoding, and the encoded signal is transmitted to a decoding end through a communication system.
  • the decoding end decodes and recovers the encoded signal.
  • the present invention performs bit allocation based on the theory of the packet and the characteristics of the signal.
  • the bands are grouped, and according to the characteristics of each group, the energy in the group is weighted, and the energy is added to each group according to the weighted energy.
  • Line bit allocation and then assign bits to each band according to the characteristics of the signals within the group. Because the entire group is allocated first, the phenomenon of discontinuous distribution is avoided, thereby improving the coding quality of different signals.
  • the characteristics of the signal are taken into account in the intra-group allocation, so that limited bits can be allocated to important audio bands that affect perception.
  • 1 is a flow chart of a method of bit allocation of an audio signal in accordance with an embodiment of the present invention.
  • 101 Divide the frequency band of the audio signal into a plurality of sub-bands, and quantize the sub-band normalization of each sub-band.
  • the MDCT transform is taken as an example for description.
  • the input audio signal is subjected to MDCT transform to obtain frequency domain coefficients.
  • the MDCT transform here can include several processes of windowing, time domain aliasing, and discrete DCT transform.
  • the frequency domain envelope is then extracted from the MDCT coefficients and quantized.
  • the entire frequency band is divided into subbands of different frequency domain resolutions, the normalization factor of each subband is extracted, and the subband normalization factor is quantized.
  • a frequency band corresponding to a 16 kHz bandwidth such as a frame length of 20 ms (640 samples;) can be divided into the following 44 subbands:
  • L p is the number of coefficients in the subband, which is the starting point of the subband, ⁇ is the ending point of the subband, and P is the total number of subbands.
  • the normalization factor After the normalization factor is obtained, it can be quantified in the log domain to obtain the quantized subband normalization factor wnorm.
  • subbands having the same bandwidth may be divided into one group, and adjacent subbands having the same bandwidth are preferably divided into one group.
  • all subbands can be divided into four groups, and at low bit rates, only the first two groups or the first three groups are used, and the remaining groups are not allocated bits.
  • subbands with subband normalization factors wnorm close to each other can be grouped.
  • wnorm[i] is greater than a predetermined threshold K
  • the sub-band number i is recorded, and the sub-bands whose sub-band normalization factor wnorm[i] is greater than a predetermined threshold K are finally grouped into one group, and the remaining sub-bands are divided into groups. Another group. It should be understood that a plurality of predetermined thresholds may be set according to different needs, thereby obtaining more groups.
  • the group parameters for each group can be obtained to characterize the energy properties of the group.
  • the group parameters may include one or more of the following: the sum of the sub-band normalization factors within the group, the group-wnorm, the intra-group sub-band normalization factor, the peak-to-average ratio group-sharp.
  • the peak-to-average ratio of the subband normalization factor in the group is group_sharp is the ratio of the peak value of the subband normalization factor in the group to the mean value of the subband normalization factor in the group.
  • Group _ avg[i] where grou p_p ea k[i] is the peak of the sub-band normalization factor of the i-th group, and group_avg[i] is the average of the sub-band normalization factors of the i-th group.
  • the bits of the audio signal can be assigned to each group according to the group parameters.
  • the principle of grouping is used to consider the energy properties of the group, so that the bit allocation of the audio signal is more concentrated, and the bit allocation between frames is more continuous.
  • the group parameters are not limited to the ones listed herein, but may be other parameters that can characterize the energy properties of the group.
  • only bits are allocated to a partial group, for example, a group having a sum of subband normalization factors in the group is not allocated to bits; for example, when the number of bits is very large When there are few, there will also be groups that are not assigned to bits.
  • the coded bits may be allocated to at least one group according to the sum of the sub-band normalization factors in each group, wherein the sum of the coded bits allocated by the at least one group The number of encoded bits for the audio signal.
  • the result of assigning bits of the audio signal to each group can also be optimized by adjusting the group parameters. For example, different weights are assigned to group parameters of different groups according to different allocation requirements. The limited number of bits is allocated in the appropriate group and then allocated in the group so that the bit allocation is no longer dispersed, which will facilitate the encoding of the audio signal.
  • the peak-to-average ratio of the sub-band normalization factors in the group may be weighted.
  • the inner subband normalized factor sum group_wnorm, and the weighted group subband normalization factor sum group_wnorm_w is obtained.
  • the peak-to-average ratio of the sub-band normalization factor in the first group is compared with the group-sharp[i-l] of the group-sharp[i] and the sub-group normalization factor of the second group. If the peak-to-average ratio of the normalized factor of the sub-band in the first group is greater than the first threshold of the normalized factor of the sub-band in the second group, the group of the first group is adjusted according to the first weighting factor. The sum of the inner subband normalization factors adjusts the sum of the subgroup normalization factors of the second group according to the second weighting factor. vice versa.
  • the group of the second group is adjusted according to the first weighting factor.
  • the sum of the inner subband normalization factors, and the sum of the subband normalization factors of the first group of groups is adjusted according to the second weighting factor.
  • weighting method of the cartridge is schematically illustrated.
  • Other weighting methods should be readily apparent to those skilled in the art to adjust the weights of the subbands by different weighting coefficients. For example, the weight of subbands that need to allocate more signal bits can be increased, while the weight of subbands that do not need or need to allocate fewer signal bits can be reduced.
  • the bits of the audio signal are assigned to each group based on the sum of the weighted intra-group sub-band normalization factors. For example, according to the sum of the weighted group subband normalization factors and the sum_wnorm ratio of the subband normalization factors of all subbands, the group bit number of the group is determined, and the bits of the audio signal are determined according to the determined The number of group bits is assigned to this group.
  • the subbands within the group can be bit allocated using existing iterative loop allocation methods. However, the iterative loop allocation method still makes the bit allocation result in the group very random, and the front and back frames are not continuous.
  • the signal characteristics of the audio signals assigned to the group can be assigned to the sub-bands within the group based on the signal characteristics of the different audio signals, i.e., different signal types, depending on the sub-band normalization factors within the group.
  • the number of subbands that can be allocated in the group can be determined first. Then, according to the type of the audio signal, the bits of the audio signal allocated to the group are allocated to the group according to the subband normalization factor in the group. In the subband in which bit allocation is performed, the number of subbands in which bit allocation is performed in the group is equal to the number of subbands band_num.
  • the number of sub-bands of the initial bit allocation in each group may be determined according to the number of group bits and the third threshold, wherein the third threshold represents the minimum number of bits used to quantize a normalized spectral coefficient. For example, if a group is assigned 13 bits and the third threshold is 7 bits, then the number of subbands allocated by the initial bits in the group is 2. Then, the number of subbands band_num for bit allocation in the group is determined according to the number of subbands allocated in the initial bit in the group and the total number of subbands in the group.
  • band_num is the total number of subbands in the group, otherwise the value of band_num is a group.
  • the subband normalization factor in the group may be used to allocate bits for the number of subbands band_num subbands in the group.
  • the peak-to-average ratio of the sub-band normalization factor in the group of the group may be performed according to group_sharp.
  • the existing iterative loop allocation method may be used to perform bit allocation for the group; If it is determined that the audio signal of the group is a harmonic signal, the existing iterative cyclic allocation method may be used to perform bit allocation for the group, or the following method a or method b may be used for bit allocation.
  • Step 1 Sort the subband normalization factors of all subbands in the group from large to small, and select the top N subbands, where N is the number of subbands in the group for band allocation.
  • Step 2 Initialize the number of bits of the N subbands to 1, and initialize the number of loops j to 0.
  • Step 3 Determine subband normalization of subbands whose subband normalization factors are greater than zero in the N subbands 4 ⁇ factor sum band_wnorm;
  • Step 4 allocate a number of bits for a subband whose subband normalization factor is greater than zero in the N subbands;
  • Step 5 determine whether the number of bits allocated by the last subband of the N subbands is less than a fixed threshold fac, if If it is less than the fixed threshold fac, the number of bits allocated by the subband is set to zero;
  • Step 6 adding 1 to the number of cycles j;
  • Step 7 Restoring the original original ordering of all sub-bands within the group, i.e., reverting to the ordering of all sub-bands before quantifying the sub-band normalization factor of each sub-band.
  • Step 1 Sort the subband normalization factors of all subbands in the group from large to small, and select the top N subbands, where N is the number of subbands in the group for band allocation.
  • Step 2 initializing the number of bits of the N subbands is 1, and initializing the number of loops j is 0, and initializing the allocated number of bits bit_sum is 0;
  • Step 3 determining a subband of the subbands whose subband normalization factor is greater than zero in the N subbands is a sum of band factors and a band_wnorm;
  • Step 4 assigning a number of bits to the subbands whose subband normalization factor is greater than zero in the N subbands; Step 5, determining whether the number of bits allocated by the N subbands is less than a fixed threshold fac, if less than a fixed threshold fac, Then zero the number of bits allocated by this subband;
  • Step 6 calculating a sum of the number of bits allocated by all N subbands temp_sum
  • Step 7 adding 1 to the number of cycles j;
  • Step 8 it is determined whether temp_sum and bit_sum are equal, if they are equal, step 10 is performed; otherwise, step 9 is continued;
  • Step 9 update bit_sum, assign the temp_sum value to bit_sum; Cycle from step 3 to step 9, until the number of cycles j is equal to N;
  • Step 10 Restore the original original ordering for all sub-bands within the group.
  • method a and method b can also be combined with the method of determining band_num, that is, combining intra-group allocation with different audio signal characteristics. For example, if the number of subbands in the initial bit allocation in the group is greater than the total number of subbands in the group multiplied by the value of the scale factor k, method a is used; if the number of subbands allocated in the initial bit in the group is less than or equal to the total in the group Multiply the number of subbands by the value of the scale factor k, then method b is used.
  • the process of bit allocation for subbands in a group is to select the first N subbands with the largest subband normalization factor from all the subbands in the group as the subband to be allocated, where N is the intra-group.
  • the number of subbands band_num is allocated; then, according to the subband normalization factors of the N subbands, the number of bits is allocated for the N subbands in turn; finally, the original original ordering is restored for all subbands of the group.
  • the bits can be effectively allocated to the frequency band that can reflect the auditory perception of the signal. For example, for a band with strong harmonics, it is necessary to distribute the bits to the bands with harmonics, and for those spectral energy comparisons. For the average signal, the bits need to be evenly distributed.
  • the group can be further subdivided, that is, the sub-bands in the group are subdivided into a plurality of groups, and the group parameters of each group are obtained; then, according to the group parameters of each group, the group will be assigned to the group. Bits are assigned to each group. Finally, based on the subband normalization factor, the bits of the audio signal assigned to each group are assigned to each subband within each group. One possibility is to continue to refine until there is only one band in each group.
  • the grouping mode of the embodiment of the present invention ensures that the front and rear frame allocations are relatively stable, and different bits are allocated in the group according to the signal characteristics, so that the allocated bits are used to quantize the important frequency information, thereby improving the audio signal. Coding quality.
  • the method for bit allocation of an audio signal according to an embodiment of the present invention can ensure that the frame allocation is stable before and after the packet, and the global influence on the local discontinuity is reduced.
  • the bit allocation in each group can be set with different threshold parameters, thereby more adaptively allocating bits, and differently assigning bit assignments within the group according to spectral signal characteristics, for example, harmonic-like signals with more concentrated frequency. Focus on the subbands with large energy, the subbands between the harmonics do not need to allocate more bits, and for the signals with more gradual spectrum, the bit allocation tries to ensure the smoothness between the subbands, so that the allocated bits are used to quantify the important bits. On the spectrum information.
  • a schematic structure of an apparatus for bit allocation of an audio signal according to an embodiment of the present invention will be described below with reference to FIG.
  • the apparatus 20 for bit allocation of audio signals includes a band division quantization unit 21, a packet unit 22, a first allocation unit 23, and a second allocation unit 24.
  • the subband quantization unit 21 is configured to divide the frequency band of the audio signal into a plurality of subbands, and quantize the subband normalization factor of each subband.
  • the grouping unit 22 is configured to divide the plurality of sub-bands into a plurality of groups, and one of the plurality of groups includes one or more sub-bands, and obtain group parameters of each group, where the group parameters are used to represent the corresponding group.
  • the signal characteristics and energy properties of the audio signal are configured to divide the plurality of sub-bands into a plurality of groups, and one of the plurality of groups includes one or more sub-bands, and obtain group parameters of each group, where the group parameters are used to represent the corresponding group.
  • the first allocating unit 23 is configured to allocate, for the at least one component, coding bits according to the group parameter of each group, wherein the sum of the number of coded bits allocated by the at least one group is an encoding bit of the audio signal.
  • the grouping unit 22 may be configured to divide the sub-bands having the same bandwidth into one group, so that the plurality of sub-bands are divided into a plurality of groups.
  • the grouping unit 22 may be configured to group the sub-bands whose sub-band normalization factors are close, so that the plurality of sub-bands are divided into a plurality of groups.
  • the subbands in each group can be contiguous.
  • the grouping unit 22 is configured to obtain a sum of intra-group sub-band normalization factors of each group, and a peak-to-average ratio of intra-group sub-band normalization factors of each group, wherein the intra-group sub-band normalization factor
  • the sum is the sum of the sub-band normalization factors of all sub-bands in the group
  • the peak-to-average ratio of the sub-band normalization factors in the group is the peak of the sub-band normalization factor in the group and the sub-band normalization within the group
  • the ratio of the mean of the factor, wherein the peak of the subband normalization factor within the group is the maximum of the subband normalization factor for all subbands within the group, and the average of the subband normalization factors within the group
  • the value is the average of the subband normalization factors for all subbands within the group.
  • the grouping unit 22 is configured to further weight the sum of the sub-band normalization factors of each group according to the peak-to-average ratio of the sub-band normalization factors of each group, and obtain each group. The sum of the normalized factors within the weighted group.
  • the grouping module 22 may be configured to compare the peak-to-average ratio of the intra-group sub-band normalization factor of the first group to the peak-to-average ratio of the sub-group normalization factor of the second group;
  • the peak-to-average ratio of the normalization factor with the normalization factor is greater than the first threshold value of the sub-band normalization factor of the second group, and the sum of the sub-band normalization factors of the first group is adjusted according to the first weighting factor.
  • adjusting the sum of the sub-band normalization factors of the second group according to the second weighting factor.
  • the first allocating unit 23 may be configured to allocate coding bits to the at least one group according to the sum of the intra-group sub-band normalization factors of each group, where the sum of the coded bits allocated by the at least one group is audio The number of coded bits of the signal.
  • the first allocating unit 23 may be configured to allocate coded bits to the at least one group according to the sum of the weighted intra-group sub-band normalization factors, wherein the sum of the coded bits allocated by the at least one group is the number of coded bits of the audio signal .
  • the first allocating unit 23 may be configured to determine the number of group bits of the group according to a ratio of a sum of weighted intra-group sub-band normalization factors of the group to a sum of sub-band normalization factors of all sub-bands, and The coded bits of the audio signal are assigned to the group according to the set of bits.
  • the second allocating unit 24 is configured to allocate coded bits allocated to the at least one group to each of the at least one group according to a subband normalization factor of each subband of each of the at least one group. Each subband of the group.
  • the second bit allocation module 24 can include a determination module 241 and an allocation module 242.
  • the determining module 241 is configured to determine a number of subbands band_num for performing bit allocation in the group.
  • the allocating module 242 is configured to allocate bit bits of the audio signal allocated to the group according to the subband normalization factor in the group. To the sub-bands in which bit allocation is performed in the group, the number of sub-bands in which bit allocation is performed within the group is equal to the number of sub-bands band_num in which bit allocation is performed within the group.
  • the determining submodule 241 is configured to determine, according to the set of the number of bits and the third threshold, the number of subbands of the initial bit allocation in the group, where the third threshold is used to quantize a normalized spectral coefficient.
  • the minimum number of bits; the smaller of the number of subbands determining the initial bit allocation within the group and the total number of subbands within the group is the number of subbands band_num for bit allocation within the group.
  • the determining submodule 241 can be configured to determine, according to the set of bit numbers and the third threshold, the number of subbands of the initial bit allocation in the group, wherein the third threshold represents a minimum bit used to quantize a normalized spectral coefficient.
  • the allocating module 242 may be configured to select, from the group of all subbands, the first N subbands with the largest subband normalization factor as the subband to be allocated, where N is the number of subbands in the group for bit allocation;
  • the sub-band normalization factors of the N sub-bands are sequentially allocated bit numbers for the N sub-bands.
  • the original original ordering is restored for all subbands of the group.
  • the allocation module 242 specifically performs the following steps:
  • Step 1 Sort the subband normalization factors of all subbands in the group from large to small, and select the top N subbands, and N is the number of subbands in the group for band allocation.
  • Step 2 Initialize the number of bits of the N subbands to 1, and initialize the number of loops j to 0.
  • Step 3 Determine subband normalization of subbands whose subband normalization factors are greater than zero in the N subbands 4 ⁇ factor sum band — wnorm;
  • Step 4 allocate a number of bits for a subband whose subband normalization factor is greater than zero in the N subbands;
  • Step 5 determine whether the number of bits allocated by the last subband of the N subbands is less than a fixed threshold fac, if If it is less than the fixed threshold fac, the number of bits allocated by the subband is set to zero;
  • Step 6 adding 1 to the number of cycles j;
  • Step 7. Restore the original original ordering for all subbands within the group.
  • allocation sub-module 242 can be used to perform the following specific steps:
  • Step 1 Sort the subband normalization factors of all subbands in the group from large to small, and select the top N subbands, where N is the number of subbands in the group for band allocation.
  • Step 2 initializing the number of bits of the N subbands to 1, and initializing the number of loops, j is 0, and initializing the allocated number of bits bit_.sum is 0;
  • Step 3 determining a subband of the subbands whose subband normalization factor is greater than zero in the N subbands is a sum of band factors and a band_wnorm;
  • Step 4 allocate a number of bits for the subbands whose subband normalization factor is greater than zero in the N subbands.
  • Step 5 Determine whether the number of bits allocated by the N subbands is less than a fixed threshold fac, if less than a fixed threshold fac , then zero the number of bits allocated by this subband;
  • Step 6 calculating a sum of the number of bits allocated by all N subbands temp_sum
  • Step 7 adding 1 to the number of cycles j;
  • Step 8 Determine whether temp_sum and bit_sum are equal. If they are equal, perform step 10; otherwise, continue to step 9;
  • Step 9 update bit_sum, assign temp_sum value to bit_sum;
  • Step 10 Restore the original original ordering for all sub-bands within the group.
  • the first allocating unit 23 may further divide the sub-bands in the group into a plurality of groups, The group parameters for each group; then the bits assigned to the group are assigned to each of the groups based on the group parameters for each group.
  • the second allocation unit 24 is operative to assign bits of the audio signal assigned to each of the groups to each of the sub-bands in each of the groups in accordance with the sub-band normalization factor.
  • the apparatus for bit allocation of the audio signal can ensure that the frame allocation before and after is relatively stable by the grouping, and reduce the influence of the global on the local discontinuity.
  • the bit allocation in each group can be set with different threshold parameters, thereby more adaptively allocating bits, and differently assigning bit assignments within the group according to spectral signal characteristics, for example, harmonic-like signals with more concentrated frequency. Focus on the subbands with large energy, the subbands between the harmonics do not need to allocate more bits, and for the signals with more gradual spectrum, the bit allocation tries to ensure the smoothness between the subbands, so that the allocated bits are used to quantify the important bits. On the spectrum information.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
  • the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment.
  • each functional unit in various embodiments of the present invention may be integrated into one processing unit
  • each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential to the prior art or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes. .
  • FIG. 3 is a schematic block diagram of another embodiment of an apparatus 30 for bit allocation of audio signals of the present invention.
  • the device 30 includes a processor 31, a memory 32, an input device 33, an output device 34, and the like, and communicates with each other via a bus.
  • the processor 31 calls the program stored in the memory 32 to execute the steps of the embodiment of the bit allocation method of the audio signal.
  • the processor 31 is operative to execute the program of the embodiment of the present invention stored in the memory 32 and to communicate bidirectionally with other devices via the bus.
  • Memory 32 may be data including RAM and ROM, or any fixed storage medium, or removable processing.
  • Memory 32 and processor 31 may also be integrated into a physical module to which embodiments of the present invention are applied, on which the programs implementing the embodiments of the present invention are stored and executed.
  • Input device 33 may include any suitable means, such as a keyboard, mouse, etc., for receiving user input or input from other devices and transmitting to processor 31.
  • the output device 34 is for outputting the result of the bit allocation of the audio signal, which may be a display, a printer or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种音频信号的比特分配的方法和装置;其中,音频信号的比特分配的方法包括:将音频信号的频带分为多个子带,量化每个子带的子带归一化因子(101);将该多个子带划分为多个组,获取每个组的组参数,其中该组参数用于表征对应组的音频信号的信号特点和能量属性(102);根据每个组的组参数,为至少一个组分配编码比特,其中该至少一个组分配的编码比特数之和为该音频信号的编码比特数(103);依据至少一个组中的每个组的每个子带的子带归一化因子,将分配到该至少一个组的编码比特分配到该至少一个组中的每个组的每个子带(104)。该方法和装置能够在中低比特率时,通过分组保证前后帧分配比较稳定,减少全局对局部不连续的影响。

Description

音频信号的比特分配的方法和装置 本申请要求于 2012 年 7 月 13 日提交中国专利局、 申请号为 201210243316.4、 发明名称为 "音频信号的比特分配的方法和装置" 的中国 专利申请的优先权, 其全部内容通过引用结合在本申请中。 技术领域
本发明实施例涉及音频技术领域, 并且更具体地, 涉及音频信号的比特 分配的方法和装置。 背景技术
目前的通信传输越来越重视音频的质量, 所以要求编解码时在保证语音 质量的前提下要尽可能地提高音乐质量。 由于音乐信号信息量极为丰富, 不 能采用传统语音的 CELP ( Code Excited Linear Prediction, 码激励线性预测 ) 编码模式, 通常是利用变换编码的方法, 在频域来处理音乐信号, 提升音乐 信号的编码质量。但如何有效地用有限的编码比特高效率的编码信息成为目 前音频编码的主要研究课题。
目前的音频编码技术通常采用 FFT ( Fast Fourier Transform,快速傅立叶 变换)或 MDCT ( Modified Discrete Cosine Transform, 改进离散余弦变换) 将时域信号转换到频域, 然后对频域信号进行编码。 变换编码通常需要把频 域系数进行分带, 求得每个带的归一化能量, 并对带内系数能量归一化, 然 后进行比特分配, 最后根据每个带分到的比特对带内系数进行量化, 其中比 特分配是极为关键的一部。 比特分配指在量化频谱系数的过程中, 根据频谱 的子带特性将音频信号用于量化频谱系数的比特分配在各个子带上, 即将音 频信号能使用的编码资源分配到各个子带上, 一般编码资源由比特表征。
具体而言, 现有的比特分配的过程包括: 对频语信号进行分带, 例如根 据临界频带理论从低频到高频逐渐增加带宽; 频谱分带, 求出每个子带的归 一化能量 norm, 并量化得到子带归一化因子 wnorm; 将各子带按子带归一 化因子 wnorm 的值从大到小降序排列; 比特分配, 例如根据子带归一化因 子 wnorm 的值迭代循环分配每个子带的比特数。 其中, 迭代循环分配比特 又可以细化为以下步骤: 步骤 1 , 初始化每个子带的比特数和迭代因子 fac; 步骤 2, 找出最大的子带归一化因子 wnorm所对应的带; 步骤 3 , 将此带分 配的比特数累加带宽值,并将子带归一化因子 wnorm的值减去迭代因子 fac; 步骤 4, 迭代步骤 2和步骤 3 , 直至比特分配完毕。 可见, 在现有技术中, 每次分配的比特单位最小是带宽值, 而量化时所需的最低比特数要小于带宽 值, 这就使得这种整数的比特分配在低比特率下效率较低, 好多带分配不到 比特, 而其它的带又分得太多。 由于是全频带循环迭代分配比特, 对不同的 带宽的子带, 循环迭代参数都是一样的, 会使分配结果 4艮随机, 量化比较分 散, 前后帧不连续。
由此可知, 在低比特率下, 比特分配对性能影响较大。 通常的比特分配 主要是根据每个子带归一化能量的高低在全频带进行分配,在比特率不足的 情况下,这种分配很随机,也比较分散,会在时域上产生量化不连续的现象。 发明内容
本发明实施例提供一种音频信号的比特分配的方法和装置, 能够解决中 低比特率的情况下, 现有比特分配方法导致分配随机且分散, 从而在时域上 产生量化不连续的问题。
一方面, 提供了一种音频信号的比特分配的方法, 包括: 将音频信号的 频带分为多个子带, 量化每个子带的子带归一化因子; 将所述多个子带划分 为多个组, 所述多个组中的一个组包含一个或多个子带, 获取每个组的组参 数, 其中所述组参数用于表征对应组的音频信号的信号特点和能量属性; 根 据所述每个组的组参数, 为至少一个组分配编码比特, 其中该至少一个组分 配的编码比特数之和为所述音频信号的编码比特数;依据所述至少一个组中 的每个组的每个子带的子带归一化因子,将分配到所述至少一个组的编码比 特分配到所述至少一个组中的每个组的每个子带。
另一方面, 提供了一种音频信号的比特分配的装置, 包括: 分带量化单 元,用于将音频信号的频带分为多个子带,量化每个子带的子带归一化因子; 分组单元, 用于将所述多个子带划分为多个组, 所述多个组中的一个组包含 一个或多个子带, 获取每个组的组参数, 其中所述组参数用于表征对应组的 音频信号的信号特点和能量属性; 第一分配单元, 用于根据所述每个组的组 参数, 为至少一个组分配编码比特, 其中该至少一个组分配的编码比特数之 和为所述音频信号的编码比特数; 第二分配单元, 用于依据所述至少一个组 中的每个组的每个子带的子带归一化因子,将分配到所述至少一个组的编码 比特分配到所述至少一个组中的每个组的每个子带。
本发明实施例可以在中低比特率时, 通过分组保证前后帧分配比较稳 定, 减少全局对局部不连续的影响。 附图说明
为了更清楚地说明本发明实施例的技术方案, 下面将对实施例或现有技 术描述中所需要使用的附图作筒单地介绍, 显而易见地, 下面描述中的附图 仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造 性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1是根据本发明实施例的音频信号的比特分配的方法的流程图。
图 2是根据本发明实施例的音频信号的比特分配的装置的结构示意图。 图 3是根据本发明另一个实施例的音频信号的比特分配的装置的结构示 意图。 具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例是本发明一部分实施例, 而不是 全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创 造性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。
编码技术方案和解码技术方案, 广泛应用于各种电子设备中, 例如: 移 动电话, 无线装置, 个人数据助理(PDA ), 手持式或便携式计算机, GPS 接收机 /导航器, 照相机, 音频 /视频播放器, 摄像机, 录像机, 监控设备等。 通常, 这类电子设备中包括音频编码器或音频解码器, 音频编码器或者解码 器可以直接由数字电路或芯片例如 DSP ( digital signal processor ) 实现, 或 者由软件代码驱动处理器执行软件代码中的流程而实现。
作为示例, 在一种音频编码技术方案中, 首先将音频时域信号变换为频 域信号, 再将编码比特分配给音频频域信号进行编码, 将编码后的信号通过 通信系统传输给解码端, 解码端对编码后的信号解码恢复。
本发明根据分组的理论和信号的特点进行比特分配。 首先对带进行分 组, 再根据每组的特点, 对组内能量进行加权, 根据加权后的能量对各组进 行比特分配, 再根据组内的信号特点将比特分配到每个带。 因为先对整组进 行分配, 避免了分配不连续的现象, 从而提升不同信号的编码质量。 而在组 内分配时又考虑了信号的特点,使得有限的比特能分配到影响感知的重要的 音频带中
图 1是本发明一个实施例的音频信号的比特分配的方法的流程图。 101, 将音频信号的频带分为多个子带, 量化每个子带的子带归一化因 下面以 MDCT变换为例进行描述。 首先对输入的音频信号进行 MDCT 变换,得到频域系数。这里的 MDCT变换可包括加窗、时域混叠和离散 DCT 变换几个过程。
例如对输入时域信号 加正弦窗
π
h(n) = sin n = 0,...,2L-l L为信号的帧长
(1) 得到加窗后的信号为:
h(n)x0LD(n), n = 0,...,L-1
xw{n)
h(n)x(n -L), n L,...,2L— 1
(2) 然后进行时域混叠操作:
Figure imgf000006_0001
(3) 这里的 1£/2和 ]m分别表示为阶数为 ZJ 2的对角矩阵:
Figure imgf000006_0002
对时域混叠信号做离散 DCT变换, 最终得到频域的 MDCT系数:
Figure imgf000006_0003
然后从 MDCT 系数中提取频域包络并量化。 将整个频带分成一些不同 频域分辨率的子带, 提取每个子带的归一化因子, 并量化子带归一化因子。
例如对于 32kHz采样的音频信号, 对应 16kHz带宽的频带, 如帧长为 20ms ( 640样点;), 则可以分为如下 44个子带:
8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24,
32, 32, 32, 32, 32, 32, 32, 32
首先分成几个组, 然后组内再细化子带, 每个子带的归一化因子可定义
Figure imgf000007_0001
这里 Lp是子带内的系数个数, 是子带的起始点, ^是子带的结束点, P为总共的子带数。
得到归一化因子后, 可以在对数域对其进行量化, 得到量化后的子带归 一化因子 wnorm。
102, 将全部子带划分为多个组, 获取每个组的组参数, 其中组参数用 于表征对应组的音频信号的信号特点和能量属性,其中所述多个组中的一个 组包含一个或多个子带。
此外, 考虑将特性和能量相似的子带分入一组。 例如, 可以将具有相同 带宽的子带划分为一个组,优选地将相邻的具有相同带宽的子带划分为一个 组。 例如, 可以将全部子带分为四组, 则在低比特率时, 只采用前两组或前 三组, 而不对剩余的组进行比特分配。
或者, 可以根据子带的归一化能量 norm之间的关系进行分组。 也就是 说, 可以将子带归一化因子 wnorm接近的子带分成一组。 例如, 可以利用 以下方法判断子带的子带归一化因子是否接近: 将子带的子带归一化因子 wnorm[i] ( i = 1 ... P-1 , P是总共的子带数) 与预定阈值 K进行比较。 如果 wnorm[i]大于预定阈值 K,则记录下该子带序号 i,最终将其子带归一化因子 wnorm[i]大于预定阈值 K的子带分为一组,其余的子带分为另一组。应理解, 可以根据不同的需求设定多个预定阈值, 从而得到更多个组。
可选地, 还可以将相邻的子带归一化因子接近的子带分成一组。 例如, 可以利用以下方法判断相邻子带的子带归一化因子是否接近: 先计算相邻子 带的子带归一化因子的差值 wnorm_diff[i] ,其中 wnorm_diff[i]= abs(wnorm[i] - wnorm[i-l]) , i = 1 ... P-l。 P是总共的子带数。 如果 wnorm_diff[i]小于预定 阈值 K,, 表明相邻子带的子带归一化因子接近, 从而确定能分成一组的相 邻子带序号。
一旦完成子带分组, 便可获取每个组的组参数, 以表征组的能量属性。 一般而言, 组参数可以包括以下中的一个或多个: 组内子带归一化因子之和 group_wnorm、 组内子带归一 4匕因子的峰均比 group—sharp。
具体而言,组内子带归一化因子之和 group_wnorm是组内所有子带的子 group _ wnorm[i] = ^ wnorm[b]
带归一化因子的加和, 即 b—-s' , 其中 是第 i组中的 开始子带, 是第 i组中的结束子带。
或者, 组内子带归一化因子的峰均比 group_sharp是组内子带归一化因 子 的 峰值与 组 内 子 带 归 一化 因 子 的 平 均 值 的 比值 。 即
, , group _ peak[i]
group _ snarp\ i\ =
group _ avg[i] , 其中 group_peak[i]是第 i组的组内子带归一化 因子的峰值, group_avg[i]是第 i组的组内子带归一化因子的平均值。
或者,组内子带归一化因子的峰值 group_peak是组内所有子带的子带归 一化因子的最大值, 即 group _ peak[i] = Max(wnorm[S { ],...., wnorm[Ei ] ) , 其中 wnorm[Si]是第 i组中的开始子带的子带归一化因子, wnorm[Ei]是第 i组中的 结束子带的子带归一化因子。
或者, 组内子带归一化因子的平均值 group_avg是组内所有子带的子带 归一化因子的平均值,即 group _ avg[i] = gP - wnorm^ ,其中 grOUp_wnorm[i]
E{ - S{ + 1
是第 i组的组内子带归一化因子之和, &是第 i组中的开始子带, 是第 i 组中的结束子带。
103 , 根据每个组的组参数, 为至少一个组分配编码比特, 其中该至少 一个组分配的编码比特数之和为所述音频信号的编码比特数。
由于上述组参数表征了组的能量属性,从而可以根据组参数将音频信号 的比特分配到每个组。 这样, 在比特率不足的情况下, 利用分组的原理, 考 虑组的能量属性, 使得音频信号的比特分配更加集中, 也使得帧间比特分配 更加连续。 应理解, 组参数不限于在此列举的几种, 还可以是其他能够表征 组的能量属性的参数。 一个实施例中, 在比特率不足情况下, 仅为部分组分 配比特,例如对于组内子带归一化因子之和为零的组,其不会被分配到比特; 又例如, 当比特数很少时, 也会存在不被分配到比特的组。 也就是说, 在获 得以上组参数的基础上, 可以仅根据每个组的组内子带归一化因子之和, 为 至少一个组分配编码比特, 其中该至少一个组分配的编码比特数之和为音频 信号的编码比特数。
进一步的,还可以通过调整组参数来优化将音频信号的比特分配到每个 组的结果。 比如,根据不同的分配需求, 为不同组的组参数分配不同的权重, 使得有限的比特数分配在恰当的组中, 再在该组中分配, 使得比特分配不再 分散, 这样将有利于音频信号的编码。
下面只是示例性地给出一种实施方式。 例如, 在获取每个组的组内子带 归一化因子之和 group_wnorm 以及组内子带归一化因子的峰均比 group_sharp之后, 可以根据组内子带归一化因子的峰均比 group_sharp, 加 权组内子带归一化因子之和 group_wnorm , 得到加权的组内子带归一化因子 之和 group_wnorm_w。
具体的,比较第一组的组内子带归一化因子的峰均比 group_sharp[i]与第 二组的组内子带归一化因子的峰均比 group_sharp[i-l]。若第一组的组内子带 归一化因子的峰均比相对第二组的组内子带归一化因子的峰均比大于第一 阈值, 即根据第一加权因子调整该第一组的组内子带归一化因子之和, 根据 第二加权因子调整所述第二组的组内子带归一化因子之和。 反之亦然。 即若 第二组的组内子带归一化因子的峰均比相对第一组的组内子带归一化因子 的峰均比大于第二阈值,根据第一加权因子调整该第二组的组内子带归一化 因子之和, 根据第二加权因子调整所述第一组的组内子带归一化因子之和。
4列^口, ^口果 group—sharp [i] - group—sharp [i-1] > a, 贝l group_wnorm_w [i] =b* group_wnorm[i]„ 或者, ^口果 group—sharp [i-1] - group—sharp [i] > c, 贝l group_wnorm[i- 1 ] = b* group_wnorm[i- 1 ]。 其中, 组序号 i = 1 ... P-l。 P是总 共的子带数。 b为权重, a为第一阈值, c为第二阈值。 应理解, a、 b和 c 的选取可以根据比特分配的需求进行。
这里, 仅是示意性地说明了一种筒单的加权方法。 本领域技术人员应很 容易想到其他的加权方法, 以便通过不同的加权系数来调整子带的权重。 例 如, 可以加大需要分配更多信号比特的子带的权重, 而减小无需或需要分配 较少信号比特的子带的权重。
接着, 根据加权的组内子带归一化因子之和, 将音频信号的比特分配到 每个组。 例如按照加权的组内子带归一化因子之和 group_wnorm[i]与全部子 带的子带归一化因子之和 sum_wnorm比率, 确定该组的组比特数, 并将音 频信号的比特按照确定的组比特数分配到该组。通过以下公式确定每组的总 比特数 group—bits: group_bits[i] = sum—bits * group_wnorm[i]/sum_wnorm , 其中 sum_bits为需要分配的音频信号的总比特数, sum_wnorm是所有子带 的子带归一化因子之和。 在比特被分入各个组之后, 可以进一步将每个组分到的编码比特再分入 组内的各个子带中。
104, 依据该至少一个组中的每个组的每个子带的子带归一化因子, 将 分配到至少一个组的编码比特分配到该至少一个组中的每个组的每个子带。
应理解, 可以采用现有的迭代循环分配方法对组内的子带进行比特分 配。 但是, 迭代循环分配方法仍会使得组内的比特分配结果很随机, 前后帧 不连续。 因此, 可以结合不同音频信号的信号特点, 即不同的信号类型, 依 据该组内的子带归一化因子,将分配到该组的音频信号的比特分配到该组内 的子带中。
另外, 在本发明实施例中, 在比特率不足的情况下, 若将有限的比特分 配到组内所有的子带上, 将影响比特分配的效果。 因此, 可以先确定组内可 以进行比特分配的子带数 band_num; 然后, 根据音频信号的类型等, 依据 组内的子带归一化因子,将分配到组的音频信号的比特分配到组内进行比特 分配的子带中, 其中所述组内进行比特分配的子带的个数等于子带数 band_num。
这里, 可以根据组比特数以及第三阈值确定每个组内初始比特分配的子 带数, 其中第三阈值表示用于量化一个归一化后的频谱系数的最小比特数。 比如, 一个组分配到了 13比特, 第三阈值为 7比特, 那么组内初始比特分 配的子带数为 2。 然后, 根据组内初始比特分配的子带数以及组内的总子带 数确定组内进行比特分配的子带数 band_num。
例如,如果组内初始比特分配的子带数大于组内的总子带数乘以比例因 子 k的值, 则确定 band_num的取值为组内的总子带数, 否则 band_num的 取值为组内初始比特分配的子带数, 这里的比例因子 k是经验因子可以为 0.75 , 或其他数值。 也可以筒化该过程, 使组内进行比特分配的子带数 band_num取组内初始比特分配的子带数与组内的总子带数两者中较少的子 带数。
接下来, 可以根据组的音频信号的类型, 结合组内的子带归一化因子为 组内进行比特分配的子带数 band_num个子带分配比特。 在判断该组的音频 信号的类型时, 可以依据该组的组内子带归一化因子的峰均比 group_sharp 进行。 若通过组内子带归一化因子的峰均比 group_sharp确定音频信号为正 常( normal )信号, 可以采用现有的迭代循环分配方法为该组进行比特分配; 若判断该组的音频信号为谐波(harmonic )信号, 可以采用现有的迭代循环 分配方法为该组进行比特分配, 也可以采用以下方法 a或方法 b进行比特分 配。
方法 a:
步骤 1 , 对所述组内全部子带的子带归一化因子进行由大到小排序, 选 取排在前 N个子带, 其中 N为组内进行比特分配的子带数 band_num;
步骤 2, 初始化所述 N个子带的比特数为 1 , 并初始化循环次数 j为 0; 步骤 3,确定所述 N个子带中其子带归一化因子大于零的子带的子带归 一 4匕因子总和 band_wnorm;
步骤 4,为所述 N个子带中其子带归一化因子大于零的子带分配比特数; 步骤 5, 判断 N个子带中最后的子带所分配的比特数是否小于固定阈值 fac, 如果小于固定阈值 fac, 则将这个子带分配的比特数置零;
步骤 6, 将所述循环次数 j加 1 ;
循环步骤 3至步骤 6, 直到循环次数 j等于 N;
步骤 7, 对所述组内的全部子带恢复最初原始的排序, 即恢复到量化每 个子带的子带归一化因子之前全部子带的排序。
方法 b:
步骤 1 , 对所述组内全部子带的子带归一化因子进行由大到小排序, 选 取排在前 N个子带, 其中 N为组内进行比特分配的子带数 band_num;
步骤 2, 初始化所述 N个子带的比特数为 1 , 并初始化循环次数 j为 0, 初始化已分配比特数 bit_sum为 0;
步骤 3,确定所述 N个子带中其子带归一化因子大于零的子带的子带归 一 4匕因子总和 band_wnorm;
步骤 4,为所述 N个子带中其子带归一化因子大于零的子带分配比特数; 步骤 5 , 判断 N个子带所分配的比特数是否小于固定阈值 fac, 如果小 于固定阈值 fac, 则将这个子带分配的比特数置零;
步骤 6, 计算所有 N个子带所分配的比特数的总和 temp_sum;
步骤 7, 将所述循环次数 j加 1 ;
步骤 8, 判断 temp_sum和 bit_sum是否相等, 如果相等就执行步骤 10; 否则继续步骤 9;
步骤 9, 更新 bit_sum, 将 temp_sum值赋值给 bit_sum; 循环步骤 3至步骤 9, 直到循环次数 j等于 N;
步骤 10, 对所述组内的全部子带恢复最初原始的排序。
应理解, 除了上述方法 a或方法 b, 还可以采用其他方法进行组内的比 特分配。 另外, 方法 a和方法 b也可以与确定 band_num的方法相结合, 即 结合不同的音频信号特点进行组内分配。 例如, 如果组内初始比特分配的子 带数大于组内的总子带数乘以比例因子 k的值, 则采用方法 a; 如果组内初 始比特分配的子带数小于或等于组内的总子带数乘以比例因子 k的值, 则采 用方法 b。
综上所述, 为组内的子带进行比特分配的过程是, 先从组内全部子带中 选取子带归一化因子最大的前 N个子带作为待分配子带, 其中 N为组内进 行比特分配的子带数 band_num; 然后,依据该 N个子带的子带归一化因子, 依次为这 N个子带分配比特数; 最后,对该组的全部子带恢复最初原始的排 序。
结合不同音频信号的信号特点, 可以有效地将比特分配到能体现信号听 觉感知的频带, 譬如对于谐波性强的带就需要把比特集中分配到有谐波的 带, 而对于那些频谱能量比较平均的信号, 就需要把比特分配得均匀一些。
借鉴上述的分组方式, 还可以将组再细分, 即将组内的子带再划分为多 个小组, 并获取每个小组的小组参数; 然后根据每个小组的小组参数, 将分 配到组的比特分配到每个小组。 最后, 依据子带归一化因子, 将分配到每个 小组的音频信号的比特分配到每个小组内的每个子带。 有一种可能, 是不断 进行细化, 直到每个组上只有一个带为止。
经过本发明实施例的分组方式保证了前后帧分配比较稳定,并且根据信 号特点对组内进行不同侧重的比特分配,使得分配的比特都用于量化重要频 语信息上, 从而能够提升音频信号的编码质量。
由上可知,根据本发明实施例的音频信号的比特分配的方法可以通过分 组保证前后帧分配比较稳定, 减少全局对局部不连续的影响。 此外, 每个组 内的比特分配可以设不同的阈值参数, 从而更加自适应地分配比特, 并且根 据频谱信号特点对组内进行不同侧重的比特分配,例如对于频语较集中的类 谐波信号重点分配在能量大的子带, 谐波间的子带无需分配更多比特, 而对 于频谱较为平緩的信号, 比特分配则尽量保证子带间平滑, 这样会使得分配 的比特都用于量化重要的频谱信息上。 以下将结合图 2, 描述根据本发明实施例的音频信号的比特分配的装置 的示意结构。
在图 2中, 音频信号的比特分配的装置 20包括分带量化单元 21、 分组 单元 22、 第一分配单元 23以及第二分配单元 24。
其中, 分带量化单元 21用于将音频信号的频带分为多个子带, 量化每 个子带的子带归一化因子。
其中, 分组单元 22用于将多个子带划分为多个组, 所述多个组中的一 个组包含一个或多个子带, 获取每个组的组参数, 其中组参数用于表征对应 组的音频信号的信号特点和能量属性。
其中, 第一分配单元 23用于根据该每个组的组参数, 为至少一个组分 配编码比特, 其中该至少一个组分配的编码比特数之和为音频信号的编码比 特数。
可选地, 分组单元 22可以用于将具有相同带宽的子带划分为一个组, 从而该多个子带被划分为多个组。 或者, 分组单元 22可以用于将子带归一 化因子接近的子带分成一组, 从而多个子带被划分为多个组。 一般而言, 每 个组中的子带可以是相邻的。
可选地, 分组单元 22用于获取每个组的组内子带归一化因子之和、 每 个组的组内子带归一化因子的峰均比, 其中所述组内子带归一化因子之和是 所述组内所有子带的子带归一化因子的加和, 所述组内子带归一化因子的峰 均比是组内子带归一化因子的峰值与组内子带归一化因子的平均值的比值, 其中所述组内子带归一化因子的峰值是所述组内所有子带的子带归一化因 子的最大值, 所述组内子带归一化因子的平均值是所述组内所有子带的子带 归一化因子的平均值。
进一步地, 分组单元 22用于还可以根据所述每个组的组内子带归一化 因子的峰均比, 加权所述每个组的组内子带归一化因子之和, 得到每个组的 加权的组内子带归一化因子之和。
可选的, 分组模块 22可以用于比较第一组的组内子带归一化因子的峰 均比与第二组的组内子带归一化因子的峰均比; 若第一组的组内子带归一化 因子的峰均比相对第二组的组内子带归一化因子的峰均比大于第一阈值,根 据第一加权因子调整该第一组的组内子带归一化因子之和,根据第二加权因 子调整第二组的组内子带归一化因子之和。 可选地, 第一分配单元 23可以用于根据该每个组的组内子带归一化因 子之和, 为至少一个组分配编码比特, 其中该至少一个组分配的编码比特数 之和为音频信号的编码比特数。 或者, 第一分配单元 23 可以用于根据加权 的组内子带归一化因子之和, 为至少一个组分配编码比特, 其中该至少一个 组分配的编码比特数之和为音频信号的编码比特数。或者,第一分配单元 23 可以用于按照该组的加权的组内子带归一化因子之和与全部子带的子带归 一化因子之和的比率, 确定该组的组比特数, 并将音频信号的编码比特按照 该组比特数分配到该组。
其中, 第二分配单元 24用于依据该至少一个组中的每个组的每个子带 的子带归一化因子,将分配到至少一个组的编码比特分配到该至少一个组中 的每个组的每个子带。
进一步地,第二比特分配模块 24可以包括确定模块 241和分配模块 242。 其中, 确定模块 241用于确定该组内进行比特分配的子带数 band_num; 分 配模块 242用于根据所述组内的子带归一化因子,将分配到所述组的音频信 号的比特分配到所述组内进行比特分配的子带中,其中所述组内进行比特分 配的子带的个数等于所述组内进行比特分配的子带数 band_num。
可选地, 确定子模块 241可以用于根据该组比特数以及第三阈值, 确定 组内初始比特分配的子带数, 其中该第三阈值表示用于量化一个归一化后的 频谱系数的最小比特数; 确定所述组内初始比特分配的子带数以及所述组内 的总子带数中的较小值为所述组内进行比特分配的子带数 band_num。
或者, 确定子模块 241可以用于根据该组比特数以及第三阈值, 确定组 内初始比特分配的子带数, 其中该第三阈值表示用于量化一个归一化后的频 谱系数的最小比特数; 比较所述组内初始比特分配的子带数以及所述组内的 总子带数与比例因子 k的乘积, 其中比例因子 k用于调整所述组内的总子带 数; 若所述组内初始比特分配的子带数小于所述组内的总子带数与比例因子 k的乘积, 确定所述组内进行比特分配的子带数为所述组内初始比特分配的 子带数;否则,确定所述组内进行比特分配的子带数为所述组内的总子带数。
可选地,分配模块 242可以用于从该组全部子带中选取子带归一化因子 最大的前 N个子带作为待分配子带, 其中 N为组内进行比特分配的子带数; 依据所述 N个子带的子带归一化因子, 依次为所述 N个子带分配比特数。 对所述组的全部子带恢复最初原始的排序。 例如, 分配模块 242具体执行以下步骤:
步骤 1 , 对所述组内全部子带的子带归一化因子进行由大到小排序, 选 取排在前 N个子带, N为组内进行比特分配的子带数 band_num;
步骤 2, 初始化所述 N个子带的比特数为 1 , 并初始化循环次数 j为 0; 步骤 3, 确定所述 N个子带中其子带归一化因子大于零的子带的子带归 一 4匕因子总和 band— wnorm;
步骤 4,为所述 N个子带中其子带归一化因子大于零的子带分配比特数; 步骤 5, 判断 N个子带中最后的子带所分配的比特数是否小于固定阈值 fac, 如果小于固定阈值 fac, 则将这个子带分配的比特数置零;
步骤 6, 将所述循环次数 j加 1 ;
循环步骤 3至步骤 6, 直到循环次数 j等于 N;
步骤 7, 对所述组内的全部子带恢复最初原始的排序。
可选地, 分配子模块 242可以用于执行以下具体步骤:
步骤 1 , 对所述组内全部子带的子带归一化因子进行由大到小排序, 选 取排在前 N个子带, 其中 N为组内进行比特分配的子带数 band_num;
步骤 2, 初始化所述 N个子带的比特数为 1, 并初始化循环次数 j为 0, 初始化已分配比特数 bit_ .sum为 0;
步骤 3,确定所述 N个子带中其子带归一化因子大于零的子带的子带归 一 4匕因子总和 band_wnorm;
步骤 4,为所述 N个子带中其子带归一化因子大于零的子带分配比特数; 步骤 5, 判断 N个子带所分配的比特数是否小于固定阔值 fac, 如果小 于固定阈值 fac, 则将这个子带分配的比特数置零;
步骤 6, 计算所有 N个子带所分配的比特数的总和 temp_sum;
步骤 7, 将所述循环次数 j加 1 ;
步骤 8 , 判断 temp_sum和 bit_sum是否相等, 如果相等就执行步骤 10; 否则继续步骤 9;
步骤 9, 更新 bit_sum, 将 temp— sum值赋值给 bit_sum;
循环步骤 3至步骤 9, 直到循环次数 j等于 N;
步骤 10, 对所述组内的全部子带恢复最初原始的排序。
此外,在如图 2所示的根据本发明实施例的音频信号的比特分配的装置 中, 第一分配单元 23还可以进一步将该组内的子带划分为多个小组, 获取 每个小组的小组参数; 于是根据该每个小组的小组参数, 将分配到该组的比 特分配到该每个小组。 由此, 第二分配单元 24则用于依据该子带归一化因 子, 将分配到该每个小组的音频信号的比特分配到该每个小组内的每个子 带。
由此可见,根据本发明实施例的音频信号的比特分配的装置可以通过分 组保证前后帧分配比较稳定, 减少全局对局部不连续的影响。 此外, 每个组 内的比特分配可以设不同的阈值参数, 从而更加自适应地分配比特, 并且根 据频谱信号特点对组内进行不同侧重的比特分配,例如对于频语较集中的类 谐波信号重点分配在能量大的子带, 谐波间的子带无需分配更多比特, 而对 于频谱较为平緩的信号, 比特分配则尽量保证子带间平滑, 这样会使得分配 的比特都用于量化重要的频谱信息上。
本领域普通技术人员可以意识到, 结合本文中所公开的实施例描述的各 示例的单元及算法步骤, 能够以电子硬件、 或者计算机软件和电子硬件的结 合来实现。 这些功能究竟以硬件还是软件方式来执行, 取决于技术方案的特 定应用和设计约束条件。 专业技术人员可以对每个特定的应用来使用不同方 法来实现所描述的功能, 但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到, 为描述的方便和筒洁, 上述描 述的系统、 装置和单元的具体工作过程, 可以参考前述方法实施例中的对应 过程, 在此不再赘述。
在本申请所提供的几个实施例中, 应该理解到, 所揭露的系统、 装置和 方法, 可以通过其它的方式实现。 例如, 以上所描述的装置实施例仅仅是示 意性的, 例如, 所述单元的划分, 仅仅为一种逻辑功能划分, 实际实现时可 以有另外的划分方式, 例如多个单元或组件可以结合或者可以集成到另一个 系统, 或一些特征可以忽略, 或不执行。 另一点, 所显示或讨论的相互之间 的耦合或直接耦合或通信连接可以是通过一些接口, 装置或单元的间接耦合 或通信连接, 可以是电性, 机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作 为单元显示的部件可以是或者也可以不是物理单元, 即可以位于一个地方, 或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或 者全部单元来实现本实施例方案的目的。
另外, 在本发明各个实施例中的各功能单元可以集成在一个处理单元 中, 也可以是各个单元单独物理存在, 也可以两个或两个以上单元集成在一 个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使 用时, 可以存储在一个计算机可读取存储介质中。 基于这样的理解, 本发明 的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部 分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质 中, 包括若干指令用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。 而前 述的存储介质包括: U盘、移动硬盘、只读存储器( ROM, Read-Only Memory )、 随机存取存储器(RAM, Random Access Memory ), 磁碟或者光盘等各种可 以存储程序代码的介质。
图 3是本发明的音频信号的比特分配的装置 30另一个实施例的示意框 图。 装置 30包括处理器 31、 存储器 32、 输入设备 33和输出设备 34等, 通 过总线相互通信。 其中, 处理器 31调用存储器 32存储的程序, 可以执行上 述音频信号的比特分配方法实施例的各个步骤。
处理器 31用于执行存储器 32存储的本发明实施例的程序, 并通过总线 与其他装置双向通信。
存储器 32可以是包括 RAM和 ROM、 或任何固定的存储介质、 或可移 处理的数据。
存储器 32和处理器 31也可以整合成应用本发明实施例的物理模块,在 该物理模块上存储和运行实现该本发明实施例的程序。
输入设备 33可以包括键盘、 鼠标等任何合适的装置, 用于接收用户的 输入或来自其他设备的输入, 并发送给处理器 31。
输出设备 34用于将音频信号的比特分配的结果输出, 可以是显示器、 打印机等。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限 于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易 想到变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护 范围应所述以权利要求的保护范围为准。

Claims

权利要求
1、 一种音频信号的比特分配的方法, 其特征在于, 包括:
将音频信号的频带分为多个子带, 量化每个子带的子带归一化因子; 将所述多个子带划分为多个组,所述多个组中的一个组包含一个或多个 子带, 获取每个组的组参数, 其中所述组参数用于表征对应组的音频信号的 信号特点和能量属性;
根据所述每个组的组参数, 为至少一个组分配编码比特, 其中该至少一 个组分配的编码比特数之和为所述音频信号的编码比特数;
依据所述至少一个组中的每个组的每个子带的子带归一化因子,将分配 到所述至少一个组的编码比特分配到所述至少一个组中的每个组的每个子 带。
2、 根据权利要求 1所述的方法, 其特征在于,
所述获取每个组的组参数包括:
获取每个组的组内子带归一化因子之和、每个组的组内子带归一化因子 的峰均比, 其中所述组内子带归一化因子之和是所述组内所有子带的子带归 一化因子的和,所述组内子带归一化因子的峰均比是组内子带归一化因子的 峰值与组内子带归一化因子的平均值的比值, 其中所述组内子带归一化因子 的峰值是所述组内所有子带的子带归一化因子的最大值, 所述组内子带归一 化因子的平均值是所述组内所有子带的子带归一化因子的平均值;
所述根据所述每个组的组参数, 为至少一个组分配编码比特, 其中该至 少一个组分配的编码比特数之和为所述音频信号的编码比特数包括:
根据每个组的组内子带归一化因子之和, 为至少一个组分配编码比特, 其中该至少一个组分配的编码比特数之和为所述音频信号的编码比特数。
3、 根据权利要求 1所述的方法, 其特征在于,
所述获取每个组的组参数包括:
获取每个组的组内子带归一化因子之和、每个组的组内子带归一化因子 的峰均比, 其中所述组内子带归一化因子之和是所述组内所有子带的子带归 一化因子的加和, 所述组内子带归一化因子的峰均比是组内子带归一化因子 的峰值与组内子带归一化因子的平均值的比值, 其中所述组内子带归一化因 子的峰值是所述组内所有子带的子带归一化因子的最大值, 所述组内子带归 一化因子的平均值是所述组内所有子带的子带归一化因子的平均值; 根据所述每个组的组内子带归一化因子的峰均比,加权所述每个组的组 内子带归一化因子之和, 得到每个组的加权的组内子带归一化因子之和; 所述根据所述每个组的组参数, 为至少一个组分配编码比特, 其中该至 少一个组分配的编码比特数之和为所述音频信号的编码比特数包括:
根据每个组的加权的组内子带归一化因子之和, 为至少一个组分配编码 比特, 其中该至少一个组分配的编码比特数之和为所述音频信号的编码比特 数。
4、 根据权利要求 3所述的方法, 其特征在于, 所述根据所述每个组的 组内子带归一化因子的峰均比,加权所述每个组的组内子带归一化因子之和 包括:
比较第一组的组内子带归一化因子的峰均比与第二组的组内子带归一 化因子的峰均比;
若第一组的组内子带归一化因子的峰均比相对第二组的组内子带归一 化因子的峰均比大于第一阈值,根据第一加权因子调整所述第一组的组内子 带归一化因子之和,根据第二加权因子调整所述第二组的组内子带归一化因 子之和。
5、 根据权利要求 3或 4所述的方法, 其特征在于, 所述根据每个组的 加权的组内子带归一化因子之和, 为至少一个组分配编码比特, 其中该至少 一个组分配的编码比特数之和为所述音频信号的编码比特数包括:
按照所述每个组的加权的组内子带归一化因子之和与全部子带的子带 归一化因子之和的比率, 确定所述组的组比特数, 并将音频信号的比特按照 所述组比特数分配到所述组。
6、 根据权利要求 1至 5中任一项所述的方法, 其特征在于, 所述依据 所述至少一个组中的每个组的每个子带的子带归一化因子,将分配到所述至 少一个组的编码比特分配到所述至少一个组中每个组的每个子带包括: 确定所述组内进行比特分配的子带数;
才艮据所述组内的子带归一化因子,将分配到所述组的音频信号的编码比 特分配到所述组内进行比特分配的子带中, 其中所述组内进行比特分配的子 带的个数等于所述组内进行比特分配的子带数。
7、 根据权利要求 6所述的方法, 其特征在于, 所述确定所述组内进行 比特分配的子带数包括:
根据所述组比特数以及第三阈值, 确定组内初始比特分配的子带数, 其 中所述第三阈值表示用于量化一个归一化后的频谱系数的最小比特数; 根据所述组内初始比特分配的子带数以及所述组内的总子带数,确定所 述组内进行比特分配的子带数。
8、 根据权利要求 7所述的方法, 其特征在于, 所述根据所述组内初始 比特分配的子带数以及所述组内的总子带数, 确定所述组内进行比特分配的 子带数包括:
以所述组内初始比特分配的子带数以及所述组内的总子带数中的较小 值确定为所述组内进行比特分配的子带数。
9、 根据权利要求 7所述的方法, 其特征在于, 所述根据所述组内初始 比特分配的子带数以及所述组内的总子带数, 确定所述组内进行比特分配的 子带数包括:
比较所述组内初始比特分配的子带数以及所述组内的总子带数与比例 因子 k的乘积, 其中比例因子 k用于调整所述组内的总子带数;
若所述组内初始比特分配的子带数小于所述组内的总子带数与比例因 子 k的乘积,确定所述组内进行比特分配的子带数为所述组内初始比特分配 的子带数; 否则, 确定所述组内进行比特分配的子带数为所述组内的总子带 数。
10、 根据权利要求 6至 9中任一项所述的方法, 其特征在于, 所述根据 所述组内的子带归一化因子,将分配到所述组的音频信号的编码比特分配到 所述组内进行比特分配的子带中包括:
从该组全部子带中选取子带归一化因子最大的前 N 个子带作为待分配 子带, 其中 N为组内进行比特分配的子带数;
依据所述 N个子带的子带归一化因子, 依次为所述 N个子带分配比特 数。
11、 根据权利要求 1所述的方法, 其特征在于, 所述根据所述每个组的 组参数, 为至少一个组分配编码比特, 其中该至少一个组分配的编码比特数 之和为音频信号的编码比比特数包括:
将所述组内的子带划分为多个小组, 获取每个小组的小组参数; 根据所述每个小组的小组参数,将分配到所述组的比特分配到所述每个 小组;
所述依据所述至少一个组中的每个组的每个子带的子带归一化因子,将 分配到所述至少一个组的编码比特分配到所述至少一个组中的每个组的每 个子带包括:
依据所述子带归一化因子,将分配到所述每个小组的音频信号的比特分 配到所述每个小组内的每个子带。
12、 根据权利要求 1至 11 中任一项所述的方法, 其特征在于, 所述将 所述多个子带划分为多个组包括:
将具有相同带宽的子带划分为一个组,从而所述多个子带被划分为多个 组; 或者
将子带归一化因子接近的子带分成一个组,从而所述多个子带被划分为 多个组。
13、 根据权利要求 12所述的方法, 其特征在于, 所述多个组中的每个 组中的子带是相邻的。
14、 一种音频信号的比特分配的装置, 其特征在于, 包括:
分带量化单元, 用于将音频信号的频带分为多个子带, 量化每个子带的 子带归一化因子;
分组单元, 用于将所述多个子带划分为多个组, 所述多个组中的一个组 包含一个或多个子带, 获取每个组的组参数, 其中所述组参数用于表征对应 组的音频信号的信号特点和能量属性;
第一分配单元, 用于根据所述每个组的组参数, 为至少一个组分配编码 比特, 其中该至少一个组分配的编码比特数之和为所述音频信号的编码比特 数;
第二分配单元,用于依据所述至少一个组中的每个组的每个子带的子带 归一化因子,将分配到所述至少一个组的编码比特分配到所述至少一个组中 的每个组的每个子带。
15、 根据权利要求 14所述的装置, 其特征在于,
所述分组单元用于:
获取每个组的组内子带归一化因子之和、每个组的组内子带归一化因子 的峰均比, 其中所述组内子带归一化因子之和是所述组内所有子带的子带归 一化因子的加和, 所述组内子带归一化因子的峰均比是组内子带归一化因子 的峰值与组内子带归一化因子的平均值的比值, 其中所述组内子带归一化因 子的峰值是所述组内所有子带的子带归一化因子的最大值, 所述组内子带归 一化因子的平均值是所述组内所有子带的子带归一化因子的平均值,
所述第一分配单元用于:
根据每个组的组内子带归一化因子之和, 为至少一个组分配编码比特, 其中该至少一个组分配的编码比特数之和为所述音频信号的编码比特数。
16、 根据权利要求 14所述的装置, 其特征在于,
所述分组单元用于:
获取每个组的组内子带归一化因子之和、每个组的组内子带归一化因子 的峰均比, 其中所述组内子带归一化因子之和是所述组内所有子带的子带归 一化因子的加和, 所述组内子带归一化因子的峰均比是组内子带归一化因子 的峰值与组内子带归一化因子的平均值的比值, 其中所述组内子带归一化因 子的峰值是所述组内所有子带的子带归一化因子的最大值, 所述组内子带归 一化因子的平均值是所述组内所有子带的子带归一化因子的平均值,
根据所述每个组的组内子带归一化因子的峰均比,加权所述每个组的组 内子带归一化因子之和, 得到每个组的加权的组内子带归一化因子之和; 所述第一分配单元用于:
根据每个组的加权的组内子带归一化因子之和, 为至少一个组分配编码 比特, 其中该至少一个组分配的编码比特数之和为所述音频信号的编码比特 数。
17、 根据权利要求 16所述的装置, 其特征在于, 所述分组单元具体用 于:
比较第一组的组内子带归一化因子的峰均比与第二组的组内子带归一 化因子的峰均比;
若第一组的组内子带归一化因子的峰均比相对第二组的组内子带归一 化因子的峰均比大于第一阈值,根据第一加权因子调整所述第一组的组内子 带归一化因子之和,根据第二加权因子调整所述第二组的组内子带归一化因 子之和。
18、 根据权利要求 17所述的装置, 其特征在于, 所述第一分配单元具 体用于:
按照所述每个组的加权的组内子带归一化因子之和与全部子带的子带 归一化因子之和的比率, 确定所述组的组比特数, 并将音频信号的比特按照 所述组比特数分配到所述组。
19、 根据权利要求 14至 18中任一项所述的装置, 其特征在于, 所述第 二分配单元包括:
确定模块, 用于确定所述组内进行比特分配的子带数;
分配模块, 用于根据所述组内的子带归一化因子, 将分配到所述组的音 频信号的编码比特分配到所述组内进行比特分配的子带中, 其中所述组内进 行比特分配的子带的个数等于所述组内进行比特分配的子带数。
20、 根据权利要求 19所述的装置, 其特征在于, 所述确定模块具体用 于:
根据所述组比特数以及第三阈值, 确定组内初始比特分配的子带数, 其 中所述第三阈值表示用于量化一个归一化后的频谱系数的最小比特数; 确定所述组内初始比特分配的子带数以及所述组内的总子带数中的较 小值为所述组内进行比特分配的子带数。
21、 根据权利要求 19所述的装置, 其特征在于, 所述确定模块具体用 于:
根据所述组比特数以及第三阈值, 确定组内初始比特分配的子带数, 其 中所述第三阈值表示用于量化一个归一化后的频谱系数的最小比特数; 比较所述组内初始比特分配的子带数以及所述组内的总子带数与比例 因子 k的乘积, 其中比例因子 k用于调整所述组内的总子带数;
若所述组内初始比特分配的子带数小于所述组内的总子带数与比例因 子 k的乘积,确定所述组内进行比特分配的子带数为所述组内初始比特分配 的子带数; 否则, 确定所述组内进行比特分配的子带数为所述组内的总子带 数。
22、 根据权利要求 19至 21中任一项所述的装置, 其特征在于, 所述分 配模块具体用于:
从该组全部子带中选取子带归一化因子最大的前 N 个子带作为待分配 子带, 其中 N为组内进行比特分配的子带数;
依据所述 N个子带的子带归一化因子, 依次为所述 N个子带分配比特 数。
23、 根据权利要求 14所述的装置, 其特征在于, 所述第一分配单元具体用于:
将所述组内的子带划分为多个小组, 获取每个小组的小组参数; 根据所述每个小组的小组参数,将分配到所述组的比特分配到所述每个 小组;
所述第二分配单元具体用于:
依据所述子带归一化因子,将分配到所述每个小组的音频信号的比特分 配到所述每个小组内的每个子带。
24、 根据权利要求 14至 23中任一项所述的装置, 其特征在于, 所述分 组单元具体用于:
将具有相同带宽的子带划分为一个组,从而所述多个子带被划分为多个 组; 或者
将子带归一化因子接近的子带分成一个组,从而所述多个子带被划分为 多个组。
25、 根据权利要求 24所述的装置, 其特征在于, 所述多个组中的每个 组中的子带是相邻的。
PCT/CN2013/076393 2012-07-13 2013-05-29 音频信号的比特分配的方法和装置 WO2014008786A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
KR1020157003447A KR101661868B1 (ko) 2012-07-13 2013-05-29 오디오 신호를 위한 비트 할당 방법 및 장치
JP2015520801A JP6092383B2 (ja) 2012-07-13 2013-05-29 オーディオ信号中でビットを割り当てる方法及び装置
KR1020167026037A KR101736705B1 (ko) 2012-07-13 2013-05-29 오디오 신호를 위한 비트 할당 방법 및 장치
EP13816528.7A EP2863388B1 (en) 2012-07-13 2013-05-29 Bit allocation method and device for audio signal
US14/595,672 US9424850B2 (en) 2012-07-13 2015-01-13 Method and apparatus for allocating bit in audio signal

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210243316.4 2012-07-13
CN201210243316.4A CN103544957B (zh) 2012-07-13 2012-07-13 音频信号的比特分配的方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/595,672 Continuation US9424850B2 (en) 2012-07-13 2015-01-13 Method and apparatus for allocating bit in audio signal

Publications (1)

Publication Number Publication Date
WO2014008786A1 true WO2014008786A1 (zh) 2014-01-16

Family

ID=49915373

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/076393 WO2014008786A1 (zh) 2012-07-13 2013-05-29 音频信号的比特分配的方法和装置

Country Status (6)

Country Link
US (1) US9424850B2 (zh)
EP (1) EP2863388B1 (zh)
JP (2) JP6092383B2 (zh)
KR (2) KR101661868B1 (zh)
CN (2) CN106941004B (zh)
WO (1) WO2014008786A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105632505A (zh) * 2014-11-28 2016-06-01 北京天籁传音数字技术有限公司 主成分分析pca映射模型的编解码方法及装置
JP2019152871A (ja) * 2014-04-29 2019-09-12 華為技術有限公司Huawei Technologies Co.,Ltd. 信号処理方法及び装置

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3117432B1 (en) 2014-03-14 2019-05-08 Telefonaktiebolaget LM Ericsson (publ) Audio coding method and apparatus
CN106409300B (zh) * 2014-03-19 2019-12-24 华为技术有限公司 用于信号处理的方法和装置
EP3483882A1 (en) 2017-11-10 2019-05-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Controlling bandwidth in encoders and/or decoders
US11133891B2 (en) 2018-06-29 2021-09-28 Khalifa University of Science and Technology Systems and methods for self-synchronized communications
US10951596B2 (en) * 2018-07-27 2021-03-16 Khalifa University of Science and Technology Method for secure device-to-device communication using multilayered cyphers
US11355139B2 (en) 2020-09-22 2022-06-07 International Business Machines Corporation Real-time vs non-real time audio streaming

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1910656A (zh) * 2004-01-20 2007-02-07 杜比实验室特许公司 基于块分组的音频编码
EP1852849A1 (en) * 2006-05-05 2007-11-07 Deutsche Thomson-Brandt Gmbh Method and apparatus for lossless encoding of a source signal, using a lossy encoded data stream and a lossless extension data stream
CN101101755A (zh) * 2007-07-06 2008-01-09 北京中星微电子有限公司 一种音频编码的比特分配及量化方法及音频编码装置
CN101499279A (zh) * 2009-03-06 2009-08-05 武汉大学 空间参数逐级精细的比特分配方法及其装置
US20090313029A1 (en) * 2006-07-14 2009-12-17 Anyka (Guangzhou) Software Technologiy Co., Ltd. Method And System For Backward Compatible Multi Channel Audio Encoding and Decoding with the Maximum Entropy

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3270212D1 (en) * 1982-04-30 1986-05-07 Ibm Digital coding method and device for carrying out the method
GB8421498D0 (en) * 1984-08-24 1984-09-26 British Telecomm Frequency domain speech coding
US5752225A (en) * 1989-01-27 1998-05-12 Dolby Laboratories Licensing Corporation Method and apparatus for split-band encoding and split-band decoding of audio information using adaptive bit allocation to adjacent subbands
WO1995032499A1 (fr) * 1994-05-25 1995-11-30 Sony Corporation Procede de codage, procede de decodage, procede de codage-decodage, codeur, decodeur et codeur-decodeur
JP3491425B2 (ja) * 1996-01-30 2004-01-26 ソニー株式会社 信号符号化方法
JP3521596B2 (ja) * 1996-01-30 2004-04-19 ソニー株式会社 信号符号化方法
DE69924922T2 (de) * 1998-06-15 2006-12-21 Matsushita Electric Industrial Co., Ltd., Kadoma Audiokodierungsmethode und Audiokodierungsvorrichtung
JP3466507B2 (ja) * 1998-06-15 2003-11-10 松下電器産業株式会社 音声符号化方式、音声符号化装置、及びデータ記録媒体
JP4242516B2 (ja) * 1999-07-26 2009-03-25 パナソニック株式会社 サブバンド符号化方式
JP4287545B2 (ja) * 1999-07-26 2009-07-01 パナソニック株式会社 サブバンド符号化方式
JP2001094433A (ja) * 1999-09-17 2001-04-06 Matsushita Electric Ind Co Ltd サブバンド符号化・復号方法
JP2002091498A (ja) * 2000-09-19 2002-03-27 Victor Co Of Japan Ltd オーディオ信号符号化装置
DE60135487D1 (de) * 2000-12-22 2008-10-02 Sony Corp Codierer
US7725313B2 (en) * 2004-09-13 2010-05-25 Ittiam Systems (P) Ltd. Method, system and apparatus for allocating bits in perceptual audio coders
KR100754389B1 (ko) * 2005-09-29 2007-08-31 삼성전자주식회사 음성 및 오디오 신호 부호화 장치 및 방법
GB2454190A (en) 2007-10-30 2009-05-06 Cambridge Silicon Radio Ltd Minimising a cost function in encoding data using spectral partitioning
US8207875B2 (en) 2009-10-28 2012-06-26 Motorola Mobility, Inc. Encoder that optimizes bit allocation for information sub-parts
US8386266B2 (en) * 2010-07-01 2013-02-26 Polycom, Inc. Full-band scalable audio codec
CN102081926B (zh) * 2009-11-27 2013-06-05 中兴通讯股份有限公司 格型矢量量化音频编解码方法和系统
US8831932B2 (en) 2010-07-01 2014-09-09 Polycom, Inc. Scalable audio in a multi-point environment
US9536534B2 (en) * 2011-04-20 2017-01-03 Panasonic Intellectual Property Corporation Of America Speech/audio encoding apparatus, speech/audio decoding apparatus, and methods thereof
CN102208188B (zh) 2011-07-13 2013-04-17 华为技术有限公司 音频信号编解码方法和设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1910656A (zh) * 2004-01-20 2007-02-07 杜比实验室特许公司 基于块分组的音频编码
EP1852849A1 (en) * 2006-05-05 2007-11-07 Deutsche Thomson-Brandt Gmbh Method and apparatus for lossless encoding of a source signal, using a lossy encoded data stream and a lossless extension data stream
US20090313029A1 (en) * 2006-07-14 2009-12-17 Anyka (Guangzhou) Software Technologiy Co., Ltd. Method And System For Backward Compatible Multi Channel Audio Encoding and Decoding with the Maximum Entropy
CN101101755A (zh) * 2007-07-06 2008-01-09 北京中星微电子有限公司 一种音频编码的比特分配及量化方法及音频编码装置
CN101499279A (zh) * 2009-03-06 2009-08-05 武汉大学 空间参数逐级精细的比特分配方法及其装置

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019152871A (ja) * 2014-04-29 2019-09-12 華為技術有限公司Huawei Technologies Co.,Ltd. 信号処理方法及び装置
JP2021043453A (ja) * 2014-04-29 2021-03-18 華為技術有限公司Huawei Technologies Co.,Ltd. 信号処理方法及び装置
US11081121B2 (en) 2014-04-29 2021-08-03 Huawei Technologies Co., Ltd. Signal processing method and device
JP7144499B2 (ja) 2014-04-29 2022-09-29 華為技術有限公司 信号処理方法及び装置
US11580996B2 (en) 2014-04-29 2023-02-14 Huawei Technologies Co., Ltd. Signal processing method and device
US11881226B2 (en) 2014-04-29 2024-01-23 Huawei Technologies Co., Ltd. Signal processing method and device
CN105632505A (zh) * 2014-11-28 2016-06-01 北京天籁传音数字技术有限公司 主成分分析pca映射模型的编解码方法及装置
CN105632505B (zh) * 2014-11-28 2019-12-20 北京天籁传音数字技术有限公司 主成分分析pca映射模型的编解码方法及装置

Also Published As

Publication number Publication date
JP6092383B2 (ja) 2017-03-08
KR20160114192A (ko) 2016-10-04
CN103544957B (zh) 2017-04-12
EP2863388A1 (en) 2015-04-22
EP2863388B1 (en) 2018-09-12
CN103544957A (zh) 2014-01-29
KR101736705B1 (ko) 2017-05-16
CN106941004B (zh) 2021-05-18
KR101661868B1 (ko) 2016-09-30
US20150162011A1 (en) 2015-06-11
CN106941004A (zh) 2017-07-11
JP6351770B2 (ja) 2018-07-04
KR20150032737A (ko) 2015-03-27
JP2017107224A (ja) 2017-06-15
JP2015524574A (ja) 2015-08-24
US9424850B2 (en) 2016-08-23
EP2863388A4 (en) 2015-08-12

Similar Documents

Publication Publication Date Title
JP6702593B2 (ja) 音声信号の符号化と復号化の方法および装置
WO2014008786A1 (zh) 音频信号的比特分配的方法和装置
JP6351783B2 (ja) オーディオ信号のビットを割り当てる方法及び装置
JP6574820B2 (ja) 高周波帯域信号を予測するための方法、符号化デバイス、および復号デバイス
JP6202545B2 (ja) 帯域幅拡張周波数帯域信号を予測する方法、および復号デバイス
US10789964B2 (en) Dynamic bit allocation methods and devices for audio signal
RU2688259C2 (ru) Способ и устройство обработки сигналов
WO2012139401A1 (zh) 一种音频编码方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13816528

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2015520801

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2013816528

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20157003447

Country of ref document: KR

Kind code of ref document: A