WO2014063489A1 - 音频信号的比特分配的方法和装置 - Google Patents

音频信号的比特分配的方法和装置 Download PDF

Info

Publication number
WO2014063489A1
WO2014063489A1 PCT/CN2013/076392 CN2013076392W WO2014063489A1 WO 2014063489 A1 WO2014063489 A1 WO 2014063489A1 CN 2013076392 W CN2013076392 W CN 2013076392W WO 2014063489 A1 WO2014063489 A1 WO 2014063489A1
Authority
WO
WIPO (PCT)
Prior art keywords
group
bits
sub
bit
subbands
Prior art date
Application number
PCT/CN2013/076392
Other languages
English (en)
French (fr)
Inventor
齐峰岩
刘泽新
苗磊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to KR1020157010413A priority Critical patent/KR20150058483A/ko
Priority to EP13849179.0A priority patent/EP2892052B1/en
Priority to BR112015008609-8A priority patent/BR112015008609B1/pt
Priority to JP2015538257A priority patent/JP6121551B2/ja
Priority to SG11201502355PA priority patent/SG11201502355PA/en
Publication of WO2014063489A1 publication Critical patent/WO2014063489A1/zh
Priority to US14/675,031 priority patent/US9530420B2/en
Priority to US15/354,641 priority patent/US9972326B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/002Dynamic bit allocation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/035Scalar quantisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components

Definitions

  • Embodiments of the present invention relate to the field of audio technology and, more particularly, to methods and apparatus for bit allocation of audio signals. Background technique
  • Transform coding usually needs to band the frequency domain coefficients, obtain the normalized energy of each band, normalize the energy of the in-band coefficients, then perform bit allocation, and finally according to the bit pairs in each band.
  • the coefficients are quantized, where bit allocation is a very critical one. Bit allocation means that in the process of quantizing the spectral coefficients, the audio signal is allocated on the respective sub-bands by the bits of the quantized spectral coefficients according to the sub-band characteristics of the spectrum.
  • the existing bit allocation process includes: banding the frequency speech signal, for example, gradually increasing the bandwidth from the low frequency to the high frequency according to the critical band theory; spectrum banding, finding the normalized energy norm of each subband And quantizing the subband normalization factor wnorrm; arranging the subbands in descending order of the subband normalization factor wnorm; bit allocation, for example, iterative cyclic allocation according to the value of the subband normalization factor wnorm The number of bits per subband.
  • the iterative loop allocation bit can be further refined into the following steps: Step 1: Initialize the number of bits of each subband and the iteration factor fac; Step 2, find the band corresponding to the largest subband normalization factor wnorm; Step 3 , divide this band The allocated bit number accumulates the bandwidth value, and subtracts the value of the subband normalization factor wnorm from the iteration factor fac; Step 4, iterates Step 2 and Step 3 until the bit allocation is completed.
  • the bit unit allocated each time is the bandwidth value, and the minimum number of bits required for quantization is smaller than the bandwidth value, which makes the bit allocation of such an integer less efficient at a low bit rate. A lot of the bands are not allocated, and the other bands are too much. Because it is a full-band cyclic iteration allocation bit, the loop iteration parameters are the same for different bandwidth sub-bands, which will make the allocation result 4 ⁇ random, the quantization comparison is scattered, and the front and back frames are discontinuous.
  • bit allocation has a large impact on performance.
  • the usual bit allocation is mainly distributed in the whole frequency band according to the normalized energy of each sub-band. In the case of insufficient bit rate, the allocation is random and scattered, and quantization discontinuity is generated in the time domain. phenomenon. Summary of the invention
  • Embodiments of the present invention provide a method and apparatus for bit allocation of an audio signal, which can solve the problem of low and medium bit rate, and the existing bit allocation method causes the allocation to be random and scattered, thereby generating a problem of quantization discontinuity in the time domain.
  • a method for bit allocation of an audio signal including: dividing a frequency band of an audio signal into a plurality of sub-bands, and quantizing a sub-band normalization factor of each sub-band; dividing the plurality of sub-bands into multiple a group, obtaining a sum of sub-band normalization factors within each group, wherein a sum of sub-band normalization factors within the group is a sum of sub-band normalization factors of all sub-bands within the group; Determining the sum of the intra-group sub-band normalization factors of each group to perform initial inter-group bit allocation to determine an initial number of bits of each group; performing a second inter-group bit based on the initial number of bits of each group Allocating, to allocate coded bits of the audio signal to at least one group, wherein a sum of the bits of the at least one group assignment is an encoded bit of the audio signal; assigning bits of the audio signal assigned to the group to the group In the sub-band.
  • performing the inter-group inter-bit allocation comprises: performing a quad-group bit allocation using a bit allocation saturation algorithm.
  • performing a bit allocation saturation algorithm includes: determining a saturation bit number of each group; Determining, according to the number of saturated bits and the initial number of bits, a bit saturation group and a number of redundant bits, wherein the number of redundant bits is a number of bits of the initial bit number of the bit saturation group being greater than the number of saturated bits; Allocating the excess number of bits to a bit-unsaturated group; wherein the ratio A particularly saturated group refers to a group whose initial number of bits is more than a saturated number of bits, and the bit unsaturated group refers to a group whose initial number of bits is less than the number of saturated bits.
  • the assigning the excess number of bits to the bit-unsaturated group includes: allocating the redundant number of bits to the bit not Saturated group.
  • the method further includes: determining, according to the difference value and/or the code rate of the average value of the sub-band normalization factors in the group, whether to adopt a bit allocation saturation algorithm, wherein the average of the sub-band normalization factors in the group The value is the average of the subband normalization factors for all subbands within the group; if so, the saturation algorithm using bit allocation is determined, and if not, the weighting algorithm is determined.
  • performing the inter-group bit allocation may further include: performing a quadrature inter-group bit allocation by using a weighting algorithm.
  • the performing a quadratic inter-group bit allocation by using a weighting algorithm includes: weighting the intra-group sub-band normalization factor of each group And the sum of the weighted intra-group sub-band normalization factors of each group is obtained; the quad-group bit allocation is performed on the initial number of bits according to the sum of the weighted intra-group sub-band normalization factors of each group .
  • the assigning the bits of the audio signal allocated to the group to the subbands in the group includes: Normalizing factors are weighted to obtain a weighted subband normalization factor; according to the weighted subband normalization factor, bits of the audio signal assigned to the group are allocated to portions of the group or All sub-bands, wherein the partial sub-bands are selected from all sub-bands within the group by the weighted sub-band normalization factor from largest to smallest.
  • the dividing the multiple subbands into multiple groups includes: dividing subbands having the same bandwidth into one group, so that The plurality of sub-bands are divided into a plurality of groups; or the sub-bands whose sub-band normalization factors are close to each other are divided into a group, so that the plurality of sub-bands are divided into a plurality of groups.
  • a second aspect provides an apparatus for bit allocation of an audio signal, comprising: a subband quantization unit configured to divide a frequency band of the audio signal into a plurality of subbands, and quantize a subband normalization factor of each subband; And dividing the plurality of sub-bands into a plurality of groups, and obtaining a sum of sub-band normalization factors of each group, wherein a sum of sub-band normalization factors in the group is all sub-bands in the group And a first allocation unit, configured to perform initial inter-group bit allocation according to a sum of intra-group sub-band normalization factors of each group to determine initial bits of each group a second allocation unit, configured to perform a second inter-group bit allocation based on the initial number of bits of each group to allocate coded bits of the audio signal to at least one group, wherein the at least one group of allocated
  • the second allocating unit is specifically configured to: perform a second inter-group bit allocation by using a bit allocation saturation algorithm.
  • the second allocation unit includes: a first determining module, configured to determine a saturation bit number of each group; a module, configured to determine a bit saturation group and a redundant number of bits according to the number of saturated bits and the initial number of bits, where the number of redundant bits is that the initial number of bits of the bit saturated group is greater than the number of saturated bits a bit number; an allocation module, configured to allocate the excess number of bits to a bit-unsaturated group; wherein the bit-saturation group refers to a group whose initial number of bits is more than a saturated number of bits, and the bit-unsaturated group refers to The group whose initial number of bits is less than the number of saturated bits.
  • the allocation module is specifically configured to: allocate the excess number of bits to the bit unsaturated group.
  • the apparatus for performing bit allocation of the audio signal further includes: a determining unit, configured to: After the initial inter-group bit allocation, and before the second inter-group bit allocation, determining whether to use the bit allocation saturation algorithm according to the difference and/or the code rate of the average of the sub-band normalization factors within the group, Wherein the average of the subband normalization factors in the group is the average of the subband normalization factors of all subbands in the group; if yes, the saturation algorithm using bit allocation is determined, and if not, the weighting is determined algorithm.
  • the second allocating unit is further configured to: perform a secondary inter-group bit allocation by using a weighting algorithm.
  • the second allocation unit further includes: a weighting module, configured to weight a sum of the sub-band normalization factors of each group, and obtain a sum of weighted intra-group sub-band normalization factors of each group; A quadratic inter-group bit allocation is performed on the initial number of bits according to a sum of weighted intra-group sub-band normalization factors for each group.
  • the third allocating unit includes: a weighting module, configured to weight the subband normalization factor to obtain a weighted sub a normalization factor; an allocation module, configured to allocate, according to the weighted subband normalization factor, bits of the audio signal allocated to the group to some or all of the subbands in the group, wherein Partial subbands are selected from all subbands within the group by the weighted subband normalization factor from large to small.
  • the grouping unit is specifically configured to: divide the sub-bands having the same bandwidth into one group, so that the multiple sub-bands are divided into multiple Or grouping subbands whose subband normalization factors are close together, so that the plurality of subbands are divided into a plurality of groups.
  • the subbands in each group have the same bandwidth, or a substantially similar subband normalization factor.
  • FIG. 1 is a flow chart of a method of bit allocation of an audio signal in accordance with an embodiment of the present invention.
  • 2 is a block diagram showing the structure of an apparatus for bit allocation of an audio signal according to an embodiment of the present invention.
  • Figure 3 is a block diagram showing the structure of a second allocation unit in the apparatus for bit allocation of an audio signal according to an embodiment of the present invention.
  • Fig. 4 is a block diagram showing another configuration of an apparatus for bit allocation of an audio signal according to an embodiment of the present invention.
  • FIG. 5 is a third allocation unit in an apparatus for bit allocation of an audio signal according to an embodiment of the present invention. Schematic diagram of the structure.
  • FIG. 6 is a block diagram showing still another structure of an apparatus for bit allocation of an audio signal according to an embodiment of the present invention. detailed description
  • Coding technology solutions and decoding technology solutions are widely used in various electronic devices, such as: mobile phones, wireless devices, personal data assistants (PDAs), handheld or portable computers, GPS receivers/navigators, cameras, audio/video Players, camcorders, video recorders, surveillance equipment, etc.
  • PDAs personal data assistants
  • Such an electronic device includes an audio encoder or an audio decoder, and the audio encoder or decoder may be directly implemented by a digital circuit or a chip such as a DSP (digital signal processor), or may be executed by a software code driven processor in the software code. The process is implemented.
  • DSP digital signal processor
  • an audio time domain signal is first converted into a frequency domain signal, and then a coded bit is allocated to an audio frequency domain signal for encoding, and the encoded signal is transmitted to a decoding end through a communication system.
  • the decoding end decodes and recovers the encoded signal.
  • the present invention performs bit allocation based on the theory of the packet and the characteristics of the signal.
  • the bands are grouped, and according to the characteristics of each group, the energy in the group is weighted, the bits are allocated according to the weighted energy, and the bits are allocated to each band according to the characteristics of the signals in the group. Because the entire group is allocated first, the phenomenon of discontinuous distribution is avoided, thereby improving the coding quality of different signals.
  • the characteristics of the signal are also taken into account in the intra-group allocation, so that limited bits can be allocated to important audio bands that affect perception.
  • FIG. 1 is a flow chart of a method of bit allocation of an audio signal in accordance with an embodiment of the present invention.
  • the MDCT transform is taken as an example for description below.
  • the input audio signal is subjected to MDCT transform to obtain frequency domain coefficients.
  • the MDCT transform here can include several processes of windowing, time domain aliasing, and discrete DCT transform.
  • the signal after getting the window is:
  • n L, ... , 2L -1
  • l m and ] m are respectively represented as diagonal matrices of order ZJ 2 :
  • the frequency domain envelope is then extracted from the MDCT coefficients and quantized.
  • the entire frequency band is divided into subbands of different frequency domain resolutions, the normalization factor of each subband is extracted, and the subband normalization factor is quantized.
  • a frequency band corresponding to an 8 kHz bandwidth such as a frame length of 20 ms, and a total of 3200 spectral coefficients, can be divided into the following 26 subbands:
  • the normalization factor of each sub-band can be defined as: (6) where L P is the number of coefficients in the subband, is the starting point of the subband, ⁇ is the ending point of the subband, and P is the total number of subbands.
  • the normalization factor After the normalization factor is obtained, it can be quantified in the log domain to obtain the quantized subband normalization factor wnorm.
  • the multiple sub-bands are divided into multiple groups, and a sum of sub-band normalization factors in each group is obtained, where a sum of sub-band normalization factors in the group is a sub-band of all sub-bands in the group. With normal The sum of the factors.
  • all sub-bands are divided into a plurality of groups, and group parameters of each group are obtained, wherein the group parameters may be a sum of intra-group sub-band normalization factors used to characterize the signal characteristics and energy attributes of the group.
  • subbands having the same bandwidth may be divided into one group, and adjacent subbands having the same bandwidth are preferably divided into one group.
  • all subbands can be divided into three groups, and at a low bit rate, only the first group or the first two groups are used, and the remaining groups are not allocated bits.
  • subbands with subband normalization factors wnorm close to each other can be grouped.
  • wnorm[i] is greater than a predetermined threshold K
  • the sub-band number i is recorded, and the sub-bands whose sub-band normalization factor wnorm[i] is greater than a predetermined threshold K are finally grouped into one group, and the remaining sub-bands are divided into groups. Another group. It should be understood that a plurality of predetermined thresholds may be set according to different needs, thereby obtaining more groups.
  • the group parameters for each group can be obtained to characterize the energy properties of the group.
  • the group parameters may include one or more of the following: the sum of the sub-band normalization factors within the group group_wnorm, the sub-band normalization within the group, and the peak-to-average ratio of the group factor-sharp.
  • the bits of the audio signal can be assigned to each group according to the group parameters.
  • the principle of grouping is used to consider the energy properties of the group, so that the bit allocation of the audio signal is more concentrated, and the bit allocation between frames is more continuous.
  • the group parameters are not limited to the ones listed herein, but may be other parameters that can characterize the energy properties of the group.
  • coding bits may be allocated to at least one group according to the sum of the sub-band normalization factors in each group, wherein the sum of the bits allocated by the at least one group is audio. The bits of the signal.
  • each group group_wnorm[i] the initial number of allocated bits per group is obtained.
  • a secondary inter-group bit allocation After determining the initial number of bits for each group, a secondary inter-group bit allocation can be made.
  • quadratic inter-group bit allocation can be performed by a saturation algorithm using bit allocation.
  • the number of saturated bits is generally an empirical value, such as an average of 1 to 2 bits per spectral coefficient.
  • the number of saturated bits can also be related to the encoding rate and signal characteristics.
  • the bit saturation group and the excess number of bits are determined according to the number of saturated bits and the initial number of bits, and finally the number of redundant bits is allocated to the bit unsaturated group. For example, the number of extra bits can be equally assigned to the bit unsaturated group.
  • bit saturation group refers to a group whose initial number of bits is more than the number of saturated bits
  • bit unsaturated group refers to a group whose initial number of bits is less than the saturation bit number.
  • the number of extra bits refers to the number of bits in which the initial bit number of the bit saturation group is larger than the saturation bit number of the group.
  • a quadratic inter-group bit allocation can be performed by employing a weighting algorithm.
  • the peak-to-average ratio of the sub-band normalization factor in the group can be determined according to group_sharp.
  • the sum of the subband normalization factors in the weighted group group_wnorm is obtained by weighting the sum of the subband normalization factors within the group group_wnorm_w.
  • two adjacent groups are successively selected, such as the first group and the second group.
  • the peak-to-average ratio of the grouping normalization factor in the group of the first group was compared with the group-sharp[i-l] of the group-sharp[i] and the grouping normalization factor in the second group. If the peak-to-average ratio of the normalized factor of the sub-band in the first group is greater than the first threshold of the normalized factor of the sub-band in the second group, the group of the first group is adjusted according to the first weighting factor.
  • a sum of normalization factors of the inner sub-bands adjusting a sum of normalization factors of the sub-groups of the second group according to a second weighting factor; if the peak-to-average ratio of the normalization factors of the sub-groups in the second group is relatively first
  • the peak-to-average ratio of the sub-band normalization factors in the group is greater than a second threshold, and the sum of the sub-band normalization factors of the second group is adjusted according to the first weighting factor, and the first group is adjusted according to the second weighting factor The sum of the normalization factors in the subbands of the group.
  • the group number i l ... P-l.
  • P is the total number of sub-bands.
  • b is the weight
  • a is the first threshold
  • c is the second threshold. It should be understood that the selection of a, b, and c can be made according to the needs of bit allocation.
  • weighting method of the cartridge is schematically illustrated.
  • Other weighting methods should be readily apparent to those skilled in the art to adjust the weights of the subbands by different weighting coefficients. For example, you can increase the weight of subbands that need to allocate more signal bits, and reduce the need or need to allocate The weight of the subbands with fewer signal bits.
  • the bits of the audio signal are assigned to each group based on the sum of the weighted intra-group sub-band normalization factors. For example, according to the sum of the weighted group subband normalization factors and the sum_wnorm ratio of the subband normalization factors of all subbands, the group bit number of the group is determined, and the bits of the audio signal are determined according to the determined The number of group bits is assigned to this group.
  • group_bits[i] sum_bits * group_wnorm[i]/sum_wnorm , where sum_bits is the total number of bits of the audio signal to be allocated, and sum_wnorm is the child of all subbands Take the sum of the normalization factors.
  • the process of bit allocation between the above two groups can be further optimized, for example, according to the difference between the code rate and/or the average value of the subband normalization factors in the group to adopt different quadratic inter-group bit allocation schemes, such as a saturation algorithm or Weighting algorithm.
  • determining whether the difference and/or the code rate of the average of the subband normalization factors in the group is a saturation algorithm or a weighting algorithm using bit allocation, wherein the average of the subband normalization factors within the group is the group The average of the subband normalization factors for all subbands within.
  • bits from each component can be further subdivided into individual subbands within the group.
  • the signal characteristics of the audio signals assigned to the group can be allocated to the sub-bands in the group according to the signal characteristics of different audio signals, that is, different signal types, according to the sub-band normalization factors of the respective sub-bands in the group. in.
  • the subband normalization factor is weighted to obtain a weighted subband normalization factor; according to the weighted subband normalization factor, audio to be allocated to the group Bits of the signal are allocated to some or all of the sub-bands within the group, wherein the partial sub-bands are selected from all sub-bands within the group by the weighted sub-band normalization factor from large to d, selected .
  • a typical implementation of allocating bits of an audio signal assigned to the group to all subbands within the group according to the weighted subband normalization factor is determining the weighted sub-scores of all sub-bands After the normalization factor is calculated, the sum of the weighted subband normalization factors of all subbands in the group is calculated, and then the weighted subband normalization factor of the subband of the bit is allocated as needed and all subbands are The ratio of the sum of the weighted subband normalization factors, the bits assigned to the group are respectively to the specific sub Belt.
  • a typical implementation of allocating bits of an audio signal assigned to the group to partial subbands within the group according to the weighted subband normalization factor is to weight each subband within the group
  • the subband normalization factors are sorted, for example, from large to small; according to the ordering of the weighted subband normalization factors, the partial subbands corresponding to the weighted subband normalization factors of the top ranking are selected;
  • the bits of the audio signal assigned to the group are assigned to the aforementioned partial sub-bands within the group.
  • the method for bit allocation of an audio signal can ensure that the frame allocation before and after the packet is relatively stable, and reduce the global influence on the local discontinuity; by using the secondary allocation, the redundant bits of the saturated subband are effectively utilized. Make bit allocation more reasonable.
  • the plurality of subbands of the audio signal are divided into a plurality of groups, and the initial number of allocated bits per group is obtained according to the sum of the grouping normalization factors of each group of group_wnorm[i]. For example, all subbands are grouped into three groups.
  • the initial number of bits of the second group ⁇ 2 sum_bits * group_wnorm[ 1 ]/sum_norm
  • Step 1 Calculate the difference between the mean values of the subband normalization factors in the group
  • Avg_diff[0] group_avg[0] - group_avg[l];
  • Avg_diff[l] group_avg[l] - group_avg[2];
  • Step 2 Select a quadratic inter-group bit allocation scheme, such as determining whether to use a bit-sequenced saturation algorithm or a weighting according to the difference between the average value of the sub-band normalization factors in the group and the code rate. Algorithm.
  • Step 3 Post-processing algorithm: If the group_wnorm[2] of the highest sub-band is less than a certain value, the bits allocated by this group are allocated to the group of the low sub-band. For example, when group_wnorm[2] is less than the threshold d, the bit allocated by the highest subband is allocated to the second highest subband, and the number of bits allocated by the highest subband is set to zero.
  • B_saved B_saved + (Bl-Bl-UP);
  • B_saved B_saved + (B2-B2_UP);
  • B3 B3_UP;
  • B1_UP, B2_UP, B3_UP are empirical factors, which can be 288, 256, 96 respectively.
  • B_saved is evenly distributed to other groups. If the first group of allocated bits is not saturated, half of B_saved is first added to B1; then the second group of allocations is determined. Whether the bits are saturated, such as the first group of allocated bits is not saturated, then B2 is re-assigned to sum_bits -Bl -B3, otherwise B3 is re-assigned to sum_bits -Bl -B2, the pseudo code of the algorithm is as follows:
  • B2 sum_bits-Bl -B3;
  • B3, sum-bits - Bl,- B2,,
  • sum_bits is the total number of bits
  • FAC1 and FAC2 are empirical factors, which can be 2.0, 1.5 or 2.0, 3.0, etc., respectively.
  • Step 2 Normalize the factor for all subbands in the group wnorm in the order of large to small #f* to get wnorm_index(i)o
  • Step 3 Weight the sorted wnorm_index(i) according to the weighting parameter factor[] as follows:
  • Wnorm _ index(i) wnorm _ index(i) — 0 ⁇ ⁇ band _ num
  • Step 4 According to the value of the unordered wnorm_ _index(i), the bits allocated to the group are further allocated to the subbands in the group.
  • Step 4.1 Divide the total number of bits Bx in the group by the threshold Thr to obtain the number of subbands originally allocated in the group, BitBand_num.
  • Step 4.2 Determine the subband 3 ⁇ 4N of the bit allocation according to the relationship between the number of subbands initially allocated in the group BitBand_num and the total number of subbands in the group sumBand_num. For example, if BitBand_num is greater than k*sumBand_num , where k is a coefficient, such as 0.75, 0.8, etc., then N is equal to sumBand_num; otherwise N is equal to BitBand_num.
  • Step 4.3 Select the top N subbands, where N is the number of subbands in the group for bit allocation.
  • step 4.4 the number of bits of the N sub-bands is initialized to 1, and the number of initialization cycles j is 0.
  • step 4.5 Determine a subband normalization factor sumband band_wnorm of the subbands whose subband normalization factors are greater than zero in the N subbands.
  • Step 4.6 assigning a number of bits to the subbands whose subband normalization factor is greater than zero in the N subbands:
  • Band_bits[i] Bx*wnorm_index(i)/band_wnorm;
  • Bx is the number of bits assigned to each group.
  • the number of bits of the three groups is Bl, B2, and B3, respectively.
  • Step 4.7 Determine whether the number of bits allocated by the last subband of the N subbands is less than a fixed threshold fac. If the value is less than the fixed threshold fac, set the number of bits allocated by the subband to zero. If it is greater than or equal to fac, jump directly to Step 4.9; otherwise skip to step 4.8.
  • Step 4.8 adding 1 to the number of cycles j;
  • step 4.9 the original original ordering is restored for all sub-bands within the group, i.e., restored to the ordering of all sub-bands prior to quantifying the sub-band normalization factor for each sub-band.
  • the grouping mode of the embodiment of the present invention ensures that the front and rear frame allocations are relatively stable, and different bits are allocated in the group according to the signal characteristics, so that the allocated bits are used to quantize the important frequency information, thereby improving the audio signal. Coding quality.
  • the method for bit allocation of an audio signal can ensure that the frame allocation before and after is relatively stable by the grouping, and reduce the influence of the global on the local discontinuity.
  • the bit allocation in each group can be set with different threshold parameters, thereby more adaptively allocating bits, and differently assigning bit assignments within the group according to spectral signal characteristics, for example, harmonic-like signals with more concentrated frequency. Focus on the subbands with large energy, the subbands between the harmonics do not need to allocate more bits, and for the signals with more gradual spectrum, the bit allocation tries to ensure the smoothness between the subbands, so that the allocated bits are used to quantify the important bits. On the spectrum information.
  • FIG. 1 A schematic structure of an apparatus for bit allocation of an audio signal according to an embodiment of the present invention will be described below with reference to FIG.
  • the apparatus 20 for bit allocation of an audio signal includes a subband quantization unit 21, a packet unit 22, a first allocation unit 23, a second allocation unit 24, and a third allocation unit 25. among them:
  • the subband quantization unit 21 is configured to divide the frequency band of the audio signal into a plurality of subbands, and quantize the subband normalization factor of each subband.
  • the grouping unit 22 is configured to divide the plurality of sub-bands into a plurality of groups, and obtain a sum of intra-group sub-band normalization factors of each group, wherein a sum of the sub-band normalization factors in the group is all in the group The sum of the subband normalization factors of the subbands.
  • the grouping unit 22 is specifically configured to divide the sub-bands having the same bandwidth into one group, so that the multiple sub-bands are divided into multiple groups; or group the sub-bands whose sub-band normalization factors are close together
  • the plurality of sub-bands are divided into a plurality of groups.
  • the subbands in each group have the same bandwidth, or a specific close subband normalization factor.
  • the first allocating unit 23 is configured to perform initial inter-group bit allocation according to the sum of the intra-group sub-band normalization factors of each group to determine the initial number of bits of each group.
  • the second allocating unit 24 is configured to perform second inter-group bit allocation based on the initial number of bits of each group to allocate coded bits of the audio signal to at least one group, wherein a sum of bits allocated by the at least one group is The coded bits of the audio signal.
  • the second allocation unit 24 may be configured to perform quadratic inter-group bit allocation using a saturation algorithm of bit allocation.
  • the second allocating unit 24 may include a first determining module 241, a second determining module 242, and an allocating module 243. among them:
  • the first determining module 241 is configured to determine a saturation bit number of each group
  • the second determining module 242 is configured to determine, according to the number of saturated bits and the initial number of bits, a bit saturation group and a redundant number of bits, where the excess number of bits is an initial bit number of the bit saturation group than the saturated bit The number of extra bits;
  • the allocation module 243 is configured to allocate the redundant number of bits to a bit-unsaturated group; wherein the bit-saturation group refers to a group whose initial number of bits is more than a saturated number of bits, and the bit-unsaturated group refers to an initial number of bits thereof A group that is less than the number of saturated bits.
  • the allocation module 243 can be configured to evenly distribute the remaining number of bits to the bit-unsaturated group.
  • the second allocation unit may also be used to perform quadratic inter-group bit allocation using a weighting algorithm.
  • the second allocation unit 24 may further include a weighting module 244 and an allocation module 243. among them:
  • the weighting module 244 is configured to weight the sum of the sub-band normalization factors of the groups of the groups to obtain a sum of the weighted intra-group sub-band normalization factors of each group;
  • the allocation module 243 is configured to use the sum of the normalized sub-band normalization factors of each group according to each group.
  • the initial number of bits is used to perform quadratic inter-group bit allocation.
  • the means 20 for the bit allocation of the audio signal can further comprise a determining unit 26 for normalizing the sub-bands within the group after the initial inter-group bit allocation and before the second inter-group bit allocation
  • the difference and/or the code rate of the mean of the factors determine whether a bit allocation saturation algorithm is employed, wherein the average of the subband normalization factors within the group is the subband normalization factor of all subbands within the group average value. If a saturation algorithm for the bit allocation is employed, the determining unit 26 determines the saturation algorithm using the bit allocation, otherwise it determines that the weighting algorithm is employed. As shown in Figure 4.
  • the third allocation unit 25 is for assigning bits of the audio signal assigned to the group to sub-bands within the group.
  • the third allocation unit 25 may include a weighting module 251 and a distribution module 252. among them:
  • the weighting module 251 is configured to weight the subband normalization factor to obtain a weighted subband normalization factor
  • the allocation module 252 is configured to allocate, according to the weighted subband normalization factor, bits of the audio signal allocated to the group to some or all of the subbands in the group, wherein the partial subbands are from the Among all subbands in the group, the weighted subband normalization factor is selected from large to d.
  • the apparatus for bit allocation of an audio signal can ensure that the frame allocation before and after is relatively stable by the grouping, and reduces the influence of the global on the local discontinuity. Therefore, the grouping mode in the embodiment of the present invention ensures that the frame allocation is stable before and after, and different bits are allocated in the group according to the signal characteristics, so that the allocated bits are used to quantize the important frequency information, thereby improving the audio.
  • the coding quality of the signal can ensure that the frame allocation before and after is relatively stable by the grouping, and reduces the influence of the global on the local discontinuity. Therefore, the grouping mode in the embodiment of the present invention ensures that the frame allocation is stable before and after, and different bits are allocated in the group according to the signal characteristics, so that the allocated bits are used to quantize the important frequency information, thereby improving the audio.
  • the coding quality of the signal is coding quality of the signal.
  • an embodiment of the present invention further provides another apparatus 60 for bit allocation of an audio signal, the apparatus comprising a memory 61 and a processor 62, wherein the memory 61 is used for storing the implementation of the above method embodiment.
  • the code of each step is used by the processor 62 to process the code stored in the memory.
  • the apparatus for bit allocation of the audio signal can ensure that the frame allocation before and after is relatively stable by the grouping, and reduce the influence of the global on the local discontinuity.
  • the bit allocation in each group can be set with different threshold parameters, thereby more adaptively allocating bits, and differently assigning bit assignments within the group according to spectral signal characteristics, for example, harmonic-like signals with more concentrated frequency. Focus on the sub-bands with large energy, the sub-bands between the harmonics do not need to allocate more bits, and for the signals with more gradual spectrum, the bit allocation tries to ensure the smoothness between sub-bands, which will make the allocation
  • the bits are used to quantify important spectral information.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
  • the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present invention which is essential to the prior art or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like.
  • the medium to store the program code includes: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种音频信号的比特分配的方法和装置,其中,音频信号的比特分配的方法包括:将音频信号的频带分为多个子带,量化每个子带的子带归一化因子(101);将该多个子带划分为多个组,获取每个组的组内子带归一化因子之和,其中组内子带归一化因子之和是组内所有子带的子带归一化因子的和(102);根据每个组的组内子带归一化因子之和进行初始组间比特分配,以确定每个组的初始比特数(103);基于每个组的初始比特数,进行二次组间比特分配,以将编码比特分配到至少一个组,其中该至少一个组分配的比特之和为音频信号的编码比特(104);将分配到组的音频信号的比特分配到组内的子带中(105)。所述方法和装置能够在中低比特率时,通过分组保证前后帧分配比较稳定,减少全局对局部不连续的影响。

Description

音频信号的比特分配的方法和装置 本申请要求于 2012 年 10 月 26 日提交中国专利局、 申请号为 201210415253.6、 发明名称为 "音频信号的比特分配的方法和装置" 的中国 专利申请的优先权, 其全部内容通过引用结合在本申请中。 技术领域
本发明实施例涉及音频技术领域, 并且更具体地, 涉及音频信号的比特 分配的方法和装置。 背景技术
目前的通信传输越来越重视音频的质量, 所以要求编解码时在保证语音 质量的前提下要尽可能地提高音乐质量。 由于音乐信号信息量极为丰富, 不 能采用传统语音的 CELP ( Code Excited Linear Prediction, 码激励线性预测 ) 编码模式, 通常是利用变换编码的方法, 在频域来处理音乐信号, 提升音乐 信号的编码质量。但如何有效地用有限的编码比特高效率的编码信息成为目 前音频编码的主要研究课题。
目前的音频编码技术通常采用 FFT ( Fast Fourier Transform,快速傅立叶 变换)或 MDCT ( Modified Discrete Cosine Transform, 改进离散余弦变换) 将时域信号转换到频域, 然后对频域信号进行编码。 变换编码通常需要把频 域系数进行分带, 求得每个带的归一化能量, 并对带内系数能量归一化, 然 后进行比特分配, 最后根据每个带分到的比特对带内系数进行量化, 其中比 特分配是极为关键的一部。 比特分配指在量化频谱系数的过程中, 根据频谱 的子带特性将音频信号用量化频谱系数的比特分配在各个子带上。
具体而言, 现有的比特分配的过程包括: 对频语信号进行分带, 例如根 据临界频带理论从低频到高频逐渐增加带宽; 频谱分带, 求出每个子带的归 一化能量 norm, 并量化得到子带归一化因子 wnorm; 将各子带按子带归一 化因子 wnorm 的值从大到小降序排列; 比特分配, 例如根据子带归一化因 子 wnorm 的值迭代循环分配每个子带的比特数。 其中, 迭代循环分配比特 又可以细化为以下步骤: 步骤 1 , 初始化每个子带的比特数和迭代因子 fac; 步骤 2, 找出最大的子带归一化因子 wnorm所对应的带; 步骤 3 , 将此带分 配的比特数累加带宽值,并将子带归一化因子 wnorm的值减去迭代因子 fac; 步骤 4, 迭代步骤 2和步骤 3 , 直至比特分配完毕。 可见, 在现有技术中, 每次分配的比特单位最小是带宽值, 而量化时所需的最低比特数要小于带宽 值, 这就使得这种整数的比特分配在低比特率下效率较低, 好多带分配不到 比特, 而其它的带又分得太多。 由于是全频带循环迭代分配比特, 对不同的 带宽的子带, 循环迭代参数都是一样的, 会使分配结果 4艮随机, 量化比较分 散, 前后帧不连续。
由此可知, 在低比特率下, 比特分配对性能影响较大。 通常的比特分配 主要是根据每个子带归一化能量的高低在全频带进行分配,在比特率不足的 情况下,这种分配很随机,也比较分散,会在时域上产生量化不连续的现象。 发明内容
本发明实施例提供一种音频信号的比特分配的方法和装置, 能够解决中 低比特率的情况下, 现有比特分配方法导致分配随机且分散, 从而在时域上 产生量化不连续的问题。
第一方面, 提供了一种音频信号的比特分配的方法, 包括: 将音频信号 的频带分为多个子带, 量化每个子带的子带归一化因子; 将所述多个子带划 分为多个组, 获取每个组的组内子带归一化因子之和, 其中所述组内子带归 一化因子之和是所述组内所有子带的子带归一化因子的和; 根据所述每个组 的组内子带归一化因子之和进行初始组间比特分配, 以确定所述每个组的初 始比特数; 基于所述每个组的初始比特数, 进行二次组间比特分配, 以将音 频信号的编码比特分配到至少一个组, 其中该至少一个组分配的比特之和为 音频信号的编码比特; 将分配到所述组的音频信号的比特分配到所述组内的 子带中。
结合第一方面, 在第一方面的第一种实现方式中, 进行二次组间比特分 配包括: 采用比特分配的饱和算法, 进行二次组间比特分配。
结合第一方面的第一种实现方式, 在第一方面的第二种实现方式中, 采 用比特分配的饱和算法, 进行二次组间比特分配包括: 确定所述每个组的饱 和比特数; 根据所述饱和比特数与所述初始比特数, 确定比特饱和组以及多 余比特数, 其中所述多余比特数是所述比特饱和组的初始比特数比所述饱和 比特数多出的比特数; 将所述多余比特数分配到比特不饱和组; 其中所述比 特饱和组是指其初始比特数多于饱和比特数的组,所述比特不饱和组是指其 初始比特数少于饱和比特数的组。
结合第一方面的第二种实现方式, 在第一方面的第三种实现方式中, 将 所述多余比特数分配到比特不饱和组包括: 将所述多余比特数均勾地分配到 比特不饱和组。
结合第一方面的第一种实现方式、 第二种实施方式以及第三种实施方 式, 在第一方面的第四种实现方式中, 在所述初始组间比特分配之后, 且在 所述二次组间比特分配之前, 还包括: 根据组内子带归一化因子的平均值的 差值和 /或码率确定是否采用比特分配的饱和算法,其中所述组内子带归一化 因子的平均值是所述组内所有子带的子带归一化因子的平均值; 若是, 则确 定采用比特分配的饱和算法, 若否, 则确定采用加权算法。
结合第一方面以及第一方面的第四种实现方式,在第一方面的第五种实 现方式中, 进行二次组间比特分配还可以包括: 采用加权算法, 进行二次组 间比特分配。
结合第一方面的第五种实现方式, 在第一方面的第六种实现方式中, 采 用加权算法, 进行二次组间比特分配包括: 加权所述每个组的组内子带归一 化因子之和, 得到每个组的加权的组内子带归一化因子之和; 根据每个组的 加权的组内子带归一化因子之和, 对所述初始比特数进行二次组间比特分 配。
结合第一方面及其上述实现方式, 在第一方面的第七种实现方式中, 将 分配到所述组的音频信号的比特分配到所述组内的子带中包括: 对所述子带 归一化因子进行加权, 以得到加权的子带归一化因子; 根据所述加权的子带 归一化因子,将分配到所述组的音频信号的比特分配到所述组内的部分或全 部子带,其中所述部分子带从所述组内的所有子带中按所述加权的子带归一 化因子从大到小选择的。
结合第一方面及其上述实现方式, 在第一方面的第八种实现方式中, 将 所述多个子带划分为多个组包括: 将具有相同带宽的子带划分为一个组, 从 而所述多个子带被划分为多个组; 或者将子带归一化因子接近的子带分成一 组, 从而所述多个子带被划分为多个组。
结合第一方面的第八种实现方式, 在第一方面的第九种实现方式中, 每 个组中的子带具有相同的带宽, 或者具体接近的归一化因子。 第二方面, 提供了一种音频信号的比特分配的装置, 包括: 子带量化单 元,用于将音频信号的频带分为多个子带,量化每个子带的子带归一化因子; 分组单元, 用于将所述多个子带划分为多个组, 获取每个组的组内子带归一 化因子之和, 其中所述组内子带归一化因子之和是所述组内所有子带的子带 归一化因子的和; 第一分配单元, 用于根据所述每个组的组内子带归一化因 子之和进行初始组间比特分配, 以确定所述每个组的初始比特数; 第二分配 单元, 用于基于所述每个组的初始比特数, 进行二次组间比特分配, 以将音 频信号的编码比特分配到至少一个组, 其中该至少一个组分配的比特之和为 音频信号的编码比特; 第三分配单元, 用于将分配到所述组的音频信号的比 特分配到所述组内的子带中。
结合第二方面, 在第二方面的第一种实现方式中, 第二分配单元具体用 于: 采用比特分配的饱和算法, 进行二次组间比特分配。
结合第二方面的第一种实现方式, 在第二方面的第二种实现方式中, 第 二分配单元包括: 第一确定模块, 用于确定所述每个组的饱和比特数; 第二 确定模块, 用于根据所述饱和比特数与所述初始比特数, 确定比特饱和组以 及多余比特数,其中所述多余比特数是所述比特饱和组的初始比特数比所述 饱和比特数多出的比特数; 分配模块, 用于将所述多余比特数分配到比特不 饱和组; 其中所述比特饱和组是指其初始比特数多于饱和比特数的组, 所述 比特不饱和组是指其初始比特数少于饱和比特数的组。
结合第二方面的第二种实现方式, 在第二方面的第三种实现方式中, 分 配模块具体用于: 将所述多余比特数均勾地分配到比特不饱和组。
结合第二方面的第一种实现方式、 第二种实施方式以及第三种实施方 式, 在第二方面的第四种实现方式中, 音频信号的比特分配的装置还包括: 确定单元, 用于在所述初始组间比特分配之后, 且在所述二次组间比特分配 之前,根据组内子带归一化因子的平均值的差值和 /或码率确定是否采用比特 分配的饱和算法, 其中所述组内子带归一化因子的平均值是所述组内所有子 带的子带归一化因子的平均值; 若是, 则确定采用比特分配的饱和算法, 若 否, 则确定采用加权算法。
结合第二方面以及第二方面的第四种实现方式,在第二方面的第五种实 现方式中, 第二分配单元还用于: 采用加权算法, 进行二次组间比特分配。
结合第二方面的第五种实现方式, 在第二方面的第六种实现方式中, 第 二分配单元还包括: 加权模块, 用于加权所述每个组的组内子带归一化因子 之和, 得到每个组的加权的组内子带归一化因子之和; 所述分配模块, 用于 根据每个组的加权的组内子带归一化因子之和,对所述初始比特数进行二次 组间比特分配。
结合第二方面及其上述实现方式, 在第二方面的第七种实现方式中, 第 三分配单元包括: 加权模块, 用于对所述子带归一化因子进行加权, 以得到 加权的子带归一化因子; 分配模块, 用于根据所述加权的子带归一化因子, 将分配到所述组的音频信号的比特分配到所述组内的部分或全部子带, 其中 所述部分子带从所述组内的所有子带中按所述加权的子带归一化因子从大 到小选择的。
结合第二方面及其上述实现方式, 在第二方面的第八种实现方式中, 分 组单元具体用于: 将具有相同带宽的子带划分为一个组, 从而所述多个子带 被划分为多个组; 或者将子带归一化因子接近的子带分成一组, 从而所述多 个子带被划分为多个组。
结合第二方面的第八种实现方式, 在第二方面的第九种实现方式中, 每 个组中的子带具有相同的带宽, 或者具体接近的子带归一化因子。
本发明实施例可以在中低比特率时, 通过分组保证前后帧分配比较稳 定, 减少全局对局部不连续的影响。 附图说明
为了更清楚地说明本发明实施例的技术方案, 下面将对实施例或现有技 术描述中所需要使用的附图作筒单地介绍, 显而易见地, 下面描述中的附图 仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造 性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1是根据本发明实施例的音频信号的比特分配的方法的流程图。 图 2是根据本发明实施例的音频信号的比特分配的装置的结构示意图。 图 3是根据本发明实施例的音频信号的比特分配的装置中第二分配单元 的结构示意图。
图 4是根据本发明实施例的音频信号的比特分配的装置的另一结构示意 图。
图 5是根据本发明实施例的音频信号的比特分配的装置中第三分配单元 的结构示意图。
图 6是根据本发明实施例的音频信号的比特分配的装置的又一结构示意 图。 具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例是本发明一部分实施例, 而不是 全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创 造性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。
编码技术方案和解码技术方案, 广泛应用于各种电子设备中, 例如: 移 动电话, 无线装置, 个人数据助理(PDA ), 手持式或便携式计算机, GPS 接收机 /导航器, 照相机, 音频 /视频播放器, 摄像机, 录像机, 监控设备等。 通常, 这类电子设备中包括音频编码器或音频解码器, 音频编码器或者解码 器可以直接由数字电路或芯片例如 DSP ( digital signal processor ) 实现, 或 者由软件代码驱动处理器执行软件代码中的流程而实现。
作为示例, 在一种音频编码技术方案中, 首先将音频时域信号变换为频 域信号, 再将编码比特分配给音频频域信号进行编码, 将编码后的信号通过 通信系统传输给解码端, 解码端对编码后的信号解码恢复。
本发明根据分组的理论和信号的特点进行比特分配。 首先对带进行分 组, 再根据每组的特点, 对组内能量进行加权, 根据加权后的能量对各组进 行比特分配, 再根据组内的信号特点将比特分配到每个带。 因为先对整组进 行分配, 避免了分配不连续的现象, 从而提升不同信号的编码质量。 而在组 内分配时又考虑了信号的特点,使得有限的比特能分配到影响感知的重要的 音频带中。
图 1是本发明一个实施例的音频信号的比特分配的方法的流程图。
101 , 将音频信号的频带分为多个子带, 量化每个子带的子带归一化因 子。
下面以 MDCT变换为例进行描述。 首先对输入的音频信号进行 MDCT 变换,得到频域系数。这里的 MDCT变换可包括加窗、时域混叠和离散 DCT 变换几个过程。
例如对输入时域信号;^)加正弦窗 h(n) =
Figure imgf000009_0001
得到加窗后的信号为:
h(n)x0LD (n), n = 0, ... , L - 1
xw {n)
h(n)x(n— L), n = L, ... , 2L -1
( 2 ) 然后进行时域混叠操作:
这里的 lm和 ]m分别表示为阶数为 ZJ 2的对角矩阵:
I/J2 _
Figure imgf000009_0003
对时 的 MDCT系数:
Figure imgf000009_0004
然后从 MDCT 系数中提取频域包络并量化。 将整个频带分成一些不同 频域分辨率的子带, 提取每个子带的归一化因子, 并量化子带归一化因子。
例如对于 16kHz 采样的音频信号, 对应 8kHz 带宽的频带, 如帧长为 20ms, 共有 3200个频谱系数, 则可以分为如下 26个子带:
8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16,
24, 24
首先分成几个组, 然后组内再细化子带, 每个子带的归一化因子可定义 为:
Figure imgf000009_0005
( 6 ) 这里 LP是子带内的系数个数, 是子带的起始点, ^是子带的结束点, P为总共的子带数。
得到归一化因子后, 可以在对数域对其进行量化, 得到量化后的子带归 一化因子 wnorm。
102, 将上述多个子带划分为多个组, 获取每个组的组内子带归一化因 子之和,其中所述组内子带归一化因子之和是所述组内所有子带的子带归一 化因子的和。
也就是, 将全部子带划分为多个组, 获取每个组的组参数, 其中组参数 可以是用于表征该组的信号特点和能量属性的组内子带归一化因子之和。
这里, 考虑将特性和能量相似的子带分入一组。 例如, 可以将具有相同 带宽的子带划分为一个组,优选地将相邻的具有相同带宽的子带划分为一个 组。 例如, 可以将全部子带分为三组, 则在低比特率时, 只采用前一组或前 二组, 而不对剩余的组进行比特分配。
或者, 可以根据子带的归一化能量 norm之间的关系进行分组。 也就是 说, 可以将子带归一化因子 wnorm接近的子带分成一组。 例如, 可以利用 以下方法判断子带的子带归一化因子是否接近: 将子带的子带归一化因子 wnorm[i] ( i = 1 ... P-1 , P是总共的子带数) 与预定阈值 K进行比较。 如果 wnorm[i]大于预定阈值 K,则记录下该子带序号 i,最终将其子带归一化因子 wnorm[i]大于预定阈值 K的子带分为一组,其余的子带分为另一组。应理解, 可以根据不同的需求设定多个预定阈值, 从而得到更多个组。
可选地, 还可以将相邻的子带归一化因子接近的子带分成一组。 例如, 可以利用以下方法判断相邻子带的子带归一化因子是否接近: 先计算相邻子 带的子带归一化因子的差值 wnorm_diff[i] ,其中 wnorm_diff[i]= abs(wnorm[i] - wnorm[i-l]) , i = 1 ... P-l。 P是总共的子带数。 如果 wnorm_diff[i]小于预定 阈值 K,, 表明相邻子带的子带归一化因子接近, 从而确定能分成一组的相 邻子带序号。
一旦完成子带分组, 便可获取每个组的组参数, 以表征组的能量属性。 一般而言, 组参数可以包括以下中的一个或多个: 组内子带归一化因子之和 group_wnorm、 组内子带归一 4匕因子的峰均比 group—sharp。
具体而言,组内子带归一化因子之和 group_wnorm是组内所有子带的子 group _ wnorm[i] = ^ wnorm[b]
带归一化因子的和, 即 , 其中 是第 i组中的开 始子带, 是第 i组中的结束子带。
组内子带归一化因子的平均值 group_avg是组内所有子带的子带归一化 因子的平均值, 即 g/ p avg[i] = group _ wnorm[i] , 其中 group_wnorm[i]是第 i 组的组内子带归一化因子之和, 是第 i组中的开始子带, 是第 i组中的 结束子带。
103 , 根据每个组的组内子带归一化因子之和进行初始组间比特分配, 以确定每个组的初始比特数。
由于上述组参数表征了组的能量属性,从而可以根据组参数将音频信号 的比特分配到每个组。 这样, 在比特率不足的情况下, 利用分组的原理, 考 虑组的能量属性, 使得音频信号的比特分配更加集中, 也使得帧间比特分配 更加连续。 应理解, 组参数不限于在此列举的几种, 还可以是其他能够表征 组的能量属性的参数。
一个实施例中, 在比特率不足情况下, 仅为部分组分配比特, 例如对于 组内子带归一化因子之和为零的组, 其不会被分配到比特; 又例如, 当比特 数很少时, 也会存在不被分配到比特的组。 也就是说, 在获得以上组参数的 基础上, 可以仅根据每个组的组内子带归一化因子之和, 为至少一个组分配 编码比特, 其中该至少一个组分配的比特之和为音频信号的比特。
依据每组 group_wnorm[i] , 得到初始的每组分配的比特数。 最筒单的方 法,是按照各组的组内子带归一化因子与全部子带的归一化能量的比例分配 比特数, 即,第 i组的初始比特数 Bi = sum—bits * group_wnorm[i]/sum_norm , 其中, sum_bits为总的待分比特数, sum_norm为全部子带的归一化能量。
104, 基于每个组的初始比特数, 进行二次组间比特分配, 以将音频信 号的编码比特分配到至少一个组, 其中该至少一个组分配的比特之和为音频 信号的编码比特。 或者, 其中该至少一个组分配的比特之和为音频信号的量 化比特, 所述量化比特为量化频谱系数的比特。
在确定每个组的初始比特数之后, 可以进行二次组间比特分配。
例如, 可以通过采用比特分配的饱和算法, 进行二次组间比特分配。 首先确定所述每个组的饱和比特数, 饱和比特数一般是经验值, 比如每 个频谱系数平均 1至 2个比特。 此外, 饱和比特数还可以和编码速率、 信号 特点有关。 然后, 根据所述饱和比特数与上述初始比特数, 确定比特饱和组 以及多余比特数, 最后将所述多余比特数分配到比特不饱和组。 例如, 可以 将所述多余比特数均勾地分配到比特不饱和组。 这里, 比特饱和组是指其初 始比特数多于饱和比特数的组, 比特不饱和组是指其初始比特数少于饱和比 特数的组。 多余比特数是指所述比特饱和组的初始比特数比该组的饱和比特 数多出的比特数。
或者, 例如, 可以通过采用加权算法, 进行二次组间比特分配。
也就是, 通过调整组参数来优化将音频信号的比特分配到每个组的结 果。 比如, 根据不同的分配需求, 为不同组的组参数分配不同的权重, 使得 有限的比特数分配在恰当的组中,再在该组中分配,使得比特分配不再分散, 这样将有利于音频信号的编码。
下面示例性地给出一种实施方式。 例如, 加权所述每个组的组内子带归 一化因子之和, 得到每个组的加权的组内子带归一化因子之和; 然后, 根据 每个组的加权的组内子带归一化因子之和,对所述初始比特数进行二次组间 比特分配。
下面示例性地给出另一种实施方式。 例如, 在获取每个组的组内子带归 一化因子之和 group_wnorm 以及组内子带归一化因子的峰均比 group—sharp 之后,可以根据组内子带归一化因子的峰均比 group_sharp,加权组内子带归 一化因子之和 group_wnorm , 得到加权的组内子带归一化因子之和 group_wnorm_w。
具体的, 从低频到高频的组中, 连续选取相邻的两个组, 比如第一组和 第二组。比较第一组的组内子带归一化因子的峰均比 group_sharp[i]与第二组 的组内子带归一化因子的峰均比 group_sharp[i-l]。若第一组的组内子带归一 化因子的峰均比相对第二组的组内子带归一化因子的峰均比大于第一阈值, 即根据第一加权因子调整该第一组的组内子带归一化因子之和,根据第二加 权因子调整所述第二组的组内子带归一化因子之和; 若第二组的组内子带归 一化因子的峰均比相对第一组的组内子带归一化因子的峰均比大于第二阈 值, 根据第一加权因子调整该第二组的组内子带归一化因子之和, 根据第二 加权因子调整所述第一组的组内子带归一化因子之和。
¾ ^口 , ^口果 group—sharp [i] - group—sharp [i-1] > a , 贝l grou p_wnorm_w [i- 1 ] = b* group_wnorm[i- 1 ] , group_wnorm_w [i] = ( b-1 ) * group_wnorm[i]。 或者, ^口果 group—sharp [i-1] - group—sharp [i] > c , 贝l group_wnorm_w [i] =b* group_wnorm[i] , group_wnorm[i- 1 ] = ( b-1 ) * group_wnorm[i-l]。 其中, 组序号 i = l ... P-l。 P是总共的子带数。 b为权重, a为第一阈值, c为第二阈值。 应理解, a、 b和 c的选取可以根据比特分配 的需求进行。
这里, 仅是示意性地说明了一种筒单的加权方法。 本领域技术人员应很 容易想到其他的加权方法, 以便通过不同的加权系数来调整子带的权重。 例 如, 可以加大需要分配更多信号比特的子带的权重, 而减小无需或需要分配 较少信号比特的子带的权重。
接着, 根据加权的组内子带归一化因子之和, 将音频信号的比特分配到 每个组。 例如按照加权的组内子带归一化因子之和 group_wnorm[i]与全部子 带的子带归一化因子之和 sum_wnorm比率, 确定该组的组比特数, 并将音 频信号的比特按照确定的组比特数分配到该组。通过以下公式确定每组的总 比特数 group—bits: group_bits[i] = sum—bits * group_wnorm[i]/sum_wnorm , 其中 sum_bits为需要分配的音频信号的总比特数, sum_wnorm是所有子带 的子带归一化因子之和。
可以进一步优化上述二次组间比特分配的过程 ,例如 ^据码率和 /或组内 子带归一化因子的平均值的差值来采取不同的二次组间比特分配方案, 比如 饱和算法或加权算法。
例如,根据组内子带归一化因子的平均值的差值和 /或码率确定是采用比 特分配的饱和算法还是加权算法, 其中所述组内子带归一化因子的平均值是 所述组内所有子带的子带归一化因子的平均值。
在比特被分入各个组之后, 可以进一步将每个组分到的比特再分入组内 的各个子带中。
105 , 将分配到所述组的音频信号的比特分配到所述组内的子带中。 应理解, 可以采用现有的迭代循环分配方法对组内的子带进行比特分 配。 但是, 迭代循环分配方法仍会使得组内的比特分配结果很随机, 前后帧 不连续。 因此, 可以结合不同音频信号的信号特点, 即不同的信号类型, 依 据该组内的各个子带的子带归一化因子,将分配到该组的音频信号的比特分 配到组内的子带中。
其中一种实施方式是, 对所述子带归一化因子进行加权, 以得到加权的 子带归一化因子; 根据所述加权的子带归一化因子, 将分配到所述组的音频 信号的比特分配到所述组内的部分或全部子带, 其中所述部分子带从所述组 内的所有子带中按所述加权的子带归一化因子从大到 d、选择的。
根据所述加权的子带归一化因子将分配到所述组的音频信号的比特分 配到所述组内的全部子带的一种典型的实施方式是,在确定全部子带的加权 的子带归一化因子之后,计算得到该组内的全部子带的加权的子带归一化因 子的和, 然后根据需要分配比特的子带的加权的子带归一化因子与全部子带 的加权的子带归一化因子的和的比值,将分配到该组的比特分别到具体的子 带。
根据所述加权的子带归一化因子将分配到所述组的音频信号的比特分 配到所述组内的部分子带的一种典型的实施方式是,将组内的各个子带的加 权的子带归一化因子进行排序, 例如从大到小的排序; 根据加权的子带归一 化因子的排序, 选取排序靠前的加权的子带归一化因子所对应的部分子带; 将分配到组的音频信号的比特分配到组内的上述部分子带。
例如, 首先确定组内各个子带的子带归一化因子 wnorm 的加权参数 factor[0]和 factor[l] , 将组内各个子带的子带归一化因子 wnorm进行排序得 到 wnorm_index[i] , 利用力口权参数对 wnorm_index[i]进行力口权, 最后才艮据力口 权后的 wnorm_index[i]对组内的各个子带进行比特分配。
由上可知,根据本发明实施例的音频信号的比特分配的方法可以通过分 组保证前后帧分配比较稳定,减少全局对局部不连续的影响;通过二次分配, 有效利用饱和子带的多余比特, 使得比特分配更加合理。
以下将在具体实施例中结合程序语言,详细描述如何根据码率和 /或组内 子带归一化因子的平均值的差值来采取不同的二次组间比特分配方案, 进而 如何进行组内的各个子带的比特分配。
首先, 将音频信号的多个子带划分为多个组, 依据每组子带归一化因子 之和 group_wnorm[i] , 得到初始的每组分配的比特数。 例如, 所有子带被分 成三组,
第一组的初始比特数 Bl = sum—bits * group_wnorm[0]/sum_norm , 第二组的初始比特数 Β2 = sum—bits * group_wnorm[ 1 ]/sum_norm , 第三组的初始比特数 Β3 = sum—bits * group_wnorm[2]/sum_norm, 其中, sum_bits为总的待分比特数, 因此 B3 = sum_bits - Bl - B2, 从而 sum_norm=group_wnorm[0]+group_wnorm[l]+group_wnorm[2]„
然后, 依据码率 (bit_rate ) 和组内子带归一化因子的平均值的差值
( avg_diff )来采取不同的二次组间比特分配方案。
步骤 1 : 计算组内子带归一化因子的平均值的差值
avg_diff[0]= group_avg[0] - group_avg[l];
avg_diff[l]= group_avg[l] - group_avg[2];
步骤 2: 选择二次组间比特分配方案, 比如根据组内子带归一化因子的 平均值的差值以及码率两个条件,确定是采用比特分配的饱和算法还是加权 算法。
if ( bit—rate > a && avg_diff[0] < b && avg_diff[l] < c) 饱和算法 else 加权算法 这里的 a, b, c为经验因子。
步骤 3: 后处理算法: 如果最高子带的 group_wnorm[2]小于一定值时, 将此组所分的比特分配给低子带的组。例如, 当 group_wnorm[2]小于阈值 d, 则将最高子带分配的比特分配给第二高的子带, 并将最高子带所分配的比特 数置零。
对于饱和算法: 原则是当一个组中分到的比特趋于饱和时, 则将多余的 比特分给其它的组, 例如:
1 )首先设置每个组的饱和比特数分别为, B1_UP、 B2_UP、 B3_UP;
2 )计算多余的比特:
B_saved = 0;
if ( Bl > B1_UP)
B_saved = B_saved + (Bl-Bl—UP);
Bl = B1_UP; if ( B2 > B2_UP)
B_saved = B_saved + (B2-B2_UP);
B2=B2_UP; if ( B3 > B3_UP) B_saved = B_saved + (B3-B3_UP);
B3=B3_UP; 这里的 B1_UP、 B2_UP、 B3_UP为经验因子, 可以分别为 288、 256、 96。
3)将多余的比特再次分配。 例如, 当第一组分配的比特达到饱和时, 就将 B_saved 平均分配给其他组, 如第一组分配的比特没饱和时, 就将 B_saved的一半先累加到 B1上; 然后判断第二组分配的比特是否饱和, 如 第一组分配的比特没饱和, 则将 B2重新赋值为 sum_bits -Bl -B3, 否则将 B3重新赋值为 sum_bits -Bl -B2, 算法的伪代码如下:
if ( B_saved > 0 ) if (Bl ==B1_UP)
B2 B2 + B_saved/2;
B3 sum_bits- Bl - B2; else Bl =B1 +B_saved/2;
if ( B2 == B2_UP)
B3 = sum_bits-Bl -B2; else
B2 = sum_bits-Bl -B3;
}
对于加权算法: Bl,=al* Bl ,
B2' = a2* B2,
B3, = sum—bits - Bl,- B2,,
这里, sum_bits为总的比特数,
sum_norm=group_wnorm[0]+group_wnorm[l]+group_wnorm[2]
其中, al和 a2为加权系数, 例如这里可以设置 al=1.0、 a2 = 0.92。 最后, 将分配到各个组的比特通过以下方法分配到组内的各个子带。 步骤 1 : 确定每组内的子带的子带归一化因子 wnorm 的加权参数 factor [] , 例如 factor [0] = FAC1 , factor[l] = FAC2;
其中, FAC1、 FAC2为经验因子, 可以分别为 2.0、 1.5或者 2.0、 3.0等 等。
步骤 2:对组内的全部子带归一化因子 wnorm按照由大到小的顺序进行 # f* , 以得到 wnorm_index(i)o
步骤 3: 根据加权参数 factor[]对排序后的 wnorm_index(i)的值进行如下 加权处理:
wnorm _ index(i) = wnorm _ index(i) — 0≤ < band _ num
这里, band_num是该组内所包含的子带的个数, "和 可以根据条件进 行设置, 例如可以根据不同的组来设置不同的值, 如果是第一组的低频成分 β = 1
则可设置 a = factor\-^\ , band— num , 如果高于第一组则可设置 a = factor[l] , band _ num
步躁 4: 才艮据非序后的 wnorm_ _index(i)的值, 将分配到组内的比特再分 配到组内的子带中。
步骤 4.1 , 用组内总的比特数 Bx除以阀值 Thr, 得到组内初始分配的子带 数 BitBand_num。
步骤 4.2 , 根据组内初始分配的子带数 BitBand_num和组内总子带数 sumBand_num的关系, 确定比特分配的子带 ¾N。 例如, 如果 BitBand_num 大于 k*sumBand_num , 这里 k是系数, 比如 0.75、 0.8等, 则 N等于 sumBand_num; 否贝l N等于 BitBand_num。
步骤 4.3 , 选取排在前面的 N个子带, 其中 N为组内进行比特分配的子 带数。 步骤 4.4,初始化所述 N个子带的比特数为 1 ,并初始化循环次数 j为 0。 步骤 4.5, 确定所述 N个子带中其子带归一化因子大于零的子带的子带 归一 4匕因子总和 band_wnorm。
步骤 4.6, 为所述 N个子带中其子带归一化因子大于零的子带分配比特 数:
band_bits[i] = Bx*wnorm_index(i)/band_wnorm;
这里, Bx是分到每组的比特数, 例如在上面的实施例中, 3个组的比特 数分别为 Bl、 B2和 B3。
步骤 4.7, 判断 N个子带中最后的子带所分配的比特数是否小于固定阈值 fac, 如果小于固定阈值 fac, 则将这个子带分配的比特数置零; 如果大于等 于 fac, 则直接跳到步骤 4.9; 否则跳到步骤 4.8。
步骤 4.8, 将所述循环次数 j加 1 ;
循环步骤 4.5至步骤 4.8, 直到循环次数 j等于 N。
步骤 4.9,对所述组内的全部子带恢复最初原始的排序, 即恢复到量化每 个子带的子带归一化因子之前全部子带的排序。
可以理解,本发明实施例中进行组内比特分配的方法不限于以上由步骤 4.1至 4.9描述的示例。
经过本发明实施例的分组方式保证了前后帧分配比较稳定,并且根据信 号特点对组内进行不同侧重的比特分配,使得分配的比特都用于量化重要频 语信息上, 从而能够提升音频信号的编码质量。
由上可知,根据本发明实施例的音频信号的比特分配的方法可以通过分 组保证前后帧分配比较稳定, 减少全局对局部不连续的影响。 此外, 每个组 内的比特分配可以设不同的阈值参数, 从而更加自适应地分配比特, 并且根 据频谱信号特点对组内进行不同侧重的比特分配,例如对于频语较集中的类 谐波信号重点分配在能量大的子带, 谐波间的子带无需分配更多比特, 而对 于频谱较为平緩的信号, 比特分配则尽量保证子带间平滑, 这样会使得分配 的比特都用于量化重要的频谱信息上。
以下将结合图 2, 描述根据本发明实施例的音频信号的比特分配的装置 的示意结构。
在图 2中, 音频信号的比特分配的装置 20包括子带量化单元 21、 分组 单元 22、 第一分配单元 23、 第二分配单元 24和第三分配单元 25。 其中: 子带量化单元 21用于将音频信号的频带分为多个子带, 量化每个子带 的子带归一化因子。
分组单元 22用于将所述多个子带划分为多个组, 获取每个组的组内子 带归一化因子之和, 其中所述组内子带归一化因子之和是所述组内所有子带 的子带归一化因子的和。
可选地, 分组单元 22具体用于将具有相同带宽的子带划分为一个组, 从而所述多个子带被划分为多个组; 或者将子带归一化因子接近的子带分成 一组, 从而所述多个子带被划分为多个组。 优选地, 每个组中的子带具有相 同的带宽, 或者具体接近的子带归一化因子。
第一分配单元 23用于根据所述每个组的组内子带归一化因子之和进行 初始组间比特分配, 以确定所述每个组的初始比特数。
第二分配单元 24用于基于所述每个组的初始比特数, 进行二次组间比 特分配, 以将音频信号的编码比特分配到至少一个组, 其中该至少一个组分 配的比特之和为音频信号的编码比特。
可选地, 具体而言, 第二分配单元 24可以用于采用比特分配的饱和算 法, 进行二次组间比特分配。 例如, 如图 3所示, 第二分配单元 24可以包 括第一确定模块 241、 第二确定模块 242和分配模块 243。 其中:
第一确定模块 241用于确定所述每个组的饱和比特数;
第二确定模块 242用于根据所述饱和比特数与所述初始比特数,确定比 特饱和组以及多余比特数, 其中所述多余比特数是所述比特饱和组的初始比 特数比所述饱和比特数多出的比特数;
分配模块 243用于将所述多余比特数分配到比特不饱和组; 其中所述比 特饱和组是指其初始比特数多于饱和比特数的组,所述比特不饱和组是指其 初始比特数少于饱和比特数的组。 可选地, 分配模块 243可以用于将所述多 余比特数均匀地分配到比特不饱和组。
或者, 可选地, 具体而言, 第二分配单元也可以用于采用加权算法, 进 行二次组间比特分配。 例如, 第二分配单元 24还可以包括加权模块 244和 分配模块 243。 其中:
加权模块 244用于加权所述每个组的组内子带归一化因子之和,得到每 个组的加权的组内子带归一化因子之和;
分配模块 243用于根据每个组的加权的组内子带归一化因子之和,对所 述初始比特数进行二次组间比特分配。
由此可见, 音频信号的比特分配的装置 20还可以包括确定单元 26, 其 用于在所述初始组间比特分配之后, 且在所述二次组间比特分配之前, 根据 组内子带归一化因子的平均值的差值和 /或码率确定是否采用比特分配的饱 和算法,其中所述组内子带归一化因子的平均值是所述组内所有子带的子带 归一化因子的平均值。 若采用比特分配的饱和算法, 则确定单元 26确定采 用比特分配的饱和算法, 否则确定采用加权算法。 如图 4所示。
第三分配单元 25用于将分配到所述组的音频信号的比特分配到所述组 内的子带中。
例如, 如图 5所示, 第三分配单元 25可以包括加权模块 251和分配模 块 252。 其中:
加权模块 251用于对所述子带归一化因子进行加权, 以得到加权的子带 归一化因子;
分配模块 252用于根据所述加权的子带归一化因子,将分配到所述组的 音频信号的比特分配到所述组内的部分或全部子带, 其中所述部分子带从所 述组内的所有子带中按所述加权的子带归一化因子从大到 d、选择的。
由上可知,根据本发明实施例的音频信号的比特分配的装置可以通过分 组保证前后帧分配比较稳定, 减少全局对局部不连续的影响。 从而, 经过本 发明实施例的分组方式保证了前后帧分配比较稳定, 并且根据信号特点对组 内进行不同侧重的比特分配, 使得分配的比特都用于量化重要频语信息上, 从而能够提升音频信号的编码质量。
另外, 在图 6中, 本发明的实施例还提供了另外一种音频信号的比特分 配的装置 60, 该装置包括存储器 61和处理器 62, 其中该存储器 61用于存 储实现上述方法实施例中各步骤的代码, 所述处理器 62用于处理所述存储 器中存储的代码。
由此可见,根据本发明实施例的音频信号的比特分配的装置可以通过分 组保证前后帧分配比较稳定, 减少全局对局部不连续的影响。 此外, 每个组 内的比特分配可以设不同的阈值参数, 从而更加自适应地分配比特, 并且根 据频谱信号特点对组内进行不同侧重的比特分配,例如对于频语较集中的类 谐波信号重点分配在能量大的子带, 谐波间的子带无需分配更多比特, 而对 于频谱较为平緩的信号, 比特分配则尽量保证子带间平滑, 这样会使得分配 的比特都用于量化重要的频谱信息上。
本领域普通技术人员可以意识到, 结合本文中所公开的实施例描述的各 示例的单元及算法步骤, 能够以电子硬件、 或者计算机软件和电子硬件的结 合来实现。 这些功能究竟以硬件还是软件方式来执行, 取决于技术方案的特 定应用和设计约束条件。 专业技术人员可以对每个特定的应用来使用不同方 法来实现所描述的功能, 但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到, 为描述的方便和筒洁, 上述描 述的系统、 装置和单元的具体工作过程, 可以参考前述方法实施例中的对应 过程, 在此不再赘述。
在本申请所提供的几个实施例中, 应该理解到, 所揭露的系统、 装置和 方法, 可以通过其它的方式实现。 例如, 以上所描述的装置实施例仅仅是示 意性的, 例如, 所述单元的划分, 仅仅为一种逻辑功能划分, 实际实现时可 以有另外的划分方式, 例如多个单元或组件可以结合或者可以集成到另一个 系统, 或一些特征可以忽略, 或不执行。 另一点, 所显示或讨论的相互之间 的耦合或直接耦合或通信连接可以是通过一些接口, 装置或单元的间接耦合 或通信连接, 可以是电性, 机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作 为单元显示的部件可以是或者也可以不是物理单元, 即可以位于一个地方, 或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或 者全部单元来实现本实施例方案的目的。
另外, 在本发明各个实施例中的各功能单元可以集成在一个处理单元 中, 也可以是各个单元单独物理存在, 也可以两个或两个以上单元集成在一 个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使 用时, 可以存储在一个计算机可读取存储介质中。 基于这样的理解, 本发明 的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部 分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质 中, 包括若干指令用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。 而前 述的存储介质包括: U盘、移动硬盘、只读存储器( ROM , Read-Only Memory )、 随机存取存储器(RAM, Random Access Memory ), 磁碟或者光盘等各种可 以存储程序代码的介质。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限 于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易 想到变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护 范围应所述以权利要求的保护范围为准。

Claims

权利要求
1、 一种音频信号的比特分配的方法, 其特征在于, 包括:
将音频信号的频带分为多个子带, 量化每个子带的子带归一化因子; 将所述多个子带划分为多个组, 获取每个组的组内子带归一化因子之 和, 其中所述组内子带归一化因子之和是所述组内所有子带的子带归一化因 子的和;
根据所述每个组的组内子带归一化因子之和进行初始组间比特分配, 以 确定所述每个组的初始比特数;
基于所述每个组的初始比特数, 进行二次组间比特分配, 以将音频信号 的编码比特分配到至少一个组, 其中该至少一个组分配的比特之和为音频信 号的编码比特;
将分配到所述组的音频信号的比特分配到所述组内的子带中。
2、 根据权利要求 1所述的方法, 其特征在于, 所述进行二次组间比特 分配包括:
采用比特分配的饱和算法, 进行二次组间比特分配。
3、 根据权利要求 2所述的方法, 其特征在于, 所述采用比特分配的饱 和算法, 进行二次组间比特分配包括:
确定所述每个组的饱和比特数;
根据所述饱和比特数与所述初始比特数 ,确定比特饱和组以及多余比特 数, 其中所述多余比特数是所述比特饱和组的初始比特数比所述饱和比特数 多出的比特数;
将所述多余比特数分配到比特不饱和组;
其中所述比特饱和组是指其初始比特数多于饱和比特数的组,所述比特 不饱和组是指其初始比特数少于饱和比特数的组。
4、 根据权利要求 3所述的方法, 其特征在于, 所述将所述多余比特数 分配到比特不饱和组包括:
将所述多余比特数均匀地分配到比特不饱和组。
5、 根据权利要求 2至 4中任一项所述的方法, 其特征在于, 在所述初 始组间比特分配之后, 且在所述二次组间比特分配之前, 还包括:
根据所述组内子带归一化因子的平均值的差值和 /或码率确定是否采用 比特分配的饱和算法,其中所述组内子带归一化因子的平均值是所述组内所 有子带的子带归一化因子的平均值;
若是, 则确定采用比特分配的饱和算法,
若否, 则确定采用加权算法。
6、 根据权利要求 1或 5所述的方法, 其特征在于, 所述进行二次组间 比特分配包括:
采用加权算法, 进行二次组间比特分配。
7、 根据权利要求 6所述的方法, 其特征在于, 所述采用加权算法, 进 行二次组间比特分配包括:
加权所述每个组的组内子带归一化因子之和,得到每个组的加权的组内 子带归一化因子之和;
根据每个组的加权的组内子带归一化因子之和,对所述初始比特数进行 二次组间比特分配。
8、 根据权利要求 1至 7中任一项所述的方法, 其特征在于, 所述将分 配到所述组的音频信号的比特分配到所述组内的子带中包括:
对所述子带归一化因子进行加权, 以得到加权的子带归一化因子; 根据所述加权的子带归一化因子,将分配到所述组的音频信号的比特分 配到所述组内的部分或全部子带, 其中所述部分子带从所述组内的所有子带 中按所述加权的子带归一化因子从大到小选择的。
9、 根据权利要求 1至 8中任一项所述的方法, 其特征在于, 所述将所 述多个子带划分为多个组包括:
将具有相同带宽的子带划分为一个组,从而所述多个子带被划分为多个 组; 或者
将子带归一化因子接近的子带分成一组,从而所述多个子带被划分为多 个组。
10、 根据权利要求 9所述的方法, 其特征在于, 所述每个组中的子带具 有相同的带宽, 或者具体接近的子带归一化因子。
11、 一种音频信号的比特分配的装置, 其特征在于, 包括:
子带量化单元, 用于将音频信号的频带分为多个子带, 量化每个子带的 子带归一化因子;
分组单元, 用于将所述多个子带划分为多个组, 获取每个组的组内子带 归一化因子之和, 其中所述组内子带归一化因子之和是所述组内所有子带的 子带归一化因子的和;
第一分配单元,用于根据所述每个组的组内子带归一化因子之和进行初 始组间比特分配, 以确定所述每个组的初始比特数;
第二分配单元, 用于基于所述每个组的初始比特数, 进行二次组间比特 分配, 以将音频信号的编码比特分配到至少一个组, 其中该至少一个组分配 的比特之和为音频信号的编码比特;
第三分配单元,用于将分配到所述组的音频信号的比特分配到所述组内 的子带中。
12、 根据权利要求 11所述的装置, 其特征在于, 所述第二分配单元具 体用于:
采用比特分配的饱和算法, 进行二次组间比特分配。
13、 根据权利要求 12所述的装置, 其特征在于, 所述第二分配单元包 括:
第一确定模块, 用于确定所述每个组的饱和比特数;
第二确定模块, 用于根据所述饱和比特数与所述初始比特数, 确定比特 饱和组以及多余比特数, 其中所述多余比特数是所述比特饱和组的初始比特 数比所述饱和比特数多出的比特数;
分配模块, 用于将所述多余比特数分配到比特不饱和组;
其中所述比特饱和组是指其初始比特数多于饱和比特数的组,所述比特 不饱和组是指其初始比特数少于饱和比特数的组。
14、 根据权利要求 13所述的装置, 其特征在于, 所述分配模块具体用 于:
将所述多余比特数均匀地分配到比特不饱和组。
15、根据权利要求 12至 14中任一项所述的装置,其特征在于,还包括: 确定单元, 用于在所述初始组间比特分配之后, 且在所述二次组间比特分配 之前,根据组内子带归一化因子的平均值的差值和 /或码率确定是否采用比特 分配的饱和算法, 其中所述组内子带归一化因子的平均值是所述组内所有子 带的子带归一化因子的平均值;
若是, 则确定采用比特分配的饱和算法,
若否, 则确定采用加权算法。
16、 根据权利要求 11或 15所述的装置, 其特征在于, 所述第二分配单 元还用于:
采用加权算法, 进行二次组间比特分配。
17、 根据权利要求 16所述的装置, 其特征在于, 所述第二分配单元还 包括: 加权模块, 用于加权所述每个组的组内子带归一化因子之和, 得到每 个组的加权的组内子带归一化因子之和;
所述分配模块, 用于根据每个组的加权的组内子带归一化因子之和, 对 所述初始比特数进行二次组间比特分配。
18、 根据权利要求 11至 17中任一项所述的装置, 其特征在于, 所述第 三分配单元包括:
加权模块, 用于对所述子带归一化因子进行加权, 以得到加权的子带归 一化因子;
分配模块, 用于根据所述加权的子带归一化因子, 将分配到所述组的音 频信号的比特分配到所述组内的部分或全部子带,其中所述部分子带从所述 组内的所有子带中按所述加权的子带归一化因子从大到 d、选择的。
19、 根据权利要求 11至 18中任一项所述的装置, 其特征在于, 所述分 组单元具体用于:
将具有相同带宽的子带划分为一个组,从而所述多个子带被划分为多个 组; 或者
将子带归一化因子接近的子带分成一组,从而所述多个子带被划分为多 个组。
20、 根据权利要求 19所述的装置, 其特征在于, 所述每个组中的子带 具有相同的带宽, 或者具体接近的子带归一化因子。
PCT/CN2013/076392 2012-10-26 2013-05-29 音频信号的比特分配的方法和装置 WO2014063489A1 (zh)

Priority Applications (7)

Application Number Priority Date Filing Date Title
KR1020157010413A KR20150058483A (ko) 2012-10-26 2013-05-29 오디오 신호의 비트를 할당하는 방법 및 장치
EP13849179.0A EP2892052B1 (en) 2012-10-26 2013-05-29 Bit allocation method and device for audio signal
BR112015008609-8A BR112015008609B1 (pt) 2012-10-26 2013-05-29 Método e aparelho para alocação de bits de um sinal de áudio
JP2015538257A JP6121551B2 (ja) 2012-10-26 2013-05-29 オーディオ信号のビットを割り当てる方法及び装置
SG11201502355PA SG11201502355PA (en) 2012-10-26 2013-05-29 Method and apparatus for allocating bits of audio signal
US14/675,031 US9530420B2 (en) 2012-10-26 2015-03-31 Method and apparatus for allocating bits of audio signal
US15/354,641 US9972326B2 (en) 2012-10-26 2016-11-17 Method and apparatus for allocating bits of audio signal

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210415253.6A CN103778918B (zh) 2012-10-26 2012-10-26 音频信号的比特分配的方法和装置
CN201210415253.6 2012-10-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/675,031 Continuation US9530420B2 (en) 2012-10-26 2015-03-31 Method and apparatus for allocating bits of audio signal

Publications (1)

Publication Number Publication Date
WO2014063489A1 true WO2014063489A1 (zh) 2014-05-01

Family

ID=50543952

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/076392 WO2014063489A1 (zh) 2012-10-26 2013-05-29 音频信号的比特分配的方法和装置

Country Status (8)

Country Link
US (2) US9530420B2 (zh)
EP (1) EP2892052B1 (zh)
JP (2) JP6121551B2 (zh)
KR (1) KR20150058483A (zh)
CN (1) CN103778918B (zh)
BR (1) BR112015008609B1 (zh)
SG (2) SG11201502355PA (zh)
WO (1) WO2014063489A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200227A (zh) * 2014-05-17 2014-12-10 北京工业大学 一种用于人类认知模式识别的特征归一化方法及系统

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778918B (zh) * 2012-10-26 2016-09-07 华为技术有限公司 音频信号的比特分配的方法和装置
WO2014199449A1 (ja) * 2013-06-11 2014-12-18 株式会社東芝 電子透かし埋め込み装置、電子透かし検出装置、電子透かし埋め込み方法、電子透かし検出方法、電子透かし埋め込みプログラム、及び電子透かし検出プログラム
CN106409300B (zh) * 2014-03-19 2019-12-24 华为技术有限公司 用于信号处理的方法和装置
US11354536B2 (en) * 2017-07-19 2022-06-07 Audiotelligence Limited Acoustic source separation systems
EP3547765B1 (en) * 2018-03-28 2021-08-18 Institut Mines-Telecom Power distribution to sub-bands in multiple access communications systems
US11133891B2 (en) * 2018-06-29 2021-09-28 Khalifa University of Science and Technology Systems and methods for self-synchronized communications
US10951596B2 (en) * 2018-07-27 2021-03-16 Khalifa University of Science and Technology Method for secure device-to-device communication using multilayered cyphers
EP3751567B1 (en) * 2019-06-10 2022-01-26 Axis AB A method, a computer program, an encoder and a monitoring device
US11823698B2 (en) 2020-01-17 2023-11-21 Audiotelligence Limited Audio cropping

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5537510A (en) * 1994-12-30 1996-07-16 Daewoo Electronics Co., Ltd. Adaptive digital audio encoding apparatus and a bit allocation method thereof
US5761636A (en) * 1994-03-09 1998-06-02 Motorola, Inc. Bit allocation method for improved audio quality perception using psychoacoustic parameters
US6745162B1 (en) * 2000-06-22 2004-06-01 Sony Corporation System and method for bit allocation in an audio encoder
CN102208188A (zh) * 2011-07-13 2011-10-05 华为技术有限公司 音频信号编解码方法和设备
CN102467910A (zh) * 2010-11-09 2012-05-23 索尼公司 编码设备、编码方法和程序

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3466507B2 (ja) * 1998-06-15 2003-11-10 松下電器産業株式会社 音声符号化方式、音声符号化装置、及びデータ記録媒体
JP4287545B2 (ja) * 1999-07-26 2009-07-01 パナソニック株式会社 サブバンド符号化方式
JP4242516B2 (ja) * 1999-07-26 2009-03-25 パナソニック株式会社 サブバンド符号化方式
JP2001249699A (ja) * 2000-03-07 2001-09-14 Hitachi Ltd 音声圧縮装置
KR100728428B1 (ko) * 2002-09-19 2007-06-13 마츠시타 덴끼 산교 가부시키가이샤 오디오 디코딩 장치 및 오디오 디코딩 방법
US20060172862A1 (en) 2003-06-05 2006-08-03 Flexiped As Physical exercise apparatus and footrest platform for use with the apparatus
CN101101755B (zh) 2007-07-06 2011-04-27 北京中星微电子有限公司 一种音频编码的比特分配及量化方法及音频编码装置
ES2403410T3 (es) * 2007-08-27 2013-05-17 Telefonaktiebolaget L M Ericsson (Publ) Frecuencia de transición adaptativa entre el rellenado con ruido y la extensión del ancho de banda
JP5539203B2 (ja) * 2007-08-27 2014-07-02 テレフオンアクチーボラゲット エル エム エリクソン(パブル) 改良された音声及びオーディオ信号の変換符号化
GB2454190A (en) 2007-10-30 2009-05-06 Cambridge Silicon Radio Ltd Minimising a cost function in encoding data using spectral partitioning
PL2304719T3 (pl) * 2008-07-11 2017-12-29 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Koder audio, sposoby dostarczania strumienia audio oraz program komputerowy
TWI433542B (zh) * 2009-05-25 2014-04-01 Mstar Semiconductor Inc 反量化處理方法與裝置
US8207875B2 (en) 2009-10-28 2012-06-26 Motorola Mobility, Inc. Encoder that optimizes bit allocation for information sub-parts
FR2973551A1 (fr) * 2011-03-29 2012-10-05 France Telecom Allocation par sous-bandes de bits de quantification de parametres d'information spatiale pour un codage parametrique
JP6179087B2 (ja) * 2012-10-24 2017-08-16 富士通株式会社 オーディオ符号化装置、オーディオ符号化方法、オーディオ符号化用コンピュータプログラム
CN103778918B (zh) * 2012-10-26 2016-09-07 华为技术有限公司 音频信号的比特分配的方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761636A (en) * 1994-03-09 1998-06-02 Motorola, Inc. Bit allocation method for improved audio quality perception using psychoacoustic parameters
US5537510A (en) * 1994-12-30 1996-07-16 Daewoo Electronics Co., Ltd. Adaptive digital audio encoding apparatus and a bit allocation method thereof
US6745162B1 (en) * 2000-06-22 2004-06-01 Sony Corporation System and method for bit allocation in an audio encoder
CN102467910A (zh) * 2010-11-09 2012-05-23 索尼公司 编码设备、编码方法和程序
CN102208188A (zh) * 2011-07-13 2011-10-05 华为技术有限公司 音频信号编解码方法和设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2892052A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200227A (zh) * 2014-05-17 2014-12-10 北京工业大学 一种用于人类认知模式识别的特征归一化方法及系统
CN104200227B (zh) * 2014-05-17 2016-05-11 北京工业大学 一种用于人类认知模式识别的特征归一化方法及系统

Also Published As

Publication number Publication date
BR112015008609A2 (pt) 2017-07-04
SG11201502355PA (en) 2015-05-28
JP6121551B2 (ja) 2017-04-26
JP2015534129A (ja) 2015-11-26
EP2892052A1 (en) 2015-07-08
CN103778918A (zh) 2014-05-07
KR20150058483A (ko) 2015-05-28
US20150206541A1 (en) 2015-07-23
SG10201703301UA (en) 2017-06-29
EP2892052B1 (en) 2016-07-27
CN103778918B (zh) 2016-09-07
BR112015008609B1 (pt) 2021-10-26
EP2892052A4 (en) 2015-09-09
JP2017138614A (ja) 2017-08-10
US20170069329A1 (en) 2017-03-09
JP6351783B2 (ja) 2018-07-04
US9530420B2 (en) 2016-12-27
US9972326B2 (en) 2018-05-15

Similar Documents

Publication Publication Date Title
WO2014063489A1 (zh) 音频信号的比特分配的方法和装置
JP6702593B2 (ja) 音声信号の符号化と復号化の方法および装置
KR101736705B1 (ko) 오디오 신호를 위한 비트 할당 방법 및 장치
US10789964B2 (en) Dynamic bit allocation methods and devices for audio signal
JP2018041091A (ja) 信号処理方法及び装置
WO2012139401A1 (zh) 一种音频编码方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13849179

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2013849179

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013849179

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20157010413

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2015538257

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112015008609

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112015008609

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20150416