CN1471236A - Signal adaptive multi resolution wave filter set for sensing audio encoding - Google Patents

Signal adaptive multi resolution wave filter set for sensing audio encoding Download PDF

Info

Publication number
CN1471236A
CN1471236A CNA031485154A CN03148515A CN1471236A CN 1471236 A CN1471236 A CN 1471236A CN A031485154 A CNA031485154 A CN A031485154A CN 03148515 A CN03148515 A CN 03148515A CN 1471236 A CN1471236 A CN 1471236A
Authority
CN
China
Prior art keywords
filters
signal
bank
cosine
modulation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA031485154A
Other languages
Chinese (zh)
Inventor
潘兴德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING FUGUO DIGITAL TECHN Co Ltd
Original Assignee
BEIJING FUGUO DIGITAL TECHN Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING FUGUO DIGITAL TECHN Co Ltd filed Critical BEIJING FUGUO DIGITAL TECHN Co Ltd
Priority to CNA031485154A priority Critical patent/CN1471236A/en
Publication of CN1471236A publication Critical patent/CN1471236A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to the filtering utilized in data compression and signal processing. Particularly, the invention solves correlation in solutions of sound signal so as to provide a method and device for clearing up redundancy degree. Based on model of psychological acoustics, the invention can be utilized to separate out signals with different importance. The characters are that using cosine modulation filtering or MDCT technique can construct multiple structures of filter group of different time frequency division. Moreover, structures of filter group can be self-adapting switched based on signal. Thus, statistics redundancy and component not related to acoustic sense in the signal are removed in the audio frequency coding procedure effectively so as to obtain high coding efficiency.

Description

The signal adaptive multiresolution bank of filters that is used for sensing audio encoding
Technical field
The present invention relates to data compression and signal processing bank of filters in the signal processing, in more detail, it is used for the decorrelation of audio signal, thereby provide a method and apparatus of clearing up redundancy, in addition, based on psychoacoustic model, the present invention also can be used for separating the signal component with different importance.
Background technology
The Digital Audio Compression Coding technology can obtain high-quality coding effect with lower speed, and its basic principle is: the redundancy of 1) managing to eliminate audio signal; 2) make full use of human hearing characteristic.
As everyone knows, some linear transformations can cause approaching zero high frequency coefficient, in other words, the most information that time-domain signal comprises can be converted or focus on frequency domain or the time one frequency coefficient a son concentrate, so the Signal Compression technology adopts different filter structures as the means that improve code efficiency widely.
In psychologic acoustics, a pure tone can be the center with it, and the continuous noise with certain bandwidth is sheltered, if noise power equals the power of this pure tone in this frequency band, this moment, this pure tone was in the critical condition that can just be heard, and claimed that promptly this frequency band is critical bandwidth (unit is Bark).Critical band is the psychology foundation of sub-band division in the coding.People's ear to the analysis of audio signal based on critical band, a similar non-equiband bank of filters, widely different in different subbands.Therefore, critical band is the psychologic acoustics foundation of sub-band division in the coding.In sensing audio encoding, the division of subband should be approaching with the width of people's ear critical band as far as possible, so that better adapt to the auditory properties of people's ear.But, in the coding of reality, consider the cost of realization, this requirement can not obtain satisfying completely.Reason is, can and quantize the difficulty that there is technical elements in design near the non-equiband bank of filters design of the auditory properties of people's ear, relevant psychoacoustic analysis.
Usually, a basic operation of perception audio encoding device be the audio signal of input from time-domain be mapped to frequency domain or the time one frequency domain, its basic thinking is: signal decomposition is the composition on each frequency band; In case input signal is expressed on frequency domain, psychoacoustic model just can be with removing minor matters information; Further, the composition on each frequency band is divided into groups.At last, by allocation bit number reasonably to express each class frequency parameter.Because audio signal shows strong quasi periodicity, this process can reduce data volume greatly, promote code efficiency.
In nearest several years, a series of can be used for that the signal composition separates and redundancy is extracted the time--frequency domain shines upon (being also referred to as conversion and filtering) algorithm and is developed.The different method of these performances comprises:
(1) discrete Fourier transform (DFT) (Discrete Fourier transform, DFT).
(2) discrete cosine transform (Discrete cosine transform, DCT).
(3) mirror filter (Quadrature mirror filters, QMF).
(4) pseudo-mirror filter (Pseudo QMF, PQMF).
(5) the cosine-modulation filter (Cosine Modulated Filters, CMF).(comprising discrete cosine transform, i.e. MDCT)
(6) discrete wavelet (bag) conversion (Discrete Wavelet (Packet) Transform, DW (P) T).
Above-mentioned various conversion has different pluses and minuses, and different systems selects the basic comprising of suitable conversion as its bank of filters as required for use.
MPEG-1,2 Layers I and II have adopted PQMF as bank of filters.The advantage of this bank of filters is: structure is simple relatively, temporal resolution is fine.Its shortcoming is: have tangible frequency overlap between the contiguous subband; The variation of single-frequency signals can influence two subbands that are adjacent.The following frequency bandwidth of 2000Hz is much larger than the psychologic acoustics bandwidth value, thereby can't realize the optimum allocation of bit number.The real-time operation amount is bigger than normal.
MPEG-1,2 Layer III have adopted the cascade of PQMF and MDCT as its bank of filters.Thereby improve code efficiency though the introducing of MDCT can promote frequency resolution, the frequency overlap of PQMF between contiguous subband still can cause mixing repeatedly of signal, and the diffusion ratio of frequency domain quantizing noise on time-domain is more serious.
MPEG-2,4 AAC have adopted MDCT, and (steady-state signal: 1024-point MDCT, transient signal: 128-point MDCT), this bank of filters has been used two kinds of overlapping window shape: SINE and KBD as bank of filters.Its advantage is: frequency resolution is fine; Its shortcoming is: temporal resolution is on the low side.
Bank of filters and the MPEG-2 of MPEG-4 Twin VQ, 4 AAC are similar, and in addition, it has adopted the linear filter group to carry out the normalization operation with the albefaction spectral coefficient and before quantized level.
The bank of filters of AC-3 is used 256-point MDCT to steady-state signal, and transient signal is used 128-point MDCT, and its block length choice mechanism is fairly simple, and the selection effect is a suboptimum.
Said system or only adopt a kind of alternate arrangement to go compression to express an input signal frame, perhaps adopt interval less bank of filters of time domain analysis or conversion compression to express and change violent signal (or the varying signal of expressing one's gratification), to eliminate the influence of pre-echo decoded signal.When a signal frame comprised the composition of different transient characterisitics, single alternate arrangement was not enough to satisfy the unlike signal subframe to optimizing the primary demand of compression; And simply adopt bank of filters less between the time domain active region or conversion to handle fast changed signal, then the frequency resolution of gained coefficient is lower, makes the frequency resolution of low frequency part much larger than the critical subband bandwidth of people's ear, thereby has a strong impact on code efficiency.
The bank of filters of ATRAC is formed by pre-echo gain controlling, QMF and MDCT cascade.It has also adopted window to change the mechanism and has adjusted time frequency resolution with the characteristic according to input signal.
The bank of filters of DTS is made of 512-tap 32 subband PQMF.In order further to extract redundancy, a linear filter group can be in cascade after the PQMF.
Deepen Sinha and J.D.Johnston have proposed a kind of coding techniques (Deepen Sinha and J.D.Johnston " Audiocompression at low bit rates using a signal adaptive switchedfilterbank " based on MDCT and the switching of wavelet transform signal self adaptation, In Proc.IEEE Int.Conf.Acoust., Speech, SignalProcessing, volume 2, pages 1053-1056, Atlanta, USA, 1996.), to tempolabile signal, adopt the higher MDCT conversion of frequency resolution, to the violent signal of conversion, adopt wavelet transformation, obtained higher code efficiency.
Marcus Purat and Peter Noll carry out filtering again by the output to the cosine-modulation bank of filters, a kind of filtering technique (MarcusPurat and Peter Noll of the multiresolution that is used for audio coding newly are provided, " A new orthonormal wavelet packetdecomposition for audio coding using frequency-varying modulatedlapped transforms ", IEEE 1995 Workshop on Applications of SignalProcessing to Audio and Acoustics, New Paltz, N.Y. (USA), 1995), also obtained higher code efficiency.
Summary of the invention
In order to improve the quality of audio coding, must effectively improve statistical redundancy and the irrelevant composition of the sense of hearing in the code signal.The use of bank of filters provides a kind of optimal path of removing statistics and sense of hearing redundant information and reducing the coding side information.According to its function, the purpose of design of filtering comprises:
(1) for different signal types, adjust bank of filters the time, resolution frequently, optimize and separate signal component with different apperceive characteristics.
(2) adopt the basic function of long as far as possible improvement cosine form, effectively remove or weaken statistical redundancy in the audio signal.
(3) self adaptation by the bank of filters time frequency resolution is switched, and the overlapping window adding technology between consecutive frame, has reduced pre-echo (Pre Echo) noise as far as possible and by the sense of hearing blocking effect that discontinuity caused (Blocking Effect) on border.
(4) owing to effectively removed the statistical redundancy and the irrelevant redundancy of perception of audio signal, under the prerequisite that keeps the audio signal quality, effectively improved the compression efficiency of audio signal.
(5) filtering technique of Cai Yonging can produce less volume/separate delay.
(6) adopt fast algorithm, operand is less.
In order to realize above-mentioned target, the present invention adopts cosine-modulation bank of filters technology, designs one group of filter bank structure of switching according to the transient state tolerance of audio input signal, when eliminating or weakening intersymbol statistical redundancy, make full use of human hearing characteristic, to improve code efficiency.
Signal adaptive filter group structure proposed by the invention is meant in audio coding, according to the type of present encoding signal, dynamically adjusts the technology of filter structure.To slow signal, adopt the cosine-modulation bank of filters of equiband; To fast changed signal, adopt multiresolution time-frequency division filters group based on cosine-modulation, promptly low-frequency component utilizes the higher cosine-modulation bank of filters of frequency resolution, and radio-frequency component utilizes higher temporal resolution cosine-modulation bank of filters.And statistical property, masking characteristics and/or time-frequency characteristic according to current demand signal can change this multi-resolution framework, represent with the analysis of adaptive realization signal, thereby effectively reduce the used bit number of encoding.
Described signal adaptive filtering technology to different signal types, adopts the different filter structures based on cosine-modulation bank of filters technology.Be that tempolabile signal adopts equiband cosine-modulation bank of filters, the low frequency part of fast changed signal and HFS adopt not simultaneously--and the equiband cosine-modulation bank of filters of frequency resolution, transition signal adopt equiband cosine-modulation bank of filters.Equiband cosine-modulation bank of filters can adopt classical cosine-modulation bank of filters and two kinds of forms of MDCT.
Bank of filters of the present invention is used at audio coding decoding, has obtained very high coding efficiency, and the not significant increase of needed computing.
Description of drawings
Fig. 1 is the structured flowchart of the analysis and synthesis bank of filters of cosine-modulation filter.
Fig. 2 is the operation principle block diagram of bank of filters of the present invention.
Fig. 3 is a fast changed signal analysis filtered structure (coding side).
Fig. 4 is a fast changed signal integrated filter structure (decoding end).
Fig. 5 is the analysis/comprehensive window schematic diagram of fast changed signal frame.
Fig. 6 is the gradual analysis of handling to fast changed signal/comprehensive window schematic diagram.
Fig. 7 is the workflow diagram of bank of filters of the present invention.
Fig. 8 is the typical case's coding flow chart that is used for audio frequency.
Fig. 9 is the signal type from " steadily → fast the change → steadily " when changing, when bank of filters is switched, and the window shape schematic diagram.
Figure 10 is a fast changed signal filter analysis window.
Figure 11 is that the time-frequency that becomes piece is soon divided schematic diagram.
Embodiment
Signal adaptive filter group structure proposed by the invention is meant in audio coding, according to the type of present encoding signal, dynamically adjusts the technology of filter structure.The strategy that is different from the length MDCT transform block of AAC, the present invention adopts the cosine-modulation bank of filters of equiband to slow signal; For fast changed signal, adopt multiresolution time-frequency division filters group based on the cosine-modulation technology, promptly low-frequency component utilizes the higher cosine-modulation bank of filters of frequency resolution, and radio-frequency component utilizes higher temporal resolution cosine-modulation bank of filters.And statistical property, masking characteristics and/or time-frequency characteristic according to current demand signal can change this multi-resolution framework, represent with the analysis of adaptive realization signal, thereby effectively reduce the used bit number of encoding.The operation principle of signal adaptive filtering technology proposed by the invention is as shown in Figure 2: input signal is analyzed through the transient state metric module, be divided into tempolabile signal, fast changed signal (can segment fast changed signal, non-class I type fast changed signal respectively, class II type signal etc.) and the transition signal during the unlike signal type conversion; Then, different signal type adopts different filter structure filtering, obtains when required--the frequency filter factor.
Signal adaptive filtering technology proposed by the invention, different signal types adopts the different filter structures based on cosine-modulation bank of filters technology.Be that tempolabile signal adopts equiband cosine-modulation bank of filters, the low frequency part of fast changed signal and HFS adopt not simultaneously--and the equiband cosine-modulation bank of filters of frequency resolution, transition signal adopt equiband cosine-modulation bank of filters.Equiband cosine-modulation bank of filters can adopt classical cosine-modulation bank of filters and two kinds of forms of MDCT.Wherein, a kind of multiresolution filter structure of fast changed signal as shown in Figure 3 and Figure 4, Fig. 3 is the filter structure of coding side, Fig. 4 is the filter structure of decoding end.
The workflow of signal adaptive filter group technology of the present invention as shown in Figure 7.Its step is as follows:
(1) frequency signal decomposition framing, the input handling process;
(2) select the transient state measure;
(3) transient state of calculating current demand signal frame;
(4) type of judgement current demand signal;
(5) filter structure of selection current frame signal;
(6) cosine-modulation filtering;
(7) the time frequency tissue of filter factor;
(8) filtering output.
In order to narrate conveniently, the present patent application proposes two notions, i.e. " tempolabile signal " and " fast changed signal ".Because audio signal became when being, therefore, characteristics according to current frame signal, as the statistic intensity of variation, the time/the temporal masking ability indexs such as (whether can produce pre-echo) of frequency-domain waveform flatness and signal self, current frame signal is defined as " tempolabile signal " or " fast changed signal "." tempolabile signal " and so-called usually " accurate steady " or " time domain is gradual " signal of should be noted that here definition are distinguishing, " fast changed signal " also and so-called usually " non-stationary " or " transition " signal distinguish to some extent.
In the implementation procedure of bank of filters of the present invention, need to determine a signal type judgment mechanism easily and effectively, and this judgment mechanism can be used and determine according to actual coding.
In the present invention, definition audio signal transient state measure is: Z = ( Σ j = 1 N | s j - 1 N Σ j = 1 N s j | 2 + λ ) / Σ j = 1 N | s j | 2 + λ
s jBe j sample of signal of present frame; N is a frame length,
λ is less than 1 real number greater than zero; The introducing of λ is in order to highlight the importance of variation.
When the Z of following formula is lower than a certain threshold X 1The time, this signal can be defined as tempolabile signal; Otherwise, if be lower than another threshold X 2, then be type K 1Fast changed signal like this, can define a series of fast changed signal type.If establish common K kind signal type, then threshold X i(i=1 ..., K) can change adaptive change according to signal.Wherein, K and threshold X i(i=1 ..., definite method K) is as follows: if desire limits every frame filter structural information and takies L bit, then K≤2 L, the distribution function of statistical signal transient state tolerance is divided into K interval with transient state tolerance, and each interval probability distribution is equated.
In the present invention, tempolabile signal is adopted the cosine-modulation bank of filters of equiband; For fast changed signal, when adopting the multiresolution based on the cosine-modulation filtering technique--division filters group frequently, promptly low-frequency component utilizes the higher cosine-modulation bank of filters of frequency resolution, and radio-frequency component utilizes higher temporal resolution cosine-modulation bank of filters.When this--the characteristics of division frequently meet the regularity of distribution of the critical subband of human auditory system; Simultaneously, because signal becomes branch soon and is mainly reflected in the medium-high frequency part, therefore, in audio coding, such filter structure is better than the bank of filters of other single structures or adopts the simple bank of filters of switching.See also the operation principle block diagram of bank of filters shown in Figure 2.
In the present invention, some parameters and mechanism must reasonably be formulated.These parameters and mechanism comprise:
(a) resolution filter structure and selection thereof;
(b) shape of overlapping window;
(c) length of overlapping window.
As mentioned above, in the present invention, the filtering of tempolabile signal and fast changed signal is all based on cosine-modulation bank of filters technology, and wherein, the cosine-modulation bank of filters comprises two kinds of filtered version: traditional cosine-modulation filtering technique and MDCT technology.One based on the information source coding/decoding system of cosine-modulation filtering as shown in Figure 1.At coding side, the analyzed bank of filters of input signal resolves into M subband, and sub-band coefficients is quantized and entropy coding.In decoding end, behind entropy decoding and inverse quantization, obtain sub-band coefficients, sub-band coefficients is recovered audio signal by the filtering of synthesis filter group.
The impulse response of traditional cosine-modulation filtering technique is as follows: h k ( n ) = 2 p a ( n ) cos ( π M ( k + 0.5 ) ( n - D 2 ) + θ k ) - - - - ( 1 )
n=0,1,…,N h-1 f k ( n ) = 2 p s ( n ) cos ( π M ( k + 0.5 ) ( n - D 2 ) - θ k ) - - - - ( 2 )
n=0,1,…,N f-1
0≤k<M-1 wherein, 0≤n<2KM-1, K are the integer greater than zero, θ k = ( - 1 ) k π 4 .
Here, establish analysis window (analysis prototype filter) p of M subband cosine-modulation bank of filters a(n) impulse response length is N a, comprehensive window (or claiming comprehensive prototype filter) p s(n) impulse response length is N s, this moment, the time-delay D of whole system can be defined in [M-1, N s+ N a-M+1] in the scope, system delay is D=2sM+d (0≤d≤2M-1).
When analysis window and comprehensive window equate, promptly
p a(n)=p sAnd N (n), a=N s(3)
The time, the cosine-modulation bank of filters of being represented by (1) formula and (2) formula is the orthogonal filter group, at this moment matrix H and F ([H] N, k=h k(n), [F] N, k=f k(n)) be orthogonal transform matrix.For obtaining the linear phase filter group, further stipulate symmetry-windows
p a(2KM-1-n)=p a(n) (4)
For guaranteeing the complete reconstruct of quadrature and biorthogonal system, the condition that window function need satisfy is seen document (P.P.Vaidynathan, " Multirate Systems and Filter Banks ", Prentice Hall, Englewood Cliffs, NJ, 1993).
Another filtered version is MDCT (Modified Discrete Cosine Transform), is also referred to as TDAC (Time Domain Aliasing Cancellation) cosine-modulation bank of filters, and its impulse response is: h k ( n ) = p a ( n ) 2 M cos ( π M ( k + 0.5 ) ( n + M + 1 2 ) ) - - - - ( 5 ) f k ( n ) = p s ( n ) 2 M cos ( π M ( k + 0.5 ) ( n + M + 1 2 ) ) - - - - ( 6 )
0≤k<M-1 wherein, 0≤n<2KM-1, K are the integer greater than zero.Wherein, p a(n) and p s(n) be respectively analysis window (or analyzing prototype filter) and comprehensive window (or comprehensive prototype filter).
Same, when analysis window and comprehensive window equate, promptly
p a(n)=p s(n) (7)
The time, the cosine-modulation bank of filters of being represented by (5) formula and (6) formula is the orthogonal filter group, at this moment matrix H and F ([H] N, k=h k(n), [F] N, k=f (n)) be orthogonal transform matrix.For obtaining the linear phase filter group, further stipulate symmetry-windows
p a(2KM-1-n)=p a(n) (8)
Then for satisfying complete reconstruct, by as can be known, analysis window and comprehensive window need satisfy Σ m = 0 2 K - 1 - 2 s p a ( mM + n ) p a ( ( m + 2 s ) M + n ) = δ ( s ) - - - - ( 9 )
S=0 wherein ..., K-1, n=0 ...,
Relax the constraints of (7) formula, promptly cancel the restriction that analysis window and comprehensive window equate, then the cosine-modulation bank of filters is the biorthogonal modulated filter bank.Though the biorthogonal modulated filter bank has been lost the orthogonality of conversion, might obtain the performance that other more are of practical significance.
Time domain analysis is verified, still satisfies complete reconstruct performance as (5) formula and the biorthogonal modulated filter bank that (6) formula obtains, as long as Σ m = 0 2 K - 1 - 2 s p s ( mM + n ) p a ( ( m + 2 s ) M + n ) = δ ( s ) - - - - ( 10 ) Σ m = 0 2 K - 1 - 2 s ( - 1 ) m p s ( mM + n ) p a ( ( m + 2 s ) M + ( M - n - 1 ) ) = 0 - - - - ( 11 )
S=0 wherein ..., K-1, n=0 ..., M-1.
The analysis window of filtering of the present invention and comprehensive window can adopt the window shape formula that satisfies the complete reconstruct of bank of filters (Perfect Reconstruction) condition arbitrarily, as SINE and KBD window commonly used in audio coding.
In order to guarantee the complete reconstruction nature of filter bank structure of the present invention, the cosine-modulation bank of filters need satisfy following condition:
(a) analysis window of the analysis and synthesis bank of filters of tempolabile signal frame coding/decoding end and comprehensive window must satisfy the constraint requirements of the complete reconstruct of cosine-modulation bank of filters to window function, promptly satisfy above-mentioned (10), (11) formula or other constraints.
During (b) to fast changed signal frame coding/decoding, the analysis and synthesis bank of filters need satisfy following condition: equal than the sequential quadratic sum of the analysis window of the cosine-modulation bank of filters of high time resolution upper frequency resolution the cosine-modulation bank of filters analysis window square, perhaps M sequential quadratic sum than the analysis window of the cosine-modulation bank of filters of high time resolution equals the sequential quadratic sum (N≤M) wherein of analysis window of the cosine-modulation bank of filters of N upper frequency resolution; When adopting the bank of filters of a plurality of (>2) temporal resolution, need satisfy above-mentioned condition equally.
For example work as M=2, during N=1, low-frequency filter group analysis window, comprehensive window length are high frequency filter group analysis window, comprehensive window length
Figure A0314851500121
Doubly 501, establishing low-frequency filter group analysis window is x (i), i=0 ..., L-1, comprehensive window is y (i), i=0 ..., L-1 503; First analysis window of high frequency filter group is x 1(i), i=0 ..., L 1-1, analysis window is y 1(i), i=0 ..., L 1-1 505; Second analysis window is x 2(i), i=0 ..., L 1-1, comprehensive window is y 2(i), i=0 ..., L 1-1 507, and L = L 1 × 3 2 . (as shown in Figure 5)
Make this multiresolution bank of filters satisfy complete reconstruction condition, analysis window and comprehensive window need satisfy following condition: z 1 2 ( i + L / 3 ) + z 2 2 ( i ) = 1 - - - ( 13 )
Wherein, above-mentioned z represents x to analysis filter, and above-mentioned z represents y to synthesis filter.Simultaneously, x and y also need satisfy above-mentioned (10), (11) formula or other constraints.
(c) in order to realize from the undistorted switching between equiband bank of filters and the multiresolution bank of filters, need to adopt the transition filter group, the transition filter group can be for equiband cosine-modulation bank of filters or based on the multiresolution bank of filters of cosine-modulation; The window fibrous root that the transition filter group is adopted is determined according to the window that equiband bank of filters and multiresolution bank of filters are adopted, is made system satisfy complete reconstruction condition.
And, according to the repeatedly effect character of mixing of cosine-modulation filter, guarantee that mixing repeatedly between transition filter group and adjacent filter eliminate, guarantee the complete reconstruct performance of whole filtering analysis and synthesis system.The relation of transition filter 603 and its adjacent filter 601 and 605 as shown in Figure 6.
(d) when different fast changed signal frames are adopted different analysis and synthesis windows, then when the different window signal frame switches, need to adopt multiresolution transition filter group based on cosine-modulation, the window fibrous root that this moment, the transition filter group was adopted is determined according to the window that former and later two bank of filters adopted, and the analysis window of different resolution bank of filters satisfies the constraint requirements of (b), to satisfy the complete reconstruction condition of system.
Therefore, multiresolution filter structure among the present invention is based upon on cosine-modulation bank of filters (the comprising MDCT) technical foundation, (be also sometimes referred to as and analyze or comprehensive prototype filter group by the analysis and synthesis window analysis and synthesis that makes the different resolution bank of filters, here, the prototype filter group is meant the baseband filter group that is used to modulate other bank of filters) satisfy window and retrain and realize.
In the design coder/decoder, can be according to the characteristics of signal, design the cosine-modulation bank of filters configuration of a plurality of different time-frequency resolutions, the multiresolution time-frequency that is obtained is divided can be represented signal efficiently, and utilize the auditory properties of people's ear.And the window constraints of the analysis and synthesis bank of filters of the cosine-modulation bank of filters of these different time resolution above all needing to satisfy, with the character of the complete reconstruct that guarantees this filter structure.
Embodiment
A following examples specific implementation of the present invention does not as an illustration limit the scope of the claim of patent of the present invention, because researcher who is skilled in technique or engineer can realize similar innovation and creation according to the present invention.
Implementation platform as shown in Figure 8, an input audio signal is sampled with 44.1kHz.Sampled signal is divided framing.Every frame is formed (about 23.22ms) by 1024 samples.At first determine current demand signal frame encoding block type 801,, adopt different bank of filters 805 structures according to different block types according to the transient state of current demand signal.Psychoacoustic model utilizes human auditory system's occlusion to remove imperceptible content from input signal frame according to selected bank of filters configuration 803, simultaneously, determines the budget bit number 807 of present frame coding.Then, bank of filters time of implementation--the mapping 805 between the frequency, last, pretreated data are quantized 809 and 811 (quantification is corresponding with selected alternate arrangement with Methods for Coding) of encoding, and index value and side information be packaged to advance bit stream 811.Wherein the realization details such as the following steps of bank of filters specific implementation and changing method are described: step 1. is decomposed framing (1024 samples) with input audio data; The transient state tolerance of step 2. assessment current input signal frame: Z = ( Σ j = 1 1024 [ | s j - 1 1024 Σ j = 1 1024 s j | ] 2 + 0.618 ) / Σ j = 1 1024 | s j | 2 + 0.618 Step 3. is determined the filter bank structure of current demand signal frame according to Z value and historical information.
According to the current demand signal type, this bank of filters adopts four kinds of filter structures, as shown:
Steady piece SMOOTH_TYPE,
The fast piece QUICK_TYPE that becomes,
Open beginning piece RAISE_TYPE,
End block STOP_TYPE.Wherein, fast when becoming piece and adopting based on 1152 of the MDCT technology multiresolution--conversion frequently, steadily piece adopts different 2048 MDCT conversion, opens the multiresolution time-frequency conversion that the beginning piece adopts 1024 MDCT and at 1152, and end block adopts 1024 MDCT conversion.
When signal changes through a signal type from " steadily → fast the change → steadily ", the switching of this bank of filters as shown in Figure 9, wherein 901,909, one fast 905, one of pieces that become of two steady pieces open beginning piece 903 and an end block 907.
The filter structure that wherein becomes frame is soon seen shown in Figure 10, and low-frequency component adopts the bank of filters 1003 of upper frequency resolution, the bank of filters filtering 1005 of radio-frequency component than high time resolution.At this moment, the time-frequency of signal frame is divided as shown in figure 11.Wherein:
The temporal resolution of 1~96 coefficient is 2048/fsSec., and frequency resolution is fs/2048Hz603;
The temporal resolution of 97~1024 coefficients is 256/fsSec., and frequency resolution is fs/256Hz601.
The time-frequency partition structure filter structure of other structures can obtain with similar mode.
In cataloged procedure, it should be noted that after optionally filter bank structure was determined, bank of filters needing to realize correct deal with data buffering area, to guarantee that the bank of filters switching is according to not misplacing.

Claims (10)

1, a kind of signal adaptive multiresolution bank of filters that is used for sensing audio encoding is characterized in that: described bank of filters adopts the cosine-modulation filtering technique to construct the filter structure that multiple different time-frequency is divided, and input signal is carried out the self adaptation switching.
2, signal adaptive filter group according to claim 1 is characterized in that: comprise different transient state tolerance according to the current demand signal frame Z = ( Σ j = 1 N | s j - 1 N Σ j = 1 N s j | 2 + λ ) / Σ j = 1 N | s j | 2 + λ Characteristic, self adaptation is switched the filter bank structure that is used to encode,
To tempolabile signal, adopt the cosine-modulation bank of filters of equiband;
To fast changed signal, adopt multiresolution time-frequency division filters group based on cosine-modulation;
To transition signal, adopt the cosine-modulation bank of filters of equiband.
3, signal adaptive filter group according to claim 2 is characterized in that: based on the multiresolution time-frequency division filters group of cosine-modulation, and can basis h k ( n ) = p a ( n ) 2 M cos ( π M ( k + 0.5 ) ( n + M + 1 2 ) ) - - - - ( 5 ) f k ( n ) = p s ( n ) 2 M cos ( π M ( k + 0.5 ) ( n + M + 1 2 ) ) - - - - ( 6 )
0≤k<M-1 wherein, 0≤n<2KM-1, K are the integer greater than zero, and structure satisfies the multiresolution filter structure of different performance requirement.
4, signal adaptive filter group according to claim 3 is characterized in that: the multiresolution filter structure to analytic signal the time--conversion/filtering frequently can be mapped to time-domain signal the time-frequency domain signal of time and frequency resolution dynamic adjustable.
5, signal adaptive filter group according to claim 2, it is characterized in that: the multiresolution filter structure that satisfies the different performance requirement, can be according to statistical property, masking characteristics and/or the time-frequency characteristic of current demand signal, the self adaptation of filter structure is regulated.
6, signal adaptive filter group according to claim 2, it is characterized in that: based on the multiresolution time-frequency division filters group of cosine-modulation, to input signal conversion/filtering the time, in the different frequency interval, adopt the cosine-modulation bank of filters of different time frequency resolution, the time-frequency that obtains multiresolution is divided, and makes system satisfy complete reconstruct, and its complete reconstruction condition is: Σ m = 0 2 K - 1 - 2 s p s ( mM + n ) p a ( ( m + 2 s ) M + n ) = δ ( s ) - - - - ( 10 ) Σ m = 0 2 K - 1 - 2 s ( - 1 ) m p s ( mM + n ) p a ( ( m + 2 s ) M + ( M - n - 1 ) ) = 0 - - - - ( 11 )
S=0 wherein ..., K-1, n=0 ..., M-1.
7, signal adaptive filter group according to claim 2 is characterized in that: the structure of the coding and decoding of the cosine-modulation bank of filters of equiband comprises:
The cosine-modulation bank of filters that is used for tempolabile signal filtering;
Be used for from the tempolabile signal bank of filters to fast changed signal bank of filters transition cosine-modulation bank of filters;
Be used for from the fast changed signal bank of filters to tempolabile signal bank of filters transition cosine-modulation bank of filters;
Be used for the multiresolution filtering based on the cosine-modulation filtering technique of fast changed signal filtering, and/or the transition filter group when switching between the different fast changed signal filter structure; Adopt the transition filter group, the complete reconstruction nature when purpose is to guarantee the bank of filters switching.
8, signal adaptive filter group according to claim 2 is characterized in that: based on the multiresolution time-frequency division filters group of cosine-modulation, utilize cosine-modulation bank of filters technology h k ( n ) = 2 p a ( n ) cos ( π M ( k + 0.5 ) ( n - D 2 ) + θ k ) - - - - ( 1 )
n=0,1,…,Nh-1 f k ( n ) = 2 p s ( n ) cos ( π M ( k + 0.5 ) ( n - D 2 ) - θ k ) - - - - ( 2 )
n=0,1,…,Nf-1
0≤k<M-1 wherein, 0≤n<2KM-1, K are the integer greater than zero, θ k = ( - 1 ) k π 4 , Construct a specific multiresolution analysis structure, satisfy the requirement of the masking characteristics compressing audio signal that utilizes signal statistics redundancy and human auditory system.
9, signal adaptive filter group according to claim 2, it is characterized in that: based on the multiresolution bank of filters of cosine-modulation, low-frequency component utilizes the higher cosine-modulation bank of filters of frequency resolution, radio-frequency component utilizes higher temporal resolution cosine-modulation bank of filters, makes the conversion/filter factor that is obtained have the design feature of multiresolution.
10, signal adaptive filter group according to claim 9, it is characterized in that: the multiresolution bank of filters technology that the cosine-modulation bank of filters of different resolution is formed, different resolution cosine-modulation bank of filters need satisfy the window constraint, promptly in specific scramble time section, equal than the sequential quadratic sum of the analysis window of the cosine-modulation bank of filters of high time resolution upper frequency resolution the cosine-modulation bank of filters analysis window square, make the adaptive-filtering structure of whole system can guarantee complete reconstruct.
CNA031485154A 2003-07-01 2003-07-01 Signal adaptive multi resolution wave filter set for sensing audio encoding Pending CN1471236A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA031485154A CN1471236A (en) 2003-07-01 2003-07-01 Signal adaptive multi resolution wave filter set for sensing audio encoding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA031485154A CN1471236A (en) 2003-07-01 2003-07-01 Signal adaptive multi resolution wave filter set for sensing audio encoding

Publications (1)

Publication Number Publication Date
CN1471236A true CN1471236A (en) 2004-01-28

Family

ID=34156265

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA031485154A Pending CN1471236A (en) 2003-07-01 2003-07-01 Signal adaptive multi resolution wave filter set for sensing audio encoding

Country Status (1)

Country Link
CN (1) CN1471236A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290774B (en) * 2007-01-31 2011-09-07 广州广晟数码技术有限公司 Audio encoding and decoding system
CN101930740B (en) * 2004-11-02 2012-05-30 杜比国际公司 Multichannel audio signal decoding using de-correlated signals
CN101615393B (en) * 2008-06-25 2013-01-02 汤姆森许可贸易公司 Method and apparatus for encoding or decoding a speech and/or non-speech audio input signal
CN112968688A (en) * 2021-02-10 2021-06-15 西南电子技术研究所(中国电子科技集团公司第十研究所) Method for realizing digital filter with selectable pass band

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930740B (en) * 2004-11-02 2012-05-30 杜比国际公司 Multichannel audio signal decoding using de-correlated signals
CN101290774B (en) * 2007-01-31 2011-09-07 广州广晟数码技术有限公司 Audio encoding and decoding system
CN101615393B (en) * 2008-06-25 2013-01-02 汤姆森许可贸易公司 Method and apparatus for encoding or decoding a speech and/or non-speech audio input signal
CN112968688A (en) * 2021-02-10 2021-06-15 西南电子技术研究所(中国电子科技集团公司第十研究所) Method for realizing digital filter with selectable pass band

Similar Documents

Publication Publication Date Title
Srinivasan et al. High-quality audio compression using an adaptive wavelet packet decomposition and psychoacoustic modeling
CN1210689C (en) Improved spectral translation/folding in subband domain
CN1181467C (en) Enhancing perceptual performance of SBR and related HFR coding methods by adaptive noise-floor addition and noise substitution limiting
CN1272911C (en) Audio signal decoding device and audio signal encoding device
CN101521014B (en) Audio bandwidth expansion coding and decoding devices
CN1527995A (en) Encoding device and decoding device
CN1708787A (en) Method for encoding digital audio using advanced psychoacoustic model and apparatus thereof
CN102473417A (en) Band enhancement method, band enhancement apparatus, program, integrated circuit and audio decoder apparatus
CN1310210C (en) Audio coding system using characteristics of a decoded signal to adapt synthesized spectral components
KR100472442B1 (en) Method for compressing audio signal using wavelet packet transform and apparatus thereof
JP2004206129A (en) Improved method and device for audio encoding and/or decoding using time-frequency correlation
CN1460992A (en) Low-time-delay adaptive multi-resolution filter group for perception voice coding/decoding
Kumar et al. The optimized wavelet filters for speech compression
He et al. An enhanced psychoacoustic model based on the discrete wavelet packet transform
CN1471236A (en) Signal adaptive multi resolution wave filter set for sensing audio encoding
CN1862969A (en) Adaptive block length, constant converting audio frequency decoding method
CN1388517A (en) Audio coding/decoding technology based on pseudo wavelet filtering
CN1123865C (en) Block effect eliminating method in wavelet voice frequency signal processing
Manohar et al. Audio compression using daubechie wavelet
He et al. Psychoacoustic Music Analysis Based on the Discrete Wavelet Packet Transform.
Aloui et al. Optimized speech compression algorithm based on wavelets techniques and its real time implementation on DSP
WO2011052221A1 (en) Encoder, decoder and methods thereof
Gunjal et al. Traditional Psychoacoustic Model and Daubechies Wavelets for Enhanced Speech Coder Performance
CN1127054C (en) Signal processing method and flexible filter for perception audio encoding
Luo et al. High quality wavelet-packet based audio coder with adaptive quantization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
PP01 Preservation of patent right

Effective date of registration: 20051209

Pledge (preservation): Preservation

PP01 Preservation of patent right
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20040128