CN101388213B - Preecho control method - Google Patents
Preecho control method Download PDFInfo
- Publication number
- CN101388213B CN101388213B CN2008100537463A CN200810053746A CN101388213B CN 101388213 B CN101388213 B CN 101388213B CN 2008100537463 A CN2008100537463 A CN 2008100537463A CN 200810053746 A CN200810053746 A CN 200810053746A CN 101388213 B CN101388213 B CN 101388213B
- Authority
- CN
- China
- Prior art keywords
- transient
- planarization
- data block
- transition
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Abstract
The invention discloses a pre-echo control method, which conducts a time-domain planarization treatment to original signals through detecting transient positions and transient intensity in transient signals, and comprises the following processes: audio frame data which are be coded are divided into a plurality of data blocks, the transient intensity of each data block is calculated, data block whose transient intensity exceeds a threshold value is marked with a transient data block, transient starting data block which is a transient starting position is marked, and redundant transient positions are eliminated according to the masking effect of ears. Time domain planarization curves are drawn according to the transient intensity, if having a plurality of transient positions, the time domain planarization curves are synthesized, the planarization curves are aligned to the transient positions, and frame changing signals are added with windows and are done with the planarization treatment. Compared with a traditional scheme, the pre-echo control method directly aims to the transient positions to inhibit noise, thereby effectively controlling the pre-echo phenomenon.
Description
Technical field
The invention belongs to the digital audio processing technical field, be specifically related to a kind of new Pre echoes control method and device.
Background technology
In the audio coding technology, the Pre echoes distortion is a suitable stubborn problem always, and especially when bit rate is low, that is compressibility is when higher, and the Pre echoes distortion will become more obviously with serious.The key reason that the Pre echoes distortion produces is: the deficiency of temporal resolution causes the time domain diffusion of quantizing noise.Especially when a transient signal by piecemeal conversion (or filtering) when frequency domain carries out quantization encoding, owing to quantizing noise is diffused on whole transform block (or bank of filters) scope, if it can not be sheltered by signal, Pre echoes will appear.Pre echoes causes the waveform of distorted signals as shown in Figure 1, tangible quantizing noise obviously as can be seen from the figure before burst, occurred, and people's ear is very responsive to this type of distortion.
Except the intensity of quantizing noise has determined that the Pre echoes distortion effect sound quality, the time domain masking of people's ear also plays an important role to Pre echoes distortion effect sound quality.The time domain masking of people's ear has two kinds of situation, i.e. forward masking and backward masking.Can reach 20ms the action time of forward masking, it has been generally acknowledged that the practical function time 0.5~2ms with interior effectively; Backward masking has the longer continuous action time, reaches 200ms approximately, and it has been generally acknowledged that the practical function time is effective within 10~50ms.Because backward masking is longer action time, quantizing noise generally can be masked off well and do not influenced subjective sound quality, therefore this situation of less consideration in perceptual audio coder.With respect to backward masking; The forward masking ability a little less than, just need carefully design the time domain specification of a suitable quantizing noise, make it be no more than the preparatory masking level of people's ear; The Pre echoes distortion just can not detected by people's ear like this, thereby guarantees transparent encode sound quality.
In order to suppress quantizing noise; In modern times in the main flow audio coding standard; Nearly all adopt the method that the length window switches to handle transient signal, promptly when transient signal is handled, used relatively shorter transformation block length; Temporal resolution can be improved like this, of the diffusion of frequency domain quantizing noise can be suppressed again in time domain.In addition, " Pre echoes " phenomenon that adopted other measure to improve respectively again in each standard has all adopted code rate control method such as MPEG-1 Layer 3 and MPEG-2; MPEG-2 AAC has adopted time domain noise-shaped and gain control, AC-3 exponent mantissa coding and sound channel coupling etc.
In the AVS audio standard, do not use length window handoff technique, but the long window conversion of unified use; Different is; When signal is transient state, also to carry out the frequency domain multiresolution analysis to the coefficient behind the time-frequency conversion, promptly adopt the method for hybrid filter-bank to suppress Pre echoes and improve code efficiency.It is all very high that but the computation complexity of this method and computational accuracy require, and seriously influenced the speed of codec, especially influenced the real-time of demoder.
And present main flow scrambler is when handling transient signal; It all is unit with the piece; Though and these methods can detect the transient response of signal; But they all can not accurately locate position and the intensity that transition takes place, and also do not make corresponding processing to the position that transition takes place and the intensity of transition.So still can produce tangible Pre echoes phenomenon in some cases.
Summary of the invention
In view of above-mentioned technical matters; The present invention proposes a kind of new Pre echoes control method; Ultimate principle (the time domain transient signal needs high temporal resolution) according to the Pre echoes generation; Utilize people's ear to the details of high-frequency signal and insensitive characteristic (thereby can give up part high frequency details and not by the perception of people's ear) simultaneously,, signal is carried out time-domain planarization according to the transition intensity of this transition through the transient position of location time domain transient signal; Directly carry out squelch, thereby effectively control the Pre echoes phenomenon to transient position.
The invention discloses audio frame number to be encoded according to being divided into a plurality of data blocks; In the AVS audio coding standard; The length of one frame signal is 1024 sampled points; 44.1kHz the time of one frame signal is 23.22ms during sampling, is a data block with 32 sampled points, each data block is approximately 0.7ms;
Calculate the transition intensity of each data block;
The data block that wherein transition intensity is surpassed threshold value is labeled as the transient data piece;
To transition initial data piece, i.e. transition reference position carry out mark, and rejects redundant transient position according to the masking effect of people's ear;
Draw the time-domain planarization curve according to transition intensity, may further comprise the steps:
The transition intensity C (k) that tries to achieve according to the front confirms the minimum value C that the planarization curve is decayed
Min, T wherein
cTransition intensity threshold for the detection of transition piece;
Work as T
c<C (k)<4 o'clock, C
Min=1/2
When 4≤C (k)<8, C
Min=1/4
When 8≤C (k)<16, C
Min=1/8
When 16≤C (k)<32, C
Min=1/16
When 32≤C (k), C
Min=1/32
If k data block is first transition piece in the frame, the calculating of planarization curve y (x) is following:
y(x)=1;(x=0,1,...,32*(k-1)-1)
y(x)=C
min;(x=32*k,32*k+1,...,1024-1)
The starting point of planarization curve y (x) with align with transition piece reference position, promptly the planarization curve begins from transition piece original position;
If any a plurality of transient position, then synthetic time-domain planarization curve;
The planarization curve is alignd with transient position, this frame signal is carried out windowing, planarization.
The said data block that wherein transition intensity is surpassed threshold value is labeled as the step of transient data piece, and also further may further comprise the steps: with this frame flag is the transient state frame.
Compare with existing main flow scrambler, the present invention can accurately locate position and the intensity that transition takes place, and makes corresponding processing to the position that transition takes place and the intensity of transition.This new Pre echoes control method; Ultimate principle (the time domain transient signal needs high temporal resolution) according to the Pre echoes generation; Utilize people's ear to the details of high-frequency signal and insensitive characteristic (thereby can give up part high frequency details and not by the perception of people's ear) simultaneously,, signal is carried out time-domain planarization according to the transition intensity of this transition through the transient position of location time domain transient signal; Directly carry out squelch, thereby effectively control the Pre echoes phenomenon to transient position.
Description of drawings
Fig. 1 is the Pre echoes phenomenon;
Fig. 2 is a FB(flow block) of the present invention;
Fig. 3 is the oscillogram of original castanets sequence;
Fig. 4 is based on the oscillogram that AVS audio coding decoding platform adopts the castanets sequence after the multiresolution analysis method is handled;
Fig. 5 is based on the oscillogram that AVS audio coding decoding platform adopts the castanets sequence after new method is handled.
Embodiment
Below in conjunction with accompanying drawing and specific embodiment technical scheme of the present invention is done and to be further described:
Embodiment 1: adopt AVS audio coding decoding platform; Through test, contrasted the control effect of the method for the hybrid filter-bank that the present invention and AVS audio standard adopt to the Pre echoes distortion to typical castanets sequence (monophony, 44.1kHz SF, 16 bit-pattern precision, encoder bit rate 32kbps/ch).
Embodiment of the present invention illustrates as follows: for example, in the AVS audio coder, the length of a frame signal be 1024 sampled points (PCM [i], i=0,1 ... 1023; ), the time of 44.1kHz when sampling one frame signal is about 23.22ms, is a data block with 32 sampled points, i.e. the about 0.7ms of each data block, and every frame is 32 data blocks altogether.If the general power of each data block sampling point is P (k); K=1; 2; ... 32, then
Definition 1: the power of each data block sampling point is with respect to the ratio of last data piece
Definition 2: transition intensity C (k) surpasses a certain threshold value T
cData block be called the transition piece.
The first step: data block is divided.Signal in one frame is divided into a data block according to per 32 the continuous sampling points of top definition.
Second step: the transition intensity of calculating and detect each data block in the frame.
Calculate the transition intensity C (k) of each data block successively, and with threshold value T
cCompare, when C (k) greater than T
cThe time (get T here
c=2), think that then this data block is the transition piece.
The 3rd step: all the transition pieces in the frame are labeled as transient position, and are the transient state frame with this frame flag.
The 4th step: reject redundant transient position.
Because the forward masking time of signal is generally 20ms; The practical function time is general consider 0.5~2ms with interior effectively, so k=2,3; In 4 three data blocks; If transition takes place then need not consider,, can be sheltered by the forward masking effect because only have an appointment the signal about 2ms in the front of transient position.Longer because of the backward masking time of signal again, about 200ms, generally get about 20ms effective time, so after detecting first transient position, remaining transient position just can have been rejected.
The 5th step: time-domain planarization treatment
The transition intensity C (k) that tries to achieve according to the front confirms the minimum value C that the planarization curve is decayed
MinConcrete grammar is following, wherein T
cTransition intensity threshold for the detection of transition piece:
Work as T
c<C (k)<4 o'clock, C
Min=1/2
When 4≤C (k)<8, C
Min=1/4
When 8≤C (k)<16, C
Min=1/8
When 16≤C (k)<32, C
Min=1/16
When 32≤C (k), C
Min=1/32
The calculating of planarization curve y (x) is following: (establish k data block is first transition piece in the frame)
y(x)=1;(x=0,1,...,32*(k-1)-1)
y(x)=C
min;(x=32*k,32*k+1,...,1024-1)
Use y (x) to this frame signal windowing, so just can suppress the amplitude of signal transients part, make the signal planarization.
The 6th step:, be bundled in the AVS audio code stream like transition frame identification, transient position and transition intensity with transient information.
Other processing are identical with steady-state signal, and it is reducible raw data that the signal that decoding end obtains decoding according to transient information in the code stream carries out reverse operating.
Test findings such as Fig. 3-shown in Figure 5.Wherein, Fig. 3 is the oscillogram of original castanets sequence, and transition effect is obvious in this oscillogram, and the noise before the transient signal is very low; Fig. 4 is based on the oscillogram that AVS audio coding decoding platform adopts the castanets sequence after the multiresolution analysis method is handled; Can find out from this oscillogram: the signal of handling through coding/decoding has kept time domain resolution preferably; Noise before the transient signal has also obtained inhibition, obvious noise do not occur; Fig. 5 is based on the oscillogram that AVS audio coding decoding platform adopts the castanets sequence after method of the present invention is handled; Can find out from this oscillogram: the signal of handling through encoding and decoding has also kept good time domain resolution; Noise before the transient signal has also obtained effective inhibition, and noise is starkly lower than the result of multiresolution analysis method.
Embodiment 2: the Dolby AC-3 audio coding decoding platform of employing, to the embodiment of typical castanets sequence (monophony, 44.1kHz SF, 16 bit-pattern precision, encoder bit rate 64kbps/ch).
Embodiment of the present invention illustrates as follows: for example, in the AC-3 audio coder, the length of a frame signal be 512 sampled points (PCM [i], i=0,1 ... 511; ), the time of 44.1kHz when sampling one frame signal is about 11.6ms, is a data block with 32 sampled points, i.e. the about 0.7ms of each data block, and every frame is 16 data blocks altogether.If the general power of each data block sampling point is P (k); K=1; 2; ... 16, then
Definition 1: the power of each data block sampling point is with respect to the ratio of last data piece
Definition 2: transition intensity C (k) surpasses a certain threshold value T
cData block be called the transition piece.
The first step: data block is divided.Signal in one frame is divided into a data block according to per 32 the continuous sampling points of top definition.
Second step: the transition intensity of calculating and detect each data block in the frame.
Calculate the transition intensity C (k) of each data block successively, and with threshold value T
cCompare, when C (k) greater than T
cThe time (get T here
c=2), think that then this data block is the transition piece.
The 3rd step: all the transition pieces in the frame are labeled as transient position, and are the transient state frame with this frame flag.
The 4th step: reject redundant transient position.
Because the forward masking time of signal is generally 20ms; The practical function time is general consider 0.5~2ms with interior effectively, so k=2,3; In 4 three data blocks; If transition takes place then need not consider,, can be sheltered by the forward masking effect because only have an appointment the signal about 2ms in the front of transient position.Longer because of the backward masking time of signal again, about 200ms, generally get about 20ms effective time, so after detecting first transient position, remaining transient position just can have been rejected.
The 5th step: time-domain planarization treatment
The transition intensity C (k) that tries to achieve according to the front confirms the minimum value C that the planarization curve is decayed
MinConcrete grammar is following, wherein T
cTransition intensity threshold for the detection of transition piece:
Work as T
c<C (k)<4 o'clock, C
Min=1/2
When 4≤C (k)<8, C
Min=1/4
When 8≤C (k)<16, C
Min=1/8
When 16≤C (k)<32, C
Min=1/16
When 32≤C (k), C
Min=1/32
The calculating of planarization curve y (x) is following: (establish k data block is first transition piece in the frame)
y(x)=1;(x=0,1,...,32*(k-1)-1)
y(x)=C
min;(x=32*k,32*k+1,...,1024-1)
Use y (x) to this frame signal windowing, so just can suppress the amplitude of signal transients part, make the signal planarization.
The 6th step:, be bundled in the AVS audio code stream like transition frame identification, transient position and transition intensity with transient information.
Other processing are identical with steady-state signal, and it is reducible raw data that the signal that decoding end obtains decoding according to transient information in the code stream carries out reverse operating.
Claims (2)
1. a Pre echoes control method is carried out time-domain planarization treatment through the transient position and the transition intensity that detect in the transient signal to original signal, and this method comprises following process:
Audio frame number certificate to be encoded is divided into a plurality of data blocks; In the AVS audio coding standard; The length of one frame signal is 1024 sampled points; 44.1kHz the time of one frame signal is 23.22 milliseconds of ms during sampling, is a data block with 32 sampled points, promptly about sampling time of each data block is 0.7 millisecond of ms;
Calculate the transition intensity of each data block;
The data block that wherein transition intensity is surpassed threshold value is labeled as the transient data piece;
To transition initial data piece, i.e. transition reference position carry out mark, and rejects redundant transient position according to the masking effect of people's ear;
Draw the time-domain planarization curve according to transition intensity, may further comprise the steps:
The transition intensity C (k) that tries to achieve according to the front confirms the minimum value C that the planarization curve is decayed
Min, T wherein
cTransition intensity threshold for the detection of transition piece;
Work as T
c<C (k)<4 o'clock, C
Min=1/2
When 4≤C (k)<8, C
Min=1/4
When 8≤C (k)<16, C
Min=1/8
When 16≤C (k)<32, C
Min=1/16
When 32≤C (k), C
Min=1/32
If k data block is first transition piece in the frame, the calculating of planarization curve y (x) is following:
y(x)=1;(x=0,1,…,32*(k-1)-1)
y(x)=C
min;(x=32*k,32*k+1,…,1024-1)
The starting point of planarization curve y (x) is alignd with the reference position of audio frame;
If any a plurality of transient position, then synthetic time-domain planarization curve;
Use the planarization curve that this frame signal is carried out windowing, planarization.
2. Pre echoes control method as claimed in claim 1 is characterized in that, the said data block that wherein transition intensity is surpassed threshold value is labeled as the step of transient data piece, and also further may further comprise the steps: with this frame flag is the transient state frame.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100537463A CN101388213B (en) | 2008-07-03 | 2008-07-03 | Preecho control method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100537463A CN101388213B (en) | 2008-07-03 | 2008-07-03 | Preecho control method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101388213A CN101388213A (en) | 2009-03-18 |
CN101388213B true CN101388213B (en) | 2012-02-22 |
Family
ID=40477583
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2008100537463A Expired - Fee Related CN101388213B (en) | 2008-07-03 | 2008-07-03 | Preecho control method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101388213B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101908342B (en) * | 2010-07-23 | 2012-09-26 | 北京理工大学 | Method for inhibiting pre-echoes of audio transient signals by utilizing frequency domain filtering post-processing |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1253418A (en) * | 1998-10-29 | 2000-05-17 | 松下电器产业株式会社 | Block size determination used in audio frequency conversion coding and self adapting method |
CN1458646A (en) * | 2003-04-21 | 2003-11-26 | 北京阜国数字技术有限公司 | Filter parameter vector quantization and audio coding method via predicting combined quantization model |
CN1514997A (en) * | 2001-06-08 | 2004-07-21 | �ʼҷ����ֵ�������˾ | Editing of audio signals |
US7099830B1 (en) * | 2000-03-29 | 2006-08-29 | At&T Corp. | Effective deployment of temporal noise shaping (TNS) filters |
CN1934619A (en) * | 2004-03-17 | 2007-03-21 | 皇家飞利浦电子股份有限公司 | Audio coding |
-
2008
- 2008-07-03 CN CN2008100537463A patent/CN101388213B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1253418A (en) * | 1998-10-29 | 2000-05-17 | 松下电器产业株式会社 | Block size determination used in audio frequency conversion coding and self adapting method |
US7099830B1 (en) * | 2000-03-29 | 2006-08-29 | At&T Corp. | Effective deployment of temporal noise shaping (TNS) filters |
CN1514997A (en) * | 2001-06-08 | 2004-07-21 | �ʼҷ����ֵ�������˾ | Editing of audio signals |
CN1458646A (en) * | 2003-04-21 | 2003-11-26 | 北京阜国数字技术有限公司 | Filter parameter vector quantization and audio coding method via predicting combined quantization model |
CN1934619A (en) * | 2004-03-17 | 2007-03-21 | 皇家飞利浦电子股份有限公司 | Audio coding |
Also Published As
Publication number | Publication date |
---|---|
CN101388213A (en) | 2009-03-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Moattar et al. | A simple but efficient real-time voice activity detection algorithm | |
JP2023015055A (en) | Harmonic dependency control for harmonic filter tool | |
Harma et al. | A comparison of warped and conventional linear predictive coding | |
CN1926608B (en) | Device and method for processing a multi-channel signal | |
US8082157B2 (en) | Apparatus for encoding and decoding audio signal and method thereof | |
Liu et al. | Compression artifacts in perceptual audio coding | |
CN1938758B (en) | Method and apparatus for determining an estimate | |
EP1808851A1 (en) | System and method for low power stereo perceptual audio coding using adaptive masking threshold | |
US20100223054A1 (en) | Single-microphone wind noise suppression | |
MX2011000368A (en) | Providing a time warp activation signal and encoding an audio signal therewith. | |
EP3113184B1 (en) | Method and device for voice activity detection | |
EP3739582B1 (en) | Voice detection | |
WO2005055197A3 (en) | Noise suppressor for speech coding and speech recognition | |
CN1997988A (en) | Method of making a window type decision based on MDCT data in audio encoding | |
CN101908342B (en) | Method for inhibiting pre-echoes of audio transient signals by utilizing frequency domain filtering post-processing | |
AU666612B2 (en) | Method and apparatus for encoding/decoding of background sounds | |
EP3614384B1 (en) | Method for estimating noise in an audio signal, noise estimator, audio encoder, audio decoder, and system for transmitting audio signals | |
CN101388213B (en) | Preecho control method | |
CN100492495C (en) | Apparatus and method for detecting noise | |
CN110709926A (en) | Apparatus and method for post-processing audio signals using prediction-based shaping | |
CN101393744A (en) | Method for regulating threshold and detection module | |
CN101271691A (en) | Time-domain noise reshaping instrument start-up judging method and device | |
JP2656069B2 (en) | Voice detection device | |
CN113205826B (en) | LC3 audio noise elimination method, device and storage medium | |
JP2006126372A (en) | Audio signal coding device, method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120222 Termination date: 20120703 |