CN102280103A

CN102280103A - Audio signal transient-state segment detection method based on variance

Info

Publication number: CN102280103A
Application number: CN201110219767XA
Authority: CN
Inventors: 张涛; 周延献; 张瑞生; 邢东亮
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2011-08-02
Filing date: 2011-08-02
Publication date: 2011-12-14

Abstract

The invention relates to the audio technology, and provides an audio signal detection method capable of effectively reducing redundant detection of low-energy transient-state segment. The technical scheme is that: the audio signal transient-state segment detection method based on variance comprises the following steps: 1, inputting an audio signal, and framing the audio signal, wherein the frame length is N; calculating the average and standard deviation of each frame according to the sample average and standard deviation formula respectively; comparing the average and standard deviation with a threshold to judge whether the frame is a pre-transient state frame; dividing the pre-transient state frame into short signals with length of M; judging the transient state/stable state of the frame again; judging the transient state/stable state of the processed frame; and if the frame is a transient-state frame, marking the position at which the transient state occurs. The method provided by the invention is mainly used for testing an audio signal.

Description

Sound signal transition segment detection method based on variance

Technical field

The present invention relates to Audiotechnica, relate in particular to sound signal transition segment detection side to, specifically relate to sound signal transition segment detection method based on variance.

Background technology

In the low Bit Rate Audio Coding of present high-quality, encryption algorithm generally adopts the piece transform process method, with the windowing method continuous sound signal is divided into audio data block, and each audio data block is carried out time-frequency conversion, quantization encoding, and then store or transmit.Help removing the redundance of signal based on the audio coding algorithm of piece conversion, improve ratio of compression.Yet in the audio coding technology based on piecemeal, the Pre echoes distortion is the problem that is difficult to solution always.The basic reason that the Pre echoes distortion produces is that the temporal resolution deficiency causes the diffusion of quantizing noise in time domain.Especially transformed to frequency domain when carrying out quantization encoding by piecemeal when the transient state in sound signal part, quantizing noise is diffused on the whole transform block, Pre echoes will occur.When lower or ratio of compression is higher when target bit rate, Pre echoes will be more obvious.。

The method that solves the Pre echoes problem at present mainly contains: bit pond, time domain noise-shaped, the switching of length piece, hybrid filter-bank, gain control etc.These methods all are to be prerequisite with accurate detection sound signal transition segment.Sound signal transition segment detection algorithm can be divided three classes: time domain energy detection method, frequency domain energy measuring method and time and frequency zone complex energy detection method.Yet these methods at algorithm complex, accuracy (omission and flase drop) but, exist on temporal resolution and the practicality not enough.For example, " perceptual entropy " algorithm detected transient signal that adopts among the MPEG-2AAC, but its threshold value difference is big, is not easy to use.The time and frequency zone detection algorithm utilizes interblock capacity volume variance and unpredictable degree to carry out transition segment and detects, but has the higher-energy transition segment and can have influence on next transition segment and detect for previous, can cause flase drop like this.Smoothly estimate the deficiency that detection algorithm has remedied above-mentioned algorithm to a certain extent, have practical potentiality.Yet smoothly estimate detection method and can detect the lower transition segment of energy, cause some unnecessary detections.

Summary of the invention

For overcoming the deficiencies in the prior art, a kind of sound signal transition segment detection method based on variance is provided, this method can effectively reduce the redundancy detection of low-yield transition segment.For reaching above-mentioned purpose, the technical scheme that the present invention takes is that the sound signal transition segment detection method based on variance comprises the following steps:

Step 1: import a section audio signal, divide frame to handle sound signal, frame length is N, according to formula:

Sample mean:

\overset{&OverBar;}{X} = \frac{1}{N} Σ_{i = 1}^{N} | X_{i} | - - - (1)

The standard deviation formula:

σ = \sqrt{\frac{Σ_{i = 1}^{N} {(| X_{i} | - \overset{&OverBar;}{X})}^{2}}{N - 1}} - - - (2)

X wherein _iI sample value in the expression current block, | X _i| expression is to X _iAsk absolute value, N is the sample value sum in the current block,

Represent the mean value of current block sample value, calculate the average of every frame respectively

And standard deviation sigma, compare with threshold value SD_TH then, as σ f SD_TH, judge that this frame is pre-transient state frame; As σ p SD_TH, judge that this frame is the stable state frame;

Step 2: pre-transient state frame is divided into the short signal section that length is M, each short signal section adopt smoothly estimate, a kind of detection method in the perceptual entropy, time domain energy, frequency domain energy detects the state of each short signal section of mark;

Step 3: according to the state of the short signal section that obtains in the step 2, secondary is adjudicated the transient state/lower state of this frame;

Step 4: according to the judged result of step 1 and step 3, judge the transient state/lower state of handled frame, if the transient state frame, and the position of mark transient state generation.

Described method is refined as: divide frame to handle input audio signal, and frame length N=1024, at first basis (1) formula is calculated the mean value of a frame signal sample

According to (2) formula basis of calculation difference σ, carry out then comparing with threshold value SD_TH, SD_TH=2.8 as σ f SD_TH, judges that this frame is pre-transient state frame; As σ p SD_TH, judge that this frame is the stable state frame;

When this frame is pre-transient state frame, carry out the frame data piecemeal, whether detect each piece with smooth detection technique is transition segment:

At first, these frame data are carried out piecemeal, block length M=128 presses each small-signal section respectively

Ask the smooth TFM of estimating of time domain and the smooth FFM of estimating of frequency domain;

Secondly, carry out that time domain is smooth estimates judgement, TFM f T_TH, T_TH=0.12 judge that this section is transition segment, and the mark transition segment, otherwise carry out that frequency domain is smooth estimates judgement, FFM p F_TH, F_TH=0.38, judge that this section is transition segment, and mark, otherwise carry out time domain-frequency domain synthesis judgement

TF_TH=1.4 judges this section for transition segment, and mark otherwise be the stable state section;

Conclude according to whether containing transition segment in this pre-transient state frame whether this frame is the transient state frame at last, and mark the position of each transition segment.

The present invention has following technique effect:

Transition segment based on variance proposed by the invention detects general framework and compares with the existing detection method of independent employing, produces unnecessary detection hardly on the lower transition segment of energy.And can with than the littler signal segment of frame as detected object, improve temporal resolution, the actual conditions that testing result more can approach signal for the follow-up work in the coding has reduced certain complexity, and have improved coding quality.In addition, this general framework has advantages such as accuracy in detection height, algorithm be simple.

Description of drawings

Fig. 1 is based on the transition segment testing process figure of this method;

The transition segment that Fig. 2 is based on this method detects test result figure, and (a) be time domain waveform among the figure, (b) estimates the fast transient state situation of detection signal for smooth, (c) is that standard deviation-smooth is estimated the fast transient state situation of comprehensive detection signal.

Embodiment

At this redundancy detection, the present invention proposes a kind ofly to detect sound signal transition segment general framework based on variance, when the lower but sample value fluctuation of whole frame signal energy is big, utilizes the variance of putting in order frame signal to reduce redundancy detection.

The most information that the transient phenomena of signal and irregular structure (as the saltus step of signal waveform, interruption, quick oscillation etc.) are usually carrying signal, the signal that comprises this transient phenomena and irregular structure, be commonly referred to as transient signal, because signal becomes duality relation at frequency domain with time domain, be that signal suddenlys change on time domain, it is smooth just to become on frequency spectrum.Otherwise signal is smooth on time domain, the sudden change of corresponding frequency spectrum.According to the scope of FM, FM the sudden change and smooth on good judgement can be arranged.Can judge promptly whether signal is transient signal, but general frequency domain transition segment detection algorithm can detect the lower transient signal of energy, cause unnecessary detection, main cause is can be detected as transient signal when the fluctuation of the lower but sample value of whole frame signal energy is big.But for standard deviation, the sample difference is just smaller comparatively speaking, can it be excluded by appropriate threshold value is set.Thereby whole frame is done variance just can suitably solve the problem that low-yield transition segment is detected.

Standard deviation is a kind of tolerance of one group of data mean value degree of scatter.A bigger standard deviation is represented between most of numerical value and its mean value to differ greatly; A less standard deviation represents these numerical value near mean value.

Sample mean:

\overset{&OverBar;}{X} = \frac{1}{N} Σ_{i = 1}^{N} | X_{i} | - - - (1)

The standard deviation formula:

σ = \sqrt{\frac{Σ_{i = 1}^{N} {(| X_{i} | - \overset{&OverBar;}{X})}^{2}}{N - 1}} - - - (2)

Represent the mean value of current block sample value.

The present invention can have a distinct increment in the little unnecessary context of detection of transition segment of elimination energy to transition segment detection algorithm commonly used.Algorithm is implemented according to Fig. 1 process flow diagram, and operation steps is as follows:

Step 1: import a section audio signal, divide frame to handle (frame length is N) sound signal, according to formula (1), (2) calculate the average of every frame respectively

And standard deviation sigma, compare with threshold value SD_TH then, as σ f SD_TH, judge that this frame is pre-transient state frame; As σ p SD_TH, judge that this frame is the stable state frame.

Step 2: pre-transient state frame is divided into the short signal section that length is M, and each short signal section can detect with detection method commonly used, such as: smoothly estimate perceptual entropy, time domain energy, frequency domain energy etc.The state of each short signal section of mark.

Step 3: according to the state of the short signal section that obtains in the step 2, secondary is adjudicated the transient state/lower state of this frame.

Further describe the present invention below in conjunction with drawings and Examples.

According to Fig. 1 transition segment testing process figure, divide frame to handle (frame length N=1024) input audio signal, at first calculate the mean value of a frame signal sample according to (1) formula

According to (2) formula basis of calculation difference σ, carry out then comparing with threshold value SD_TH (SD_TH=2.8).As σ f SD_TH, judge that this frame is pre-transient state frame; As σ p SD_TH, judge that this frame is the stable state frame.

When this frame is pre-transient state frame, carry out the frame data piecemeal, whether detect each piece with smooth detection technique is transition segment.At first, these frame data are carried out piecemeal, block length M=128 presses each small-signal section respectively

Ask the smooth TFM of estimating of time domain and the smooth FFM of estimating of frequency domain.

Secondly, carry out that time domain is smooth estimates judgement, TFM f T_TH (T_TH=0.12) judges that this section is transition segment, and the mark transition segment, otherwise carrying out that frequency domain is smooth estimates judgement, FFM p F_TH (F_TH=0.38) judges that this section is transition segment, and mark, otherwise carry out time domain-frequency domain synthesis judgement (TF_TH=1.4) judge that this section is transition segment, and mark.Otherwise be the stable state section.

Carry sound signal smlsplas.wav as test signal with MATLAB, the time domain of signal is shown in Fig. 2-a, earlier with signal with the smooth method detected transient section of estimating, the result is shown in Fig. 2-b, and then signal estimated comprehensive method detected transient section with variance-smooth, the result is shown in Fig. 2-c.By comparative analysis Fig. 2-b and Fig. 2-c, find that variance-smooth estimates detection algorithm under the smaller situation of transition segment energy, performance is more outstanding.The transient signal lower to those energy, the unnecessary transition segment context of detection effect that does not influence tonequality is obvious eliminating.In application, can reduce the complexity of follow-up work in the scrambler like this, improve code efficiency.

Employing experimentizes to 5 kinds of audio-frequency test signals respectively based on standard deviation-smooth sound signal transition segment detection algorithm of estimating, and the block of detection all is 100 (sample number is 100*128), and testing result is as shown in table 1.

The contrast of table 1 transition segment testing result

Claims

1. the sound signal transition segment detection method based on variance is characterized in that, may further comprise the steps:

Step 1: import a section audio signal, divide frame to handle sound signal, frame length is N, according to formula: sample mean:

\overset{&OverBar;}{X} = \frac{1}{N} Σ_{i = 1}^{N} | X_{i} | - - - (1)

The standard deviation formula:

σ = \sqrt{\frac{Σ_{i = 1}^{N} {(| X_{i} | - \overset{&OverBar;}{X})}^{2}}{N - 1}} - - - (2)

2. the method for claim 1 is characterized in that, described method is refined as:

Divide frame to handle input audio signal, frame length N=1024, at first basis (1) formula is calculated the mean value of a frame signal sample