CN116129924A

CN116129924A - Speex-based noise identification method

Info

Publication number: CN116129924A
Application number: CN202310056306.8A
Authority: CN
Inventors: 金海林; 俞振飞; 吕连新; 余海; 宋晹; 汪丽芳; 李一君; 刘旭伟; 张得军; 肖非
Original assignee: Radio and Television Group of Zhejiang; Hangzhou Linker Technology Co ltd
Current assignee: Radio and Television Group of Zhejiang; Hangzhou Linker Technology Co ltd
Priority date: 2023-01-17
Filing date: 2023-01-17
Publication date: 2023-05-16

Abstract

The invention discloses a Speex-based noise identification method, which comprises the following steps of: s1, using Spex to reduce noise on an original audio signal to obtain a primary noise reduction audio signal and a filtered primary noise signal; s2, calculating a noise value for each T milliseconds of a primary noise signal and recording the noise value into a primary noise array D; s3, calculating a primary noise average difference MD of the D; s4, the Speex noise reduction is used for the primary noise reduction audio signal again, and a secondary noise reduction audio signal and a filtered secondary noise signal are obtained; s5, calculating a noise value for the secondary noise signal every T2 milliseconds and enabling the noise value to reach a secondary noise array D2; s6, calculating a secondary noise average difference MD2 of the D2; s7, calculating a variation coefficient K of MD and MD2, and if the K is larger than a noise threshold value, indicating that noise exists. The method and the device have high accuracy and are suitable for the field of broadcast audio signal processing.

Description

Speex-based noise identification method

Technical Field

The invention relates to the field of audio signal analysis, in particular to a Speex-based noise identification method for broadcast audio signals.

Background

In the actual application scene of the broadcasting station, noise is received sometimes due to environmental factors, and people need to be detected and prompted, so that staff can process the noise in time. In the process of analyzing the audio signal, when the signal-to-noise ratio (SNR) of the audio signal is not high, noise misjudgment occurs with high probability, so that the usability of the system is reduced.

Speex is a set of open source free, unprotected applications mainly for speech, which includes not only codecs, but also practical modules such as AEC (echo cancellation), NS (denoising), etc. However, the accuracy of the pure Speex noise reduction is not sufficient, and the effect is not satisfactory for the broadcast audio signal.

Disclosure of Invention

The invention mainly solves the technical problem of low noise recognition accuracy in the prior art, and provides a Speex-based noise recognition method with high recognition rate.

The invention aims at the technical problems and is mainly solved by the following technical scheme: a Speex-based noise identification method comprising the steps of:

s1, using Spex to reduce noise on an original audio signal to obtain a primary noise reduction audio signal and a filtered primary noise signal;

s2, calculating a noise value for each T milliseconds of a primary noise signal, and recording the calculated noise value to a primary noise array D; t is a preset parameter;

s3, calculating a primary noise arithmetic mean V of the primary noise array D, and further calculating a primary noise average difference MD;

s4, the Speex noise reduction is used for the primary noise reduction audio signal again, and a secondary noise reduction audio signal and a filtered secondary noise signal are obtained;

s5, calculating a noise value for the secondary noise signal every T2 milliseconds and recording the calculated noise value to a secondary noise array D2; t2 is a preset parameter;

s6, calculating a secondary noise arithmetic mean V2 of the secondary noise array D2, and further calculating a secondary noise average difference MD2;

s7, calculating a variation coefficient K of the primary noise average difference MD and the secondary noise average difference MD2, wherein the formula is as follows:

K＝(MD2/MD)×100％

if K is greater than the noise threshold, then it indicates that noise is present.

Preferably, T and T2 are both 50.

Preferably, the noise threshold in step S7 is 80%.

Preferably, speex noise reduction is specifically:

a1, windowing, overlapping and time-frequency Fourier transformation are carried out on an input signal;

a2, calculating the frequency domain energy of the signal and the energy of the noisy signal in the critical frequency band;

a3, updating the noise energy by using a fixed iteration factor smoothing algorithm;

a4, updating noise spectrum energy;

a5, calculating a posterior signal-to-noise ratio and updating the prior signal-to-noise ratio;

a6, performing prior signal-to-noise ratio smoothing;

a7, calculating the EM algorithm gain in the critical frequency band and the EM algorithm gain on the linear frequency domain;

a8, applying amplitude spectrum gain2 to the Fourier transform amplitude spectrum;

a9, post-processing, including inverse Fourier transformation, addition and window function and overlap addition, finally obtaining the denoised time domain signal.

Preferably, the noise value is the number of noise updates in the statistical time period, that is, when the noise energy is updated by using the fixed iteration factor smoothing algorithm in the statistical time period, if there is noise update, the noise value is added with 1, the initial noise value is 0, and the final noise value is the calculated result.

Preferably, the arithmetic mean of primary noise is given by:

wherein n is the number of noise values in the primary noise array D, D _i The ith noise value in the primary noise array D;

the arithmetic mean of the secondary noise is the same as the arithmetic mean of the primary noise.

Preferably, the primary noise average difference MD is expressed as:

the algorithm of the secondary noise average difference is the same as that of the primary noise average difference.

The invention has the substantial effect of accurately identifying the noise in the broadcast audio signal and providing basis for subsequent timely processing.

Drawings

Fig. 1 is a flow chart of the present invention.

Detailed Description

The technical scheme of the invention is further specifically described below through examples and with reference to the accompanying drawings.

Examples: a Speex-based noise identification method in this embodiment, as shown in FIG. 1, includes the following steps:

s2, using the Spex noise-reduced noise estimated value to calculate a noise value every 50 milliseconds for a primary noise signal, 200 values are obtained in one second, and the calculated noise value is recorded into a primary noise array D;

s3, calculating a primary noise arithmetic mean V of the primary noise array D, and further calculating a primary noise average difference MD (comprehensively reflecting the variation degree of each unit sign value of the whole);

s5, calculating a noise value for the secondary noise signal every 50 milliseconds and recording the calculated noise value to a secondary noise array D2;

K＝(MD2/MD)×100％

if K is greater than the noise threshold (80%), then it indicates that noise is present.

1 second audio data byte number = sample size × sample rate × channel number/8; for example, a PCM encoded WAV file with a sampling rate of 48KHz, a sampling size of 16 bits, and a binaural PCM encoding rate of 48k 16 x 2=1536 Kb/s, and an audio data byte per second of 48k 16 x 2/8= 192000.

Spex noise reduction is directly realized by calling a library function, and specifically comprises the following steps:

1) preprocess_analysis () is mainly a commonly used signal processing algorithm such as windowed overlapping fourier (fft) transform;

2) update_noise_prob () updates the noise energy, where if there is a noise update, the noise value is increased by 1;

3) Updating mel noise spectrum energy;

4) Calculating the posterior signal-to-noise ratio and updating the prior signal-to-noise ratio;

5) The prior signal-to-noise ratio smoothing (zeta [ i ]) is used for calculating the background gain, and the calculating range comprises a fft domain and a Bark domain;

6) Calculating an EM algorithm gain in a Bark band (critical frequency band) and an EM algorithm gain on a linear frequency domain;

7) The amplitude spectrum gain2 is acted on the fft amplitude spectrum;

8) Post-processing, including inverse Fourier transform (ifft), addition of window functions, overlap addition, and finally obtaining the denoised time domain signal.

The noise value is the number of noise updates in the statistical time period, namely, when the noise energy is updated by using a fixed iteration factor smoothing algorithm in the statistical time period, if the noise update exists, the noise value is added with 1, the initial noise value is 0, and the final noise value is the calculated result.

The formula of the arithmetic mean of the primary noise is:

The formula of the primary noise average difference MD is:

The specific embodiments described herein are offered by way of example only to illustrate the spirit of the invention. Those skilled in the art may make various modifications or additions to the described embodiments or substitutions thereof without departing from the spirit of the invention or exceeding the scope of the invention as defined in the accompanying claims.

Although terms of noise reduction audio signal, noise value, coefficient of variation, etc. are used more herein, the possibility of using other terms is not excluded. These terms are used merely for convenience in describing and explaining the nature of the invention; they are to be interpreted as any additional limitation that is not inconsistent with the spirit of the present invention.

Claims

1. A Speex-based noise identification method, comprising the steps of:

K＝(MD2/MD)×100％

2. The Speex-based noise identification method of claim 1, wherein T and T2 are both 50.

3. The Speex-based noise identification method according to claim 1 or 2, wherein the noise threshold in step S7 is 80%.

4. The Speex-based noise identification method of claim 1, wherein Speex noise reduction is specifically:

a4, updating noise spectrum energy;

a6, performing prior signal-to-noise ratio smoothing;

5. The Spex-based noise identification method according to claim 4, wherein the noise value is the number of noise updates in a statistical time period, i.e. when the noise energy is updated by using a fixed iteration factor smoothing algorithm in the statistical time period, if there is a noise update, the noise value is added with 1, the initial noise value is 0, and the final noise value is the calculated result.

6. The Speex-based noise identification method according to claim 1 or 2, wherein the formula of the arithmetic mean of the primary noise is:

7. The Speex-based noise identification method according to claim 6, wherein the primary noise average difference MD is formulated as: