CN116129924A - Speex-based noise identification method - Google Patents
Speex-based noise identification method Download PDFInfo
- Publication number
- CN116129924A CN116129924A CN202310056306.8A CN202310056306A CN116129924A CN 116129924 A CN116129924 A CN 116129924A CN 202310056306 A CN202310056306 A CN 202310056306A CN 116129924 A CN116129924 A CN 116129924A
- Authority
- CN
- China
- Prior art keywords
- noise
- primary
- signal
- calculating
- speex
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/45—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
Abstract
The invention discloses a Speex-based noise identification method, which comprises the following steps of: s1, using Spex to reduce noise on an original audio signal to obtain a primary noise reduction audio signal and a filtered primary noise signal; s2, calculating a noise value for each T milliseconds of a primary noise signal and recording the noise value into a primary noise array D; s3, calculating a primary noise average difference MD of the D; s4, the Speex noise reduction is used for the primary noise reduction audio signal again, and a secondary noise reduction audio signal and a filtered secondary noise signal are obtained; s5, calculating a noise value for the secondary noise signal every T2 milliseconds and enabling the noise value to reach a secondary noise array D2; s6, calculating a secondary noise average difference MD2 of the D2; s7, calculating a variation coefficient K of MD and MD2, and if the K is larger than a noise threshold value, indicating that noise exists. The method and the device have high accuracy and are suitable for the field of broadcast audio signal processing.
Description
Technical Field
The invention relates to the field of audio signal analysis, in particular to a Speex-based noise identification method for broadcast audio signals.
Background
In the actual application scene of the broadcasting station, noise is received sometimes due to environmental factors, and people need to be detected and prompted, so that staff can process the noise in time. In the process of analyzing the audio signal, when the signal-to-noise ratio (SNR) of the audio signal is not high, noise misjudgment occurs with high probability, so that the usability of the system is reduced.
Speex is a set of open source free, unprotected applications mainly for speech, which includes not only codecs, but also practical modules such as AEC (echo cancellation), NS (denoising), etc. However, the accuracy of the pure Speex noise reduction is not sufficient, and the effect is not satisfactory for the broadcast audio signal.
Disclosure of Invention
The invention mainly solves the technical problem of low noise recognition accuracy in the prior art, and provides a Speex-based noise recognition method with high recognition rate.
The invention aims at the technical problems and is mainly solved by the following technical scheme: a Speex-based noise identification method comprising the steps of:
s1, using Spex to reduce noise on an original audio signal to obtain a primary noise reduction audio signal and a filtered primary noise signal;
s2, calculating a noise value for each T milliseconds of a primary noise signal, and recording the calculated noise value to a primary noise array D; t is a preset parameter;
s3, calculating a primary noise arithmetic mean V of the primary noise array D, and further calculating a primary noise average difference MD;
s4, the Speex noise reduction is used for the primary noise reduction audio signal again, and a secondary noise reduction audio signal and a filtered secondary noise signal are obtained;
s5, calculating a noise value for the secondary noise signal every T2 milliseconds and recording the calculated noise value to a secondary noise array D2; t2 is a preset parameter;
s6, calculating a secondary noise arithmetic mean V2 of the secondary noise array D2, and further calculating a secondary noise average difference MD2;
s7, calculating a variation coefficient K of the primary noise average difference MD and the secondary noise average difference MD2, wherein the formula is as follows:
K=(MD2/MD)×100%
if K is greater than the noise threshold, then it indicates that noise is present.
Preferably, T and T2 are both 50.
Preferably, the noise threshold in step S7 is 80%.
Preferably, speex noise reduction is specifically:
a1, windowing, overlapping and time-frequency Fourier transformation are carried out on an input signal;
a2, calculating the frequency domain energy of the signal and the energy of the noisy signal in the critical frequency band;
a3, updating the noise energy by using a fixed iteration factor smoothing algorithm;
a4, updating noise spectrum energy;
a5, calculating a posterior signal-to-noise ratio and updating the prior signal-to-noise ratio;
a6, performing prior signal-to-noise ratio smoothing;
a7, calculating the EM algorithm gain in the critical frequency band and the EM algorithm gain on the linear frequency domain;
a8, applying amplitude spectrum gain2 to the Fourier transform amplitude spectrum;
a9, post-processing, including inverse Fourier transformation, addition and window function and overlap addition, finally obtaining the denoised time domain signal.
Preferably, the noise value is the number of noise updates in the statistical time period, that is, when the noise energy is updated by using the fixed iteration factor smoothing algorithm in the statistical time period, if there is noise update, the noise value is added with 1, the initial noise value is 0, and the final noise value is the calculated result.
Preferably, the arithmetic mean of primary noise is given by:
wherein n is the number of noise values in the primary noise array D, D i The ith noise value in the primary noise array D;
the arithmetic mean of the secondary noise is the same as the arithmetic mean of the primary noise.
Preferably, the primary noise average difference MD is expressed as:
the algorithm of the secondary noise average difference is the same as that of the primary noise average difference.
The invention has the substantial effect of accurately identifying the noise in the broadcast audio signal and providing basis for subsequent timely processing.
Drawings
Fig. 1 is a flow chart of the present invention.
Detailed Description
The technical scheme of the invention is further specifically described below through examples and with reference to the accompanying drawings.
Examples: a Speex-based noise identification method in this embodiment, as shown in FIG. 1, includes the following steps:
s1, using Spex to reduce noise on an original audio signal to obtain a primary noise reduction audio signal and a filtered primary noise signal;
s2, using the Spex noise-reduced noise estimated value to calculate a noise value every 50 milliseconds for a primary noise signal, 200 values are obtained in one second, and the calculated noise value is recorded into a primary noise array D;
s3, calculating a primary noise arithmetic mean V of the primary noise array D, and further calculating a primary noise average difference MD (comprehensively reflecting the variation degree of each unit sign value of the whole);
s4, the Speex noise reduction is used for the primary noise reduction audio signal again, and a secondary noise reduction audio signal and a filtered secondary noise signal are obtained;
s5, calculating a noise value for the secondary noise signal every 50 milliseconds and recording the calculated noise value to a secondary noise array D2;
s6, calculating a secondary noise arithmetic mean V2 of the secondary noise array D2, and further calculating a secondary noise average difference MD2;
s7, calculating a variation coefficient K of the primary noise average difference MD and the secondary noise average difference MD2, wherein the formula is as follows:
K=(MD2/MD)×100%
if K is greater than the noise threshold (80%), then it indicates that noise is present.
1 second audio data byte number = sample size × sample rate × channel number/8; for example, a PCM encoded WAV file with a sampling rate of 48KHz, a sampling size of 16 bits, and a binaural PCM encoding rate of 48k 16 x 2=1536 Kb/s, and an audio data byte per second of 48k 16 x 2/8= 192000.
Spex noise reduction is directly realized by calling a library function, and specifically comprises the following steps:
1) preprocess_analysis () is mainly a commonly used signal processing algorithm such as windowed overlapping fourier (fft) transform;
2) update_noise_prob () updates the noise energy, where if there is a noise update, the noise value is increased by 1;
3) Updating mel noise spectrum energy;
4) Calculating the posterior signal-to-noise ratio and updating the prior signal-to-noise ratio;
5) The prior signal-to-noise ratio smoothing (zeta [ i ]) is used for calculating the background gain, and the calculating range comprises a fft domain and a Bark domain;
6) Calculating an EM algorithm gain in a Bark band (critical frequency band) and an EM algorithm gain on a linear frequency domain;
7) The amplitude spectrum gain2 is acted on the fft amplitude spectrum;
8) Post-processing, including inverse Fourier transform (ifft), addition of window functions, overlap addition, and finally obtaining the denoised time domain signal.
The noise value is the number of noise updates in the statistical time period, namely, when the noise energy is updated by using a fixed iteration factor smoothing algorithm in the statistical time period, if the noise update exists, the noise value is added with 1, the initial noise value is 0, and the final noise value is the calculated result.
The formula of the arithmetic mean of the primary noise is:
wherein n is the number of noise values in the primary noise array D, D i The ith noise value in the primary noise array D;
the arithmetic mean of the secondary noise is the same as the arithmetic mean of the primary noise.
The formula of the primary noise average difference MD is:
the algorithm of the secondary noise average difference is the same as that of the primary noise average difference.
The specific embodiments described herein are offered by way of example only to illustrate the spirit of the invention. Those skilled in the art may make various modifications or additions to the described embodiments or substitutions thereof without departing from the spirit of the invention or exceeding the scope of the invention as defined in the accompanying claims.
Although terms of noise reduction audio signal, noise value, coefficient of variation, etc. are used more herein, the possibility of using other terms is not excluded. These terms are used merely for convenience in describing and explaining the nature of the invention; they are to be interpreted as any additional limitation that is not inconsistent with the spirit of the present invention.
Claims (7)
1. A Speex-based noise identification method, comprising the steps of:
s1, using Spex to reduce noise on an original audio signal to obtain a primary noise reduction audio signal and a filtered primary noise signal;
s2, calculating a noise value for each T milliseconds of a primary noise signal, and recording the calculated noise value to a primary noise array D; t is a preset parameter;
s3, calculating a primary noise arithmetic mean V of the primary noise array D, and further calculating a primary noise average difference MD;
s4, the Speex noise reduction is used for the primary noise reduction audio signal again, and a secondary noise reduction audio signal and a filtered secondary noise signal are obtained;
s5, calculating a noise value for the secondary noise signal every T2 milliseconds and recording the calculated noise value to a secondary noise array D2; t2 is a preset parameter;
s6, calculating a secondary noise arithmetic mean V2 of the secondary noise array D2, and further calculating a secondary noise average difference MD2;
s7, calculating a variation coefficient K of the primary noise average difference MD and the secondary noise average difference MD2, wherein the formula is as follows:
K=(MD2/MD)×100%
if K is greater than the noise threshold, then it indicates that noise is present.
2. The Speex-based noise identification method of claim 1, wherein T and T2 are both 50.
3. The Speex-based noise identification method according to claim 1 or 2, wherein the noise threshold in step S7 is 80%.
4. The Speex-based noise identification method of claim 1, wherein Speex noise reduction is specifically:
a1, windowing, overlapping and time-frequency Fourier transformation are carried out on an input signal;
a2, calculating the frequency domain energy of the signal and the energy of the noisy signal in the critical frequency band;
a3, updating the noise energy by using a fixed iteration factor smoothing algorithm;
a4, updating noise spectrum energy;
a5, calculating a posterior signal-to-noise ratio and updating the prior signal-to-noise ratio;
a6, performing prior signal-to-noise ratio smoothing;
a7, calculating the EM algorithm gain in the critical frequency band and the EM algorithm gain on the linear frequency domain;
a8, applying amplitude spectrum gain2 to the Fourier transform amplitude spectrum;
a9, post-processing, including inverse Fourier transformation, addition and window function and overlap addition, finally obtaining the denoised time domain signal.
5. The Spex-based noise identification method according to claim 4, wherein the noise value is the number of noise updates in a statistical time period, i.e. when the noise energy is updated by using a fixed iteration factor smoothing algorithm in the statistical time period, if there is a noise update, the noise value is added with 1, the initial noise value is 0, and the final noise value is the calculated result.
6. The Speex-based noise identification method according to claim 1 or 2, wherein the formula of the arithmetic mean of the primary noise is:
wherein n is the number of noise values in the primary noise array D, D i The ith noise value in the primary noise array D;
the arithmetic mean of the secondary noise is the same as the arithmetic mean of the primary noise.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310056306.8A CN116129924A (en) | 2023-01-17 | 2023-01-17 | Speex-based noise identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310056306.8A CN116129924A (en) | 2023-01-17 | 2023-01-17 | Speex-based noise identification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116129924A true CN116129924A (en) | 2023-05-16 |
Family
ID=86298882
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310056306.8A Pending CN116129924A (en) | 2023-01-17 | 2023-01-17 | Speex-based noise identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116129924A (en) |
-
2023
- 2023-01-17 CN CN202310056306.8A patent/CN116129924A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7313518B2 (en) | Noise reduction method and device using two pass filtering | |
EP2151822B1 (en) | Apparatus and method for processing and audio signal for speech enhancement using a feature extraction | |
US8712074B2 (en) | Noise spectrum tracking in noisy acoustical signals | |
EP2416315B1 (en) | Noise suppression device | |
WO2000036592A1 (en) | Improved noise spectrum tracking for speech enhancement | |
CN111883182B (en) | Human voice detection method, device, equipment and storage medium | |
Verteletskaya et al. | Noise reduction based on modified spectral subtraction method | |
EP3757993A1 (en) | Pre-processing for automatic speech recognition | |
Sanam et al. | A semisoft thresholding method based on Teager energy operation on wavelet packet coefficients for enhancing noisy speech | |
CN109102823B (en) | Speech enhancement method based on subband spectral entropy | |
US6377918B1 (en) | Speech analysis using multiple noise compensation | |
CN111968651A (en) | WT (WT) -based voiceprint recognition method and system | |
CN116129924A (en) | Speex-based noise identification method | |
CN103270772B (en) | Signal handling equipment, signal processing method | |
EP2063420A1 (en) | Method and assembly to enhance the intelligibility of speech | |
Bolisetty et al. | Speech enhancement using modified wiener filter based MMSE and speech presence probability estimation | |
JPH113091A (en) | Detection device of aural signal rise | |
Lan et al. | Research on Speech Enhancement Algorithm of Multiresolution Cochleagram Based on Skip Connection Deep Neural Network | |
Singh et al. | A wavelet based method for removal of highly non-stationary noises from single-channel hindi speech patterns of low input SNR | |
CN113611321B (en) | Voice enhancement method and system | |
CN109346097B (en) | Speech enhancement method based on Kullback-Leibler difference | |
da Silva et al. | Comparative Study between the Discrete-Frequency Kalman Filtering and the Discrete-Time Kalman Filtering with Application in Noise Reduction in Speech Signals | |
Ni et al. | Multi-channel dictionary learning speech enhancement based on power spectrum | |
Sunitha et al. | NOISE ROBUST SPEECH RECOGNITION UNDER NOISY ENVIRONMENTS. | |
CN116524944A (en) | Audio noise reduction method, medium, device and computing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |