CN112201273B - Noise power spectral density calculation method, system, equipment and medium - Google Patents

Noise power spectral density calculation method, system, equipment and medium Download PDF

Info

Publication number
CN112201273B
CN112201273B CN201910612851.4A CN201910612851A CN112201273B CN 112201273 B CN112201273 B CN 112201273B CN 201910612851 A CN201910612851 A CN 201910612851A CN 112201273 B CN112201273 B CN 112201273B
Authority
CN
China
Prior art keywords
signal
noise signal
frequency domain
time domain
domain noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910612851.4A
Other languages
Chinese (zh)
Other versions
CN112201273A (en
Inventor
陈孝良
奚少亨
冯大航
常乐
苏少炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SoundAI Technology Co Ltd
Original Assignee
Beijing SoundAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SoundAI Technology Co Ltd filed Critical Beijing SoundAI Technology Co Ltd
Priority to CN201910612851.4A priority Critical patent/CN112201273B/en
Publication of CN112201273A publication Critical patent/CN112201273A/en
Application granted granted Critical
Publication of CN112201273B publication Critical patent/CN112201273B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0224Processing in the time domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

A noise power spectral density calculation method, comprising: collecting a time domain noise signal, and processing the time domain noise signal to obtain a frequency domain noise signal; processing the frequency domain noise signal by adopting an adaptive filter to obtain a time domain error signal, solving the error signal by adopting a normalized minimum mean square error algorithm by adopting an adaptive filter algorithm part, and further solving the error signal by adopting a block frequency domain adaptive filter algorithm; the power spectral density of the time domain error signal is calculated and the noise power spectral density is calculated from the power spectral density of the error signal. The invention also discloses a noise power spectrum density computing system, electronic equipment and a storage medium. The invention enables the noise power spectrum density estimation to be more accurate, the calculated amount to be reduced, and the subsequent noise reduction effect to be better.

Description

Noise power spectral density calculation method, system, equipment and medium
Technical Field
The present invention relates to the field of noise processing, and in particular, to a method, a system, an apparatus, and a medium for calculating noise power spectral density.
Background
Along with the development of communication technology, the voice interaction technology is mature, and is widely applied to intelligent equipment such as mobile phones, intelligent sound boxes and intelligent houses. However, there are still some problems in practical use scenarios, such as in far field and noise scenarios, where noise and interference techniques are required to obtain clean speech signals for wake-up and speech recognition.
Because the background noise in the signal received by the microphone occupies a large component under far field conditions, noise suppression algorithms (such as spectral subtraction, wiener filtering and energy-based filtering algorithms) are needed to estimate the power spectral density of the background noise, but these algorithms have some drawbacks, such as larger calculated amount and lower calculation accuracy of the wiener filtering algorithms. Therefore, there is a need to propose an improved noise power spectral density calculation method to reduce the amount of calculation and improve the calculation accuracy.
Disclosure of Invention
Aiming at the technical problems existing at present, the main purpose of the invention is to provide a noise power spectral density calculation method, a system, equipment and a medium, which are used for at least partially solving the technical problems.
A first aspect of an embodiment of the present invention provides a method for calculating noise power spectral density, including: collecting a time domain noise signal, and processing the time domain noise signal to obtain a frequency domain noise signal; processing the frequency domain noise signal by adopting an adaptive filter to obtain a time domain error signal; the power spectral density of the time domain error signal is calculated and the noise power spectral density is calculated from the power spectral density of the error signal.
Optionally, the frequency domain noise signal includes a first frequency domain noise signal and a second frequency domain noise signal, and processing the frequency domain noise signal by using an adaptive filter to obtain an error signal includes: carrying out normalized minimum mean square error processing on the first frequency domain noise signal to obtain a filter impulse response signal corresponding to the first frequency domain noise signal; convolving the first frequency domain noise signal and the corresponding filter impulse response signal to obtain an estimated signal; and subtracting the second frequency domain noise signal from the estimated signal, and obtaining a time domain error signal through inverse Fourier transform.
Optionally, the frequency domain noise signal includes a first frequency domain noise signal and a second frequency domain noise signal, and processing the frequency domain noise signal by using an adaptive filter to obtain an error signal includes: and performing block adaptive filtering processing on the first frequency domain noise signal by adopting a block frequency domain adaptive filtering algorithm so as to reduce the calculated amount.
Optionally, the method further comprises: and updating the filter coefficient of the adaptive filter according to the error signal.
Optionally, collecting the time domain noise signal, and processing the time domain noise signal includes: collecting a first time domain noise signal and a second time domain noise signal; and framing, windowing and performing fast Fourier transform on the first time domain noise signal and the second time domain noise signal to obtain a first frequency domain noise signal corresponding to the first time domain noise signal and a second frequency domain noise signal corresponding to the second time domain noise signal.
Optionally, in the process of performing block adaptive filtering processing on the first frequency domain noise signal, partial removal and zero padding are required for the point location data corresponding to the first frequency domain noise signal.
Optionally, the power spectral density of the time domain error signal is calculated as follows:
ΓEE(k)=FFT(γee(τ))=FFT(e(n)·e(n+τ))
Where e (n) represents the time domain error signal, γ ee (τ) is the autocorrelation function of e (n), τ is the time delay, FFT () represents the fast fourier transform operation, Γ EE (k) represents the power spectral density of e (n).
A first aspect of an embodiment of the present invention provides a noise power spectral density calculation system, including: the first processing module is used for acquiring time domain noise signals and processing the time domain noise signals to obtain frequency domain noise signals; the second processing module is used for processing the frequency domain noise signal by adopting the adaptive filter to obtain a time domain error signal; and the calculation module is used for calculating the power spectrum density of the time domain error signal and calculating the noise power spectrum density according to the power spectrum density of the error signal.
Optionally, the frequency domain noise signal includes a first frequency domain noise signal and a second frequency domain noise signal, the second processing module processes the frequency domain noise signal by adopting an adaptive filter, and obtaining the error signal includes: carrying out normalized minimum mean square error processing on the first frequency domain noise signal to obtain a filter impulse response signal corresponding to the first frequency domain noise signal; convolving the first frequency domain noise signal and the corresponding filter impulse response signal to obtain an estimated signal; and subtracting the second frequency domain noise signal from the estimated signal, and obtaining a time domain error signal through inverse Fourier transform.
Optionally, the frequency domain noise signal includes a first frequency domain noise signal and a second frequency domain noise signal, the second processing module processes the frequency domain noise signal by adopting an adaptive filter, and obtaining the error signal includes: and performing block adaptive filtering processing on the first frequency domain noise signal by adopting a block frequency domain adaptive filtering algorithm so as to reduce the calculated amount.
A third aspect of an embodiment of the present invention provides an electronic device, including: the noise power spectrum density calculating method is characterized in that the noise power spectrum density calculating method provided by the first aspect of the embodiment of the invention is realized when the processor executes the program.
A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method for calculating noise power spectral density provided by the first aspect of the embodiments of the present invention.
According to the embodiment of the invention, when the noise signal is processed, the adaptive filter is adopted to process the signal, the NLMS algorithm in the adaptive filter is adopted to replace the wiener solution, so that the noise estimation is more accurate, the calculated amount is reduced, and the subsequent noise reduction effect is better. Furthermore, in the adaptive filter algorithm part, the block frequency domain adaptive filter algorithm solves the error signal, so that the calculated amount can be further reduced.
Drawings
For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
Fig. 1 schematically shows a flow diagram of a noise power spectral density calculation method according to an embodiment of the invention.
Fig. 2 schematically shows a flow diagram of a noise power spectral density calculation method according to an embodiment of the invention.
Fig. 3 schematically shows a flow chart of an NLMS algorithm according to an embodiment of the present invention.
Fig. 4 schematically shows a flow diagram of a noise power spectral density calculation method according to an embodiment of the invention.
Fig. 5 schematically shows a flow chart of a block frequency domain adaptive filtering algorithm according to an embodiment of the invention.
Fig. 6 schematically shows a schematic structure of a noise power spectral density calculation system according to an embodiment of the invention.
Fig. 7 schematically shows a block diagram of a hardware structure of an electronic device according to an embodiment of the invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more comprehensible, the technical solutions in the embodiments of the present invention will be clearly described in conjunction with the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Noise signals come mostly from random and unspecifically generated noise sources, such as those subjected to various types of vibration sources, high-speed vehicles, and loud sounds of people speaking. Public places such as parks, schools and sports have noise generated when people move, and sounds generated by various household appliances and household appliances in the room, such as televisions, videos, air conditioners, cooking ventilators, air conditioning equipment and tableware during dining collision, are noise.
Noise can have an impact on the interactive delivery of normal language signals. Such as: when the intelligent sound box is used for listening to songs, when a voice instruction of' playing songs is sent, the voice instruction generally contains noise generated by household appliances or household appliances, so that the intelligent sound box may not accurately recognize the voice instruction. The noise power spectral density calculation method, the system, the equipment and the medium provided by the invention can be used for rapidly and accurately calculating the power spectral density of the noise so as to perform noise reduction treatment, greatly improve the accuracy of intelligent sound box identification and provide better experience for the masses. And, for example: when the double-microphone spectral subtraction is used for noise reduction, estimation of the power spectrum X PSD of the noise signal and the power spectrum N PSD of the noise signal are needed, then the subtraction is carried out, and then the phase of the original noise signal is used for recovery, so that the noise-reduced signal is obtained. The noise power spectral density calculation method, system, equipment and medium provided by the invention can be used for obtaining N PSD and substituting the N PSD into spectral subtraction. The algorithm requires that N PSD estimate be as close to the true value as possible. Also, for example: when the wiener filter algorithm is used for noise reduction, the wiener solution W (also called gain or weight) of each frequency band also needs to be updated by X PSD and N PSD respectively with each calculated value. And carrying out the conjugate weighting on the received signal X to obtain Y, and finally inversely converting the received signal X back to a time domain, and recovering the time domain noise reduction signal Y by using an overlap addition or overlap preservation method. The N PSD can also be used for calculating the noise power spectrum density, the system, the equipment and the medium, the estimation is more accurate, and the calculated amount is small.
The noise power spectrum density calculation method, the system, the equipment and the medium provided by the invention have wide application scenes, and the invention can be adopted for the estimation of the noise signal power spectrum N PSD. The present invention will be described in detail below.
Referring to fig. 1, fig. 1 is a flowchart of a noise power spectrum density calculating method according to an embodiment of the invention, the method mainly includes the following steps:
s101, acquiring a time domain noise signal, and processing the time domain noise signal to obtain a frequency domain noise signal.
The device used for collecting the time domain noise signal can be a sound sensor, a microphone, etc., and the invention is not limited in particular. In this embodiment, two paths of time domain noise signals are collected. When the time domain noise signal is solved to obtain an error signal in the later stage, the specific solving process is realized in the frequency domain, so that the time domain noise signal needs to be processed and converted into the frequency domain noise signal.
S102, processing the frequency domain noise signal by adopting an adaptive filter to obtain a time domain error signal.
An adaptive filter is a filter that uses the results of the filtering parameters that have been obtained at a previous time instant to automatically adjust the filtering parameters at the current time instant during an iteration according to some predetermined criteria to adapt to the unknown or time-varying statistical characteristics of the signal and noise, thereby achieving optimal filtering. The method does not need prior knowledge about the input signal, the calculated amount is small, the solution is a gradually converging process, and the error is smaller and smaller along with the convergence of the filter. The convergence range is defined as how many sampling points the algorithm passes through, and the result can be close to a stable state.
In this implementation, an adaptive filter is used to solve the time domain noise signal instead of the wiener solving section to obtain an error signal. In the solving process, the self-adaptive filter filters the statistical characteristics of time domain noise signals which are input in a self-adaptive manner along with time variation, and the error signals can be obtained by updating the filter coefficients along with continuous iteration.
The algorithm of the adaptive filter can be an NLMS algorithm or a block frequency domain adaptive filtering algorithm, the specific algorithm is not limited by the invention, other algorithms (such as LEAKY NLMS algorithm) are adopted, and the characteristic of gradually converging and solving the adaptive filter is combined, so that the error signal of the time domain noise signal can be obtained.
S103, calculating the power spectral density of the time domain error signal, and calculating the noise power spectral density according to the power spectral density of the error signal.
According to the embodiment, the adaptive filter replaces the wiener solving part, so that the noise power spectral density calculating precision can be improved, and the calculated amount of noise signals can be reduced.
Referring to fig. 2, fig. 2 is a flowchart of a noise power spectrum density calculating method according to an embodiment of the invention, the method mainly includes the following steps:
S201, collecting a time domain noise signal, and processing the time domain noise signal to obtain a frequency domain noise signal.
In the above operation S201, a noise signal is collected by using two microphones, and the collected noise signal is a time domain noise signal, wherein one microphone collects a first time domain noise signal and the other microphone collects a second time domain noise signal;
Respectively framing and windowing the acquired first time domain noise signal and the acquired second time domain noise signal; and performing fast Fourier transform on the signals subjected to framing and windowing to obtain a first frequency domain noise signal corresponding to the first time domain noise signal and a second frequency domain noise signal corresponding to the second time domain noise signal.
S202, processing the frequency domain noise signal by adopting an adaptive filter to obtain a time domain error signal, wherein an algorithm part of the adaptive filter adopts a normalized minimum mean square error algorithm.
In operation S202, the adaptive filtering algorithm adopted by the adaptive filter is a Normalized LEAST MEAN Square, NLMS, which redefines the modified velocity μ value used to adjust the weighting parameter in the minimum mean Square error algorithm (LMS), so that the μ value changes with the regularization of the input filter signal, thereby effectively improving convergence stability.
Referring to fig. 3, fig. 3 is a flow chart of an NLMS algorithm, and the algorithm process is as follows:
the coefficients of the adaptive filter may be initialized to 0 in its entirety, or may be initialized according to other criteria, and the present invention is not limited in the specific initialization process.
Taking the first frequency domain noise signal obtained in the step S201 as an adaptive filter input x (n), and processing the signal by an NLMS algorithm to obtain a filter impulse response signal h (n) of the first frequency domain noise signal, wherein the formula of the filter impulse response is as follows:
where α is a fixed step size, ζ is a regularization factor, X (n) is an input signal vector, X (n) = [ X (n-m+1), X (n-m+2)..x (n) ].
And carrying out convolution processing on the first frequency domain noise signal and the corresponding filter impulse response signal to obtain an estimated signal, namely an estimated signal y (n) =x (n) ×h (n), wherein the middle asterisk represents convolution.
The second frequency domain noise signal obtained in operation S201 is subtracted from the estimated signal, and the inverse fast fourier transform is performed to obtain a time domain error signal, i.e. a time domain error signal e (n) =d (n) -y (n) =d (n) -x (n) ×h (n).
S203, calculating the power spectrum density of the time domain error signal, and calculating the noise power spectrum density according to the power spectrum density of the error signal.
The power spectral density of the time domain error signal is calculated as follows:
ΓEE(k)=FFT(γee(τ))=FFT(e(n)·e(n+τ))
Where e (n) represents the time domain error signal, γ ee (τ) is the autocorrelation function of e (n), τ is the time delay, Γ EE (k) represents the power spectral density of e (n), and FFT (γ ee (τ)) represents the fast fourier transform operation on γ ee (τ). Fast fourier transform operations refer to the conversion of a signal from the original domain (usually time or space) to a representation of the frequency domain or vice versa. The FFT can quickly calculate such a transformation by decomposing the DFT matrix into products of sparse (mostly zero) factors, and according to the theory of power spectral density function (PSD function), it can be known that the FFT operation is performed on the autocorrelation function of a time domain signal, so as to obtain the power spectral density function of the time domain signal, and further obtain the power spectral density of the time domain signal.
The noise power spectral density is calculated as follows:
ΓRR(k)=FFT(γrr(τ))=FFT(r(n)·r(n+τ))
ΓLL(k)=FFT(γll(τ))=FFT(l(n)·l(n+τ))
Wherein Γ root (k) is an intermediate variable, re represents the real part; gamma rr (τ) is the autocorrelation function of the first time domain noise signal, τ is the time delay, gamma ll (τ) is the autocorrelation function of the second time domain noise signal, τ is the time delay; Γ EE (k) represents the power spectral density of e (n), Γ RR (k) represents the power spectral density of the first time domain noise signal, Γ LL (k) represents the power spectral density of the second time domain noise signal, Γ NN (k) represents the power spectral density of the final noise; Representing a spatial interconnection correlation function model in the scattering noise field environment, d LR represents the distance of two microphones in units of per meter. c represents the propagation speed of sound in the medium in meters per second; h (k) is the adaptive filter frequency domain coefficient, k represents the frequency domain index.
Compared with the wiener solution, the noise power spectrum density calculation method based on the embodiment has higher precision, lower calculated amount and better subsequent noise reduction effect.
Referring to fig. 4, fig. 4 is a flowchart illustrating a noise power spectrum density calculating method according to an embodiment of the invention, and the method is different from the above embodiment in that: in the adaptive filter algorithm part, a block frequency domain adaptive filter algorithm is adopted to further reduce the calculation amount, the distinguishing part is described in detail below, other parts are not described in detail, please refer to the first embodiment, the method mainly comprises the following steps:
s401, collecting a time domain noise signal, and processing the time domain noise signal to obtain a frequency domain noise signal.
S402, processing the frequency domain noise signal by adopting an adaptive filter to obtain a time domain error signal, wherein an algorithm part of the adaptive filter adopts a block frequency domain adaptive filtering algorithm.
According to the digital signal processing theory, the overlap-and-reserve and overlap-add algorithm provides two efficient algorithms for fast convolution operations, namely, calculating linear convolution by using DFT. And when the filters overlap by fifty percent (when the block size is equal to the number of the weights), the operation efficiency reaches the highest, so when the adaptive filter is adopted to solve the power spectrum density of the noise signal, a block frequency domain adaptive filtering algorithm is adopted to further reduce the calculation amount.
Referring to fig. 5, fig. 5 is a specific flowchart of a block frequency domain adaptive filtering algorithm, as shown in fig. 5, the following specifically illustrates the algorithm process:
Taking the first frequency domain noise signal obtained in the operation S401 as an adaptive filter input X (n), if the block length of the block is 64 sampling points, connecting the two blocks of data in series to obtain 128-bit time domain point data, and obtaining 128 as frequency domain point data X (k) after fast fourier transformation.
The length W (k) of the adaptive filter is 64 points, and if zero is added to 128 points at the back, the length W (k) is obtained after fast fourier transformation. The corresponding points of X (k) and W (k) are multiplied to obtain a Y (k) signal, and then the Y (k) signal is subjected to inverse fast fourier transform to obtain an estimated signal Y (n), and note that the data obtained here is still 128 points, and since the first 64 points are the results of the circular convolution and are corrupted data, the data obtained here is taken as output Y (n) from the last 64 points, and the Y (n) is subtracted by the expected response d (n) (second frequency domain noise signal) to obtain an error signal e (n), and note that Y (n) is discarded as the first 64 points. The E (n) signal needs to be fast fourier transformed into the frequency domain E (k) after the previous zero padding to 128-point data. The first 64 points are discarded when the X (k) point is multiplied by the W (k) to obtain Y (k), then the X (k) transpose and the E (k) point are multiplied, the last 64 points are discarded after the inverse fast Fourier transform, similarly, zero is added to 128 points at the back when the frequency domain is transformed, and then the filter coefficient W (k+1) of the filter for obtaining the frequency domain is updated. In this way, the iteration is continued, and as the filter converges, the error becomes smaller and smaller.
In this embodiment, the length of each block of the block frequency domain adaptive filtering algorithm is 128 points, the effective length of the frequency domain is 65 points, and the number of Blocks used by each filter can be 7-13. The specific time domain length, frequency domain effective length and fast number are not limited in the invention, and are set according to the actual signal processing requirement.
S403, calculating the power spectral density of the time domain error signal, and calculating the noise power spectral density according to the power spectral density of the error signal.
In the embodiment, the noise signal is further subjected to block processing by adopting a block frequency domain adaptive filtering algorithm, so that the calculated amount can be further reduced.
Referring to fig. 6, fig. 6 is a schematic diagram of a noise power spectrum density calculating system according to an embodiment of the invention, which may be built in an electronic device, the noise power spectrum density calculating system mainly includes: a first processing module 601, a second processing module 602, and a computing module 603.
The first processing module 601 collects a time domain noise signal, and processes the time domain noise signal to obtain a frequency domain noise signal. Specifically, the first processing module 601 collects noise signals by using two microphones, where the collected noise signals are time domain signals, and one microphone collects a first time domain noise signal and the other microphone collects a second time domain noise signal; respectively framing and windowing the acquired first time domain noise signal and the acquired second time domain noise signal; the signals after framing and windowing are converted into a first frequency domain noise signal corresponding to the first time domain noise signal and a second frequency domain noise signal corresponding to the second time domain noise signal through fast Fourier transformation.
The second processing module 602 processes the frequency domain noise signal by using an adaptive filter to obtain a time domain error signal, where an algorithm part of the adaptive filter is a normalized minimum mean square error algorithm.
Specifically, the coefficients of the adaptive filter may be initialized to 0in their entirety, or may be initialized according to other criteria.
The first frequency domain noise signal obtained by the first processing module 601 is used as an adaptive filter input x (n) of the second processing module 602, and the signal is processed by an NLMS algorithm to obtain a filter impulse response signal h (n) of the first frequency domain noise signal, wherein the filter impulse response formula is as follows:
where α is a fixed step size, ζ is a regularization factor, X (n) is an input signal vector, X (n) = [ X (n-m+1), X (n-m+2) … X (n) ].
And carrying out convolution processing on the first frequency domain noise signal and the corresponding filter impulse response signal to obtain an estimated signal, namely an estimated signal y (n) =x (n) ×h (n), wherein the middle asterisk represents convolution.
The second processing module 602 performs subtraction on the second frequency domain noise signal obtained by the first processing module 601 and the estimated signal, and performs inverse fourier transform to obtain a time domain error signal, i.e. a time domain error signal e (n) =d (n) -y (n) =d (n) -x (n) ×h (n).
In order to further reduce the calculation amount, when the adaptive filter processes signals, a block frequency domain adaptive filtering algorithm is adopted to process the acquired noise signals.
Specifically, the first frequency domain noise signal obtained by the first processing module 601 is taken as the adaptive filter input X (n) of the first processing module 602, if the block length of the block is 64 sampling points, two blocks of data are connected in series to obtain 128-bit time domain point data, and after fast fourier transformation, 128 is obtained as the frequency domain point data X (k).
The length W (k) of the adaptive filter is 64 points, and if zero is added to 128 points at the back, the length W (k) is obtained after fast fourier transformation. The corresponding points of X (k) and W (k) are multiplied to obtain a Y (k) signal, and then the Y (k) signal is subjected to inverse fast fourier transform to obtain an estimated signal Y (n), and note that the data obtained here is still 128 points, and since the first 64 points are the results of the circular convolution and are corrupted data, the data obtained here is taken as output Y (n) from the last 64 points, and the Y (n) is subtracted by the expected response d (n) (second frequency domain noise signal) to obtain an error signal e (n), and note that Y (n) is discarded as the first 64 points. The E (n) signal needs to be fast fourier transformed into the frequency domain E (k) after the previous zero padding to 128-point data. The first 64 points are discarded when the X (k) point is multiplied by the W (k) to obtain Y (k), then the X (k) transpose and the E (k) point are multiplied, the last 64 points are discarded after the inverse fast Fourier transform, similarly, zero is added to 128 points at the back when the frequency domain is transformed, and then the filter coefficient W (k+1) of the frequency domain is obtained by updating. Thus, the iteration is continued, as the filter converges, the error becomes smaller and smaller, and k represents the frequency domain index.
In this embodiment, the length of each block of the block frequency domain adaptive filtering algorithm is 128 points, the effective length of the frequency domain is 65 points, and the number of Blocks used by each filter can be 7-13. The specific time domain length, frequency domain effective length and block number are not limited in the invention, and are specifically set according to the actual signal processing requirements.
The calculation module 603 calculates the power spectral density of the time domain error signal and calculates the noise power spectral density from the power spectral density of the error signal.
The power spectral density of the time domain error signal is calculated as follows:
ΓEE(k)=FFT(γee(τ))=FFT(e(n)·e(n+τ))
e (n) denotes the time domain error signal, γ ee (τ) is the autocorrelation function of e (n), τ is the time delay, FFT () denotes the fast fourier transform operation, Γ EE (k) denotes the power spectral density of e (n).
The noise power spectral density is calculated as follows:
ΓRR(k)=FFT(γrr(τ))=FFT(r(n)·r(n+τ))
ΓLL(k)=FFT(γll(τ))=FFT(l(n)·l(n+τ))
Wherein Γ root (k) is an intermediate variable, re represents the real part; gamma rr (τ) is the autocorrelation function of the first time domain noise signal, τ is the time delay, gamma ll (f) is the autocorrelation function of the second time domain noise signal, τ is the time delay; Γ EE (k) represents the power spectral density of e (n), Γ RR (k) represents the power spectral density of the first time domain noise signal, Γ LL (k) represents the power spectral density of the second time domain noise signal, Γ NN (k) represents the power spectral density of the final noise; Representing a spatial interconnection correlation function model in the scattering noise field environment, d LR represents the distance of two microphones in units of per meter. c represents the propagation velocity of sound in the medium in meters per second and H (k) is the frequency domain coefficient of the adaptive filter.
Referring to fig. 7, fig. 7 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the invention.
The electronic device described in this embodiment includes:
The memory 71, the processor 72 and a computer program stored on the memory 71 and executable on the processor, which when executed implements the noise signal power spectral density calculation method described in the embodiments shown in the foregoing fig. 1 or fig. 2 or fig. 4.
Further, the electronic device further includes:
At least one input device 73; at least one output device 74.
The memory 71, the processor 72, the input device 73 and the output device 74 are connected by a bus 75.
The input device 73 may be a camera, a touch panel, a physical button, a mouse, or the like. The output device 74 may be a display screen in particular.
The memory 71 may be a high-speed random access memory (RAM, random Access Memory) memory or a non-volatile memory (non-volatile memory), such as a disk memory. Memory 71 is used to store a set of executable program codes and processor 72 is coupled to memory 71.
Further, the embodiments of the present disclosure also provide a computer readable storage medium, which may be provided in the terminal in each of the above embodiments, and may be a memory in the embodiment shown in fig. 7. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the noise signal power spectral density calculation method described in the embodiments shown in the foregoing fig. 1 or fig. 2 or fig. 4. Further, the computer-readable medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, etc. which may store the program code.
In the various embodiments provided herein, it should be understood that the disclosed apparatus and methods may be implemented in other ways. For example, the embodiments described above are merely illustrative, e.g., the division of the modules is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication links shown or discussed with each other may be indirect coupling or communication links through interfaces, modules, or in electrical, mechanical, or other forms.
The modules illustrated as separate components may or may not be physically separate, the components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed across multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in each embodiment of the present disclosure may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.
It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present disclosure is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present disclosure. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all necessary for the present disclosure.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The foregoing describes the noise signal power spectral density calculation method, system, apparatus and medium provided by the present disclosure, and those skilled in the art, based on the concepts of the embodiments of the present disclosure, may vary in terms of specific implementations and application areas, and in summary, the present disclosure should not be construed as limited to the embodiments of the present disclosure.

Claims (10)

1. A method for calculating noise power spectral density, comprising:
Collecting a time domain noise signal, and processing the time domain noise signal to obtain a frequency domain noise signal, wherein the frequency domain noise signal comprises a first frequency domain noise signal and a second frequency domain noise signal;
Processing the frequency domain noise signal by adopting an adaptive filter to obtain a time domain error signal, wherein the method comprises the following steps:
Carrying out normalized minimum mean square error processing on the first frequency domain noise signal to obtain a filter impact response signal corresponding to the first frequency domain noise signal;
Convolving the first frequency domain noise signal and the corresponding filter impulse response signal to obtain an estimated signal;
subtracting the second frequency domain noise signal from the estimated signal, and obtaining the time domain error signal through inverse Fourier transform;
And calculating the power spectrum density of the time domain error signal, and calculating the noise power spectrum density according to the power spectrum density of the time domain error signal.
2. The method of claim 1, wherein the processing the frequency domain noise signal with an adaptive filter to obtain a time domain error signal further comprises:
and performing block adaptive filtering processing on the first frequency domain noise signal by adopting a block frequency domain adaptive filtering algorithm.
3. The noise power spectral density calculation method according to claim 1, characterized in that the method further comprises:
and updating the filter coefficient of the adaptive filter according to the time domain error signal.
4. The method of claim 1, wherein the time domain noise signal comprises a first time domain noise signal and a second time domain noise signal, and wherein the acquiring the time domain noise signal comprises:
and framing, windowing and performing fast Fourier transform on the first time domain noise signal and the second time domain noise signal to obtain a first frequency domain noise signal corresponding to the first time domain noise signal and a second frequency domain noise signal corresponding to the second time domain noise signal.
5. The method of claim 2, wherein in the block adaptive filtering process of the first frequency domain noise signal, the point location data corresponding to the first frequency domain noise signal is required to be partially removed and zero-padded.
6. The method of claim 1, wherein calculating the power spectral density of the time domain error signal is performed by:
ΓEE(k)=FFT(γee(τ))=FFT(e(n)·e(n+τ))
e (n) represents the time domain error signal, γ ee (τ) is the autocorrelation function of e (n), τ is the time delay, FFT () represents the fast fourier transform operation, Γ EE (k) represents the power spectral density of e (n).
7. A noise power spectral density calculation system, comprising:
The first processing module is used for acquiring a time domain noise signal and processing the time domain noise signal to obtain a frequency domain noise signal, wherein the frequency domain noise signal comprises a first frequency domain noise signal and a second frequency domain noise signal;
the second processing module is configured to process the frequency domain noise signal by using an adaptive filter to obtain a time domain error signal, and includes:
Carrying out normalized minimum mean square error processing on the first frequency domain noise signal to obtain a filter impact response signal corresponding to the first frequency domain noise signal;
Convolving the first frequency domain noise signal and the corresponding filter impulse response signal to obtain an estimated signal;
subtracting the second frequency domain noise signal from the estimated signal, and obtaining the time domain error signal through inverse Fourier transform;
And the calculation module is used for calculating the power spectrum density of the time domain error signal and calculating the noise power spectrum density according to the power spectrum density of the error signal.
8. The noise power spectral density calculation system of claim 7, wherein said second processing module processing said frequency domain noise signal with an adaptive filter to obtain a time domain error signal further comprises:
and performing block adaptive filtering processing on the first frequency domain noise signal by adopting a block frequency domain adaptive filtering algorithm.
9. An electronic device, comprising:
A processor;
a memory storing a computer executable program that, when executed by the processor, causes the processor to perform the noise power spectral density calculation method of any one of claims 1-6.
10. A computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the noise power spectral density calculation method according to any of claims 1-6.
CN201910612851.4A 2019-07-08 2019-07-08 Noise power spectral density calculation method, system, equipment and medium Active CN112201273B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910612851.4A CN112201273B (en) 2019-07-08 2019-07-08 Noise power spectral density calculation method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910612851.4A CN112201273B (en) 2019-07-08 2019-07-08 Noise power spectral density calculation method, system, equipment and medium

Publications (2)

Publication Number Publication Date
CN112201273A CN112201273A (en) 2021-01-08
CN112201273B true CN112201273B (en) 2024-08-02

Family

ID=74004516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910612851.4A Active CN112201273B (en) 2019-07-08 2019-07-08 Noise power spectral density calculation method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN112201273B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114242096B (en) * 2021-08-20 2024-07-05 北京士昌鼎科技有限公司 Noise reduction system based on time-frequency domain
CN113808608B (en) * 2021-09-17 2023-07-25 随锐科技集团股份有限公司 Method and device for suppressing mono noise based on time-frequency masking smoothing strategy
CN114112006B (en) * 2021-11-26 2024-08-16 中科传启(苏州)科技有限公司 Noise monitoring method and device and electronic equipment
CN116528099A (en) * 2022-01-24 2023-08-01 Oppo广东移动通信有限公司 Audio signal processing method and device, earphone device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101369427A (en) * 2007-08-13 2009-02-18 哈曼贝克自动系统股份有限公司 Noise reduction by combined beamforming and post-filtering
CN102938254A (en) * 2012-10-24 2013-02-20 中国科学技术大学 Voice signal enhancement system and method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1188547A (en) * 1995-06-21 1998-07-22 艾利森电话股份有限公司 Power spectral density estimation method and apparatus
CN102263599B (en) * 2010-05-25 2015-06-10 中兴通讯股份有限公司 Intelligent antenna array simulation method and apparatus thereof
KR101550501B1 (en) * 2013-11-12 2015-09-04 고려대학교 산학협력단 Residual echo cancellation method and apparatus
CN103813251B (en) * 2014-03-03 2017-01-11 深圳市微纳集成电路与系统应用研究院 Hearing-aid denoising device and method allowable for adjusting denoising degree
CN107393550B (en) * 2017-07-14 2021-03-19 深圳永顺智信息科技有限公司 Voice processing method and device
EP3474280B1 (en) * 2017-10-19 2021-07-07 Goodix Technology (HK) Company Limited Signal processor for speech signal enhancement
EP3701526B1 (en) * 2017-10-26 2024-02-21 Bose Corporation Noise estimation using coherence

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101369427A (en) * 2007-08-13 2009-02-18 哈曼贝克自动系统股份有限公司 Noise reduction by combined beamforming and post-filtering
CN102938254A (en) * 2012-10-24 2013-02-20 中国科学技术大学 Voice signal enhancement system and method

Also Published As

Publication number Publication date
CN112201273A (en) 2021-01-08

Similar Documents

Publication Publication Date Title
CN112201273B (en) Noise power spectral density calculation method, system, equipment and medium
CN109727604B (en) Frequency domain echo cancellation method for speech recognition front end and computer storage medium
CN107393550B (en) Voice processing method and device
CN109979476B (en) Method and device for removing reverberation of voice
CN112863535B (en) Residual echo and noise elimination method and device
KR20180115984A (en) Method and apparatus for integrating and removing acoustic echo and background noise based on deepening neural network
CN113436643B (en) Training and application method, device and equipment of voice enhancement model and storage medium
CN108010536B (en) Echo cancellation method, device, system and storage medium
JPWO2020121590A1 (en) Signal processing equipment, signal processing methods, and programs
JP5634959B2 (en) Noise / dereverberation apparatus, method and program thereof
CN111968658A (en) Voice signal enhancement method and device, electronic equipment and storage medium
CN113077806B (en) Audio processing method and device, model training method and device, medium and equipment
CN107360497B (en) Calculation method and device for estimating reverberation component
Fattah et al. Identification of autoregressive moving average systems based on noise compensation in the correlation domain
JP6517124B2 (en) Noise suppression device, noise suppression method, and program
CN114220453B (en) Multi-channel non-negative matrix decomposition method and system based on frequency domain convolution transfer function
CN107346658B (en) Reverberation suppression method and device
CN107393553B (en) Auditory feature extraction method for voice activity detection
CN112802487B (en) Echo processing method, device and system
CN115620737A (en) Voice signal processing device, method, electronic equipment and sound amplification system
JP5438629B2 (en) Stereo echo canceling method, stereo echo canceling device, stereo echo canceling program
CN114220451A (en) Audio denoising method, electronic device, and storage medium
KR101558397B1 (en) Reverberation Filter Estimation Method and Dereverberation Filter Estimation Method, and A Single-Channel Speech Dereverberation Method Using the Dereverberation Filter
KR101506547B1 (en) speech feature enhancement method and apparatus in reverberation environment
CN107393559B (en) Method and device for checking voice detection result

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TG01 Patent term adjustment