CN114220446A - Adaptive background noise detection method, system and medium - Google Patents
Adaptive background noise detection method, system and medium Download PDFInfo
- Publication number
- CN114220446A CN114220446A CN202111512446.9A CN202111512446A CN114220446A CN 114220446 A CN114220446 A CN 114220446A CN 202111512446 A CN202111512446 A CN 202111512446A CN 114220446 A CN114220446 A CN 114220446A
- Authority
- CN
- China
- Prior art keywords
- sound signal
- sound
- signal
- frequency spectrum
- background noise
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 85
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 28
- 230000005236 sound signal Effects 0.000 claims abstract description 267
- 238000001228 spectrum Methods 0.000 claims abstract description 71
- 238000007619 statistical method Methods 0.000 claims abstract description 42
- 230000008859 change Effects 0.000 claims abstract description 24
- 238000000034 method Methods 0.000 claims description 47
- 230000003595 spectral effect Effects 0.000 claims description 30
- 238000007781 pre-processing Methods 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 19
- 238000005070 sampling Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000001914 filtration Methods 0.000 claims description 4
- 230000004044 response Effects 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 230000007613 environmental effect Effects 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 8
- 230000009471 action Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/45—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02087—Noise filtering the noise being separate speech, e.g. cocktail party
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
The invention provides an adaptive background noise detection method, which comprises the following steps: s1, acquiring the sound signal belonging to the background noise detection interval; s2, performing fast Fourier transform on each frame of signal in the sound signal, and estimating the frequency spectrum amplitude of the sound signal; s3, performing steady state statistical analysis on the frequency spectrum amplitude of the sound signal, and calculating to obtain the frequency spectrum amplitude change condition of the sound signal; s4, carrying out dynamic statistical analysis on the frequency spectrum amplitude of the sound signal, and calculating to obtain the frequency spectrum amplitude variation change condition of the sound signal; and S5, judging whether the sound signal is background noise according to the change situation of the frequency spectrum amplitude of the sound signal and the change situation of the frequency spectrum amplitude variation of the sound signal. The invention judges whether the sound signal is background noise or not through the frequency spectrum signal characteristic of the sound signal, can make timely adjustment according to the change of the detected environmental noise, and has higher detection accuracy.
Description
Technical Field
The invention relates to the technical field of noise detection, in particular to an adaptive background noise detection method, an adaptive background noise detection system and an adaptive background noise detection medium.
Background
Background noise is often accompanied in systems for recording, communicating, detecting and the like of sound, and the sound quality is affected, which causes auditory interference and reduces the accuracy of detection, so that the elimination or suppression of the background noise becomes a very important problem.
At present, sound noise reduction methods are diversified, but noise reduction methods based on a single microphone are very limited, and generally, noise reduction methods based on spectral subtraction are used, and this method needs to effectively estimate background noise to exert the maximum effect, otherwise, the method may be suitable for the contrary, and redundant noise is generated. Voice Activity Detection (VAD) in the prior art is usually determined based on the amount of Voice energy, but because noise in the environment changes, the Detection method cannot timely adjust according to the change of the noise, and therefore the Detection accuracy is not ideal.
Therefore, it is important to provide a method, a system and a medium for detecting background noise, which can be adjusted in real time according to the change of the detected environmental noise, so as to achieve high detection accuracy.
Disclosure of Invention
The invention provides an adaptive background noise detection method, an adaptive background noise detection system and an adaptive background noise detection medium, which aim to solve the technical problem that the detection accuracy is low because a sound activity detection method in the prior art cannot be timely adjusted according to the change of the detection environmental noise.
According to a first aspect of the present application, an adaptive background noise detection method is provided, which includes the following steps:
s1, acquiring the sound signal belonging to the background noise detection interval;
s2, performing fast Fourier transform on each frame of signal in the sound signal, and estimating the frequency spectrum amplitude of the sound signal;
s3, performing steady state statistical analysis on the frequency spectrum amplitude of the sound signal, and calculating to obtain the frequency spectrum amplitude change condition of the sound signal;
s4, carrying out dynamic statistical analysis on the frequency spectrum amplitude of the sound signal, and calculating to obtain the frequency spectrum amplitude variation change condition of the sound signal; and
and S5, judging whether the sound signal is background noise according to the change situation of the frequency spectrum amplitude of the sound signal and the change situation of the frequency spectrum amplitude variation of the sound signal.
By collecting the sound signal in the background noise detection interval, and performing steady-state statistical analysis and dynamic statistical analysis on the sound signal, if the spectral amplitude and the spectral amplitude variation of the sound signal are judged to be maintained in a stable state according to the analysis result, the sound signal is indicated as the background noise, and the parameter of the sound signal is recorded to be used as the basis for eliminating the background noise. The method can make timely adjustment according to the change of the detected environmental noise, effectively judges the background noise and has higher detection accuracy.
Preferably, the step S3 specifically includes:
s31, calculating the average spectrum amplitude of the sound signal within 1 second, and continuously recording for 5 seconds;
s32, calculating a first average standard deviation between the average spectrum amplitudes of the corresponding 5 sound signals within 5 seconds;
and S33, judging whether the first average standard deviation is smaller than a first threshold value, if so, indicating that the frequency spectrum amplitude of the sound signal is maintained in a stable state.
By comparing the second average standard deviation between the average spectrum amplitudes of the 5 sound signals within 5 seconds with the first threshold, it can be reflected whether the spectrum amplitude of the sound signal is maintained in a stable state within the time, and thus the comparison can be used as one of the bases for judging whether the sound signal is the background noise.
Preferably, the step S4 specifically includes:
s41, calculating the average frequency spectrum amplitude variation number of the sound signal within 1 second, and continuously recording for 5 seconds, wherein the specific calculation formula is as follows:
wherein, Vx,j(k) Representing the average spectral amplitude variation, X, of the sound signal in the j-th secondi(k) Representing the spectral amplitude, X, of the signal of the i-th framea,j(k) Represents the average spectral amplitude of the sound signal in the j second, and N represents the number of frames in the 1 second sound signal;
s42, calculating a second average standard deviation between the average frequency spectrum amplitude variation numbers corresponding to the 5 sound signals within 5 seconds;
and S43, judging whether the second average standard deviation is smaller than a second threshold value, if so, indicating that the frequency spectrum amplitude variation number of the sound signal is maintained in a stable state.
By comparing the second average standard deviation between the average spectrum amplitude variations of 5 sound signals within 5 seconds with the second threshold, it can be reflected whether the spectrum amplitude variations of the sound signals are maintained in a stable state within the time, and thus the comparison can be used as one of the bases for judging whether the sound signals are background noise.
Preferably, the step S1 specifically includes:
s11, collecting sound signals, and preprocessing the sound signals;
s12, setting the initial detection state of the sound signal, and preprocessing the sound signal;
and S13, estimating the sound energy of the sound signal, judging whether the sound energy of the sound signal is smaller than a third threshold value, if so, executing a step S2, and if not, returning to the step S12.
And estimating the sound energy of the processed sound signal, comparing the estimated sound energy with a third threshold value, and if the sound energy of the sound signal is smaller than the third threshold value, indicating that the sound signal is in a background noise detection interval.
Further preferably, the step S11 specifically includes:
s111, collecting a sound signal, and converting the sound signal into a voltage signal;
s112, amplifying the voltage signal;
s113, filtering the amplified voltage signal so as to adjust the frequency spectrum response of the voltage signal;
and S114, converting the voltage signal into a digital signal.
Through the preprocessing step, the collected sound signals are converted into digital signals which can be identified in the subsequent detection step.
Further preferably, the step S12 of setting the initial detection state of the sound signal specifically includes:
s121, setting sampling time, counting the average value of the initial parameters of the sound signals in the sampling time, taking the average value as the initial parameters of the sound signals, and dynamically updating the initial parameters of the sound signals.
Further preferably, the preprocessing the audio signal in step S12 specifically includes:
and S122, extracting each frame signal in the sound signals, and performing frequency spectrum equalization processing on the sound signals.
The extraction of each frame of signal in the sound signal can reduce the distortion on the frequency spectrum, and the frequency spectrum equalization processing can also be used for emphasizing a certain frequency band and increasing or decreasing the weight of the background noise detection of the frequency band besides compensating the distortion in the sound collection process.
Further preferably, the estimating of the sound energy of the sound signal in step S13 specifically includes: estimating the sound energy of each frame of the sound signals, wherein the specific estimation formula is as follows:
wherein E isiRepresents the sound energy of the ith frame signal, x (n) represents the frame signal corresponding to the ith frame, and k represents the total number of frames of the sound signal.
Further preferably, the setting criteria of the third threshold in step S13 are: and taking 4 times of the average sound energy of the sound signal as a third threshold, wherein the minimum sound energy in each frame signal in the sound signal in the first 5 seconds is taken as the average sound energy of the sound signal.
By comparing the sound energy of the sound signal with the third threshold, which sound signals belong to the section for detecting the background noise can be discriminated.
Preferably, the step S5 specifically includes: :
s51, judging whether the frequency spectrum amplitude of the sound signal is maintained in a stable state, if so, executing a step S52, otherwise, returning to the step S121;
s52, judging whether the frequency spectrum amplitude variation number of the sound signal is maintained in a stable state, if so, executing the step S53, otherwise, returning to the step S121;
s53, recording and updating the parameters of the sound signal into the background noise data, and returning to step S121.
Through the steps, when the spectral amplitude of the sound signal is maintained in a stable state and the variance of the spectral amplitude of the sound signal is maintained in the stable state, the sound signal is judged to be background noise, and the parameters of the sound signal are recorded and updated.
Further preferably, after the step S4 and before the step S5, the method further includes:
s4a, storing the statistical data of the stable statistical analysis and the dynamic statistical analysis;
s5a, judging whether the storage time of the statistical data exceeds the preset time, if so, executing the step S5, and if not, returning to the step S122.
According to a second aspect of the present application, an adaptive background noise detection system is proposed, comprising:
the sound acquisition device is configured for acquiring sound signals and preprocessing the sound signals;
a processor operation unit configured to set an initial detection state of the sound signal, pre-process the sound signal, estimate sound energy of the sound signal, determine whether the sound energy of the sound signal is less than a third threshold, if not, reset the initial detection state of the sound signal, pre-process the sound signal, if so, estimate spectral amplitude of the sound signal, perform steady-state statistical analysis and dynamic statistical analysis on the sound signal, determine whether the sound signal is background noise according to analysis results of the steady-state statistical analysis and the dynamic statistical analysis, if so, record and update parameters of the sound signal into background noise data, reset the initial detection state of the sound signal, and pre-process the sound signal, if not, directly resetting the initial detection state of the sound signal, and preprocessing the sound signal;
the memory unit is configured and used for storing programs or tables required by the processor arithmetic unit and temporarily storing data in the arithmetic process;
a statistical record storage unit configured to store the background noise data.
Preferably, the sound collection device specifically includes:
the microphone sound receiving device is configured for collecting the sound signal and converting the sound signal into a voltage signal;
an amplifier configured to amplify the voltage signal;
the filter is configured to filter the amplified voltage signal;
an analog-to-digital converter configured to convert the filtered voltage signal into a digital signal.
According to a third aspect of the present application, a computer-readable storage medium is proposed, which stores a computer program which, when executed by a processor, implements the adaptive background noise detection method according to the first aspect of the present application.
The application provides an adaptive background noise detection method, an adaptive background noise detection system and an adaptive background noise detection medium, wherein a sound signal is collected by a sound collection device and is preprocessed, an initial detection state and a preprocessing are set, the sound energy of the processed sound signal is estimated and is compared with a third threshold, if the sound energy of the sound signal is smaller than the third threshold, the sound signal is indicated to be in a background noise detection interval, steady-state statistical analysis and dynamic statistical analysis are continuously performed on the sound signal, according to an analysis result, if the spectral amplitude and the spectral amplitude variation of the sound signal are both maintained in a stable state, the sound signal is judged to be background noise, and the parameter of the sound signal is recorded and updated into background noise data to be used as a basis for eliminating the background noise. The method can make timely adjustment according to the change of the detected environmental noise, effectively judges the background noise and has higher detection accuracy.
Drawings
The accompanying drawings are included to provide a further understanding of the embodiments and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments and together with the description serve to explain the principles of the invention. Other embodiments and many of the intended advantages of embodiments will be readily appreciated as they become better understood by reference to the following detailed description. The elements of the drawings are not necessarily to scale relative to each other. Like reference numerals designate corresponding similar parts.
FIG. 1 is a flow chart of an adaptive background noise detection method according to an embodiment of the present invention;
FIG. 2 is a flow diagram of background noise detection according to an embodiment of the present invention;
FIG. 3 is a flow diagram of a pre-process according to an embodiment of the present invention;
FIG. 4 is a flow diagram of steady state statistical analysis in accordance with a specific embodiment of the present invention;
FIG. 5 is a flow diagram of dynamic statistical analysis in accordance with one illustrative embodiment of the present invention;
FIG. 6 is a flowchart of determining whether an audio signal is background noise according to an embodiment of the present invention;
FIG. 7 is a system block diagram of an adaptive background noise detection system according to an embodiment of the present invention;
fig. 8 is an architecture diagram of a sound collection device according to an embodiment of the present invention.
Description of reference numerals: 1. a sound collection device; 2. a processor arithmetic unit; 3. a memory unit; 4. a statistical record storage unit; 11. a microphone sound receiving device; 12. an amplifier; 13. a filter; 14. an analog-to-digital converter.
Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The term "comprising", without further limitation, means that the element so defined is not excluded from the list of additional identical elements in a process, method, article, or apparatus that comprises the element.
According to a first aspect of the present application, an adaptive background noise detection method is presented. Fig. 1 shows a flowchart of an adaptive background noise detection method according to an embodiment of the present invention, and as shown in fig. 1, the adaptive background noise detection method includes the following steps:
and S1, acquiring the sound signals belonging to the background noise detection interval.
In a specific embodiment, a sound signal in the environment is collected by a sound collection device, and the sound signal belongs to a background noise detection interval. Fig. 2 shows a flowchart of background noise detection according to an embodiment of the present invention, and as shown in fig. 2, the detection process is as follows:
and S11, collecting the sound signal and preprocessing the sound signal.
Fig. 3 shows a flow chart of the preprocessing according to an embodiment of the present invention, and as shown in fig. 3, in an embodiment, the preprocessing specifically includes:
and S111, collecting the sound signal, and converting the sound signal into a voltage signal.
In this embodiment, the process of converting the sound signal into the voltage (analog) signal is also performed on the sound collection device.
And S112, amplifying the voltage signal.
Since the signal strength of the sound signal is weak after the sound signal is converted into a voltage signal, it needs to be amplified.
And S113, filtering the amplified voltage signal to adjust the frequency spectrum response of the voltage signal.
In the filtering process, the spectral response of the voltage signal can be adjusted to perform sound enhancement and equalization (equalization) processing.
And S114, converting the voltage signal into a digital signal.
The analog digital signal is converted into a digital signal in preparation for subsequent signal processing.
With continued reference to fig. 1 and 2, after S11:
and S12, setting the initial detection state of the sound signal and preprocessing the sound signal.
In a specific embodiment, setting the initial detection state of the sound signal is embodied as:
s121, setting sampling time, counting the average value of the initial parameters of the sound signal in the sampling time, taking the average value as the initial parameters of the sound signal, and dynamically updating the initial parameters of the sound signal.
The method comprises the steps of counting various parameters of the sound signal in a short period of time, such as spectral amplitude and sound energy, taking the average value of various parameters as the initial parameter of the sound signal, and dynamically updating the subsequent initial parameter along with the change of statistical data.
The preprocessing of the sound signal is specifically as follows:
s122, extracting each frame signal in the sound signal, and performing frequency spectrum equalization processing on the sound signal.
In the detection process, each frame of the audio signal is detected once, and the extraction of each frame of the audio signal includes, but is not limited to, capturing with a Hanning Window (Hanning Window), thereby reducing distortion in the spectrum of the used signal. The frequency spectrum equalization processing is realized through a digital filter, and the frequency spectrum equalization processing not only can compensate distortion when sound acquisition equipment acquires a sound signal, but also can be used for strengthening a certain frequency band, so that the weight of noise detection of the frequency band is increased or reduced, and the detection efficiency is improved.
With continued reference to fig. 2, after step S12:
s13, estimating the sound energy of the sound signal, and determining whether the sound energy of the sound signal is less than a third threshold, if so, the sound signal belongs to the background noise detection interval, executing step S2, otherwise, returning to step S12.
In a specific embodiment, the estimating the sound energy of the sound signal comprises: estimating the sound energy of each frame signal in the sound signal, wherein the specific estimation formula is as follows:
wherein E isiThe sound energy indicating the i-th frame number, x (n) the frame signal corresponding to the i-th frame, and k the total number of frames of the sound signal.
Hereinafter, assuming collectively that the sampling frequency of the sound signal by the sound collection apparatus is 16000Hz, the number of frames of the sound signal is 512, the duration of each frame signal is 32 milliseconds, and there are about 32 frames in the 1 second sound signal, the sound energy Ei of the i-th frame signal is estimated as:
in a specific embodiment, the setting criteria of the third threshold are:
assume that the initial average sound energy of the sound signal is EoThe minimum sound energy in each frame signal in the sound signal (160 frame signal) in the first 5 seconds is taken as EoThe value of (a) is:
assume that the first threshold is EthTaking 4 times of the average sound energy of the sound signal as a first threshold value, namely:
Eth=4E0
when the sound energy of the sound signal is judged to be larger than the third threshold value, which indicates that the sound signal is not in the background noise detection interval, returning to the step S121, and resetting the initial detection state of the sound signal; when the sound energy of the frame is determined to be within the first threshold, it indicates that the sound signal is within the detection interval of the background noise, and the process proceeds to the next step S2.
With continued reference to fig. 1, after step S1:
s2, performing fast fourier transform on each frame of the audio signal, and estimating the spectral amplitude of the audio signal.
Each frame signal x (n) in the sound signal is subjected to fast Fourier transform to obtain:
Xi(k)=|FFT{x(n)}|,k=1,2,3,...,512
wherein, Xi(k) Representing the spectral amplitude of the signal of the ith frame. According to the spectrum amplitude of each frame signal, the spectrum amplitude of the whole sound signal can be obtained.
With continued reference to fig. 1, after step S2:
and S3, performing steady state statistical analysis on the frequency spectrum amplitude of the sound signal, and calculating to obtain the frequency spectrum amplitude change condition of the sound signal.
Fig. 4 is a flowchart illustrating a steady-state statistical analysis according to an embodiment of the present invention, and as shown in fig. 4, in an embodiment, the steady-state statistical analysis specifically includes:
s31, calculate the average spectrum amplitude of the sound signal within 1 second, and record continuously for 5 seconds.
The average spectral amplitude of the sound signal within 1 second (32 frame signal) is calculated as:
wherein, Xa,j(k) Representing the average spectral amplitude of the sound signal in the j-th second.
S32, calculating a first average standard deviation between the average spectral amplitudes corresponding to the 5 sound signals within 5 seconds.
Defining the first average standard deviation as SDXThe specific calculation formula is as follows:
wherein
And S33, judging whether the first average standard deviation is smaller than a first threshold value, if so, indicating that the frequency spectrum amplitude of the sound signal is maintained in a stable state.
In a specific embodiment, the first threshold is set to:
it should be noted that γ is an empirical parameter, and since various noises such as rain noise and fan noise exist in the background noise, the value of γ is dynamically changed, and is usually between 0.5 and 2. When the background noise is more stable, the value of γ is lower, and in this embodiment, γ specifically takes a value of 1.
Therefore, when judging
It is indicated that the spectral amplitude of the sound signal is statistically maintained in a stable state.
With continued reference to fig. 1, after step S3:
and S4, carrying out dynamic statistical analysis on the frequency spectrum amplitude of the sound signal, and calculating to obtain the frequency spectrum amplitude variation change condition of the sound signal.
Fig. 5 is a flowchart illustrating a dynamic statistical analysis according to an embodiment of the present invention, and as shown in fig. 5, in an embodiment, the dynamic statistical analysis specifically includes:
s41, calculating the average frequency spectrum amplitude variation number of the sound signal within 1 second, and continuously recording for 5 seconds, wherein the specific calculation formula is as follows:
wherein, Vx,j(k) Representing the average spectral amplitude variation, X, of the sound signal in the j-th secondi(k) Representing the spectral amplitude, X, of the signal of the i-th framea,j(k) Represents the average spectral amplitude of the sound signal in the j-th second, and N represents the number of frames in the 1-second sound signal.
Therefore, in this embodiment, the average spectral amplitude variance of the sound signal within 1 second is:
s42, calculating a second average standard deviation between the average spectral amplitude variances corresponding to the 5 sound signals within 5 seconds.
Defining the second mean standard deviation as SDVThe specific calculation formula is as follows:
wherein
And S43, judging whether the second average standard deviation is smaller than a second threshold value, if so, indicating that the frequency spectrum amplitude variation number of the sound signal is maintained in a stable state.
In a specific embodiment, the second threshold is set to:
it should be noted that β is also an empirical parameter, and the value method thereof is the same as γ in the above description, and is not described here again.
Therefore, when judging
It is indicated that the spectral amplitude variance of the sound signal is consistent with maintaining a statistically steady state.
With continued reference to fig. 1, after step S4:
and S5, judging whether the sound signal is background noise according to the change situation of the frequency spectrum amplitude of the sound signal and the change situation of the frequency spectrum amplitude variation number of the sound signal.
Fig. 6 shows a flowchart of determining whether the sound signal is a background noise according to an embodiment of the present invention, and as shown in fig. 6, in the embodiment, the step S5 specifically includes:
s51, determining whether the spectrum amplitude of the audio signal is maintained in a stable state, if so, executing step S52, otherwise, returning to step S121.
S52, determining whether the frequency spectrum amplitude variation of the audio signal is maintained in a stable state, if so, the audio signal is background noise, and executing step S53, otherwise, returning to step S121.
S53, records and updates the parameters of the sound signal in the background noise data, and returns to step S121.
Through double judgment of steady-state statistical analysis and dynamic statistical analysis, when the requirement that the frequency spectrum amplitude of the sound signal is maintained in a stable state and the frequency spectrum amplitude variation number of the sound signal is maintained in the stable state is met, the sound signal is judged to be background noise, and the parameter of the sound signal is recorded and used as the basis for eliminating the background noise.
Thus, the detection of background noise in one round is completed, and the next round of detection is entered again.
With continued reference to FIG. 2, in a preferred embodiment, after step S43 and before step S51, there is further included:
s4a, storing statistical data of the stable statistical analysis and the dynamic statistical analysis.
S5a, judging whether the storage time of the statistical data exceeds the preset time, if yes, executing step S5, and if not, returning to step S122.
And only when the accumulated statistical data exceeds the preset time, the next step is carried out to judge whether the sound signal is background noise, otherwise, steady-state statistical analysis and dynamic statistical analysis are continuously carried out on the sound signal until the statistical data time reaches the preset time. In the present embodiment, the predetermined time is set to 5 seconds.
According to a second aspect of the present application, an adaptive background noise detection system is provided, which is built based on the detection method described above. Fig. 7 shows a system block diagram of an adaptive background noise detection system according to an embodiment of the present invention, as shown in fig. 7, the system comprising:
the sound collection device 1 is configured to collect sound signals and preprocess the sound signals.
The processor arithmetic unit 2 is configured to set an initial detection state of the sound signal, preprocess the sound signal, estimate sound energy of the sound signal, and judge whether the sound energy of the sound signal is smaller than a third threshold value, if not, reset the initial detection state of the sound signal, preprocess the sound signal, if so, estimate a spectral amplitude of the sound signal, and perform steady-state statistical analysis and dynamic statistical analysis on the sound signal; judging whether the sound signal is background noise according to the analysis results of the steady state statistical analysis and the dynamic statistical analysis, if so, recording and updating the parameters of the sound signal into the background noise data, resetting the initial detection state of the sound signal, and preprocessing the sound signal, otherwise, directly resetting the initial detection state of the sound signal and preprocessing the sound signal.
The memory unit 3 is configured to store programs or tables required by the processor arithmetic unit 2 and temporarily store data in the arithmetic process;
a statistical record storage unit 4 configured to store background noise data.
Fig. 8 is a structural diagram of a sound collection device according to an embodiment of the present invention, and as shown in fig. 8, the sound collection device 1 specifically includes:
the microphone radio device 11 is configured to collect a sound signal and convert the sound signal into a voltage signal;
an amplifier 12 configured to amplify the voltage signal, and the amplifier may be preset with various sensitivities according to usage requirements, so as to quickly adjust the voltage signal to an appropriate magnitude;
a filter 13 configured to filter the amplified voltage signal;
an analog-to-digital converter 14 configured to convert the filtered voltage signal into a digital signal.
According to a third aspect of the present application, a computer-readable storage medium is proposed, which stores a computer program, which when executed by a processor implements the adaptive background noise detection method of the above.
The invention provides an adaptive background noise detection method, an adaptive background noise detection system and an adaptive background noise detection medium, wherein a sound signal is collected by a sound collection device and is preprocessed, an initial detection state and a preprocessing are set, the sound energy of the processed sound signal is estimated and is compared with a third threshold, if the sound energy of the sound signal is smaller than the third threshold, the sound signal is in a background noise detection interval, steady-state statistical analysis and dynamic statistical analysis are continuously carried out on the sound signal, according to an analysis result, if the frequency spectrum amplitude and the frequency spectrum amplitude variation number of the sound signal are both maintained in a stable state, the sound signal is judged to be background noise, and the parameter of the sound signal is recorded and updated into background noise data to be used as a basis for eliminating the background noise. The method can make timely adjustment according to the change of the detected environmental noise, effectively judges the background noise and has higher detection accuracy.
In the embodiments of the present application, it should be understood that the disclosed technical contents may be implemented in other ways. The above-described embodiments of the apparatus/system/method are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present invention without departing from the spirit and scope of the invention. In this way, if these modifications and changes are within the scope of the claims of the present invention and their equivalents, the present invention is also intended to cover these modifications and changes. The word "comprising" does not exclude the presence of other elements or steps than those listed in a claim. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage. Any reference signs in the claims shall not be construed as limiting the scope.
Claims (14)
1. An adaptive background noise detection method, comprising the steps of:
s1, acquiring the sound signal belonging to the background noise detection interval;
s2, performing fast Fourier transform on each frame of signal in the sound signal, and estimating the frequency spectrum amplitude of the sound signal;
s3, performing steady state statistical analysis on the frequency spectrum amplitude of the sound signal, and calculating to obtain the frequency spectrum amplitude change condition of the sound signal;
s4, carrying out dynamic statistical analysis on the frequency spectrum amplitude of the sound signal, and calculating to obtain the frequency spectrum amplitude variation change condition of the sound signal; and
and S5, judging whether the sound signal is background noise according to the change situation of the frequency spectrum amplitude of the sound signal and the change situation of the frequency spectrum amplitude variation of the sound signal.
2. The method according to claim 1, wherein the step S3 specifically includes:
s31, calculating the average spectrum amplitude of the sound signal within 1 second, and continuously recording for 5 seconds;
s32, calculating a first average standard deviation between the average spectrum amplitudes of the corresponding 5 sound signals within 5 seconds;
and S33, judging whether the first average standard deviation is smaller than a first threshold value, if so, indicating that the frequency spectrum amplitude of the sound signal is maintained in a stable state.
3. The method according to claim 2, wherein the step S4 specifically includes:
s41, calculating the average frequency spectrum amplitude variation number of the sound signal within 1 second, and continuously recording for 5 seconds, wherein the specific calculation formula is as follows:
wherein, Vx,j(k) Representing the average spectral amplitude variation, X, of the sound signal in the j-th secondi(k) Representing the spectral amplitude, X, of the signal of the i-th framea,j(k) Represents the average spectral amplitude of the sound signal in the j second, and N represents the number of frames in the 1 second sound signal;
s42, calculating a second average standard deviation between the average frequency spectrum amplitude variation numbers corresponding to the 5 sound signals within 5 seconds;
and S43, judging whether the second average standard deviation is smaller than a second threshold value, if so, indicating that the frequency spectrum amplitude variation number of the sound signal is maintained in a stable state.
4. The method according to claim 3, wherein the step S1 specifically includes:
s11, collecting sound signals, and preprocessing the sound signals;
s12, setting the initial detection state of the sound signal, and preprocessing the sound signal;
and S13, estimating the sound energy of the sound signal, judging whether the sound energy of the sound signal is smaller than a third threshold value, if so, executing a step S2, and if not, returning to the step S12.
5. The method according to claim 4, wherein the step S11 specifically includes:
s111, collecting a sound signal, and converting the sound signal into a voltage signal;
s112, amplifying the voltage signal;
s113, filtering the amplified voltage signal so as to adjust the frequency spectrum response of the voltage signal;
and S114, converting the voltage signal into a digital signal.
6. The method according to claim 4, wherein the step S12 of setting the initial detection state of the sound signal specifically comprises:
s121, setting sampling time, counting the average value of the initial parameters of the sound signals in the sampling time, taking the average value as the initial parameters of the sound signals, and dynamically updating the initial parameters of the sound signals.
7. The method according to claim 6, wherein the preprocessing the sound signal in the step S12 specifically includes:
and S122, extracting each frame signal in the sound signals, and performing frequency spectrum equalization processing on the sound signals.
8. The method according to claim 7, wherein the estimating of the sound energy of the sound signal in step S13 specifically includes: estimating the sound energy of each frame of the sound signals, wherein the specific estimation formula is as follows:
wherein E isiRepresents the sound energy of the ith frame signal, x (n) represents the frame signal corresponding to the ith frame, and k represents the total number of frames of the sound signal.
9. The method according to claim 8, wherein the setting criteria of the third threshold in the step S13 are: and taking 4 times of the average sound energy of the sound signal as a third threshold, wherein the minimum sound energy in each frame signal in the sound signal in the first 5 seconds is taken as the average sound energy of the sound signal.
10. The method according to claim 9, wherein the step S5 specifically includes:
s51, judging whether the frequency spectrum amplitude of the sound signal is maintained in a stable state, if so, executing a step S52, otherwise, returning to the step S121;
s52, judging whether the frequency spectrum amplitude variation number of the sound signal is maintained in a stable state, if so, executing the step S53, otherwise, returning to the step S121;
s53, recording and updating the parameters of the sound signal into the background noise data, and returning to step S121.
11. The method of claim 7, further comprising, after the step S4 and before the step S5:
s4a, storing the statistical data of the stable statistical analysis and the dynamic statistical analysis;
s5a, judging whether the storage time of the statistical data exceeds the preset time, if so, executing the step S5, and if not, returning to the step S122.
12. An adaptive background noise detection system, comprising:
the sound acquisition device is configured for acquiring sound signals and preprocessing the sound signals;
a processor operation unit configured to set an initial detection state of the sound signal, pre-process the sound signal, estimate sound energy of the sound signal, determine whether the sound energy of the sound signal is less than a third threshold, if not, reset the initial detection state of the sound signal, pre-process the sound signal, if so, estimate spectral amplitude of the sound signal, perform steady-state statistical analysis and dynamic statistical analysis on the sound signal, determine whether the sound signal is background noise according to analysis results of the steady-state statistical analysis and the dynamic statistical analysis, if so, record and update parameters of the sound signal into background noise data, reset the initial detection state of the sound signal, and pre-process the sound signal, if not, directly resetting the initial detection state of the sound signal, and preprocessing the sound signal;
the memory unit is configured and used for storing programs or tables required by the processor arithmetic unit and temporarily storing data in the arithmetic process;
a statistical record storage unit configured to store the background noise data.
13. The system according to claim 12, wherein the sound collection device comprises:
the microphone sound receiving device is configured for collecting the sound signal and converting the sound signal into a voltage signal;
an amplifier configured to amplify the voltage signal;
the filter is configured to filter the amplified voltage signal;
an analog-to-digital converter configured to convert the filtered voltage signal into a digital signal.
14. A computer-readable storage medium, storing a computer program which, when executed by a processor, implements the method of any one of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111512446.9A CN114220446A (en) | 2021-12-08 | 2021-12-08 | Adaptive background noise detection method, system and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111512446.9A CN114220446A (en) | 2021-12-08 | 2021-12-08 | Adaptive background noise detection method, system and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114220446A true CN114220446A (en) | 2022-03-22 |
Family
ID=80701064
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111512446.9A Pending CN114220446A (en) | 2021-12-08 | 2021-12-08 | Adaptive background noise detection method, system and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114220446A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118262475A (en) * | 2024-05-30 | 2024-06-28 | 河北雄安亿晶云科技有限公司 | AI intelligent sound wave auxiliary campus anti-cheating system |
-
2021
- 2021-12-08 CN CN202111512446.9A patent/CN114220446A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118262475A (en) * | 2024-05-30 | 2024-06-28 | 河北雄安亿晶云科技有限公司 | AI intelligent sound wave auxiliary campus anti-cheating system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8611548B2 (en) | Noise analysis and extraction systems and methods | |
EP2381702B1 (en) | Systems and methods for own voice recognition with adaptations for noise robustness | |
US8065115B2 (en) | Method and system for identifying audible noise as wind noise in a hearing aid apparatus | |
CN109087655A (en) | A kind of monitoring of traffic route sound and exceptional sound recognition system | |
CN108766454A (en) | A kind of voice noise suppressing method and device | |
EP2881948A1 (en) | Spectral comb voice activity detection | |
JP2010514235A (en) | Volume automatic adjustment method and system | |
JP2012532650A (en) | Reducing breathing signal noise | |
CN111800725A (en) | Howling detection method and device, storage medium and computer equipment | |
CN114220446A (en) | Adaptive background noise detection method, system and medium | |
CN110634508A (en) | Music classifier, related method and hearing aid | |
CN109074707B (en) | Glass breakage detection system | |
US20200402499A1 (en) | Detecting speech activity in real-time in audio signal | |
CN111798864A (en) | Echo cancellation method and device, electronic equipment and storage medium | |
JP6320962B2 (en) | Speech recognition system, speech recognition method, program | |
CN107548007B (en) | Detection method and device of audio signal acquisition equipment | |
CN112866877B (en) | Speaker control method, speaker control device, electronic apparatus, and storage medium | |
US11490198B1 (en) | Single-microphone wind detection for audio device | |
CN112235679B (en) | Signal equalization method and processor suitable for earphone and earphone | |
CN115019813A (en) | Sound energy detection method, device and medium | |
CN115835092B (en) | Audio amplification feedback suppression method, system, computer and storage medium | |
US11758334B2 (en) | Acoustic activity detection | |
CN116156401A (en) | Hearing-aid equipment intelligent detection method, system and medium based on big data monitoring | |
CN113936694B (en) | Real-time human voice detection method, computer device and computer readable storage medium | |
CN110544487A (en) | Microphone-based voice detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |