CN109427345A - A kind of wind is made an uproar detection method, apparatus and system - Google Patents
A kind of wind is made an uproar detection method, apparatus and system Download PDFInfo
- Publication number
- CN109427345A CN109427345A CN201710754716.4A CN201710754716A CN109427345A CN 109427345 A CN109427345 A CN 109427345A CN 201710754716 A CN201710754716 A CN 201710754716A CN 109427345 A CN109427345 A CN 109427345A
- Authority
- CN
- China
- Prior art keywords
- frequency domain
- uproar
- wind
- domain data
- audio data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Circuit For Audible Band Transducer (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
It makes an uproar detection method, apparatus and system the embodiment of the invention provides a kind of wind, method includes: that each frame audio data is converted to frequency domain data, according to the spectral centroid of frequency domain data, determines in audio data and makes an uproar with the presence or absence of wind;As it can be seen that make an uproar detection in a first aspect, wind can be carried out using audio data all the way, does not need to acquire two-way audio data simultaneously and compare, it is easy to operate;Second aspect, the scheme of two-way audio data is acquired compared to two acquisition equipment of setting, and this programme only needs acquisition equipment acquisition audio data all the way, reduces equipment cost.
Description
Technical field
The present invention relates to audio signal processing technique field, make an uproar detection method, apparatus and system more particularly to a kind of wind.
Background technique
During audio collection, wind, which is made an uproar, usually influences a key factor of audio quality.In order to propose big audio matter
Amount, it usually needs wind is carried out to collected audio data and is made an uproar detection.
Progress wind makes an uproar to detect and usually requires while acquiring two-way audio data, this two-way audio data is compared, than
Such as, the attribute values such as the peak relationship of two-way audio data, signal-to-noise ratio are compared, according to comparison result, determine audio data
In make an uproar with the presence or absence of wind.
In above scheme, need to acquire two-way audio data simultaneously, and in some scenes for not having this condition, than
Such as, in single microphone scene, if to acquire two-way audio data simultaneously, need to add other acquisition equipment, operation is not
It is convenient.
Summary of the invention
A kind of wind of being designed to provide of the embodiment of the present invention is made an uproar detection method, apparatus and system, to realize by all the way
Audio data carries out wind and makes an uproar detection.
In order to achieve the above objectives, it makes an uproar detection method the embodiment of the invention provides a kind of wind, comprising:
For each frame audio data, which is converted into frequency domain data;
Calculate the power spectral density of Frequency point in the frequency domain data;
By the power spectral density, the spectral centroid of the frequency domain data is calculated;
Judge the spectral centroid whether less than the first preset threshold;
If so, determining that there are wind to make an uproar in the frame audio data.
Optionally, after making an uproar in the determination frame audio data there are wind, can also include:
Whether the spectral centroid is judged less than the second preset threshold, and it is default that second preset threshold is less than described first
Threshold value;
If so, by the frame audio data zero setting.
Optionally, after making an uproar in the determination frame audio data there are wind, can also include:
Judge whether the spectral centroid is more than or equal to third predetermined threshold value, the third predetermined threshold value is less than described first
Preset threshold;
If so, being filtered using Predetermined filter to the frequency domain data, filtered frequency domain data is obtained;
The filtered frequency domain data is converted into time domain data.
Optionally, in the case where judging the spectral centroid not less than the first preset threshold, the method can also be wrapped
It includes:
Export the frame audio data.
Optionally, the power spectral density for calculating Frequency point in the frequency domain data may include:
The frequency domain data is sampled, multiple Frequency points are obtained, the power spectrum spectrum for calculating the multiple Frequency point is close
Degree;
It is described by the power spectral density, calculate the spectral centroid of the frequency domain data, may include:
According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
Optionally, described by the power spectral density, calculate the spectral centroid of the frequency domain data, comprising:
Using following formula, the spectral centroid of the frequency domain data is calculated:
Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,It indicates k-th
The power spectral density of Frequency point,X (k) indicates that the frequency values of k-th of Frequency point, k are no more than L's
Positive integer.
Optionally, in the case where judging that the spectral centroid is less than third predetermined threshold value, the method can also include:
The corresponding power spectral density of the spectral centroid is recorded, is made an uproar power spectral density as wind;
It is described that the frequency domain data is filtered using Predetermined filter, obtain filtered frequency domain data, comprising:
Using following formula, filtered frequency domain data is determined:
Y (k)=H (k) * X (k),
Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate kth
The frequency values of a Frequency point,Indicate the power spectral density of k-th of Frequency point,
Indicate that the wind of state-of-the-art record is made an uproar power spectral density.
In order to achieve the above objectives, it makes an uproar detection device the embodiment of the invention also provides a kind of wind, comprising:
The frame audio data is converted to frequency domain data for being directed to each frame audio data by the first conversion module;
First computing module, for calculating the power spectral density of Frequency point in the frequency domain data;
Second computing module, for calculating the spectral centroid of the frequency domain data by the power spectral density;
First judgment module, for judging the spectral centroid whether less than the first preset threshold;
Determining module, for determining and existing in the frame audio data when the first judgment module judging result, which is, is
Wind is made an uproar.
Optionally, described device can also include:
Second judgment module, for judging that the spectral centroid is when the first judgment module judging result, which is, is
It is no less than the second preset threshold, second preset threshold is less than first preset threshold;
Zero setting module, for the second judgment module judging result be when, by the frame audio data zero setting.
Optionally, described device can also include:
Third judgment module, for judging that the spectral centroid is when the first judgment module judging result, which is, is
No to be more than or equal to third predetermined threshold value, the third predetermined threshold value is less than first preset threshold;
Filter module, for the third judgment module judging result be when, using Predetermined filter to the frequency
Numeric field data is filtered, and obtains filtered frequency domain data;
Second conversion module, for the filtered frequency domain data to be converted to time domain data.
Optionally, described device can also include:
Output module, for exporting the frame audio data when the first judgment module judging result is no.
Optionally, first computing module, specifically can be used for:
The frequency domain data is sampled, multiple Frequency points are obtained, the power spectrum spectrum for calculating the multiple Frequency point is close
Degree;
Second computing module, specifically can be used for:
According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
Optionally, second computing module, specifically can be used for:
Using following formula, the spectral centroid of the frequency domain data is calculated:
Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,It indicates k-th
The power spectral density of Frequency point,X (k) indicates that the frequency values of k-th of Frequency point, k are no more than L's
Positive integer.
Optionally, described device can also include:
Logging modle, for it is corresponding to record the spectral centroid when the third judgment module judging result is no
Power spectral density is made an uproar power spectral density as wind;
The filter module, specifically can be used for:
Using following formula, filtered frequency domain data is determined:
Y (k)=H (k) * X (k),
Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate kth
The frequency values of a Frequency point,Indicate the power spectral density of k-th of Frequency point,
Indicate that the wind of the logging modle state-of-the-art record is made an uproar power spectral density.
In order to achieve the above objectives, the embodiment of the invention also provides a kind of electronic equipment, including processor and memory,
In,
Memory, for storing computer program;
Processor when for executing the program stored on memory, realizes that any of the above-described kind of wind is made an uproar detection method.
In order to achieve the above objectives, it makes an uproar detection system the embodiment of the invention also provides a kind of wind, comprising: audio collecting device
It makes an uproar detection device with wind, wherein
Audio data collected for acquiring audio data, and is sent to the wind and made an uproar by the audio collecting device
Detection device;
The wind is made an uproar detection device, the audio data sent for receiving the audio collecting device, for therein every
The frame audio data is converted to frequency domain data by one frame audio data;Calculate the power spectrum of Frequency point in the frequency domain data
Degree;By the power spectral density, the spectral centroid of the frequency domain data is calculated;Judge the spectral centroid whether less than first
Preset threshold;If so, determining that there are wind to make an uproar in the frame audio data.
Using illustrated embodiment of the present invention, each frame audio data is converted into frequency domain data, according to the frequency of frequency domain data
Mass center is composed, determines in audio data and makes an uproar with the presence or absence of wind;As it can be seen that making an uproar in the present solution, wind can be carried out using audio data all the way
Detection, does not need to acquire two-way audio data simultaneously and compares, easy to operate.
Certainly, implement any of the products of the present invention or method it is not absolutely required at the same reach all the above excellent
Point.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is that wind provided in an embodiment of the present invention is made an uproar the first flow diagram of detection method;
Fig. 2 is that wind provided in an embodiment of the present invention is made an uproar second of flow diagram of detection method;
Fig. 3 is that a kind of wind provided in an embodiment of the present invention is made an uproar the structural schematic diagram of detection device;
Fig. 4 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention;
Fig. 5 is that a kind of wind provided in an embodiment of the present invention is made an uproar the structural schematic diagram of detection system.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
In order to solve the above-mentioned technical problem, it makes an uproar detection method, device and equipment the embodiment of the invention provides a kind of wind.It should
Method and device can be applied to the equipment with audio processing function, specifically without limitation.
A kind of wind provided in an embodiment of the present invention detection method of making an uproar is described in detail first below.
Fig. 1 is that wind provided in an embodiment of the present invention is made an uproar the first flow diagram of detection method, comprising:
S101: it is directed to each frame audio data, which is converted into frequency domain data.
S102: the power spectral density of Frequency point in the frequency domain data is calculated.
S103: by the power spectral density, the spectral centroid of the frequency domain data is calculated.
S104: the spectral centroid is judged whether less than the first preset threshold, if so, executing S105.
S105: determine that there are wind to make an uproar in the frame audio data.
Using embodiment illustrated in fig. 1 of the present invention, each frame audio data is converted into frequency domain data, according to frequency domain data
Spectral centroid is determined in audio data and is made an uproar with the presence or absence of wind;As it can be seen that in a first aspect, wind can be carried out using audio data all the way
It makes an uproar detection, does not need to acquire two-way audio data simultaneously and compare, it is easy to operate;Second aspect is adopted for two compared to setting
Collecting equipment to acquire the scheme of two-way audio data, this programme only needs acquisition equipment acquisition audio data all the way,
Reduce equipment cost.
It is described in detail below for embodiment illustrated in fig. 1:
S101: it is directed to each frame audio data, which is converted into frequency domain data.
The equipment (executing subject in other words, hereinafter referred to as this equipment) for executing this programme can be audio collecting device,
It can be other electronic equipments communicated to connect with audio collecting device.If this equipment is audio collecting device, audio is adopted
Collected each frame audio data can be directed to by collecting equipment, carried out wind using this programme and made an uproar detection.If this equipment is and sound
Other electronic equipments of frequency acquisition equipment communication connection, then the electronic equipment can obtain it from audio collecting device and collect
Audio data, and be directed to each frame audio data, carry out wind using this programme and make an uproar detection.
In an alternate embodiment of the present invention where, sub-frame processing can be carried out to collected audio data, such as often
The audio data of 20ms is as a frame.Specific frame length can be set according to the actual situation.
It can use time-domain and frequency-domain transfer algorithm, a frame audio data be converted into frequency domain data.For example, can use Fu
In leaf transformation algorithm, fast fourier transform algorithm (FFT, Fast Fourier Transformation) etc., do not limit specifically
It is fixed.
Assuming that a frame audio data is converted into frequency domain data X by fft algorithm for x (t).It is wrapped in frequency domain data X
Frequency values containing each Frequency point, for example, the frequency values of k-th of Frequency point are X (k).
S102: the power spectral density of Frequency point in the frequency domain data is calculated.
As an implementation, which can be sampled, obtains multiple Frequency points, calculate multiple frequency
The power spectral density of rate point.Assuming that sampling to frequency domain data X, L Frequency point is obtained, is calculated each in this L Frequency point
The power spectral density of Frequency point, wherein the power spectral density of k-th of Frequency point is Its
In, k is the positive integer no more than L.
S103: by the power spectral density, the spectral centroid of the frequency domain data is calculated.
It is appreciated that spectral centroid (SC, spectral centroids) indicates the spectrum energy point of a frame audio data
The equalization point of cloth can react the center of the audio data frequency distribution.
In general, the formula for calculating spectral centroid can be with are as follows:
Wherein,Indicate the frequency range of the frequency domain data, F (ω) indicates that frequency values, E indicate the energy of the frequency domain data
Amount.
Formula A can calculate the spectral centroid of one section of continuous frequency domain data, alternatively, also can use formula B calculates institute
State the spectral centroid of frequency domain data:
Wherein, fs indicates sample rate, and L indicates the total quantity of Frequency point,Indicate the power spectrum of k-th of Frequency point
Density,X (k) indicates that the frequency values of k-th of Frequency point, k are the positive integer no more than L.
One section of continuous frequency domain data is sampled, it is assumed that ω=fs*k/L is substituted into above-mentioned formula by sample rate fs
A has just obtained formula B.It is appreciated that there is frequency domain data conjugate symmetry therefore can only consider half Frequency point,
That is k can take [1, L/2].
S104: the spectral centroid is judged whether less than the first preset threshold, if so, executing S105.
S105: determine that there are wind to make an uproar in the frame audio data.
Since the spectral centroid that wind is made an uproar is smaller, and the spectral centroid of voice is larger, therefore:
The first situation, if the spectral centroid being calculated in S103 is very small, the spectral centroid made an uproar close to pure wind,
It may be considered that in the audio data, there are relatively strong winds to make an uproar;
Second situation, if the spectral centroid being calculated in S103 is very big, close to the spectral centroid of pure voice,
It may be considered that wind present in the audio data is made an uproar and be can be ignored, or think that there is no wind to make an uproar;
The third situation, if the spectral centroid being calculated in S103 between first two situation, it may be considered that
The audio data is mixed with voice and wind is made an uproar.
On this basis, a threshold value (the first preset threshold) can be set, if the frequency spectrum matter being calculated in S103
The heart is less than first preset threshold, that is to say, that the spectral centroid is smaller, it is believed that is the first above-mentioned situation and the third
Situation, it is determined that there are wind to make an uproar in the frame audio data.
As an implementation, if the spectral centroid being calculated in S103 is more than or equal to first preset threshold,
" the frame audio data " in S101 can directly be exported.It is appreciated that if the spectral centroid being calculated in S103 is greater than
First preset threshold, then it is assumed that be above-mentioned second situation, in this case, the better quality of the frame audio data can be with
Directly export the frame audio data.
Through the above scheme, wind has just been completed to make an uproar detection.As an implementation, it makes an uproar the basis of detection completing wind
On, further progress wind is made an uproar inhibition.
As an implementation, after S105, the spectral centroid can also be judged whether less than the second default threshold
Value;If so, by the frame audio data zero setting.
It may be above-mentioned if the spectral centroid being calculated in S103 is smaller according to the description of three kinds of situations above
The first situation or the third situation, it is possible to further set a threshold value more smaller than the first preset threshold, (second is pre-
If threshold value), if the spectral centroid being calculated in S103 is less than second preset threshold, then it is assumed that be the first above-mentioned feelings
Condition, there are relatively strong winds to make an uproar in the audio data.It in this case, can be by the frame audio data zero setting.It is appreciated that by the frame
Audio data zero setting output, output is quiet data.
In this embodiment, if certain frame audio data apoplexy is made an uproar excessive, output mute data avoid user from listening
It makes an uproar to meaningless wind, experience is preferable.
As an implementation, after S105, it is pre- can also to judge whether the spectral centroid is more than or equal to third
If threshold value, the third predetermined threshold value is less than first preset threshold;
If so, being filtered using Predetermined filter to the frequency domain data, filtered frequency domain data is obtained;
The filtered frequency domain data is converted into time domain data.
It is appreciated that the spectral centroid being calculated in S103 is less than the first preset threshold and pre- more than or equal to third
If in the case where threshold value, executing present embodiment, therefore, third predetermined threshold value is less than the first preset threshold.
The third predetermined threshold value can be equal to second preset threshold, can also be greater than second preset threshold, for example,
In 2 embodiment of subsequent figure, third predetermined threshold value is equal to the second preset threshold, and the case where second preset threshold that will be greater than or equal to all is made
The case where for more than or equal to third predetermined threshold value, is handled.
Present embodiment may be considered for the third above-mentioned situation, and audio data is the mixed number that wind is made an uproar with voice
According in this case, by being filtered to audio data, realizing that wind is made an uproar inhibition.
In a kind of optional filtering mode, if certain frame audio data belongs to the first above-mentioned situation, that is to say, that such as
When the spectral centroid of fruit frame audio data is less than third predetermined threshold value, the power spectral density of the frame audio data is recorded, as
Wind is made an uproar power spectral density.In this way, recorded wind can be utilized to make an uproar power spectral density, audio data is filtered.
Several Frequency points are chosen from continuous frequency domain data, and the serial number of the Frequency point selected is indicated with k, then,
It can use following formula, determine filtered frequency domain data:
Y (k)=H (k) * X (k),
Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate kth
The frequency values of a Frequency point,Indicate the power spectral density of k-th of Frequency point,
Indicate that the wind of state-of-the-art record is made an uproar power spectral density, k is positive integer.
In this filtering mode, it can only record newest wind and make an uproar power spectral density, that is to say, that whenever judging certain frame sound
When frequency is less than third predetermined threshold value according to corresponding spectral centroid, the power spectral density of the audio data is just recorded, is made an uproar as wind
Power spectral density, and the wind recorded before deleting is made an uproar power spectral density;In this way, being filtered using above-mentioned formula to certain frame audio data
When, in formulaThe wind as recorded is made an uproar power spectral density.The function alternatively, wind recorded before can not also deleting is made an uproar
Rate spectrum density;In this way, needing to make an uproar in power spectral density in the wind recorded when filtering certain frame audio data using above-mentioned formula
Determine that the newest wind once recorded is made an uproar power spectral density, as in formula
The setting process of the Predetermined filter is as follows:
Assuming that the frequency domain data X=s+n of certain frame audio data, wherein s indicates that the phonological component in frequency domain data, n indicate
Wind in frequency domain data is made an uproar part.
Correspondingly, the power spectral density of frequency domain data XWherein,Table
Show the power spectral density of the phonological component,Indicate that the wind of state-of-the-art record is made an uproar power spectral density.
Set filter asThe range of H (k) be [0,1], when wind make an uproar it is bigger
When, H (k) is smaller, and Y (k) is smaller, indicate it is bigger to the inhibitory effect of audio data, when wind is made an uproar it is smaller when, H (k) is bigger, Y (k)
It is bigger, it indicates smaller to the inhibitory effect of audio data.
Y (k) is subjected to IFFT (Inverse Fast Fourier Transform, inverse fast Fourier transform) again, just
Filtered frequency domain data is converted into time domain data, can export the time domain data, that is, output wind make an uproar after inhibiting when
Numeric field data.
In present embodiment, to partially there is audio data that wind makes an uproar, (spectral centroid is more than or equal to third predetermined threshold value, small
In the audio data of the first preset threshold) it is filtered, compared to the scheme for the audio data that wind is made an uproar will be present all abandoning, this
Embodiment can retain more effective voice data.
If carrying out wind using two-way audio data to make an uproar detection, does not have some while acquiring two-way audio data qualification
Scene in, for example, in single microphone scene, need to add other acquisition equipment, it is inconvenient for operation, and increase equipment at
This.And illustrated embodiment of the present invention is applied, and wind, which can be carried out, using audio data all the way makes an uproar detection, it is easy to operate, and reduction
Equipment cost.
Fig. 2 is that wind provided in an embodiment of the present invention is made an uproar second of flow diagram of detection method, comprising:
S201: it is directed to each frame audio data, which is converted into frequency domain data.
S202: the power spectral density of Frequency point in the frequency domain data is calculated.
S203: by the power spectral density, the spectral centroid of the frequency domain data is calculated.
S204: the spectral centroid is judged whether less than the first preset threshold, if so, S205 is executed, if not, executing
S206。
S205: determine that there are wind to make an uproar in the frame audio data.
S206: the frame audio data is exported.
S207: the spectral centroid is judged whether less than the second preset threshold, if so, S208 is executed, if not, executing
S209.Second preset threshold is less than the first preset threshold.
S208: by the frame audio data zero setting, and the audio data after zero setting is exported.
S209: the frequency domain data is filtered using Predetermined filter, obtains filtered frequency domain data.
S210: being converted to time domain data for the filtered frequency domain data, and exports the time domain data after conversion.
Fig. 2 embodiment of the present invention provides that a kind of complete wind makes an uproar detection and wind is made an uproar the method for inhibition.In Fig. 2 embodiment
In, do not set third predetermined threshold value, but will be greater than or equal to the second preset threshold the case where all as be more than or equal to third it is pre-
If the case where threshold value, is handled.
According to content in Fig. 1 embodiment, since the spectral centroid that wind is made an uproar is smaller, and the spectral centroid of voice is larger, so
According to spectral centroid, audio data is divided into three kinds of situations: one, making an uproar there are relatively strong winds or pure wind is made an uproar;Two, wind, which is made an uproar, ignores not
It counts or there is no wind to make an uproar (pure voice);Three, wind is made an uproar the blended data with voice.
In Fig. 2 embodiment, different processing schemes is respectively adopted for these three situations: the first situation (frequency spectrum matter
The heart is less than the second preset threshold), audio quality is very poor, audio data zero setting is exported, that is, output mute data;Second
Situation (spectral centroid is more than or equal to the first preset threshold), audio quality is fine, directly output audio data;The third situation
(spectral centroid is more than or equal to the second preset threshold, less than the first preset threshold), audio quality is between the first situation and second
Between kind situation, after being filtered to the frequency domain data of the audio data, it is reconverted into audio data output.
In Fig. 2 embodiment, for not detecting the audio data frame that outlet air is made an uproar, (spectral centroid is more than or equal to the first default threshold
The audio data frame of value), directly output it;And in some schemes, no matter made an uproar in audio data frame with the presence or absence of wind, all to whole
Body audio data carries out wind and makes an uproar inhibition;Compared to these schemes, this embodiment reduces unnecessary inhibition operations.In some wind
In the scene that smaller or wind is made an uproar off and on of making an uproar, this effect of the present embodiment is become apparent.
In Fig. 2 embodiment, to partially there is the audio data frame that wind is made an uproar, (spectral centroid is more than or equal to the second default threshold
Value, less than the audio data frame of the first preset threshold) be filtered;And in some schemes, it will test the audio data that outlet air is made an uproar
All abandon;Compared to these schemes, the present embodiment can retain more effective voice data.
Using embodiment illustrated in fig. 2 of the present invention, on the basis of wind makes an uproar detection, the situation that can make an uproar for different wind is not using
With wind make an uproar Restrain measurement, the audio data quality of output is higher.
Corresponding with above method embodiment, the embodiment of the present invention also provides a kind of wind and makes an uproar detection device.
Fig. 3 is that a kind of wind provided in an embodiment of the present invention is made an uproar the structural schematic diagram of detection device, comprising:
The frame audio data is converted to frequency domain data for being directed to each frame audio data by the first conversion module 301;
First computing module 302, for calculating the power spectral density of Frequency point in the frequency domain data;
Second computing module 303, for calculating the spectral centroid of the frequency domain data by the power spectral density;
First judgment module 304, for judging the spectral centroid whether less than the first preset threshold;
Determining module 305, for determining and being deposited in the frame audio data when 304 judging result of first judgment module, which is, is
It makes an uproar in wind.
As an implementation, described device can also include: that the second judgment module and zero setting module (are not shown in figure
Out), wherein
Second judgment module, for judging that the spectral centroid is when the first judgment module judging result, which is, is
It is no less than the second preset threshold, second preset threshold is less than first preset threshold;
Zero setting module, for the second judgment module judging result be when, by the frame audio data zero setting.
As an implementation, described device can also include: third judgment module, filter module and the second modulus of conversion
Block (not shown), wherein
Third judgment module, for judging that the spectral centroid is when the first judgment module judging result, which is, is
No to be more than or equal to third predetermined threshold value, the third predetermined threshold value is less than first preset threshold;
Filter module, for the third judgment module judging result be when, using Predetermined filter to the frequency
Numeric field data is filtered, and obtains filtered frequency domain data;
Second conversion module, for the filtered frequency domain data to be converted to time domain data.
As an implementation, described device can also include:
Output module (not shown), for exporting the frame audio when 304 judging result of first judgment module is no
Data.
As an implementation, the first computing module 302, specifically can be used for:
The frequency domain data is sampled, multiple Frequency points are obtained, the power spectrum spectrum for calculating the multiple Frequency point is close
Degree;
Second computing module 303, specifically can be used for:
According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
As an implementation, the second computing module 303, specifically can be used for:
Using following formula, the spectral centroid of the frequency domain data is calculated:
Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,It indicates k-th
The power spectral density of Frequency point,X (k) indicates that the frequency values of k-th of Frequency point, k are no more than L's
Positive integer.
As an implementation, described device can also include:
Logging modle (not shown), for recording the frequency when the third judgment module judging result is no
The corresponding power spectral density of mass center is composed, is made an uproar power spectral density as wind;
The filter module, chooses several Frequency points from the continuous frequency domain data of this section, indicates the frequency selected with k
The serial number of rate point, and can be used for:
Using following formula, filtered frequency domain data is determined:
Y (k)=H (k) * X (k),
Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate kth
The frequency values of a Frequency point,Indicate the power spectral density of k-th of Frequency point,
Indicate that the wind of the logging modle state-of-the-art record is made an uproar power spectral density, k is positive integer.
Using embodiment illustrated in fig. 3 of the present invention, each frame audio data is converted into frequency domain data, according to frequency domain data
Spectral centroid is determined in audio data and is made an uproar with the presence or absence of wind;As it can be seen that in a first aspect, wind can be carried out using audio data all the way
It makes an uproar detection, does not need to acquire two-way audio data simultaneously and compare, it is easy to operate;Second aspect is adopted for two compared to setting
Collecting equipment to acquire the scheme of two-way audio data, this programme only needs acquisition equipment acquisition audio data all the way,
Reduce equipment cost.
Corresponding with above method embodiment, the embodiment of the present invention also provides a kind of electronic equipment, as shown in figure 4, including
Processor 401 and memory 402, wherein
Memory 402, for storing computer program;
Processor 401 when for executing the program stored on memory 402, realizes that any of the above-described kind of wind is made an uproar the side of detection
Method.
The memory that above-mentioned electronic equipment is mentioned may include random access memory (Random Access Memory,
It RAM), also may include nonvolatile memory (Non-Volatile Memory, NVM), for example, at least a disk storage
Device.Optionally, memory can also be that at least one is located remotely from the storage device of aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit,
CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (Digital Signal
Processing, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing
It is field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete
Door or transistor logic, discrete hardware components.
The embodiment of the present invention also provides a kind of computer readable storage medium, storage in the computer readable storage medium
There is a computer program, the computer program realizes that any of the above-described kind of wind is made an uproar detection method when being executed by processor.
The embodiment of the present invention also provides a kind of wind and makes an uproar detection system, as shown in Figure 5, comprising: audio collecting device and wind are made an uproar
Detection device, wherein
Audio data collected for acquiring audio data, and is sent to the wind and made an uproar by the audio collecting device
Detection device;
The wind is made an uproar detection device, the audio data sent for receiving the audio collecting device, for each frame sound
The frame audio data is converted to frequency domain data by frequency evidence;Calculate the power spectral density of Frequency point in the frequency domain data;Pass through
The power spectral density calculates the spectral centroid of the frequency domain data;Judge the spectral centroid whether less than the first default threshold
Value;If so, determining that there are wind to make an uproar in the frame audio data.
As an implementation, the quantity of the audio collecting device can be one, that is to say, that the system can
Only to include an audio collecting device.
In this embodiment, an audio collecting device, for acquiring audio data all the way, and by sound collected
Frequency is made an uproar detection device according to being sent to the wind;
The wind makes an uproar detection device, will for each of these frame audio data for receiving the audio data all the way
The frame audio data is converted to frequency domain data;Calculate the power spectral density of Frequency point in the frequency domain data;Pass through the power
Spectrum density calculates the spectral centroid of the frequency domain data;Judge the spectral centroid whether less than the first preset threshold;If
It is to determine that there are wind to make an uproar in the frame audio data.
As an implementation, the audio collecting device and the wind detection device of making an uproar can be wholely set, and the two can be with
For same equipment.
Using illustrated embodiment of the present invention, each frame audio data received is converted to frequency domain number by wind detection device of making an uproar
According to determining in audio data and make an uproar with the presence or absence of wind according to the spectral centroid of frequency domain data;As it can be seen that in a first aspect, utilizing sound all the way
Frequency is made an uproar detection according to can carry out wind, is not needed to acquire two-way audio data simultaneously and is compared, easy to operate;Second aspect,
The scheme of two-way audio data is acquired compared to two acquisition equipment of setting, only needs an acquisition equipment acquisition in this system
Audio data all the way reduces equipment cost.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
Each embodiment in this specification is all made of relevant mode and describes, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for Fig. 3 institute
The wind shown make an uproar detection device embodiment, electronic equipment embodiment shown in Fig. 4, wind shown in fig. 5 make an uproar detection system embodiment and
Speech is made an uproar detection method embodiment since it is substantially similar to wind shown in Fig. 1-2, so being described relatively simple, related place
Referring to wind shown in Fig. 1-2 make an uproar detection method embodiment part explanation.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all
Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention
It is interior.
Claims (16)
- The detection method 1. a kind of wind is made an uproar characterized by comprisingFor each frame audio data, which is converted into frequency domain data;Calculate the power spectral density of Frequency point in the frequency domain data;By the power spectral density, the spectral centroid of the frequency domain data is calculated;Judge the spectral centroid whether less than the first preset threshold;If so, determining that there are wind to make an uproar in the frame audio data.
- 2. the method according to claim 1, wherein in the determination frame audio data, there are wind to make an uproar it Afterwards, further includes:Whether the spectral centroid is judged less than the second preset threshold, and second preset threshold is less than the described first default threshold Value;If so, by the frame audio data zero setting.
- 3. the method according to claim 1, wherein in the determination frame audio data, there are wind to make an uproar it Afterwards, further includes:Judge whether the spectral centroid is more than or equal to third predetermined threshold value, it is default that the third predetermined threshold value is less than described first Threshold value;If so, being filtered using Predetermined filter to the frequency domain data, filtered frequency domain data is obtained;The filtered frequency domain data is converted into time domain data.
- 4. the method according to claim 1, wherein judging the spectral centroid not less than the first preset threshold In the case where, the method also includes:Export the frame audio data.
- 5. the method according to claim 1, wherein the power spectrum for calculating Frequency point in the frequency domain data Density, comprising:The frequency domain data is sampled, multiple Frequency points are obtained, calculates the power spectral density of the multiple Frequency point;It is described by the power spectral density, calculate the spectral centroid of the frequency domain data, comprising:According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
- 6. calculating the frequency domain the method according to claim 1, wherein described by the power spectral density The spectral centroid of data, comprising:Using following formula, the spectral centroid of the frequency domain data is calculated:Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,Indicate k-th of frequency The power spectral density of point,X (k) indicates that the frequency values of k-th of Frequency point, k are just whole no more than L Number.
- 7. according to the method described in claim 3, it is characterized in that, judging the spectral centroid less than third predetermined threshold value In the case of, the method also includes:The corresponding power spectral density of the spectral centroid is recorded, is made an uproar power spectral density as wind;It is described that the frequency domain data is filtered using Predetermined filter, obtain filtered frequency domain data, comprising:Using following formula, filtered frequency domain data is determined:Y (k)=H (k) * X (k),Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate k-th of frequency The frequency values of rate point,Indicate the power spectral density of k-th of Frequency point, It indicates The wind of state-of-the-art record is made an uproar power spectral density, and k is positive integer.
- The detection device 8. a kind of wind is made an uproar characterized by comprisingThe frame audio data is converted to frequency domain data for being directed to each frame audio data by the first conversion module;First computing module, for calculating the power spectral density of Frequency point in the frequency domain data;Second computing module, for calculating the spectral centroid of the frequency domain data by the power spectral density;First judgment module, for judging the spectral centroid whether less than the first preset threshold;Determining module, for determining that there are wind to make an uproar in the frame audio data when the first judgment module judging result, which is, is.
- 9. device according to claim 8, which is characterized in that described device further include:Second judgment module, for judging whether the spectral centroid is small when the first judgment module judging result, which is, is In the second preset threshold, second preset threshold is less than first preset threshold;Zero setting module, for the second judgment module judging result be when, by the frame audio data zero setting.
- 10. device according to claim 8, which is characterized in that described device further include:Third judgment module, for judging whether the spectral centroid is big when the first judgment module judging result, which is, is In being equal to third predetermined threshold value, the third predetermined threshold value is less than first preset threshold;Filter module, for the third judgment module judging result be when, using Predetermined filter to the frequency domain number According to being filtered, filtered frequency domain data is obtained;Second conversion module, for the filtered frequency domain data to be converted to time domain data.
- 11. device according to claim 8, which is characterized in that described device further include:Output module, for exporting the frame audio data when the first judgment module judging result is no.
- 12. device according to claim 8, which is characterized in that first computing module is specifically used for:The frequency domain data is sampled, multiple Frequency points are obtained, calculates the power spectrum spectrum density of the multiple Frequency point;Second computing module, is specifically used for:According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
- 13. device according to claim 8, which is characterized in that second computing module is specifically used for:Using following formula, the spectral centroid of the frequency domain data is calculated:Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,Indicate k-th of frequency The power spectral density of point,X (k) indicates that the frequency values of k-th of Frequency point, k are just whole no more than L Number.
- 14. device according to claim 10, which is characterized in that described device further include:Logging modle, for recording the corresponding power of the spectral centroid when the third judgment module judging result is no Spectrum density is made an uproar power spectral density as wind;The filter module, is specifically used for:Using following formula, filtered frequency domain data is determined:Y (k)=H (k) * X (k),Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate k-th of frequency The frequency values of rate point,Indicate the power spectral density of k-th of Frequency point, It indicates The wind of the logging modle state-of-the-art record is made an uproar power spectral density, and k is positive integer.
- 15. a kind of electronic equipment, which is characterized in that including processor and memory, whereinMemory, for storing computer program;Processor when for executing the program stored on memory, realizes method and step as claimed in claim 1 to 7.
- The detection system 16. a kind of wind is made an uproar characterized by comprising audio collecting device and wind are made an uproar detection device, whereinAudio data collected for acquiring audio data, and is sent to the wind and made an uproar detection by the audio collecting device Equipment;The wind is made an uproar detection device, the audio data sent for receiving the audio collecting device, for each of these frame The frame audio data is converted to frequency domain data by audio data;Calculate the power spectral density of Frequency point in the frequency domain data;It is logical The power spectral density is crossed, the spectral centroid of the frequency domain data is calculated;Judge whether the spectral centroid is default less than first Threshold value;If so, determining that there are wind to make an uproar in the frame audio data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710754716.4A CN109427345B (en) | 2017-08-29 | 2017-08-29 | Wind noise detection method, device and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710754716.4A CN109427345B (en) | 2017-08-29 | 2017-08-29 | Wind noise detection method, device and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109427345A true CN109427345A (en) | 2019-03-05 |
CN109427345B CN109427345B (en) | 2022-12-02 |
Family
ID=65501950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710754716.4A Active CN109427345B (en) | 2017-08-29 | 2017-08-29 | Wind noise detection method, device and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109427345B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112309420A (en) * | 2020-10-30 | 2021-02-02 | 出门问问(苏州)信息科技有限公司 | Method and device for detecting wind noise |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101031963A (en) * | 2004-09-16 | 2007-09-05 | 法国电信 | Method of processing a noisy sound signal and device for implementing said method |
US20110188685A1 (en) * | 2009-12-29 | 2011-08-04 | Sheikh Naim | Method for the detection of whistling in an audio system |
US20120008799A1 (en) * | 2009-04-03 | 2012-01-12 | Sascha Disch | Apparatus and method for determining a plurality of local center of gravity frequencies of a spectrum of an audio signal |
CN103345921A (en) * | 2013-07-15 | 2013-10-09 | 南京理工大学 | Nighttime sleeping sound signal analyzing method based on multiple characteristics |
US20160203833A1 (en) * | 2013-08-30 | 2016-07-14 | Zte Corporation | Voice Activity Detection Method and Device |
US20160225388A1 (en) * | 2013-10-25 | 2016-08-04 | Intel IP Corporation | Audio processing devices and audio processing methods |
CN106463106A (en) * | 2014-07-14 | 2017-02-22 | 英特尔Ip公司 | Wind noise reduction for audio reception |
-
2017
- 2017-08-29 CN CN201710754716.4A patent/CN109427345B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101031963A (en) * | 2004-09-16 | 2007-09-05 | 法国电信 | Method of processing a noisy sound signal and device for implementing said method |
US20120008799A1 (en) * | 2009-04-03 | 2012-01-12 | Sascha Disch | Apparatus and method for determining a plurality of local center of gravity frequencies of a spectrum of an audio signal |
US20110188685A1 (en) * | 2009-12-29 | 2011-08-04 | Sheikh Naim | Method for the detection of whistling in an audio system |
CN103345921A (en) * | 2013-07-15 | 2013-10-09 | 南京理工大学 | Nighttime sleeping sound signal analyzing method based on multiple characteristics |
US20160203833A1 (en) * | 2013-08-30 | 2016-07-14 | Zte Corporation | Voice Activity Detection Method and Device |
US20160225388A1 (en) * | 2013-10-25 | 2016-08-04 | Intel IP Corporation | Audio processing devices and audio processing methods |
CN106463106A (en) * | 2014-07-14 | 2017-02-22 | 英特尔Ip公司 | Wind noise reduction for audio reception |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112309420A (en) * | 2020-10-30 | 2021-02-02 | 出门问问(苏州)信息科技有限公司 | Method and device for detecting wind noise |
CN112309420B (en) * | 2020-10-30 | 2023-06-27 | 出门问问(苏州)信息科技有限公司 | Method and device for detecting wind noise |
Also Published As
Publication number | Publication date |
---|---|
CN109427345B (en) | 2022-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103152498B (en) | Echo canceler | |
US7508948B2 (en) | Reverberation removal | |
US20060056644A1 (en) | Audio feedback processing system | |
CN101430882A (en) | Method and apparatus for restraining wind noise | |
CN104703094B (en) | Utter long and high-pitched sounds detection suppression system and its control method based on MAX262 and FPGA | |
WO2015078121A1 (en) | Audio signal quality detection method and device | |
CN205812392U (en) | Sound boxes detecting device | |
CN104159177A (en) | Audio recording system and method based on screencast | |
CN107274913A (en) | A kind of sound identification method and device | |
CN106448696A (en) | Adaptive high-pass filtering speech noise reduction method based on background noise estimation | |
CN105659631B (en) | Sound field measurement apparatus and sound field measuring method | |
CN110072175B (en) | Control circuit and method for reducing wind noise | |
US9407998B2 (en) | Hearing device with analog filtering and associated method | |
CN103888868A (en) | Sound recovering method based on loudness adjustment and control | |
CN101517638B (en) | High frequency signal interpolating method and high frequency signal interpolating apparatus | |
CN110111811A (en) | Audio signal detection method, device and storage medium | |
CN109427345A (en) | A kind of wind is made an uproar detection method, apparatus and system | |
CN103297590A (en) | Method and system for achieving equipment unlocking based on voice frequency | |
CN109741762B (en) | Voice activity detection method and device and computer readable storage medium | |
CN105708487A (en) | Snoring detection control method for human body snoring detection device | |
JP2004135309A (en) | Tone detector and therefor method | |
CN1701630B (en) | Howling suppression device and howling suppression method | |
CN105632523B (en) | Adjust the method and apparatus and terminal of the volume output valve of audio data | |
CN104424954B (en) | noise estimation method and device | |
CN109297614A (en) | Loudspeaker temperature protecting method based on phase change measurement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |