CN109427345A - A kind of wind is made an uproar detection method, apparatus and system - Google Patents

A kind of wind is made an uproar detection method, apparatus and system Download PDF

Info

Publication number
CN109427345A
CN109427345A CN201710754716.4A CN201710754716A CN109427345A CN 109427345 A CN109427345 A CN 109427345A CN 201710754716 A CN201710754716 A CN 201710754716A CN 109427345 A CN109427345 A CN 109427345A
Authority
CN
China
Prior art keywords
frequency domain
uproar
wind
domain data
audio data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710754716.4A
Other languages
Chinese (zh)
Other versions
CN109427345B (en
Inventor
杨茜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201710754716.4A priority Critical patent/CN109427345B/en
Publication of CN109427345A publication Critical patent/CN109427345A/en
Application granted granted Critical
Publication of CN109427345B publication Critical patent/CN109427345B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

It makes an uproar detection method, apparatus and system the embodiment of the invention provides a kind of wind, method includes: that each frame audio data is converted to frequency domain data, according to the spectral centroid of frequency domain data, determines in audio data and makes an uproar with the presence or absence of wind;As it can be seen that make an uproar detection in a first aspect, wind can be carried out using audio data all the way, does not need to acquire two-way audio data simultaneously and compare, it is easy to operate;Second aspect, the scheme of two-way audio data is acquired compared to two acquisition equipment of setting, and this programme only needs acquisition equipment acquisition audio data all the way, reduces equipment cost.

Description

A kind of wind is made an uproar detection method, apparatus and system
Technical field
The present invention relates to audio signal processing technique field, make an uproar detection method, apparatus and system more particularly to a kind of wind.
Background technique
During audio collection, wind, which is made an uproar, usually influences a key factor of audio quality.In order to propose big audio matter Amount, it usually needs wind is carried out to collected audio data and is made an uproar detection.
Progress wind makes an uproar to detect and usually requires while acquiring two-way audio data, this two-way audio data is compared, than Such as, the attribute values such as the peak relationship of two-way audio data, signal-to-noise ratio are compared, according to comparison result, determine audio data In make an uproar with the presence or absence of wind.
In above scheme, need to acquire two-way audio data simultaneously, and in some scenes for not having this condition, than Such as, in single microphone scene, if to acquire two-way audio data simultaneously, need to add other acquisition equipment, operation is not It is convenient.
Summary of the invention
A kind of wind of being designed to provide of the embodiment of the present invention is made an uproar detection method, apparatus and system, to realize by all the way Audio data carries out wind and makes an uproar detection.
In order to achieve the above objectives, it makes an uproar detection method the embodiment of the invention provides a kind of wind, comprising:
For each frame audio data, which is converted into frequency domain data;
Calculate the power spectral density of Frequency point in the frequency domain data;
By the power spectral density, the spectral centroid of the frequency domain data is calculated;
Judge the spectral centroid whether less than the first preset threshold;
If so, determining that there are wind to make an uproar in the frame audio data.
Optionally, after making an uproar in the determination frame audio data there are wind, can also include:
Whether the spectral centroid is judged less than the second preset threshold, and it is default that second preset threshold is less than described first Threshold value;
If so, by the frame audio data zero setting.
Optionally, after making an uproar in the determination frame audio data there are wind, can also include:
Judge whether the spectral centroid is more than or equal to third predetermined threshold value, the third predetermined threshold value is less than described first Preset threshold;
If so, being filtered using Predetermined filter to the frequency domain data, filtered frequency domain data is obtained;
The filtered frequency domain data is converted into time domain data.
Optionally, in the case where judging the spectral centroid not less than the first preset threshold, the method can also be wrapped It includes:
Export the frame audio data.
Optionally, the power spectral density for calculating Frequency point in the frequency domain data may include:
The frequency domain data is sampled, multiple Frequency points are obtained, the power spectrum spectrum for calculating the multiple Frequency point is close Degree;
It is described by the power spectral density, calculate the spectral centroid of the frequency domain data, may include:
According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
Optionally, described by the power spectral density, calculate the spectral centroid of the frequency domain data, comprising:
Using following formula, the spectral centroid of the frequency domain data is calculated:
Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,It indicates k-th The power spectral density of Frequency point,X (k) indicates that the frequency values of k-th of Frequency point, k are no more than L's Positive integer.
Optionally, in the case where judging that the spectral centroid is less than third predetermined threshold value, the method can also include:
The corresponding power spectral density of the spectral centroid is recorded, is made an uproar power spectral density as wind;
It is described that the frequency domain data is filtered using Predetermined filter, obtain filtered frequency domain data, comprising:
Using following formula, filtered frequency domain data is determined:
Y (k)=H (k) * X (k),
Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate kth The frequency values of a Frequency point,Indicate the power spectral density of k-th of Frequency point, Indicate that the wind of state-of-the-art record is made an uproar power spectral density.
In order to achieve the above objectives, it makes an uproar detection device the embodiment of the invention also provides a kind of wind, comprising:
The frame audio data is converted to frequency domain data for being directed to each frame audio data by the first conversion module;
First computing module, for calculating the power spectral density of Frequency point in the frequency domain data;
Second computing module, for calculating the spectral centroid of the frequency domain data by the power spectral density;
First judgment module, for judging the spectral centroid whether less than the first preset threshold;
Determining module, for determining and existing in the frame audio data when the first judgment module judging result, which is, is Wind is made an uproar.
Optionally, described device can also include:
Second judgment module, for judging that the spectral centroid is when the first judgment module judging result, which is, is It is no less than the second preset threshold, second preset threshold is less than first preset threshold;
Zero setting module, for the second judgment module judging result be when, by the frame audio data zero setting.
Optionally, described device can also include:
Third judgment module, for judging that the spectral centroid is when the first judgment module judging result, which is, is No to be more than or equal to third predetermined threshold value, the third predetermined threshold value is less than first preset threshold;
Filter module, for the third judgment module judging result be when, using Predetermined filter to the frequency Numeric field data is filtered, and obtains filtered frequency domain data;
Second conversion module, for the filtered frequency domain data to be converted to time domain data.
Optionally, described device can also include:
Output module, for exporting the frame audio data when the first judgment module judging result is no.
Optionally, first computing module, specifically can be used for:
The frequency domain data is sampled, multiple Frequency points are obtained, the power spectrum spectrum for calculating the multiple Frequency point is close Degree;
Second computing module, specifically can be used for:
According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
Optionally, second computing module, specifically can be used for:
Using following formula, the spectral centroid of the frequency domain data is calculated:
Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,It indicates k-th The power spectral density of Frequency point,X (k) indicates that the frequency values of k-th of Frequency point, k are no more than L's Positive integer.
Optionally, described device can also include:
Logging modle, for it is corresponding to record the spectral centroid when the third judgment module judging result is no Power spectral density is made an uproar power spectral density as wind;
The filter module, specifically can be used for:
Using following formula, filtered frequency domain data is determined:
Y (k)=H (k) * X (k),
Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate kth The frequency values of a Frequency point,Indicate the power spectral density of k-th of Frequency point, Indicate that the wind of the logging modle state-of-the-art record is made an uproar power spectral density.
In order to achieve the above objectives, the embodiment of the invention also provides a kind of electronic equipment, including processor and memory, In,
Memory, for storing computer program;
Processor when for executing the program stored on memory, realizes that any of the above-described kind of wind is made an uproar detection method.
In order to achieve the above objectives, it makes an uproar detection system the embodiment of the invention also provides a kind of wind, comprising: audio collecting device It makes an uproar detection device with wind, wherein
Audio data collected for acquiring audio data, and is sent to the wind and made an uproar by the audio collecting device Detection device;
The wind is made an uproar detection device, the audio data sent for receiving the audio collecting device, for therein every The frame audio data is converted to frequency domain data by one frame audio data;Calculate the power spectrum of Frequency point in the frequency domain data Degree;By the power spectral density, the spectral centroid of the frequency domain data is calculated;Judge the spectral centroid whether less than first Preset threshold;If so, determining that there are wind to make an uproar in the frame audio data.
Using illustrated embodiment of the present invention, each frame audio data is converted into frequency domain data, according to the frequency of frequency domain data Mass center is composed, determines in audio data and makes an uproar with the presence or absence of wind;As it can be seen that making an uproar in the present solution, wind can be carried out using audio data all the way Detection, does not need to acquire two-way audio data simultaneously and compares, easy to operate.
Certainly, implement any of the products of the present invention or method it is not absolutely required at the same reach all the above excellent Point.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is that wind provided in an embodiment of the present invention is made an uproar the first flow diagram of detection method;
Fig. 2 is that wind provided in an embodiment of the present invention is made an uproar second of flow diagram of detection method;
Fig. 3 is that a kind of wind provided in an embodiment of the present invention is made an uproar the structural schematic diagram of detection device;
Fig. 4 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention;
Fig. 5 is that a kind of wind provided in an embodiment of the present invention is made an uproar the structural schematic diagram of detection system.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
In order to solve the above-mentioned technical problem, it makes an uproar detection method, device and equipment the embodiment of the invention provides a kind of wind.It should Method and device can be applied to the equipment with audio processing function, specifically without limitation.
A kind of wind provided in an embodiment of the present invention detection method of making an uproar is described in detail first below.
Fig. 1 is that wind provided in an embodiment of the present invention is made an uproar the first flow diagram of detection method, comprising:
S101: it is directed to each frame audio data, which is converted into frequency domain data.
S102: the power spectral density of Frequency point in the frequency domain data is calculated.
S103: by the power spectral density, the spectral centroid of the frequency domain data is calculated.
S104: the spectral centroid is judged whether less than the first preset threshold, if so, executing S105.
S105: determine that there are wind to make an uproar in the frame audio data.
Using embodiment illustrated in fig. 1 of the present invention, each frame audio data is converted into frequency domain data, according to frequency domain data Spectral centroid is determined in audio data and is made an uproar with the presence or absence of wind;As it can be seen that in a first aspect, wind can be carried out using audio data all the way It makes an uproar detection, does not need to acquire two-way audio data simultaneously and compare, it is easy to operate;Second aspect is adopted for two compared to setting Collecting equipment to acquire the scheme of two-way audio data, this programme only needs acquisition equipment acquisition audio data all the way, Reduce equipment cost.
It is described in detail below for embodiment illustrated in fig. 1:
S101: it is directed to each frame audio data, which is converted into frequency domain data.
The equipment (executing subject in other words, hereinafter referred to as this equipment) for executing this programme can be audio collecting device, It can be other electronic equipments communicated to connect with audio collecting device.If this equipment is audio collecting device, audio is adopted Collected each frame audio data can be directed to by collecting equipment, carried out wind using this programme and made an uproar detection.If this equipment is and sound Other electronic equipments of frequency acquisition equipment communication connection, then the electronic equipment can obtain it from audio collecting device and collect Audio data, and be directed to each frame audio data, carry out wind using this programme and make an uproar detection.
In an alternate embodiment of the present invention where, sub-frame processing can be carried out to collected audio data, such as often The audio data of 20ms is as a frame.Specific frame length can be set according to the actual situation.
It can use time-domain and frequency-domain transfer algorithm, a frame audio data be converted into frequency domain data.For example, can use Fu In leaf transformation algorithm, fast fourier transform algorithm (FFT, Fast Fourier Transformation) etc., do not limit specifically It is fixed.
Assuming that a frame audio data is converted into frequency domain data X by fft algorithm for x (t).It is wrapped in frequency domain data X Frequency values containing each Frequency point, for example, the frequency values of k-th of Frequency point are X (k).
S102: the power spectral density of Frequency point in the frequency domain data is calculated.
As an implementation, which can be sampled, obtains multiple Frequency points, calculate multiple frequency The power spectral density of rate point.Assuming that sampling to frequency domain data X, L Frequency point is obtained, is calculated each in this L Frequency point The power spectral density of Frequency point, wherein the power spectral density of k-th of Frequency point is Its In, k is the positive integer no more than L.
S103: by the power spectral density, the spectral centroid of the frequency domain data is calculated.
It is appreciated that spectral centroid (SC, spectral centroids) indicates the spectrum energy point of a frame audio data The equalization point of cloth can react the center of the audio data frequency distribution.
In general, the formula for calculating spectral centroid can be with are as follows:
Wherein,Indicate the frequency range of the frequency domain data, F (ω) indicates that frequency values, E indicate the energy of the frequency domain data Amount.
Formula A can calculate the spectral centroid of one section of continuous frequency domain data, alternatively, also can use formula B calculates institute State the spectral centroid of frequency domain data:
Wherein, fs indicates sample rate, and L indicates the total quantity of Frequency point,Indicate the power spectrum of k-th of Frequency point Density,X (k) indicates that the frequency values of k-th of Frequency point, k are the positive integer no more than L.
One section of continuous frequency domain data is sampled, it is assumed that ω=fs*k/L is substituted into above-mentioned formula by sample rate fs A has just obtained formula B.It is appreciated that there is frequency domain data conjugate symmetry therefore can only consider half Frequency point, That is k can take [1, L/2].
S104: the spectral centroid is judged whether less than the first preset threshold, if so, executing S105.
S105: determine that there are wind to make an uproar in the frame audio data.
Since the spectral centroid that wind is made an uproar is smaller, and the spectral centroid of voice is larger, therefore:
The first situation, if the spectral centroid being calculated in S103 is very small, the spectral centroid made an uproar close to pure wind, It may be considered that in the audio data, there are relatively strong winds to make an uproar;
Second situation, if the spectral centroid being calculated in S103 is very big, close to the spectral centroid of pure voice, It may be considered that wind present in the audio data is made an uproar and be can be ignored, or think that there is no wind to make an uproar;
The third situation, if the spectral centroid being calculated in S103 between first two situation, it may be considered that The audio data is mixed with voice and wind is made an uproar.
On this basis, a threshold value (the first preset threshold) can be set, if the frequency spectrum matter being calculated in S103 The heart is less than first preset threshold, that is to say, that the spectral centroid is smaller, it is believed that is the first above-mentioned situation and the third Situation, it is determined that there are wind to make an uproar in the frame audio data.
As an implementation, if the spectral centroid being calculated in S103 is more than or equal to first preset threshold, " the frame audio data " in S101 can directly be exported.It is appreciated that if the spectral centroid being calculated in S103 is greater than First preset threshold, then it is assumed that be above-mentioned second situation, in this case, the better quality of the frame audio data can be with Directly export the frame audio data.
Through the above scheme, wind has just been completed to make an uproar detection.As an implementation, it makes an uproar the basis of detection completing wind On, further progress wind is made an uproar inhibition.
As an implementation, after S105, the spectral centroid can also be judged whether less than the second default threshold Value;If so, by the frame audio data zero setting.
It may be above-mentioned if the spectral centroid being calculated in S103 is smaller according to the description of three kinds of situations above The first situation or the third situation, it is possible to further set a threshold value more smaller than the first preset threshold, (second is pre- If threshold value), if the spectral centroid being calculated in S103 is less than second preset threshold, then it is assumed that be the first above-mentioned feelings Condition, there are relatively strong winds to make an uproar in the audio data.It in this case, can be by the frame audio data zero setting.It is appreciated that by the frame Audio data zero setting output, output is quiet data.
In this embodiment, if certain frame audio data apoplexy is made an uproar excessive, output mute data avoid user from listening It makes an uproar to meaningless wind, experience is preferable.
As an implementation, after S105, it is pre- can also to judge whether the spectral centroid is more than or equal to third If threshold value, the third predetermined threshold value is less than first preset threshold;
If so, being filtered using Predetermined filter to the frequency domain data, filtered frequency domain data is obtained;
The filtered frequency domain data is converted into time domain data.
It is appreciated that the spectral centroid being calculated in S103 is less than the first preset threshold and pre- more than or equal to third If in the case where threshold value, executing present embodiment, therefore, third predetermined threshold value is less than the first preset threshold.
The third predetermined threshold value can be equal to second preset threshold, can also be greater than second preset threshold, for example, In 2 embodiment of subsequent figure, third predetermined threshold value is equal to the second preset threshold, and the case where second preset threshold that will be greater than or equal to all is made The case where for more than or equal to third predetermined threshold value, is handled.
Present embodiment may be considered for the third above-mentioned situation, and audio data is the mixed number that wind is made an uproar with voice According in this case, by being filtered to audio data, realizing that wind is made an uproar inhibition.
In a kind of optional filtering mode, if certain frame audio data belongs to the first above-mentioned situation, that is to say, that such as When the spectral centroid of fruit frame audio data is less than third predetermined threshold value, the power spectral density of the frame audio data is recorded, as Wind is made an uproar power spectral density.In this way, recorded wind can be utilized to make an uproar power spectral density, audio data is filtered.
Several Frequency points are chosen from continuous frequency domain data, and the serial number of the Frequency point selected is indicated with k, then, It can use following formula, determine filtered frequency domain data:
Y (k)=H (k) * X (k),
Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate kth The frequency values of a Frequency point,Indicate the power spectral density of k-th of Frequency point, Indicate that the wind of state-of-the-art record is made an uproar power spectral density, k is positive integer.
In this filtering mode, it can only record newest wind and make an uproar power spectral density, that is to say, that whenever judging certain frame sound When frequency is less than third predetermined threshold value according to corresponding spectral centroid, the power spectral density of the audio data is just recorded, is made an uproar as wind Power spectral density, and the wind recorded before deleting is made an uproar power spectral density;In this way, being filtered using above-mentioned formula to certain frame audio data When, in formulaThe wind as recorded is made an uproar power spectral density.The function alternatively, wind recorded before can not also deleting is made an uproar Rate spectrum density;In this way, needing to make an uproar in power spectral density in the wind recorded when filtering certain frame audio data using above-mentioned formula Determine that the newest wind once recorded is made an uproar power spectral density, as in formula
The setting process of the Predetermined filter is as follows:
Assuming that the frequency domain data X=s+n of certain frame audio data, wherein s indicates that the phonological component in frequency domain data, n indicate Wind in frequency domain data is made an uproar part.
Correspondingly, the power spectral density of frequency domain data XWherein,Table Show the power spectral density of the phonological component,Indicate that the wind of state-of-the-art record is made an uproar power spectral density.
Set filter asThe range of H (k) be [0,1], when wind make an uproar it is bigger When, H (k) is smaller, and Y (k) is smaller, indicate it is bigger to the inhibitory effect of audio data, when wind is made an uproar it is smaller when, H (k) is bigger, Y (k) It is bigger, it indicates smaller to the inhibitory effect of audio data.
Y (k) is subjected to IFFT (Inverse Fast Fourier Transform, inverse fast Fourier transform) again, just Filtered frequency domain data is converted into time domain data, can export the time domain data, that is, output wind make an uproar after inhibiting when Numeric field data.
In present embodiment, to partially there is audio data that wind makes an uproar, (spectral centroid is more than or equal to third predetermined threshold value, small In the audio data of the first preset threshold) it is filtered, compared to the scheme for the audio data that wind is made an uproar will be present all abandoning, this Embodiment can retain more effective voice data.
If carrying out wind using two-way audio data to make an uproar detection, does not have some while acquiring two-way audio data qualification Scene in, for example, in single microphone scene, need to add other acquisition equipment, it is inconvenient for operation, and increase equipment at This.And illustrated embodiment of the present invention is applied, and wind, which can be carried out, using audio data all the way makes an uproar detection, it is easy to operate, and reduction Equipment cost.
Fig. 2 is that wind provided in an embodiment of the present invention is made an uproar second of flow diagram of detection method, comprising:
S201: it is directed to each frame audio data, which is converted into frequency domain data.
S202: the power spectral density of Frequency point in the frequency domain data is calculated.
S203: by the power spectral density, the spectral centroid of the frequency domain data is calculated.
S204: the spectral centroid is judged whether less than the first preset threshold, if so, S205 is executed, if not, executing S206。
S205: determine that there are wind to make an uproar in the frame audio data.
S206: the frame audio data is exported.
S207: the spectral centroid is judged whether less than the second preset threshold, if so, S208 is executed, if not, executing S209.Second preset threshold is less than the first preset threshold.
S208: by the frame audio data zero setting, and the audio data after zero setting is exported.
S209: the frequency domain data is filtered using Predetermined filter, obtains filtered frequency domain data.
S210: being converted to time domain data for the filtered frequency domain data, and exports the time domain data after conversion.
Fig. 2 embodiment of the present invention provides that a kind of complete wind makes an uproar detection and wind is made an uproar the method for inhibition.In Fig. 2 embodiment In, do not set third predetermined threshold value, but will be greater than or equal to the second preset threshold the case where all as be more than or equal to third it is pre- If the case where threshold value, is handled.
According to content in Fig. 1 embodiment, since the spectral centroid that wind is made an uproar is smaller, and the spectral centroid of voice is larger, so According to spectral centroid, audio data is divided into three kinds of situations: one, making an uproar there are relatively strong winds or pure wind is made an uproar;Two, wind, which is made an uproar, ignores not It counts or there is no wind to make an uproar (pure voice);Three, wind is made an uproar the blended data with voice.
In Fig. 2 embodiment, different processing schemes is respectively adopted for these three situations: the first situation (frequency spectrum matter The heart is less than the second preset threshold), audio quality is very poor, audio data zero setting is exported, that is, output mute data;Second Situation (spectral centroid is more than or equal to the first preset threshold), audio quality is fine, directly output audio data;The third situation (spectral centroid is more than or equal to the second preset threshold, less than the first preset threshold), audio quality is between the first situation and second Between kind situation, after being filtered to the frequency domain data of the audio data, it is reconverted into audio data output.
In Fig. 2 embodiment, for not detecting the audio data frame that outlet air is made an uproar, (spectral centroid is more than or equal to the first default threshold The audio data frame of value), directly output it;And in some schemes, no matter made an uproar in audio data frame with the presence or absence of wind, all to whole Body audio data carries out wind and makes an uproar inhibition;Compared to these schemes, this embodiment reduces unnecessary inhibition operations.In some wind In the scene that smaller or wind is made an uproar off and on of making an uproar, this effect of the present embodiment is become apparent.
In Fig. 2 embodiment, to partially there is the audio data frame that wind is made an uproar, (spectral centroid is more than or equal to the second default threshold Value, less than the audio data frame of the first preset threshold) be filtered;And in some schemes, it will test the audio data that outlet air is made an uproar All abandon;Compared to these schemes, the present embodiment can retain more effective voice data.
Using embodiment illustrated in fig. 2 of the present invention, on the basis of wind makes an uproar detection, the situation that can make an uproar for different wind is not using With wind make an uproar Restrain measurement, the audio data quality of output is higher.
Corresponding with above method embodiment, the embodiment of the present invention also provides a kind of wind and makes an uproar detection device.
Fig. 3 is that a kind of wind provided in an embodiment of the present invention is made an uproar the structural schematic diagram of detection device, comprising:
The frame audio data is converted to frequency domain data for being directed to each frame audio data by the first conversion module 301;
First computing module 302, for calculating the power spectral density of Frequency point in the frequency domain data;
Second computing module 303, for calculating the spectral centroid of the frequency domain data by the power spectral density;
First judgment module 304, for judging the spectral centroid whether less than the first preset threshold;
Determining module 305, for determining and being deposited in the frame audio data when 304 judging result of first judgment module, which is, is It makes an uproar in wind.
As an implementation, described device can also include: that the second judgment module and zero setting module (are not shown in figure Out), wherein
Second judgment module, for judging that the spectral centroid is when the first judgment module judging result, which is, is It is no less than the second preset threshold, second preset threshold is less than first preset threshold;
Zero setting module, for the second judgment module judging result be when, by the frame audio data zero setting.
As an implementation, described device can also include: third judgment module, filter module and the second modulus of conversion Block (not shown), wherein
Third judgment module, for judging that the spectral centroid is when the first judgment module judging result, which is, is No to be more than or equal to third predetermined threshold value, the third predetermined threshold value is less than first preset threshold;
Filter module, for the third judgment module judging result be when, using Predetermined filter to the frequency Numeric field data is filtered, and obtains filtered frequency domain data;
Second conversion module, for the filtered frequency domain data to be converted to time domain data.
As an implementation, described device can also include:
Output module (not shown), for exporting the frame audio when 304 judging result of first judgment module is no Data.
As an implementation, the first computing module 302, specifically can be used for:
The frequency domain data is sampled, multiple Frequency points are obtained, the power spectrum spectrum for calculating the multiple Frequency point is close Degree;
Second computing module 303, specifically can be used for:
According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
As an implementation, the second computing module 303, specifically can be used for:
Using following formula, the spectral centroid of the frequency domain data is calculated:
Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,It indicates k-th The power spectral density of Frequency point,X (k) indicates that the frequency values of k-th of Frequency point, k are no more than L's Positive integer.
As an implementation, described device can also include:
Logging modle (not shown), for recording the frequency when the third judgment module judging result is no The corresponding power spectral density of mass center is composed, is made an uproar power spectral density as wind;
The filter module, chooses several Frequency points from the continuous frequency domain data of this section, indicates the frequency selected with k The serial number of rate point, and can be used for:
Using following formula, filtered frequency domain data is determined:
Y (k)=H (k) * X (k),
Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate kth The frequency values of a Frequency point,Indicate the power spectral density of k-th of Frequency point, Indicate that the wind of the logging modle state-of-the-art record is made an uproar power spectral density, k is positive integer.
Using embodiment illustrated in fig. 3 of the present invention, each frame audio data is converted into frequency domain data, according to frequency domain data Spectral centroid is determined in audio data and is made an uproar with the presence or absence of wind;As it can be seen that in a first aspect, wind can be carried out using audio data all the way It makes an uproar detection, does not need to acquire two-way audio data simultaneously and compare, it is easy to operate;Second aspect is adopted for two compared to setting Collecting equipment to acquire the scheme of two-way audio data, this programme only needs acquisition equipment acquisition audio data all the way, Reduce equipment cost.
Corresponding with above method embodiment, the embodiment of the present invention also provides a kind of electronic equipment, as shown in figure 4, including Processor 401 and memory 402, wherein
Memory 402, for storing computer program;
Processor 401 when for executing the program stored on memory 402, realizes that any of the above-described kind of wind is made an uproar the side of detection Method.
The memory that above-mentioned electronic equipment is mentioned may include random access memory (Random Access Memory, It RAM), also may include nonvolatile memory (Non-Volatile Memory, NVM), for example, at least a disk storage Device.Optionally, memory can also be that at least one is located remotely from the storage device of aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (Digital Signal Processing, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing It is field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete Door or transistor logic, discrete hardware components.
The embodiment of the present invention also provides a kind of computer readable storage medium, storage in the computer readable storage medium There is a computer program, the computer program realizes that any of the above-described kind of wind is made an uproar detection method when being executed by processor.
The embodiment of the present invention also provides a kind of wind and makes an uproar detection system, as shown in Figure 5, comprising: audio collecting device and wind are made an uproar Detection device, wherein
Audio data collected for acquiring audio data, and is sent to the wind and made an uproar by the audio collecting device Detection device;
The wind is made an uproar detection device, the audio data sent for receiving the audio collecting device, for each frame sound The frame audio data is converted to frequency domain data by frequency evidence;Calculate the power spectral density of Frequency point in the frequency domain data;Pass through The power spectral density calculates the spectral centroid of the frequency domain data;Judge the spectral centroid whether less than the first default threshold Value;If so, determining that there are wind to make an uproar in the frame audio data.
As an implementation, the quantity of the audio collecting device can be one, that is to say, that the system can Only to include an audio collecting device.
In this embodiment, an audio collecting device, for acquiring audio data all the way, and by sound collected Frequency is made an uproar detection device according to being sent to the wind;
The wind makes an uproar detection device, will for each of these frame audio data for receiving the audio data all the way The frame audio data is converted to frequency domain data;Calculate the power spectral density of Frequency point in the frequency domain data;Pass through the power Spectrum density calculates the spectral centroid of the frequency domain data;Judge the spectral centroid whether less than the first preset threshold;If It is to determine that there are wind to make an uproar in the frame audio data.
As an implementation, the audio collecting device and the wind detection device of making an uproar can be wholely set, and the two can be with For same equipment.
Using illustrated embodiment of the present invention, each frame audio data received is converted to frequency domain number by wind detection device of making an uproar According to determining in audio data and make an uproar with the presence or absence of wind according to the spectral centroid of frequency domain data;As it can be seen that in a first aspect, utilizing sound all the way Frequency is made an uproar detection according to can carry out wind, is not needed to acquire two-way audio data simultaneously and is compared, easy to operate;Second aspect, The scheme of two-way audio data is acquired compared to two acquisition equipment of setting, only needs an acquisition equipment acquisition in this system Audio data all the way reduces equipment cost.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.
Each embodiment in this specification is all made of relevant mode and describes, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for Fig. 3 institute The wind shown make an uproar detection device embodiment, electronic equipment embodiment shown in Fig. 4, wind shown in fig. 5 make an uproar detection system embodiment and Speech is made an uproar detection method embodiment since it is substantially similar to wind shown in Fig. 1-2, so being described relatively simple, related place Referring to wind shown in Fig. 1-2 make an uproar detection method embodiment part explanation.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims (16)

  1. The detection method 1. a kind of wind is made an uproar characterized by comprising
    For each frame audio data, which is converted into frequency domain data;
    Calculate the power spectral density of Frequency point in the frequency domain data;
    By the power spectral density, the spectral centroid of the frequency domain data is calculated;
    Judge the spectral centroid whether less than the first preset threshold;
    If so, determining that there are wind to make an uproar in the frame audio data.
  2. 2. the method according to claim 1, wherein in the determination frame audio data, there are wind to make an uproar it Afterwards, further includes:
    Whether the spectral centroid is judged less than the second preset threshold, and second preset threshold is less than the described first default threshold Value;
    If so, by the frame audio data zero setting.
  3. 3. the method according to claim 1, wherein in the determination frame audio data, there are wind to make an uproar it Afterwards, further includes:
    Judge whether the spectral centroid is more than or equal to third predetermined threshold value, it is default that the third predetermined threshold value is less than described first Threshold value;
    If so, being filtered using Predetermined filter to the frequency domain data, filtered frequency domain data is obtained;
    The filtered frequency domain data is converted into time domain data.
  4. 4. the method according to claim 1, wherein judging the spectral centroid not less than the first preset threshold In the case where, the method also includes:
    Export the frame audio data.
  5. 5. the method according to claim 1, wherein the power spectrum for calculating Frequency point in the frequency domain data Density, comprising:
    The frequency domain data is sampled, multiple Frequency points are obtained, calculates the power spectral density of the multiple Frequency point;
    It is described by the power spectral density, calculate the spectral centroid of the frequency domain data, comprising:
    According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
  6. 6. calculating the frequency domain the method according to claim 1, wherein described by the power spectral density The spectral centroid of data, comprising:
    Using following formula, the spectral centroid of the frequency domain data is calculated:
    Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,Indicate k-th of frequency The power spectral density of point,X (k) indicates that the frequency values of k-th of Frequency point, k are just whole no more than L Number.
  7. 7. according to the method described in claim 3, it is characterized in that, judging the spectral centroid less than third predetermined threshold value In the case of, the method also includes:
    The corresponding power spectral density of the spectral centroid is recorded, is made an uproar power spectral density as wind;
    It is described that the frequency domain data is filtered using Predetermined filter, obtain filtered frequency domain data, comprising:
    Using following formula, filtered frequency domain data is determined:
    Y (k)=H (k) * X (k),
    Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate k-th of frequency The frequency values of rate point,Indicate the power spectral density of k-th of Frequency point, It indicates The wind of state-of-the-art record is made an uproar power spectral density, and k is positive integer.
  8. The detection device 8. a kind of wind is made an uproar characterized by comprising
    The frame audio data is converted to frequency domain data for being directed to each frame audio data by the first conversion module;
    First computing module, for calculating the power spectral density of Frequency point in the frequency domain data;
    Second computing module, for calculating the spectral centroid of the frequency domain data by the power spectral density;
    First judgment module, for judging the spectral centroid whether less than the first preset threshold;
    Determining module, for determining that there are wind to make an uproar in the frame audio data when the first judgment module judging result, which is, is.
  9. 9. device according to claim 8, which is characterized in that described device further include:
    Second judgment module, for judging whether the spectral centroid is small when the first judgment module judging result, which is, is In the second preset threshold, second preset threshold is less than first preset threshold;
    Zero setting module, for the second judgment module judging result be when, by the frame audio data zero setting.
  10. 10. device according to claim 8, which is characterized in that described device further include:
    Third judgment module, for judging whether the spectral centroid is big when the first judgment module judging result, which is, is In being equal to third predetermined threshold value, the third predetermined threshold value is less than first preset threshold;
    Filter module, for the third judgment module judging result be when, using Predetermined filter to the frequency domain number According to being filtered, filtered frequency domain data is obtained;
    Second conversion module, for the filtered frequency domain data to be converted to time domain data.
  11. 11. device according to claim 8, which is characterized in that described device further include:
    Output module, for exporting the frame audio data when the first judgment module judging result is no.
  12. 12. device according to claim 8, which is characterized in that first computing module is specifically used for:
    The frequency domain data is sampled, multiple Frequency points are obtained, calculates the power spectrum spectrum density of the multiple Frequency point;
    Second computing module, is specifically used for:
    According to sample rate and the power spectral density, the spectral centroid of the frequency domain data is calculated.
  13. 13. device according to claim 8, which is characterized in that second computing module is specifically used for:
    Using following formula, the spectral centroid of the frequency domain data is calculated:
    Wherein, SC indicates that spectral centroid, fs indicate sample rate, and L indicates the total quantity of Frequency point,Indicate k-th of frequency The power spectral density of point,X (k) indicates that the frequency values of k-th of Frequency point, k are just whole no more than L Number.
  14. 14. device according to claim 10, which is characterized in that described device further include:
    Logging modle, for recording the corresponding power of the spectral centroid when the third judgment module judging result is no Spectrum density is made an uproar power spectral density as wind;
    The filter module, is specifically used for:
    Using following formula, filtered frequency domain data is determined:
    Y (k)=H (k) * X (k),
    Wherein, Y (k) indicates that the filtered frequency domain data, H (k) indicate that the Predetermined filter, X (k) indicate k-th of frequency The frequency values of rate point,Indicate the power spectral density of k-th of Frequency point, It indicates The wind of the logging modle state-of-the-art record is made an uproar power spectral density, and k is positive integer.
  15. 15. a kind of electronic equipment, which is characterized in that including processor and memory, wherein
    Memory, for storing computer program;
    Processor when for executing the program stored on memory, realizes method and step as claimed in claim 1 to 7.
  16. The detection system 16. a kind of wind is made an uproar characterized by comprising audio collecting device and wind are made an uproar detection device, wherein
    Audio data collected for acquiring audio data, and is sent to the wind and made an uproar detection by the audio collecting device Equipment;
    The wind is made an uproar detection device, the audio data sent for receiving the audio collecting device, for each of these frame The frame audio data is converted to frequency domain data by audio data;Calculate the power spectral density of Frequency point in the frequency domain data;It is logical The power spectral density is crossed, the spectral centroid of the frequency domain data is calculated;Judge whether the spectral centroid is default less than first Threshold value;If so, determining that there are wind to make an uproar in the frame audio data.
CN201710754716.4A 2017-08-29 2017-08-29 Wind noise detection method, device and system Active CN109427345B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710754716.4A CN109427345B (en) 2017-08-29 2017-08-29 Wind noise detection method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710754716.4A CN109427345B (en) 2017-08-29 2017-08-29 Wind noise detection method, device and system

Publications (2)

Publication Number Publication Date
CN109427345A true CN109427345A (en) 2019-03-05
CN109427345B CN109427345B (en) 2022-12-02

Family

ID=65501950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710754716.4A Active CN109427345B (en) 2017-08-29 2017-08-29 Wind noise detection method, device and system

Country Status (1)

Country Link
CN (1) CN109427345B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112309420A (en) * 2020-10-30 2021-02-02 出门问问(苏州)信息科技有限公司 Method and device for detecting wind noise

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101031963A (en) * 2004-09-16 2007-09-05 法国电信 Method of processing a noisy sound signal and device for implementing said method
US20110188685A1 (en) * 2009-12-29 2011-08-04 Sheikh Naim Method for the detection of whistling in an audio system
US20120008799A1 (en) * 2009-04-03 2012-01-12 Sascha Disch Apparatus and method for determining a plurality of local center of gravity frequencies of a spectrum of an audio signal
CN103345921A (en) * 2013-07-15 2013-10-09 南京理工大学 Nighttime sleeping sound signal analyzing method based on multiple characteristics
US20160203833A1 (en) * 2013-08-30 2016-07-14 Zte Corporation Voice Activity Detection Method and Device
US20160225388A1 (en) * 2013-10-25 2016-08-04 Intel IP Corporation Audio processing devices and audio processing methods
CN106463106A (en) * 2014-07-14 2017-02-22 英特尔Ip公司 Wind noise reduction for audio reception

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101031963A (en) * 2004-09-16 2007-09-05 法国电信 Method of processing a noisy sound signal and device for implementing said method
US20120008799A1 (en) * 2009-04-03 2012-01-12 Sascha Disch Apparatus and method for determining a plurality of local center of gravity frequencies of a spectrum of an audio signal
US20110188685A1 (en) * 2009-12-29 2011-08-04 Sheikh Naim Method for the detection of whistling in an audio system
CN103345921A (en) * 2013-07-15 2013-10-09 南京理工大学 Nighttime sleeping sound signal analyzing method based on multiple characteristics
US20160203833A1 (en) * 2013-08-30 2016-07-14 Zte Corporation Voice Activity Detection Method and Device
US20160225388A1 (en) * 2013-10-25 2016-08-04 Intel IP Corporation Audio processing devices and audio processing methods
CN106463106A (en) * 2014-07-14 2017-02-22 英特尔Ip公司 Wind noise reduction for audio reception

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112309420A (en) * 2020-10-30 2021-02-02 出门问问(苏州)信息科技有限公司 Method and device for detecting wind noise
CN112309420B (en) * 2020-10-30 2023-06-27 出门问问(苏州)信息科技有限公司 Method and device for detecting wind noise

Also Published As

Publication number Publication date
CN109427345B (en) 2022-12-02

Similar Documents

Publication Publication Date Title
CN103152498B (en) Echo canceler
US7508948B2 (en) Reverberation removal
US20060056644A1 (en) Audio feedback processing system
CN101430882A (en) Method and apparatus for restraining wind noise
CN104703094B (en) Utter long and high-pitched sounds detection suppression system and its control method based on MAX262 and FPGA
WO2015078121A1 (en) Audio signal quality detection method and device
CN205812392U (en) Sound boxes detecting device
CN104159177A (en) Audio recording system and method based on screencast
CN107274913A (en) A kind of sound identification method and device
CN106448696A (en) Adaptive high-pass filtering speech noise reduction method based on background noise estimation
CN105659631B (en) Sound field measurement apparatus and sound field measuring method
CN110072175B (en) Control circuit and method for reducing wind noise
US9407998B2 (en) Hearing device with analog filtering and associated method
CN103888868A (en) Sound recovering method based on loudness adjustment and control
CN101517638B (en) High frequency signal interpolating method and high frequency signal interpolating apparatus
CN110111811A (en) Audio signal detection method, device and storage medium
CN109427345A (en) A kind of wind is made an uproar detection method, apparatus and system
CN103297590A (en) Method and system for achieving equipment unlocking based on voice frequency
CN109741762B (en) Voice activity detection method and device and computer readable storage medium
CN105708487A (en) Snoring detection control method for human body snoring detection device
JP2004135309A (en) Tone detector and therefor method
CN1701630B (en) Howling suppression device and howling suppression method
CN105632523B (en) Adjust the method and apparatus and terminal of the volume output valve of audio data
CN104424954B (en) noise estimation method and device
CN109297614A (en) Loudspeaker temperature protecting method based on phase change measurement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant