CN110265064A - Audio sonic boom detection method, device and storage medium - Google Patents
Audio sonic boom detection method, device and storage medium Download PDFInfo
- Publication number
- CN110265064A CN110265064A CN201910506938.3A CN201910506938A CN110265064A CN 110265064 A CN110265064 A CN 110265064A CN 201910506938 A CN201910506938 A CN 201910506938A CN 110265064 A CN110265064 A CN 110265064A
- Authority
- CN
- China
- Prior art keywords
- signal
- audio
- frame
- audio signal
- short
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
Abstract
The embodiment of the present application discloses a kind of audio sonic boom detection method, device and storage medium, the application is when carrying out sonic boom detection to audio signal, available audio signal to be detected, the audio signal is divided into multiple frame signals, then, the short-time energy for calculating two neighboring frame signal is poor, then, the frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtain mutation audio signal, subsequently, calculate the frequency spectrum flatness of the mutation audio signal, if the frequency spectrum flatness, which is greater than, presets flat value, then determine that there are sonic booms for the audio signal;The program can accurately detect audio signal with the presence or absence of sonic boom.
Description
Technical field
This application involves fields of communication technology, and in particular to a kind of audio sonic boom detection method, device and storage medium.
Background technique
As Internet technology continues to develop, internet there are all kinds of audio files of magnanimity, as music/speech/monologue story-telling with gestures/
Various types of audio files such as chat.It the step of due to a series of complex such as audio is by recording, processing, transmission, storages, can
The phenomenon that " distortion ", such as beginning sonic boom, burr, breakpoint etc. can occur.Starting sonic boom is a kind of relatively common distortion phenomenon.
" beginning sonic boom " refers to that the beginning part in musical waveform sounds like a sound of " clatter ", this thorn there is of short duration pulse
The unnatural sound of ear can bring poor user experience to hearer.It shows, deposits in the statistics case to a library
Reach 10% in the audio accounting of beginning sonic boom, due to the presence of sonic boom, causes audio quality poor.Therefore, it correctly detects out
It is extremely important that audio starts sonic boom.
Summary of the invention
The embodiment of the present application provides a kind of audio sonic boom detection method, device and storage medium, can be used for detecting audio
It is lacked in signal with the presence or absence of frequency band, to effectively and rapidly filter out the audio file of frequency band missing.
The embodiment of the present application provides a kind of audio sonic boom detection method, comprising:
Audio signal to be detected is obtained, the audio signal is divided into multiple frame signals;
The short-time energy for calculating two neighboring frame signal is poor;
The frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtains mutation audio signal;
The frequency spectrum flatness for calculating the mutation audio signal, if the frequency spectrum flatness, which is greater than, presets flat value, really
There are sonic booms for the fixed audio signal.
Optionally, in some embodiments, described to divide the audio signal in the audio sonic boom detection method
For multiple frame signals, comprising:
The signal for choosing preset time period to the audio signal since first frame in time domain obtains beginning audio letter
Number;
The beginning audio signal is divided into multiple frame signals.
Optionally, in some embodiments, described to calculate two neighboring frame signal in the audio sonic boom detection method
Short-time energy it is poor, comprising:
Calculate the short-time energy of each frame signal;
Obtain the time of each frame signal;
The difference between the short-time energy of two neighboring frame signal is successively calculated according to the time sequencing of the frame signal, is obtained
The short-time energy of two neighboring frame signal is poor.
Optionally, in some embodiments, described poor according to the short-time energy in the audio sonic boom detection method
The frame signal for meeting preset condition section is obtained, mutation audio signal is obtained, comprising:
Two frame signals that the short-time energy difference is greater than preset threshold are obtained, it will be in two frame signals according to time sequencing
Following frame signal be determined as start frame signal;
Two frame signals that the short-time energy difference is less than preset threshold negative value are obtained after the beginning frame signal, according to
Following frame signal in two frame signals is determined as terminating frame signal by time sequencing;
The beginning frame signal is obtained to the signal between the end frame signal, obtains mutation audio signal.
Optionally, in some embodiments, described after the beginning frame signal in the audio sonic boom detection method
Two frame signals that the short-time energy difference is less than preset threshold negative value are obtained, it will be after in two frame signals according to time sequencing
One frame signal is determined as terminating frame signal, comprising:
Successively judge whether the short-time energy difference is less than preset threshold in chronological order after the beginning frame signal
Negative value;
When detecting that the short-time energy difference is less than preset threshold negative value for the first time, will be less than according to time sequencing default
Following frame signal in two frame signals of threshold value negative value is determined as terminating frame signal.
Optionally, in some embodiments, described to calculate the mutation audio letter in the audio sonic boom detection method
Number frequency spectrum flatness, comprising:
Detect the peak position of the mutation audio signal;
Multiple fixed sample point composition sonic boom audio frames are respectively taken before and after the peak position;
Calculate the frequency spectrum flatness of the sonic boom audio frame.
Optionally, in some embodiments, in the audio sonic boom detection method, if the frequency spectrum flatness is big
In presetting flat value, it is determined that there are sonic booms for the audio signal, comprising:
Judge whether the frequency spectrum flatness is greater than and presets flat value;
If the frequency spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal;
If the frequency spectrum flatness, which is less than, presets flat value, it is determined that sonic boom is not present in the audio signal.
Optionally, in some embodiments, in the audio sonic boom detection method, if the frequency spectrum flatness is big
In presetting flat value, it is determined that there are after sonic boom for the audio signal, further includes:
The frame signal for executing and obtaining according to the short-time energy difference and meeting preset condition section is returned, mutation audio letter is obtained
Number the step of, until audio signal to be detected detection finish.
Correspondingly, the embodiment of the present application also provides a kind of audio sonic boom detection device, comprising:
The audio signal is divided into multiple frame signals for obtaining audio signal to be detected by framing module;
Computing module, the short-time energy for calculating two neighboring frame signal are poor;
Module is obtained, for obtaining the frame signal for meeting preset condition section according to the short-time energy difference, is mutated
Audio signal;
Judgment module, for calculating the frequency spectrum flatness of the mutation audio signal, if the frequency spectrum flatness is greater than in advance
If flat value, it is determined that there are sonic booms for the audio signal.
Optionally, in some embodiments, in the audio sonic boom detection device, the framing module, comprising:
Submodule is chosen, for choosing the signal of preset time period to the audio signal since first frame in time domain,
Obtain beginning audio signal;
Framing submodule, for the beginning audio signal to be divided into multiple frame signals.
Optionally, in some embodiments, in the audio sonic boom detection device, the computing module, comprising:
Energy submodule, for calculating the short-time energy of each frame signal;
Acquisition submodule, for obtaining the time of each frame signal;
Energy difference submodule, for successively calculating two neighboring frame signal in short-term according to the time sequencing of the frame signal
Difference between energy, the short-time energy for obtaining two neighboring frame signal are poor.
Optionally, in some embodiments, in the audio sonic boom detection device, the energy difference submodule, specifically
Two frame signals for being greater than preset threshold for obtaining the short-time energy difference, will be after in two frame signals according to time sequencing
One frame signal is determined as starting frame signal;It is negative less than preset threshold that the short-time energy difference is obtained after the beginning frame signal
Following frame signal in two frame signals is determined as terminating frame signal according to time sequencing by two frame signals of value;It obtains
The frame signal that starts obtains mutation audio signal to the signal between the end frame signal.
Optionally, in some embodiments, in the audio sonic boom detection device, the energy difference submodule, specifically
For successively judging whether the short-time energy difference is less than the negative of preset threshold in chronological order after the beginning frame signal
Value;When detecting that the short-time energy difference is less than preset threshold negative value for the first time, preset threshold will be less than according to time sequencing
Following frame signal in two frame signals of negative value is determined as terminating frame signal.
Optionally, in some embodiments, in the audio sonic boom detection device, the judgment module, comprising:
Detection sub-module, for detecting the peak position of the mutation audio signal;
Submodule is sampled, for respectively taking multiple fixed sample point composition sonic boom audio frames before and after the peak position;
Computational submodule, for calculating the frequency spectrum flatness of the sonic boom audio frame.
Optionally, in some embodiments, in the audio sonic boom detection device, the judgment module is specifically used for
Judge whether the frequency spectrum flatness is greater than and presets flat value;If the frequency spectrum flatness, which is greater than, presets flat value, it is determined that institute
Stating audio signal, there are sonic booms;If the frequency spectrum flatness, which is less than, presets flat value, it is determined that there is no quick-fried for the audio signal
Sound.
Optionally, in some embodiments, in the audio sonic boom detection device, further includes:
Detection module obtains the frame signal for meeting preset condition section according to the short-time energy difference for returning to execute,
The step of obtaining mutation audio signal, until audio signal to be detected detection finishes.
In addition, the embodiment of the present application also provides a kind of storage medium, the storage medium is stored with a plurality of instruction, the finger
It enables and being loaded suitable for processor, to execute the step in any audio sonic boom detection method provided by the embodiments of the present application.
The application is when carrying out sonic boom detection to audio signal, available audio signal to be detected, by the audio
Signal is divided into multiple frame signals, and then, the short-time energy for calculating two neighboring frame signal is poor, then, in short-term can according to described
Amount difference obtains the frame signal for meeting preset condition section, obtains mutation audio signal and subsequently calculates the mutation audio signal
Frequency spectrum flatness, if the frequency spectrum flatness be greater than preset flat value, it is determined that there are sonic booms for the audio signal;The program
By carrying out framing to audio signal, the time domain short-time energy of every frame audio signal is then calculated, is looked for by short-time energy difference
The audio frame position of energy jump out, finds out mutation audio signal, then calculates its frequency spectrum flatness, passes through ground spectral flatness
It spends accurately to filter out the audio file of frequency band missing.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for
For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 a is the schematic diagram of a scenario of audio sonic boom detection method provided by the embodiments of the present application;
Fig. 1 b is the first pass schematic diagram of audio sonic boom detection method provided by the embodiments of the present application;
Fig. 2 a is the second procedure schematic diagram of audio sonic boom detection method provided by the embodiments of the present application;
Fig. 2 b is the schematic diagram of the audio signal of audio sonic boom detection method provided by the embodiments of the present application;
Fig. 3 a is the first structure diagram of audio sonic boom detection device provided by the embodiments of the present application;
Fig. 3 b is the second structural schematic diagram of audio sonic boom detection device provided by the embodiments of the present application;
Fig. 4 is the structural schematic diagram of the network equipment provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on
Embodiment in the application, those skilled in the art's every other implementation obtained without creative efforts
Example, shall fall in the protection scope of this application.
Term " first ", " second " and " third " in the application etc. are for distinguishing different objects, rather than for retouching
State particular order.In addition, term " includes " and " having " and their any deformations, it is intended that cover and non-exclusive include.
The embodiment of the present application provides a kind of audio sonic boom detection method, device and storage medium.
Wherein, which specifically can integrate in the network device, which can be terminal
Or the equipment such as server, for example, with reference to Fig. 1 a, user, can when needing the audio file to magnanimity to carry out beginning sonic boom detection
It is handled with triggering the network equipment to these audio files, the available audio signal to be detected of the network equipment, by the sound
Frequency signal is divided into multiple frame signals, and then, the short-time energy for calculating two neighboring frame signal is poor, then, in short-term can according to this
Amount difference obtains the frame signal for meeting preset condition section, obtains mutation audio signal and subsequently calculates the mutation audio signal
Frequency spectrum flatness, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal.
It is described in detail separately below.It should be noted that the sequence of following embodiment is not as preferably suitable to embodiment
The restriction of sequence.
In the present embodiment, it will be described from the angle of audio sonic boom detection device, audio sonic boom detection device tool
Body can integrate in the network device, which can be the equipment such as terminal or server, wherein the terminal may include
Tablet computer, laptop or personal computer (Personal Computer, PC) etc..
The embodiment of the present application provides a kind of audio sonic boom detection method, comprising: audio signal to be detected is obtained, by the sound
Frequency signal is divided into multiple frame signals, and then, the short-time energy for calculating two neighboring frame signal is poor, then, in short-term can according to this
Amount difference obtains the frame signal for meeting preset condition section, obtains mutation audio signal and subsequently calculates the mutation audio signal
Frequency spectrum flatness, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal.
As shown in Figure 1 b, the detailed process of the audio sonic boom detection method can be such that
101, audio signal to be detected is obtained, which is divided into multiple frame signals.
For example, specifically first audio file can be obtained from the various approach such as network, mobile phone or video, and then it is supplied to
The audio sonic boom detection device, that is, the audio sonic boom detection device specifically can receive the audio file that various approach are got,
Audio signal to be detected is extracted from these files again.Then, these audio signals are divided into multiple frame signals.
Wherein, audio file can be with are as follows: audio files and musical instrument digital interface (Musical Instrument Digital
Interface, MIDI) file.Audio files is the original sound recorded by sound recording device, directly has recorded true sound
The binary sampled data of sound;MIDI file is a kind of musical performance instruction sequence, using audio output device or with calculating
The connected electronic musical instrument of machine is played.And audio signal be with voice, music and audio regular sound wave frequency,
Amplitude change information carrier.According to the feature of sound wave, audio-frequency information can be classified as regular audio and irregular sound.Wherein advise
Then audio can be divided into voice, music and audio again.Regular audio is a kind of continuously varying analog signal, can be with one continuously
Curve indicate, referred to as sound wave.
In order to improve the efficiency of detection, the period of detection can be set at the beginning in the time domain of audio signal, and
Sub-frame processing, i.e. step " audio signal is divided into multiple frame signals " are carried out to the audio signal in the period, specifically
It can be such that
The signal for choosing preset time period to the audio signal since first frame in time domain obtains beginning audio signal;
The beginning audio signal is divided into multiple frame signals.
102, the short-time energy for calculating two neighboring frame signal is poor.
For example, can specifically calculate the short-time energy of each frame signal, then, the time of each frame signal is obtained, according to
The time sequencing of the frame signal successively calculates the difference between the short-time energy of two neighboring frame signal, obtains two neighboring frame signal
Short-time energy it is poor.
Wherein, what short-time energy embodied is degree of strength of the signal in different moments.The short-time energy E's of each frame signal
Calculating can be such that
Wherein, N is the sampling number of every frame signal, and n is the sampled point of frame signal, and t indicates the position of frame signal, and E (t) is
The short-time energy of t frame signal.
Wherein, the short-time energy for calculating two neighboring frame signal is poor, can be such that
pt=E (t)-E (t-1)
Wherein, t is the position of frame, ptShort-time energy for two neighboring frame signal is poor.
103, the frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtains mutation audio signal.
Wherein, the setting means of preset condition can there are many kinds of, for example, can flexibly be set according to the demand of practical application
It sets, storage can also be pre-set in the network device.In addition, preset condition can be built in the network equipment, alternatively,
Can save in memory and be sent to the network equipment, etc..
For example, specific available short-time energy difference is greater than two frame signals of preset threshold, it will according to time sequencing
Following frame signal in two frame signals is determined as starting frame signal, and it is small that the short-time energy difference is obtained after this starts frame signal
In two frame signals of preset threshold negative value, the following frame signal in two frame signals is determined as terminating according to time sequencing
Frame signal, obtaining this and starting frame signal terminates signal between frame signal to this, obtains mutation audio signal.
Wherein, the setting means of preset threshold (threshold), abbreviation Th can also there are many kinds of, for example, can basis
The demand flexible setting of practical application can also pre-set storage in the network device.In addition, preset threshold can be built-in
In the network equipment, alternatively, can also save in memory and be sent to the network equipment, etc..
In order to subsequent frequency-flat degree calculating closer to preset condition section true value, in order to make testing result
Accuracy is higher, can take the frame letter for detecting that short-time energy difference is less than preset threshold negative value for the first time after starting frame signal
Following frame signal in number two frame signals is to terminate frame signal, i.e. step " should obtain this in short-term after this starts frame signal
Energy difference is less than two frame signals of preset threshold negative value, according to time sequencing that the following frame signal in two frame signals is true
Being set to terminates frame signal ", specifically it can be such that
Successively judge whether the short-time energy difference is less than the negative of preset threshold in chronological order after this starts frame signal
Value;
When detecting that the short-time energy difference is less than preset threshold negative value for the first time, default threshold will be less than according to time sequencing
Following frame signal in two frame signals of value negative value is determined as terminating frame signal.
104, the frequency spectrum flatness of the mutation audio signal is calculated, if the frequency spectrum flatness, which is greater than, presets flat value, really
There are sonic booms for the fixed audio signal.
For example, specifically the mutation audio signal can be carried out Fourier transformation, frequency domain mutation audio signal is obtained, is calculated
Then the frequency spectrum flatness of frequency domain mutation audio signal judges whether the frequency spectrum flatness is greater than and presets flat value;If the frequency
Spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal;If the frequency spectrum flatness, which is less than, presets flat value,
Then determining the audio signal, there is no sonic booms.
Wherein, preset flat value setting means can also there are many kinds of, for example, can be according to the spirit of the demand of practical application
Setting living, can also pre-set storage in the network device.In addition, presetting flat value can be built in the network equipment,
Alternatively, can also save in memory and be sent to the network equipment, etc..
Wherein, frequency spectrum flatness, also referred to as wiener entropy are in Digital Signal Processing for characterizing the measurement of audible spectrum.
Frequency spectrum flatness can pass through the geometric mean (Geometric Mean, GM) and arithmetic average to signal
The ratio of (Arithmetic Mean, AM) is measured, and frequency spectrum flatness (SpectralFlatness is generally also
Measure, SFM).That is:
Wherein, w (n) is window function, and k is the frequency point that frequency domain is mutated audio signal, and X is that frequency domain is mutated audio signal.Wherein,
Window function can choose rectangular window, quarter window or Hanning window etc..
F (t)=GM (t)/AM (t)
Wherein, GM (t) is the geometric mean that frequency domain is mutated audio signal, and AM (t) is the calculation that frequency domain is mutated audio signal
Art average, F (t) are frequency spectrum flatness.
It, can be with for example, guarantee that the audio to user experience does not have flaw to further promote the accuracy of detection
The peak position of the mutation audio signal is first detected, then centered on the peak position, respectively takes N/2 groups of samples to the left and right
At a sonic boom audio frame, i.e. sonic boom audio frame one shares N number of sampled point.Therefore, step " calculates the frequency of the mutation audio signal
Compose flatness ", specifically it can be such that
Detect the peak position of the mutation audio signal;
Multiple fixed sample point composition sonic boom audio frames are respectively taken before and after the peak position;
Calculate the frequency spectrum flatness of the sonic boom audio frame.
After detecting a sonic boom, for the accuracy of subsequent reparation, the acquisition of short-time energy difference can be continued to test
The frame signal for meeting preset condition section is finished until all audio signals to be detected all detect, i.e., step is " if the frequency spectrum is flat
Smooth degree, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal " after, can also include:
The frame signal for executing and obtaining according to the short-time energy difference and meeting preset condition section is returned, mutation audio signal is obtained
The step of, until audio signal to be detected detection finishes.
After audio signal detection finishes, the interface of testing result can be generated, which includes detection interface, this connects
Mouth can receive the testing result of audio signal to be detected, whether detect audio sonic boom in the interface prompt after the completion of detection
Signal.
From the foregoing, it will be observed that the present embodiment to audio signal carry out sonic boom detection when, available audio signal to be detected,
The audio signal is divided into multiple frame signals, then, the short-time energy for calculating two neighboring frame signal is poor, then, according to this
Short-time energy difference obtains the frame signal for meeting preset condition section, obtains mutation audio signal and subsequently calculates the mutation audio
The frequency spectrum flatness of signal, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal;The program
By carrying out framing to audio signal, the time domain short-time energy of every frame audio signal is then calculated, is looked for by short-time energy difference
The audio frame position of energy jump out, finds out mutation audio signal, then calculates its frequency spectrum flatness, passes through ground spectral flatness
It spends accurately to filter out the audio file of frequency band missing.
The method according to described in preceding embodiment will specifically be integrated in network below with the audio sonic boom detection device and set
Standby middle citing is described in further detail.
As shown in Figure 2 a, a kind of audio sonic boom detection method, detailed process can be such that
201, the network equipment obtains audio signal to be detected.
For example, user can specifically obtain audio file from the various approach such as network, mobile phone or video, and then provide
To the network equipment, the network equipment can receive the audio file that various approach are got, and extract from these files to be checked
The audio signal of survey.
202, the audio signal is carried out framing by the network equipment, obtains frame signal.
For example, in order to improve the efficiency of detection, the network equipment can set inspection at the beginning in the time domain of audio signal
The period of survey, and sub-frame processing is carried out to the audio signal in the period, i.e. the audio signal " is divided into multiple by step
Frame signal " specifically can be such that
The signal for choosing preset time period to the audio signal since first frame in time domain obtains beginning audio signal;
The beginning audio signal is divided into multiple frame signals.
203, the network equipment calculate two neighboring frame signal short-time energy it is poor.
For example, the network equipment can specifically calculate the short-time energy of each frame signal, then, obtain each frame signal when
Between, the difference between the short-time energy of two neighboring frame signal is successively calculated according to the time sequencing of the frame signal, obtains adjacent two
The short-time energy of a frame signal is poor.
Wherein, what short-time energy embodied is degree of strength of the signal in different moments.The short-time energy E's of each frame signal
Calculating can be such that
Wherein, N is the sampling number of every frame signal, and n is the sampled point of frame signal, and t indicates the position of frame signal, and E (t) is
The short-time energy of t frame signal.
Wherein, the short-time energy for calculating two neighboring frame signal is poor, can be such that
pt=E (t)-E (t-1)
Wherein, t is the position of frame, ptShort-time energy for two neighboring frame signal is poor.
204, the network equipment obtains the frame signal for meeting preset condition section according to the short-time energy difference, obtains mutation audio
Signal.
Wherein, the setting means of preset condition can there are many kinds of, for example, can flexibly be set according to the demand of practical application
It sets, storage can also be pre-set in the network device.In addition, preset condition can be built in the network equipment, alternatively,
Can save in memory and be sent to the network equipment, etc..
For example, the network equipment specifically the available short-time energy difference be greater than preset threshold two frame signals, according to when
Between sequence by the following frame signal in two frame signals be determined as start frame signal, obtain this in short-term after this starts frame signal
Energy difference is less than two frame signals of preset threshold negative value, according to time sequencing that the following frame signal in two frame signals is true
It is set to end frame signal, obtaining this and starting frame signal terminates signal between frame signal to this, obtains mutation audio signal.Than
Such as, as shown in Figure 2 b, the short-time energy difference p of E (2) and E (3) is calculated3If p3> Th, then starting frame signal is third frame signal a,
The short-time energy for continuing to calculate the two neighboring frame signal after third frame signal is poor, if getting the short-time energy of E (3) and E (4)
Poor p4<-Th, then terminating frame signal is the 4th frame signal b, using third frame signal a to the 4th frame signal b as the audio signal
It is mutated audio signal.
Wherein, the setting means of preset threshold can also there are many kinds of, for example, can be flexible according to the demand of practical application
Setting can also pre-set storage in the network device.In addition, preset threshold can be built in the network equipment, alternatively,
Can also save in memory and be sent to the network equipment, etc..
In order to subsequent frequency-flat degree calculating closer to preset condition section true value, in order to make testing result
Accuracy is higher, can take the frame letter for detecting that short-time energy difference is less than preset threshold negative value for the first time after starting frame signal
Following frame signal in number two frame signals is to terminate frame signal, i.e. step " should obtain this in short-term after this starts frame signal
Energy difference is less than two frame signals of preset threshold negative value, according to time sequencing that the following frame signal in two frame signals is true
Being set to terminates frame signal ", specifically it can be such that
Successively judge whether the short-time energy difference is less than the negative of preset threshold in chronological order after this starts frame signal
Value;
When detecting that the short-time energy difference is less than preset threshold negative value for the first time, default threshold will be less than according to time sequencing
Following frame signal in two frame signals of value negative value is determined as terminating frame signal.
205, the network equipment calculates the frequency spectrum flatness of the mutation audio signal.
For example, the mutation audio signal specifically can be carried out Fourier transformation by the network equipment, frequency domain mutation audio is obtained
Then signal calculates the frequency spectrum flatness of frequency domain mutation audio signal.
Wherein, preset flat value setting means can also there are many kinds of, for example, can be according to the spirit of the demand of practical application
Setting living, can also pre-set storage in the network device.In addition, presetting flat value can be built in the network equipment,
Alternatively, can also save in memory and be sent to the network equipment, etc..
Wherein, frequency spectrum flatness, also referred to as wiener entropy are in Digital Signal Processing for characterizing the measurement of audible spectrum.
Frequency spectrum flatness can be measured by the ratio of geometric mean (GM) to signal and arithmetic average (AM), generally
Also it is frequency spectrum flatness.That is:
Wherein, w (n) is window function, and k is the frequency point that frequency domain is mutated audio signal, and X is that frequency domain is mutated audio signal.Wherein,
Window function can choose rectangular window, quarter window or Hanning window etc..
F (t)=GM (t)/AM (t)
Wherein, GM (t) is the geometric mean that frequency domain is mutated audio signal, and AM (t) is the calculation that frequency domain is mutated audio signal
Art average, F (t) are frequency spectrum flatness.
For example, guaranteeing that the audio to user experience does not have flaw, network to further promote the accuracy of detection
Equipment can first detect the peak position of the mutation audio signal, then centered on the peak position, respectively take to the left and right identical
Multiple groups of samples specifically can detecte the peak position of the mutation audio signal at a sonic boom audio frame;In the peak value
The front and back of position respectively takes multiple fixed sample point composition sonic boom audio frames;Calculate the frequency spectrum flatness of the sonic boom audio frame.
For example, as shown in Figure 2 b, centered on the peak position of the mutation audio signal, respectively taking N/2 sampling to the left and right
Point forms a sonic boom audio frame c, i.e. sonic boom audio frame c mono- shares N number of sampled point, then calculates the frequency of sonic boom audio frame c
Compose flatness.
206, the network equipment judges whether the frequency spectrum flatness is greater than and presets flat value, if the frequency spectrum flatness be greater than it is default
Flat value, it is determined that there are sonic booms for the audio signal.
For example, the network equipment, which specifically may determine that whether the frequency spectrum flatness is greater than, presets flat value;If the spectral flatness
Degree, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal;If the frequency spectrum flatness, which is less than, presets flat value, it is determined that
Sonic boom is not present in the audio signal.
207, the network equipment judges whether audio signal to be detected detects and finishes, if nothing, return execute it is short according to this
When energy difference obtain and meet the frame signal (returning to step 204) in preset condition section, obtain the step of mutation audio signal
Suddenly, until audio signal to be detected detection finishes.
For example, for the accuracy of subsequent reparation, the network equipment can continue to test short after detecting a sonic boom
When energy difference obtain and meet the frame signal in preset condition section, finish, that is, return until all audio signals to be detected all detect
The step of receipt row obtains the frame signal for meeting preset condition section according to the short-time energy difference, obtains mutation audio signal, directly
It is finished to audio signal to be detected detection.For example, presetting flat value according to the judgement of the frequency spectrum flatness of the mutation audio signal
Whether it is greater than after presetting flat value, no matter whether judging result, which is greater than, is preset flat value, can also continue to detection the 4th frame letter
Frame signal after number finishes until all frame signals detect, obtains testing result.
Optionally, after audio signal detection finishes, the interface of testing result can be generated, which includes that detection connects
Mouthful, whether which can receive the testing result of audio signal to be detected, detects after the completion of detection in the interface prompt
Audio detonator signal.
Optionally, after detecting beginning sonic boom, these frequency band deleted signals can also be repaired or is replaced, with
Guarantee that user can be with the good audio file of uppick.
From the foregoing, it will be observed that the network equipment of the present embodiment is when carrying out sonic boom detection to audio signal, it is available to be detected
Audio signal, which is divided into multiple frame signals, then, the short-time energy for calculating two neighboring frame signal is poor,
Then, the frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtains mutation audio signal, subsequently, meter
The frequency spectrum flatness of the mutation audio signal is calculated, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that the audio signal is deposited
In sonic boom;Then the program calculates the time domain short-time energy of every frame audio signal, passes through by carrying out framing to audio signal
Short-time energy difference finds out the audio frame position of energy jump, finds out mutation audio signal, then calculates its frequency spectrum flatness, leads to
Ground frequency spectrum flatness is crossed accurately to filter out the audio file of frequency band missing.
In addition, the program can also be repaired or be replaced to beginning sonic boom, it is thus possible to improve the matter of audio file
Amount improves user experience.
In order to better implement audio sonic boom detection method provided by the embodiments of the present application, the embodiment of the present application also provides one
Kind audio sonic boom detection device, the audio sonic boom detection device specifically can integrate in such as mobile phone, tablet computer, palm PC etc.
In the network equipment.Wherein the meaning of noun is identical with above-mentioned audio sonic boom detection method, and specific implementation details can be with reference to side
Explanation in method embodiment.
For example, as shown in Figure 3a, audio sonic boom detection device may include framing module 301, computing module 302, obtain
Module 303 and judgment module 304, as follows:
(1) framing module 301;
The audio signal is divided into multiple frame signals for obtaining audio signal to be detected by framing module 301.
For example, framing module 301, specifically can first obtain audio text from the various approach such as network, mobile phone or video
Part, and then it is supplied to the audio sonic boom detection device, that is, the audio sonic boom detection device specifically can receive various approach and obtain
The audio file arrived, then audio signal to be detected is extracted from these files.Then, these audio signals are divided into more
A frame signal.
In order to improve the efficiency of detection, the period of detection can be set at the beginning in the time domain of audio signal, and
Sub-frame processing is carried out to the audio signal in the period, i.e. framing module may include choosing submodule and framing submodule,
It is as follows:
Submodule is chosen, for choosing the signal of preset time period to the audio signal since first frame in time domain, is obtained
To beginning audio signal;
Framing submodule, for the beginning audio signal to be divided into multiple frame signals.
(2) computing module 302;
Computing module 302, the short-time energy for calculating two neighboring frame signal are poor.
It may include energy submodule, acquisition submodule and energy difference submodule for example, computing module 302, as follows:
Energy submodule, for calculating the short-time energy of each frame signal;
Acquisition submodule, for obtaining the time of each frame signal;
Energy difference submodule, for successively calculating in short-term capable of for two neighboring frame signal according to the time sequencing of the frame signal
Difference between amount, the short-time energy for obtaining two neighboring frame signal are poor.
Wherein, what short-time energy embodied is degree of strength of the signal in different moments.The short-time energy E's of each frame signal
Calculating can be such that
Wherein, N is the sampling number of every frame signal, and n is the sampled point of frame signal, and t indicates the position of frame signal, and E (t) is
The short-time energy of t frame signal.
Wherein, the short-time energy for calculating two neighboring frame signal is poor, can be such that
pt=E (t)-E (t-1)
Wherein, t is the position of frame, ptShort-time energy for two neighboring frame signal is poor.
(3) module 303 is obtained;
Module 303 is obtained, for obtaining the frame signal for meeting preset condition section according to the short-time energy difference, is mutated
Audio signal.
Wherein, the setting means of preset condition can there are many kinds of, for example, can flexibly be set according to the demand of practical application
It sets, storage can also be pre-set in the network device.In addition, preset condition can be built in the network equipment, alternatively,
Can save in memory and be sent to the network equipment, etc..
For example, obtaining module 303, specific available short-time energy difference is greater than two frame signals of preset threshold, root
The following frame signal in two frame signals is determined as according to time sequencing to start frame signal, obtaining after this starts frame signal should
Short-time energy difference is less than two frame signals of preset threshold negative value, is believed the following frame in two frame signals according to time sequencing
Number it is determined as terminating frame signal, obtaining this and starting frame signal terminates signal between frame signal to this, obtains being mutated audio signal.
Wherein, the setting means of preset threshold can also there are many kinds of, for example, can be flexible according to the demand of practical application
Setting can also pre-set storage in the network device.In addition, preset threshold can be built in the network equipment, alternatively,
Can also save in memory and be sent to the network equipment, etc..
In order to subsequent frequency-flat degree calculating closer to preset condition section true value, in order to make testing result
Accuracy is higher, can take the frame letter for detecting that short-time energy difference is less than preset threshold negative value for the first time after starting frame signal
Following frame signal in number two frame signals is to terminate frame signal, i.e. acquisition module can specifically perform the following operations:
Successively judge whether the short-time energy difference is less than the negative of preset threshold in chronological order after this starts frame signal
Value;
When detecting that the short-time energy difference is less than preset threshold negative value for the first time, default threshold will be less than according to time sequencing
Following frame signal in two frame signals of value negative value is determined as terminating frame signal.
(4) judgment module 304;
Judgment module 304 is preset for calculating the frequency spectrum flatness of the mutation audio signal if the frequency spectrum flatness is greater than
Flat value, it is determined that there are sonic booms for the audio signal.
For example, judgment module 304, specifically can carry out Fourier transformation for the mutation audio signal, obtain frequency domain mutation
Then audio signal, the frequency spectrum flatness for calculating frequency domain mutation audio signal it is default to judge whether the frequency spectrum flatness is greater than
Flat value;If the frequency spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal;If the frequency spectrum flatness is small
In presetting flat value, it is determined that sonic boom is not present in the audio signal.
Wherein, preset flat value setting means can also there are many kinds of, for example, can be according to the spirit of the demand of practical application
Setting living, can also pre-set storage in the network device.In addition, presetting flat value can be built in the network equipment,
Alternatively, can also save in memory and be sent to the network equipment, etc..
Wherein, frequency spectrum flatness, also referred to as wiener entropy are in Digital Signal Processing for characterizing the measurement of audible spectrum.
Frequency spectrum flatness can be measured by the ratio of geometric mean (GM) to signal and arithmetic average (AM), generally
Also it is frequency spectrum flatness.That is:
Wherein, w (n) is window function, and k is the frequency point that frequency domain is mutated audio signal, and X is that frequency domain is mutated audio signal.Wherein,
Window function can choose rectangular window, quarter window or Hanning window etc..
F (t)=GM (t)/AM (t)
Wherein, GM (t) is the geometric mean that frequency domain is mutated audio signal, and AM (t) is the calculation that frequency domain is mutated audio signal
Art average, F (t) are frequency spectrum flatness.
It, can be with for example, guarantee that the audio to user experience does not have flaw to further promote the accuracy of detection
The peak position of the mutation audio signal is first detected, then centered on the peak position, respectively takes N/2 groups of samples to the left and right
At a sonic boom audio frame, i.e. sonic boom audio frame one shares N number of sampled point.Therefore, judgment module can specifically include detection
Module, sampling submodule and computational submodule are as follows:
Detection sub-module, for detecting the peak position of the mutation audio signal;
Submodule is sampled, respectively takes multiple fixed sample point composition sonic booms before and after the peak position for sampling subelement
Audio frame;
Computational submodule calculates the frequency spectrum flatness of the sonic boom audio frame.
After detecting a sonic boom, for the accuracy of subsequent reparation, the acquisition of short-time energy difference can be continued to test
The frame signal for meeting preset condition section is finished until all audio signals to be detected all detect, i.e. audio sonic boom detection dress
It sets, can also include detection module 305 as shown in Figure 3b, as follows:
Detection module 305 obtains the frame signal for meeting preset condition section according to the short-time energy difference for returning to execute,
The step of obtaining mutation audio signal, until audio signal to be detected detection finishes.
It will be understood by those skilled in the art that the limit of the not structure twin installation of audio sonic boom detection device shown in Fig. 3 a
It is fixed, it may include perhaps combining certain components or different component layouts than illustrating more or fewer components.In addition, needing
Illustrate, the specific implementation of above-mentioned each unit can be found in the embodiment of the method for front, and therefore not to repeat here.
From the foregoing, it will be observed that the audio sonic boom detection device of the present embodiment, when carrying out sonic boom detection to audio signal, framing mould
The available audio signal to be detected of block 301, is divided into multiple frame signals for the audio signal, and then, computing module 302 is counted
The short-time energy for calculating two neighboring frame signal is poor, then, obtains module 303 according to short-time energy difference acquisition and meets preset condition
The frame signal in section obtains mutation audio signal, and subsequently, judgment module 304 calculates the spectral flatness of the mutation audio signal
Degree, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal;The program passes through to audio signal
Framing is carried out, the time domain short-time energy of every frame audio signal is then calculated, the sound of energy jump is found out by short-time energy difference
Frequency frame position, finds out mutation audio signal, then calculates its frequency spectrum flatness, is accurately screened by ground frequency spectrum flatness
The audio file for thering is frequency band to lack out.
Correspondingly, the embodiment of the present invention also provides a kind of network equipment, which can be server or terminal etc.
Equipment is integrated with any audio sonic boom detection device provided by the embodiment of the present invention.As shown in figure 4, it illustrates this
The structural schematic diagram of the network equipment involved in inventive embodiments, specifically:
The network equipment may include one or more than one processing core processor 401, one or more
The components such as memory 402, power supply 403 and the input unit 404 of computer readable storage medium.Those skilled in the art can manage
It solves, network equipment infrastructure shown in Fig. 4 does not constitute the restriction to the network equipment, may include more more or fewer than illustrating
Component perhaps combines certain components or different component layouts.Wherein:
Processor 401 is the control centre of the network equipment, utilizes various interfaces and connection whole network equipment
Various pieces by running or execute the software program and/or module that are stored in memory 402, and are called and are stored in
Data in reservoir 402 execute the various functions and processing data of the network equipment, to carry out integral monitoring to the network equipment.
Optionally, processor 401 may include one or more processing cores;Preferably, processor 401 can integrate application processor and tune
Demodulation processor processed, wherein the main processing operation system of application processor, user interface and application program etc., modulatedemodulate is mediated
Reason device mainly handles wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 401
In.
Memory 402 can be used for storing software program and module, and processor 401 is stored in memory 402 by operation
Software program and module, thereby executing various function application and data processing.Memory 402 can mainly include storage journey
Sequence area and storage data area, wherein storing program area can the (ratio of application program needed for storage program area, at least one function
Such as sound-playing function, image player function) etc.;Storage data area, which can be stored, uses created number according to the network equipment
According to etc..In addition, memory 402 may include high-speed random access memory, it can also include nonvolatile memory, such as extremely
A few disk memory, flush memory device or other volatile solid-state parts.Correspondingly, memory 402 can also wrap
Memory Controller is included, to provide access of the processor 401 to memory 402.
The network equipment further includes the power supply 403 powered to all parts, it is preferred that power supply 403 can pass through power management
System and processor 401 are logically contiguous, to realize management charging, electric discharge and power managed etc. by power-supply management system
Function.Power supply 403 can also include one or more direct current or AC power source, recharging system, power failure monitor
The random components such as circuit, power adapter or inverter, power supply status indicator.
The network equipment may also include input unit 404, which can be used for receiving the number or character of input
Information, and generate keyboard related with user setting and function control, mouse, operating stick, optics or trackball signal
Input.
Although being not shown, the network equipment can also be including display unit etc., and details are not described herein.Specifically in the present embodiment
In, the processor 401 in the network equipment can be corresponding by the process of one or more application program according to following instruction
Executable file be loaded into memory 402, and the application program being stored in memory 402 is run by processor 401,
It is as follows to realize various functions:
Audio signal to be detected is obtained, which is divided into multiple frame signals, then, calculates two neighboring frame
The short-time energy of signal is poor, then, the frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtains mutation sound
Frequency signal subsequently calculates the frequency spectrum flatness of the mutation audio signal, if the frequency spectrum flatness, which is greater than, presets flat value,
Determine that there are sonic booms for the audio signal.
Optionally, which is divided into multiple frame signals, may include:
The signal for choosing preset time period to the audio signal since first frame in time domain obtains beginning audio signal;
The beginning audio signal is divided into multiple frame signals.
Optionally, the short-time energy for calculating two neighboring frame signal is poor, may include:
Calculate the short-time energy of each frame signal;Obtain the time of each frame signal;According to the time sequencing of the frame signal
The difference between the short-time energy of two neighboring frame signal is successively calculated, the short-time energy for obtaining two neighboring frame signal is poor.
Optionally, the frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtains mutation audio signal,
May include:
Two frame signals that the short-time energy difference is greater than preset threshold are obtained, it will be in two frame signals according to time sequencing
Following frame signal is determined as starting frame signal;The short-time energy difference is obtained after this starts frame signal less than preset threshold negative value
Two frame signals, according to time sequencing by the following frame signal in two frame signals be determined as terminate frame signal;Obtaining should
Starting frame signal terminates the signal between frame signal to this, obtains mutation audio signal.
Optionally, two frame signals that the short-time energy difference is less than preset threshold negative value are obtained after this starts frame signal,
The following frame signal in two frame signals is determined as according to time sequencing to terminate frame signal, may include:
Successively judge whether the short-time energy difference is less than the negative of preset threshold in chronological order after this starts frame signal
Value;When detecting that the short-time energy difference is less than preset threshold negative value for the first time, will be born less than preset threshold according to time sequencing
Following frame signal in two frame signals of value is determined as terminating frame signal.
Optionally, the frequency spectrum flatness for calculating the mutation audio signal may include:
Detect the peak position of the mutation audio signal;Multiple fixed sample points are respectively taken to form before and after the peak position
Sonic boom audio frame;Calculate the frequency spectrum flatness of the sonic boom audio frame.
Optionally, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal, can wrap
It includes:
Judge whether the frequency spectrum flatness is greater than and presets flat value;If the frequency spectrum flatness, which is greater than, presets flat value, really
There are sonic booms for the fixed audio signal;If the frequency spectrum flatness, which is less than, presets flat value, it is determined that sonic boom is not present in the audio signal.
Optionally, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that the audio signal may be used also there are after sonic boom
To include:
The frame signal for executing and obtaining according to the short-time energy difference and meeting preset condition section is returned, mutation audio signal is obtained
The step of, until audio signal to be detected detection finishes.
Above each operation is for details, reference can be made to the embodiment of front, and details are not described herein.
From the foregoing, it will be observed that the network equipment of the present embodiment is when carrying out sonic boom detection to audio signal, it is available to be detected
Audio signal, which is divided into multiple frame signals, then, the short-time energy for calculating two neighboring frame signal is poor,
Then, the frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtains mutation audio signal, subsequently, meter
The frequency spectrum flatness of the mutation audio signal is calculated, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that the audio signal is deposited
In sonic boom;Then the program calculates the time domain short-time energy of every frame audio signal, passes through by carrying out framing to audio signal
Short-time energy difference finds out the audio frame position of energy jump, finds out mutation audio signal, then calculates its frequency spectrum flatness, leads to
Ground frequency spectrum flatness is crossed accurately to filter out the audio file of frequency band missing.
It will appreciated by the skilled person that all or part of the steps in the various methods of above-described embodiment can be with
It is completed by instructing, or relevant hardware is controlled by instruction to complete, which can store computer-readable deposits in one
In storage media, and is loaded and executed by processor.
For this purpose, the embodiment of the present application provides a kind of storage medium, wherein being stored with a plurality of instruction, which can be processed
Device is loaded, to execute the step in any audio sonic boom detection method provided by the embodiment of the present application.For example, this refers to
Order can execute following steps:
Audio signal to be detected is obtained, which is divided into multiple frame signals, then, calculates two neighboring frame
The short-time energy of signal is poor, then, the frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtains mutation sound
Frequency signal subsequently calculates the frequency spectrum flatness of the mutation audio signal, if the frequency spectrum flatness, which is greater than, presets flat value,
Determine that there are sonic booms for the audio signal
The specific implementation of above each operation can be found in the embodiment of front, and details are not described herein.
Wherein, which may include: read-only memory (Read Only Memory, ROM), random access memory
Body (Random Access Memory, RAM), disk or CD etc..
By the instruction stored in the storage medium, it is quick-fried that any audio provided by the embodiment of the present application can be executed
Step in sound detection method, it is thereby achieved that any audio sonic boom that is applied to provided by the embodiment of the present application detects
Beneficial effect achieved by method is detailed in the embodiment of front, and details are not described herein.
A kind of audio sonic boom detection method, device and storage medium provided by the embodiment of the present application have been carried out in detail above
Thin to introduce, specific examples are used herein to illustrate the principle and implementation manner of the present application, and above embodiments are said
It is bright to be merely used to help understand the present processes and its core concept;Meanwhile for those skilled in the art, according to this Shen
Thought please, there will be changes in the specific implementation manner and application range, is to sum up somebody's turn to do, the content of the present specification should not be understood
For the limitation to the application.
Claims (10)
1. a kind of audio sonic boom detection method characterized by comprising
Audio signal to be detected is obtained, the audio signal is divided into multiple frame signals;
The short-time energy for calculating two neighboring frame signal is poor;
The frame signal for meeting preset condition section is obtained according to the short-time energy difference, obtains mutation audio signal;
The frequency spectrum flatness for calculating the mutation audio signal, if the frequency spectrum flatness, which is greater than, presets flat value, it is determined that institute
Stating audio signal, there are sonic booms.
2. audio sonic boom detection method according to claim 1, which is characterized in that it is described the audio signal is divided into it is more
A frame signal, comprising:
The signal for choosing preset time period to the audio signal since first frame in time domain obtains beginning audio signal;
The beginning audio signal is divided into multiple frame signals.
3. audio sonic boom detection method according to claim 1, which is characterized in that described to calculate the short of two neighboring frame signal
When energy difference, comprising:
Calculate the short-time energy of each frame signal;
Obtain the time of each frame signal;
The difference between the short-time energy of two neighboring frame signal is successively calculated according to the time sequencing of the frame signal, is obtained adjacent
The short-time energy of two frame signals is poor.
4. audio sonic boom detection method according to claim 3, which is characterized in that described to be obtained according to the short-time energy difference
The frame signal for meeting preset condition section obtains mutation audio signal, comprising:
Two frame signals that the short-time energy difference is greater than preset threshold are obtained, it will be after in two frame signals according to time sequencing
One frame signal is determined as starting frame signal;
Two frame signals that the short-time energy difference is less than preset threshold negative value are obtained after the beginning frame signal, according to the time
Following frame signal in two frame signals is determined as terminating frame signal by sequence;
The beginning frame signal is obtained to the signal between the end frame signal, obtains mutation audio signal.
5. audio sonic boom detection method according to claim 4, which is characterized in that described to be obtained after the beginning frame signal
The short-time energy difference is less than two frame signals of preset threshold negative value, according to time sequencing by the latter in two frame signals
Frame signal is determined as terminating frame signal, comprising:
Successively judge whether the short-time energy difference is less than the negative of preset threshold in chronological order after the beginning frame signal
Value;
When detecting that the short-time energy difference is less than preset threshold negative value for the first time, preset threshold will be less than according to time sequencing
Following frame signal in two frame signals of negative value is determined as terminating frame signal.
6. audio sonic boom detection method according to claim 1, which is characterized in that the calculating mutation audio signal
Frequency spectrum flatness, comprising:
Detect the peak position of the mutation audio signal;
Multiple fixed sample point composition sonic boom audio frames are respectively taken before and after the peak position;
Calculate the frequency spectrum flatness of the sonic boom audio frame.
7. audio sonic boom detection method according to claim 1, which is characterized in that if the frequency spectrum flatness is greater than in advance
If flat value, it is determined that there are sonic booms for the audio signal, comprising:
Judge whether the frequency spectrum flatness is greater than and presets flat value;
If the frequency spectrum flatness, which is greater than, presets flat value, it is determined that there are sonic booms for the audio signal;
If the frequency spectrum flatness, which is less than, presets flat value, it is determined that sonic boom is not present in the audio signal.
8. audio sonic boom detection method according to claim 1, which is characterized in that if the frequency spectrum flatness is greater than in advance
If flat value, it is determined that there are after sonic boom for the audio signal, further includes:
The frame signal for executing and obtaining according to the short-time energy difference and meeting preset condition section is returned, mutation audio signal is obtained
Step, until audio signal to be detected detection finishes.
9. a kind of audio sonic boom detection device characterized by comprising
The audio signal is divided into multiple frame signals for obtaining audio signal to be detected by framing module;
Computing module, the short-time energy for calculating two neighboring frame signal are poor;
Module is obtained, for obtaining the frame signal for meeting preset condition section according to the short-time energy difference, obtains mutation audio
Signal;
Judgment module, for calculating the frequency spectrum flatness of the mutation audio signal, if the frequency spectrum flatness is greater than default put down
Smooth value, it is determined that there are sonic booms for the audio signal.
10. a kind of storage medium, which is characterized in that the storage medium is stored with a plurality of instruction, and described instruction is suitable for processor
It is loaded, the step in 1 to 8 described in any item audio sonic boom detection methods is required with perform claim.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910506938.3A CN110265064B (en) | 2019-06-12 | 2019-06-12 | Audio frequency crackle detection method, device and storage medium |
PCT/CN2019/093409 WO2020248308A1 (en) | 2019-06-12 | 2019-06-27 | Audio pop detection method and apparatus, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910506938.3A CN110265064B (en) | 2019-06-12 | 2019-06-12 | Audio frequency crackle detection method, device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110265064A true CN110265064A (en) | 2019-09-20 |
CN110265064B CN110265064B (en) | 2021-10-08 |
Family
ID=67917850
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910506938.3A Active CN110265064B (en) | 2019-06-12 | 2019-06-12 | Audio frequency crackle detection method, device and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110265064B (en) |
WO (1) | WO2020248308A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111312285A (en) * | 2020-01-14 | 2020-06-19 | 腾讯音乐娱乐科技(深圳)有限公司 | Beginning popping detection method and device |
CN112151055A (en) * | 2020-09-25 | 2020-12-29 | 北京猿力未来科技有限公司 | Audio processing method and device |
CN112735481A (en) * | 2020-12-18 | 2021-04-30 | Oppo(重庆)智能科技有限公司 | POP sound detection method and device, terminal equipment and storage medium |
CN113035223A (en) * | 2021-03-12 | 2021-06-25 | 北京字节跳动网络技术有限公司 | Audio processing method, device, equipment and storage medium |
CN113542863A (en) * | 2020-04-14 | 2021-10-22 | 深圳Tcl数字技术有限公司 | Sound processing method, storage medium and smart television |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113611330A (en) * | 2021-07-29 | 2021-11-05 | 杭州网易云音乐科技有限公司 | Audio detection method and device, electronic equipment and storage medium |
CN113744756A (en) * | 2021-08-11 | 2021-12-03 | 浙江讯飞智能科技有限公司 | Equipment quality inspection and audio data expansion method and related device, equipment and medium |
CN113613159B (en) * | 2021-08-20 | 2023-07-21 | 贝壳找房(北京)科技有限公司 | Microphone blowing signal detection method, device and system |
CN115243183A (en) * | 2022-06-29 | 2022-10-25 | 上海勤宽科技有限公司 | Audio detection method, device and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101952889A (en) * | 2008-02-01 | 2011-01-19 | 摩托罗拉公司 | Method and apparatus for estimating high-band energy in a bandwidth extension system |
CN103650040A (en) * | 2011-05-16 | 2014-03-19 | 谷歌公司 | Noise supression method and apparatus using multiple feature modeling for speech/noise likelihood |
CN103918030A (en) * | 2011-09-29 | 2014-07-09 | 杜比国际公司 | High quality detection in fm stereo radio signals |
CN105118520A (en) * | 2015-07-13 | 2015-12-02 | 腾讯科技(深圳)有限公司 | Elimination method and device of audio beginning sonic boom |
CN105989853A (en) * | 2015-02-28 | 2016-10-05 | 科大讯飞股份有限公司 | Audio quality evaluation method and system |
CN108198572A (en) * | 2017-12-29 | 2018-06-22 | 珠海市君天电子科技有限公司 | A kind of audio-frequency processing method and device |
CN108492837A (en) * | 2018-03-23 | 2018-09-04 | 腾讯音乐娱乐科技(深圳)有限公司 | Detection method, device and the storage medium of audio burst white noise |
CN109616135A (en) * | 2018-11-14 | 2019-04-12 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio-frequency processing method, device and storage medium |
CN109658955A (en) * | 2019-01-07 | 2019-04-19 | 环鸿电子(昆山)有限公司 | Sonic boom detection method and device |
CN109801646A (en) * | 2019-01-31 | 2019-05-24 | 北京嘉楠捷思信息技术有限公司 | Voice endpoint detection method and device based on fusion features |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050120870A1 (en) * | 1998-05-15 | 2005-06-09 | Ludwig Lester F. | Envelope-controlled dynamic layering of audio signal processing and synthesis for music applications |
EP1685554A1 (en) * | 2003-10-09 | 2006-08-02 | TEAC America, Inc. | Method, apparatus, and system for synthesizing an audio performance using convolution at multiple sample rates |
CN107346665A (en) * | 2017-06-29 | 2017-11-14 | 广州视源电子科技股份有限公司 | Method, apparatus, equipment and the storage medium of audio detection |
-
2019
- 2019-06-12 CN CN201910506938.3A patent/CN110265064B/en active Active
- 2019-06-27 WO PCT/CN2019/093409 patent/WO2020248308A1/en active Application Filing
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101952889A (en) * | 2008-02-01 | 2011-01-19 | 摩托罗拉公司 | Method and apparatus for estimating high-band energy in a bandwidth extension system |
CN103650040A (en) * | 2011-05-16 | 2014-03-19 | 谷歌公司 | Noise supression method and apparatus using multiple feature modeling for speech/noise likelihood |
CN103918030A (en) * | 2011-09-29 | 2014-07-09 | 杜比国际公司 | High quality detection in fm stereo radio signals |
US20140226822A1 (en) * | 2011-09-29 | 2014-08-14 | Dolby International Ab | High quality detection in fm stereo radio signal |
CN105989853A (en) * | 2015-02-28 | 2016-10-05 | 科大讯飞股份有限公司 | Audio quality evaluation method and system |
CN105118520A (en) * | 2015-07-13 | 2015-12-02 | 腾讯科技(深圳)有限公司 | Elimination method and device of audio beginning sonic boom |
CN108198572A (en) * | 2017-12-29 | 2018-06-22 | 珠海市君天电子科技有限公司 | A kind of audio-frequency processing method and device |
CN108492837A (en) * | 2018-03-23 | 2018-09-04 | 腾讯音乐娱乐科技(深圳)有限公司 | Detection method, device and the storage medium of audio burst white noise |
CN109616135A (en) * | 2018-11-14 | 2019-04-12 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio-frequency processing method, device and storage medium |
CN109658955A (en) * | 2019-01-07 | 2019-04-19 | 环鸿电子(昆山)有限公司 | Sonic boom detection method and device |
CN109801646A (en) * | 2019-01-31 | 2019-05-24 | 北京嘉楠捷思信息技术有限公司 | Voice endpoint detection method and device based on fusion features |
Non-Patent Citations (2)
Title |
---|
ARMIN TAGHIPOUR ET AL.: "A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION", 《2014 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO)》 * |
曾毓敏等: "基于浊音语音谐波谱子带加权重建的抗噪声说话人识别", 《东南大学学报(自然科学版)》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111312285A (en) * | 2020-01-14 | 2020-06-19 | 腾讯音乐娱乐科技(深圳)有限公司 | Beginning popping detection method and device |
CN111312285B (en) * | 2020-01-14 | 2023-02-14 | 腾讯音乐娱乐科技(深圳)有限公司 | Beginning popping detection method and device |
CN113542863A (en) * | 2020-04-14 | 2021-10-22 | 深圳Tcl数字技术有限公司 | Sound processing method, storage medium and smart television |
CN112151055A (en) * | 2020-09-25 | 2020-12-29 | 北京猿力未来科技有限公司 | Audio processing method and device |
CN112735481A (en) * | 2020-12-18 | 2021-04-30 | Oppo(重庆)智能科技有限公司 | POP sound detection method and device, terminal equipment and storage medium |
CN112735481B (en) * | 2020-12-18 | 2022-08-05 | Oppo(重庆)智能科技有限公司 | POP sound detection method and device, terminal equipment and storage medium |
CN113035223A (en) * | 2021-03-12 | 2021-06-25 | 北京字节跳动网络技术有限公司 | Audio processing method, device, equipment and storage medium |
CN113035223B (en) * | 2021-03-12 | 2023-11-14 | 北京字节跳动网络技术有限公司 | Audio processing method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110265064B (en) | 2021-10-08 |
WO2020248308A1 (en) | 2020-12-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110265064A (en) | Audio sonic boom detection method, device and storage medium | |
US11715446B2 (en) | Music classification method and beat point detection method, storage device and computer device | |
CN107910014A (en) | Test method, device and the test equipment of echo cancellor | |
CN109785850A (en) | A kind of noise detecting method, device and storage medium | |
CN105259459B (en) | Automation quality detecting method, device and the equipment of a kind of electronic equipment | |
CN113259832B (en) | Microphone array detection method and device, electronic equipment and storage medium | |
MX2008016354A (en) | Detecting an answering machine using speech recognition. | |
CN110111811A (en) | Audio signal detection method, device and storage medium | |
Wang et al. | Digital audio tampering detection based on ENF consistency | |
CN105118522A (en) | Noise detection method and device | |
CN112712816B (en) | Training method and device for voice processing model and voice processing method and device | |
CN107886951A (en) | A kind of speech detection method, device and equipment | |
CN106094598B (en) | Audio-switch control method, system and audio-switch | |
WO2020015411A1 (en) | Method and device for training adaptation level evaluation model, and method and device for evaluating adaptation level | |
CN107895571A (en) | Lossless audio file identification method and device | |
CN108091352A (en) | A kind of audio file processing method, device and storage medium | |
CN113470685B (en) | Training method and device for voice enhancement model and voice enhancement method and device | |
US20220254365A1 (en) | Method and device for audio repair and readable storage medium | |
CN112652290B (en) | Method for generating reverberation audio signal and training method of audio processing model | |
CN108962286A (en) | Audio identification methods, device and storage medium | |
CN110739006B (en) | Audio processing method and device, storage medium and electronic equipment | |
CN110070891A (en) | A kind of song recognition method, apparatus and storage medium | |
CN110096612A (en) | The acquisition methods and system of the online audio analysis data of voice log | |
CN106095943A (en) | Give song recitals and know well range detection method and device | |
CN107170451A (en) | Audio signal processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |