CN108848435A

CN108848435A - A kind of processing method and relevant apparatus of audio signal

Info

Publication number: CN108848435A
Application number: CN201811141040.2A
Authority: CN
Inventors: 许慎愉
Original assignee: Guangzhou Huaduo Network Technology Co Ltd
Current assignee: Guangzhou Cubesili Information Technology Co Ltd
Priority date: 2018-09-28
Filing date: 2018-09-28
Publication date: 2018-11-20
Anticipated expiration: 2038-09-28
Also published as: CN108848435B

Abstract

The embodiment of the present application discloses the processing method and relevant apparatus of a kind of audio signal, including：The first average value of current frame voice frequency signal amplitude at each sampled point is obtained according to the sample rate of current frame voice frequency signal and duration；Obtain the corresponding first exponential smoothing value of the first average value；The first reference value of current frame voice frequency signal is calculated according to operational parameter and preset operational formula, and the first audio status of current frame voice frequency signal is determined by comparing the size of the first reference value and preset first threshold, operational parameter includes the first average value and the first exponential smoothing value, and the first audio status is pure noise states or non-pure noise states；Determine that the corresponding search window of the first audio status is long according to long preset first corresponding relationship of audio status and search window.Solves the technical issues of existing MCRA algorithm can generate adverse effect to non-noise signal while shortening the duration of noise reduction exception.

Description

A kind of processing method and relevant apparatus of audio signal

Technical field

This application involves audio signal processing technique field more particularly to the processing methods and relevant apparatus of a kind of audio signal.

Background technique

Minimum value controls recursive average (Minima Controlled Recursive Averaging, MCRA) method, uses The noise energy of each subband in estimation signal, to realize the purpose of noise reduction.Due to its preferable performance and lower meter Complexity is calculated, is widely used in various real-time voice interaction scenarios, for example mobile call, VoIP, broadcast, live streaming, educates training on line KTV etc. in instruction, line.

MCRA algorithm is controlled with minimum value, minimum so if the energy of noise is continuously increased or increases suddenly You can't get timely updating, performance will have a greatly reduced quality value, be eventually exhibited as anti-acoustic capability deterioration.

Specifically can refering to fig. 1 and Fig. 2, Fig. 1 are pure noise signals, Fig. 2 is pure noise signal in Fig. 1 through noise reduction process The signal formed afterwards.Can be seen that in conjunction with Fig. 1 and Fig. 2 from noise signal enhanced following period of time suddenly, noise signal without Method is normally inhibited, and after more seconds, noise reduction effect is just normal.Because the duration of noise reduction exception depends on search window length Size, so the duration of noise reduction exception can be shortened by reducing search window length, to improve denoising effect, but because of search The long size of window is fixed, so also adverse effect can be generated to non-noise signal simultaneously.

It is specifically signal to be processed referring again to Fig. 3 and Fig. 4, Fig. 3, believes in signal to be processed comprising non-noise signal and noise Number, Fig. 4 is the signal that the signal to be processed in Fig. 3 is formed after noise reduction process, it should be noted that in figs. 3 and 4, is leaned on Axial ray partial amplitude is smaller and stable signal is noise signal, and amplitude is larger and stable signal is non-noise signal.Knot It closes in Fig. 3 and Fig. 4 as can be seen that while suppressing noise, the non-noise signal in circle is also inhibited by.

Summary of the invention

The embodiment of the present application provides the processing method and relevant apparatus of a kind of audio signal, so that shortening noise reduction exception Duration while, will not to non-noise signal generate adverse effect.

In view of this, the application first aspect provides a kind of processing method of audio signal, the processing method packet It includes：

The current frame voice frequency signal is obtained at each sampled point according to the sample rate of current frame voice frequency signal and duration First average value of amplitude；

The corresponding first exponential smoothing value of first average value is obtained according to preset exponential smoothing formula；

The first reference value of the current frame voice frequency signal is calculated according to operational parameter and preset operational formula, and is passed through The size for comparing first reference value and preset first threshold determines the first audio status of the current frame voice frequency signal, The operational parameter includes the first average value and the first exponential smoothing value, and first audio status is pure noise states or non-pure Noise states；

It determines that first audio status is corresponding according to audio status preset first corresponding relationship long with search window to search Rope window is long.

Preferably,

The first reference value of the current frame voice frequency signal is calculated according to operational parameter and preset operational formula, and is passed through The size for comparing first reference value and preset first threshold determines the first audio status of the current frame voice frequency signal It specifically includes：

The difference of first average value and the first exponential smoothing value is calculated according to the first preset difference value equation, and will First reference value of the difference as the current frame voice frequency signal；

The size for comparing first reference value and preset first threshold states the first threshold when first reference value is greater than When value, determines that the first audio status of the current frame voice frequency signal is non-pure noise states, otherwise determine the present frame sound First audio status of frequency signal is pure noise states.

Preferably,

Before calculating the first reference value of the current frame voice frequency signal according to operational parameter and preset operational formula, Further include：

Obtain the second audio status of previous frame audio signal, second audio status is pure noise states or non-pure makes an uproar Sound state；

Second audio status corresponding the is selected according to preset second corresponding relationship of audio status and first threshold One threshold value selects the corresponding fortune of second audio status according to the preset third corresponding relationship of audio status and operational formula Calculate formula.

Preferably,

The operational parameter further includes corresponding second reference value of the previous frame audio signal；

When second audio status is pure noise states, corresponding operational formula isWherein attack is first reference Value, attack [n-1] are second reference value, and fast [n] is the first average value, and slow [n] is the first exponential smoothing value, P It is positive integer with Q, K, L, M and Decay are constant, and 0<Decay<1, and M>L>N；

The size for comparing first reference value and preset first threshold, when first reference value is greater than described first When threshold value, determine that the first audio status of the current frame voice frequency signal is non-pure noise states.

Preferably,

The processing method further includes：

If first reference value is less than or equal to the first threshold, first reference value and preset second The size of threshold value；

If first reference value is greater than the second threshold, the present frame is calculated according to preset auto-correlation formula The correlation of audio signal, and by the correlation and the 5th preset threshold value comparison；

If the correlation is greater than the 5th threshold value, the first audio status of the current frame voice frequency signal is determined For non-pure noise states, otherwise determine that the first audio status of the current frame voice frequency signal is pure noise states.

Preferably,

The operational parameter further includes previous frame audio signal second average value of amplitude and described at each sampled point The corresponding second exponential smoothing value of second average value；

When second audio status is non-pure noise states, corresponding operational formula includes the first operational formula diff_ Fast [n]=| fast [n]-fast [n-1] | and the second operational formula diff_slow [n]=| slow [n]-slow [n-1] |, Fast [n] is the first average value, and slow [n] is the first exponential smoothing value, and fast [n-1] is the second average value, and slow [n-1] is Second exponential smoothing value；

Third reference value is calculated according to first average value, second average value and first operational formula diff_fast[n]；

The 4th ginseng is calculated according to the first exponential smoothing value, the second exponential smoothing value and second operational formula Examine value diff_slow [n]；

It is big with the 4th threshold value with the third threshold value, the 4th reference value to be respectively compared the third reference value It is small, when the third reference value, which is less than the third threshold value and the 4th reference value, is less than four threshold value, it is denoted as the One reference value is primary less than first threshold, and first reference value includes the third reference value and the 4th reference value, institute Stating first threshold includes the third threshold value and the 4th threshold value；

When the read-around ratio that first reference value is less than first threshold is more than or equal to preset times, then determine described current First audio status of frame audio signal is pure noise states, otherwise determines the first audio status of the current frame voice frequency signal For non-pure noise states.

Preferably,

The exponential smoothing formula is slow [n]=fast [n] * α_s+slow[n-1]*(1-α_s), wherein fast [n] is the One average value, slow [n] are the corresponding first exponential smoothing value of current frame voice frequency signal, and slow [n-1] is previous frame audio letter Number corresponding second exponential smoothing value, α_sIt is smoothing factor.

The application second aspect provides a kind of processing unit of audio signal, including：

First acquisition unit is believed for obtaining the present frame audio according to the sample rate and duration of current frame voice frequency signal The first average value of amplitude number at each sampled point；

Second acquisition unit, for obtaining the corresponding first exponential smoothing value of first average value；

Audio status judging unit is believed for calculating the present frame audio according to operational parameter and preset operational formula Number the first reference value, and determine the present frame sound by comparing the size of first reference value and preset first threshold First audio status of frequency signal, the operational parameter include the first average value and the first exponential smoothing value, first audio State is pure noise states or non-pure noise states；

The long determination unit of search window, described in being determined according to audio status with long preset first corresponding relationship of search window The corresponding search window of first audio status is long.

Preferably,

The audio status judging unit specifically includes：

First reference value computation subunit, for calculating first average value and institute according to the first preset difference value equation State the difference of the first exponential smoothing value, and the first reference value by the difference as the current frame voice frequency signal；

First determines subelement, for the size of first reference value and preset first threshold, when described the One reference value is greater than when stating first threshold, determines that the first audio status of the current frame voice frequency signal is non-pure noise states, Otherwise the first audio status for determining the current frame voice frequency signal is pure noise states.

Preferably,

The processing unit further includes：

Third acquiring unit, for obtaining the second audio status of previous frame audio signal, second audio status is Pure noise states or non-pure noise states；

Selecting unit, for selecting second audio according to preset second corresponding relationship of audio status and first threshold The corresponding first threshold of state selects second audio according to the preset third corresponding relationship of audio status and operational formula The corresponding operational formula of state.

Preferably,

When second audio status is pure noise states, corresponding operational formula isWherein attack is first reference Value, attack [n-1] are second reference value, and fast [n] is the first average value, and slow [n] is the first exponential smoothing value, P It is positive integer with Q, K, L, M and Decay are constant, and 0<Decay<1, and M>L>K；

The audio status judging unit includes：

Second determines subelement, for the size of first reference value and preset first threshold, when described the When one reference value is greater than the first threshold, determine that the first audio status of the current frame voice frequency signal is non-pure noise-like State.

Preferably,

The audio status judging unit further includes：

First comparing subunit, for when first reference value is less than or equal to the first threshold, more described the The size of one reference value and preset second threshold；

Relatedness computation subelement, for when first reference value is greater than the second threshold, according to it is preset from Correlation formula calculates the correlation of the current frame voice frequency signal, and by the correlation and the 5th preset threshold value ratio Compared with；

Third determines subelement, for determining the present frame sound when the correlation is greater than five threshold value First audio status of frequency signal is non-pure noise states, otherwise determines that the first audio status of the current frame voice frequency signal is Pure noise states.

Preferably,

The audio status judging unit specifically includes：

Second reference value computation subunit, for according to first average value, second average value and described first Operational formula calculates the third reference value diff_fast [n]；

Third reference value computation subunit, for according to the first exponential smoothing value, the second exponential smoothing value and Second operational formula calculates the 4th reference value diff_slow [n]；

Second comparing subunit, for being respectively compared the third reference value and the third threshold value, the 4th reference The size of value and the 4th threshold value, when the third reference value is less than the third threshold value and the 4th reference value is less than institute When stating four threshold values, be denoted as the first reference value less than first threshold it is primary, first reference value includes the third reference value With the 4th reference value, the first threshold includes the third threshold value and the 4th threshold value.

4th judgment sub-unit, the read-around ratio for being less than first threshold when first reference value are more than or equal to default Number then determines that the first audio status of the current frame voice frequency signal is pure noise states, otherwise determines the present frame sound First audio status of frequency signal is non-pure noise states.

The application third aspect provides a kind of processing equipment of audio signal, and the equipment includes processor and storage Device：

Said program code is transferred to the processor for storing program code by the memory；

The processor is used to execute the processing side as described in above-mentioned first aspect according to the instruction in said program code The step of method.

The application fourth aspect provides a kind of computer readable storage medium, and the computer readable storage medium is for depositing Program code is stored up, said program code is for executing method described in above-mentioned first aspect.

As can be seen from the above technical solutions, the embodiment of the present application has the following advantages that：

In the embodiment of the present application, a kind of processing method of audio signal is provided, including according to current frame voice frequency signal Sample rate and duration obtain the first average value of current frame voice frequency signal amplitude at each sampled point；Obtain the first average value pair The the first exponential smoothing value answered；The first reference of current frame voice frequency signal is calculated according to operational parameter and preset operational formula Value, and by comparing the first audio shape of the size of the first reference value and preset first threshold judgement current frame voice frequency signal State, operational parameter include the first average value and the first exponential smoothing value, and the first audio status is pure noise states or non-pure noise State；Determine that the corresponding search window of the first audio status is long according to long preset first corresponding relationship of audio status and search window；

First average value and the first exponential smoothing value can regard the envelope signal value of two kinds of friction speeds as, can be used for sound The first reference value is calculated using the characteristic in the prediction of frequency signal, the embodiment of the present application, then by the first reference value to working as The audio status of previous frame audio signal is judged, to determine that corresponding search window is long according to audio status, in this way, in MCRA In algorithm, if current frame voice frequency signal is pure noise, it is long suitably to reduce search window, so as to shorten noise reduction exception Duration, to improve denoising effect；If current frame voice frequency signal is non-noise, it can suitably increase that window is long to be prevented pair Non-noise signal generates adverse effect.

Detailed description of the invention

Fig. 1 is one section of pure noise signal；

Fig. 2 is signal schematic representation of the pure noise signal after existing MCRA algorithm noise reduction process in Fig. 1；

Fig. 3 is one section of signal to be processed comprising non-noise signal and noise signal；

Fig. 4 is the signal schematic representation that the signal to be processed in Fig. 3 is formed after existing MCRA algorithm noise reduction process；

Fig. 5 is the flow diagram of the first embodiment of the processing method of the embodiment of the present application sound intermediate frequency signal；

Fig. 6 is the flow diagram of the second embodiment of the processing method of the embodiment of the present application sound intermediate frequency signal；

Fig. 7 be using in the embodiment of the present application processing method and MCRA algorithm to the pure noise signal in Fig. 1 at Signal schematic representation after reason；

Fig. 8 be using in the embodiment of the present application processing method and MCRA algorithm in Fig. 3 non-pure noise signal carry out Treated signal schematic representation；

Fig. 9 is the structural schematic diagram of the first embodiment of the processing unit of the embodiment of the present application sound intermediate frequency signal；

Figure 10 is the structural schematic diagram of the processing equipment of the embodiment of the present application sound intermediate frequency signal.

Specific embodiment

In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only this Apply for a part of the embodiment, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art exist Every other embodiment obtained under the premise of creative work is not made, shall fall in the protection scope of this application.

Inventor has found under study for action, prominent in noise signal if carrying out noise estimation with existing MCRA algorithm Right enhanced following period of time, noise signal can not normally be inhibited, and after more seconds, noise reduction effect is just normal；Although logical Crossing, which reduces search window length, can shorten the duration of noise reduction exception and improve denoising effect, but because the long size of search window is Fixed, so simultaneously adverse effect can be generated to non-noise signal.

It is long therefore, it is necessary to adaptively adjust search window, when audio signal is pure noise signal, it can suitably shorten search Window is long, and when audio signal is non-pure noise signal, it is long can suitably to increase search window.

So the embodiment of the present application provides a kind of processing method of audio signal, the good audio status of first default settings is corresponding Search window it is long, then the audio status of audio signal is judged, determines that corresponding search window is long according to the result of judgement, So that while shortening the duration of noise reduction exception adverse effect will not be generated to non-noise signal.

In order to make it easy to understand, referring to Fig. 5, the first embodiment of the processing method of the embodiment of the present application sound intermediate frequency signal Flow diagram.

This application provides a kind of first embodiments of the processing method of audio signal, including：

Step 501, current frame voice frequency signal is obtained in each sampling according to the sample rate of current frame voice frequency signal and duration First average value of amplitude at point.

It is understood that the audio signal to be processed for a frame, sample rate and duration are determining, for example, currently The sample rate of frame audio signal can be 32000KHz, and duration can be 20ms, then sampled point then has 32000*20.

Step 502, the corresponding first exponential smoothing value of the first average value is obtained according to preset exponential smoothing formula.

Because exponential smoothing includes single exponential smoothing, double smoothing and other multiple exponential smoothings, right The exponential smoothing formula answered also there are many, for example, exponential smoothing formula can be slow [n]=fast [n] * α_s+slow[n- 1]*(1-α_s), wherein fast [n] is the first average value, and slow [n] is corresponding first exponential smoothing of current frame voice frequency signal Value, slow [n-1] are the corresponding second exponential smoothing value of previous frame audio signal, α_sIt is smoothing factor.

It is understood that the first average value is determined by the sample rate and duration of current frame voice frequency signal, and first Exponential smoothing value determines by the corresponding second exponential smoothing value of previous frame audio signal, the first average value and smoothing factor three, And smoothing factor determines the speed of first the first average value of exponential smoothing value trace.

Step 503, the first reference value of current frame voice frequency signal is calculated according to operational parameter and preset operational formula, and The first audio status of current frame voice frequency signal, operation are determined by comparing the size of the first reference value and preset first threshold Parameter includes the first average value and the first exponential smoothing value, and the first audio status is pure noise states or non-pure noise states.

It should be noted that because the first average value and the rate of change of the first exponential smoothing value are different, the application Embodiment using between the first average value and the first exponential smoothing value difference and variation tendency determine current audio signals First audio status；And there are many specific determination methods, therefore corresponding operational formula can also there are many, do not limit herein It is fixed.

Step 504, determine that the first audio status is corresponding according to audio status preset first corresponding relationship long with search window Search window it is long.

It is understood that the preset search window of different audio status is long different, such as when the first audio status is pure When noise states, corresponding search window length can be 1ms, corresponding to search element when the first audio status is non-pure noise states Window length can be 0.5ms.

The first reference value is calculated according to the first average value and the first exponential smoothing value in the embodiment of the present application, then passes through First reference value judges the audio status of current frame voice frequency signal, to determine corresponding search window according to audio status Long, in this way, in MCRA algorithm, if current frame voice frequency signal is pure noise, it is long suitably to reduce search window, so as to To shorten the duration of noise reduction exception, to improve denoising effect；It, can be appropriate if current frame voice frequency signal is non-noise Increasing window length prevents from generating adverse effect to non-noise signal.

Further, step 503 can specifically include：

The difference of the first average value and the first exponential smoothing value is calculated according to the first preset difference value equation, and difference is used as and is worked as First reference value of previous frame audio signal.

The size for comparing the first reference value Yu preset first threshold is sentenced when the first reference value, which is greater than, states first threshold The first audio status for determining current frame voice frequency signal is non-pure noise states, otherwise determines the first audio of current frame voice frequency signal State is pure noise states.

It is understood that the first average value can be understood as actual value, the first exponential smoothing value in exponential smoothing It can be understood as predicted value, when the difference of actual value and predicted value is sufficiently large, can determine the first of current frame voice frequency signal Audio status is non-pure noise states.

In order to make it easy to understand, referring to Fig. 6, the second embodiment of the processing method of the embodiment of the present application sound intermediate frequency signal Flow diagram.

This application provides a kind of second embodiments of the processing method of audio signal, including：

Step 601, current frame voice frequency signal is obtained in each sampling according to the sample rate of current frame voice frequency signal and duration First average value of amplitude at point.

It should be noted that the content of step 601 is identical as the content of the application first embodiment step 501, specifically may be used Referring to the content of step 501 in the application first embodiment, it is not repeated herein.

Step 602, the corresponding first exponential smoothing value of the first average value is obtained according to preset exponential smoothing formula.

It should be noted that the content of step 602 is identical as the content of the application first embodiment step 502, specifically may be used Referring to the content of step 502 in the application first embodiment, it is not repeated herein.

Step 603, the second audio status of previous frame audio signal is obtained, the second audio status is pure noise states or non- Pure noise states.

It should be noted that in the present embodiment, the sequencing of step 603 and step 601 is without limitation.

Step 604, select the second audio status corresponding according to preset second corresponding relationship of audio status and first threshold First threshold or the corresponding fortune of the second audio status is selected according to the preset third corresponding relationship of audio status and operational formula Calculate formula.

It should be noted that the correlation of adjacent two frames audio signal is often larger, so the sound of current frame voice frequency signal Frequency state determines to need the audio status according to previous frame audio signal.

For example, when same frame audio signal is individually placed to the audio signal of pure noise states and the audio of non-pure noise states Behind signal, the audio status finally determined may be different, in order to retain useful non-pure noise, same frame audio letter as far as possible It number is placed on behind the audio signal of non-pure noise states, compared to being placed on behind the audio signal of pure noise states, is judged as A possibility that non-pure noise states, should be bigger.

Therefore, in the embodiment of the present application, corresponding operational formula and threshold value can be selected according to the second audio status, with to the greatest extent It possibly avoids that non-pure noise audio signal is caused to be removed because of misjudgment.

It is further to note that different operational formulas can carry out operation, the operation using different operational parameters Parameter can be related to previous frame audio signal, thus some operational formulas previous frame audio signal be pure noise states when more It is applicable in, some operational formulas are more applicable when previous frame audio signal is non-pure noise states.

Therefore, the embodiment of the present application selects corresponding operational formula to can use each operation public affairs according to the second audio status The advantage of formula, to realize the accurate judgement of audio status.

Further, in the embodiment of the present application, operational parameter can also include previous frame audio signal corresponding second Reference value；When the second audio status is pure noise states, corresponding operational formula can beWherein attack is the first reference value, Attack [n-1] is the second reference value, and fast [n] is the first average value, and slow [n] is the first exponential smoothing value, and P and Q are positive Integer, K, L, M and Decay are constant, and 0<Decay<1, and M>L>K.

It is understood that K, L, M and Decay can be adjusted according to actual needs, for example, Decay take 0.9, M, L, K takes 0,0.1,0.5 respectively, can also pass throughThe value of attack [n] is limited System is between 0 to 1.

So, processing method provided by the embodiments of the present application further includes：

Step 605, the size for comparing the first reference value Yu preset first threshold, when the first reference value is greater than first threshold When, determine that the first audio status of current frame voice frequency signal is non-pure noise states.

Step 606, if the first reference value is less than or equal to first threshold, compare the first reference value and preset second threshold Size.

Step 607, if the first reference value is greater than second threshold, present frame audio is calculated according to preset auto-correlation formula The correlation of signal, and by correlation and the 5th preset threshold value comparison.

Correlation formula can beWherein s_n[i] is audio signal to be processed, i For sampled point, n is the serial number of audio signal to be processed, and SR is sample rate, and T is duration, and in the correlation formula, step-length is 1, can also 2 or other positive integers be set by step-length according to actual needs.

Step 608, if correlation is greater than the 5th threshold value, determine that the first audio status of current frame voice frequency signal is non- Otherwise pure noise states determine that the first audio status of current frame voice frequency signal is pure noise states.

It is understood that in practical applications, current frame voice frequency signal may be the very faint signal of sound, i.e. amplitude Very little, at this point, if only through the first reference value compared with first threshold, it may not be possible to accurately determine current audio signals For non-pure noise states；So the application is real in order to as much as possible determine the current audio signals of non-pure noise states out It applies example and sets second threshold, when the first reference value is no more than first threshold but is greater than second threshold, meter can be passed through Correlation is calculated to determine the first audio status of current audio signals.

Further, operational parameter can also include previous frame audio signal at each sampled point amplitude it is second average It is worth the second exponential smoothing value corresponding with the second average value, when the second audio status is non-pure noise states, corresponding operation Formula may include the first operational formula diff_fast [n]=| fast [n]-fast [n-1] | and the second operational formula diff_ Slow [n]=| slow [n]-slow [n-1] |, fast [n] is the first average value, and slow [n] is the first exponential smoothing value, fast [n-1] is the second average value, and slow [n-1] is the second exponential smoothing value.

Step 609, third reference value diff_ is calculated according to the first average value, the second average value and the first operational formula fast[n]。

Step 610, the 4th reference value is calculated according to the first exponential smoothing value, the second exponential smoothing value and the second operational formula diff_slow[n]。

Step 611, it is respectively compared third reference value and third threshold value, the size of the 4th reference value and the 4th threshold value, when When three reference values are less than third threshold value and the 4th reference value less than four threshold values, the first reference value is denoted as less than first threshold one Secondary, the first reference value includes third reference value and the 4th reference value, and first threshold includes third threshold value and the 4th threshold value.

Step 612, when the read-around ratio that the first reference value is less than first threshold is more than or equal to preset times, then judgement is current First audio status of frame audio signal is pure noise states, otherwise determines that the first audio status of current frame voice frequency signal is non- Pure noise states.

It should be noted that preset times can be set according to actual needs, such as it can be set to 1；And in order to It prevents that the current frame voice frequency signal containing useful signal is mistakenly determined as pure noise states, can suitably increase preset times Value, such as be arranged to 2 or 3.

Step 613, determine that the first audio status is corresponding according to audio status preset first corresponding relationship long with search window Search window it is long.

It should be noted that the content of step 613 is identical as the content of the application first embodiment step 504, specifically may be used Referring to the content of step 504 in the application first embodiment, it is not repeated herein.

In order to embody the embodiment of the present application sound intermediate frequency signal processing method effect, please refer to figure Fig. 7 and Fig. 8, Fig. 7 is Using in the embodiment of the present application processing method and MCRA algorithm treated signal hint is carried out to the pure noise signal in Fig. 1 Figure, Fig. 8 are using the processing method in the embodiment of the present application and after MCRA algorithm handles the non-pure noise signal in Fig. 3 Signal schematic representation, Fig. 7 and Fig. 2 are compared, Fig. 8 and Fig. 4 is compared, it can be clearly seen that, the embodiment of the present application is shortening drop While aberrant continuation duration of making an uproar, efficiently avoid generating adverse effect to non-noise signal.

Referring to Fig. 9, the structural schematic diagram of the first embodiment of the processing unit of the embodiment of the present application sound intermediate frequency signal.

The application provides a kind of first embodiment of the processing unit of audio signal, including：

First acquisition unit 901, for obtaining present frame audio letter according to the sample rate and duration of current frame voice frequency signal The first average value of amplitude number at each sampled point；

Second acquisition unit 902, for obtaining the corresponding first exponential smoothing value of the first average value；

Audio status judging unit 903, for calculating present frame audio letter according to operational parameter and preset operational formula Number the first reference value, and determine current frame voice frequency signal by comparing the size of the first reference value and preset first threshold First audio status, operational parameter include the first average value and the first exponential smoothing value, and the first audio status is pure noise states Or non-pure noise states；

The long determination unit 904 of search window, for being determined according to long preset first corresponding relationship of audio status and search window The corresponding search window of first audio status is long.

Further, audio status judging unit 903 can specifically include：

First reference value computation subunit refers to for calculating the first average value and first according to the first preset difference value equation The difference of number smooth value, and the first reference value by difference as current frame voice frequency signal；

First determines subelement, for comparing the size of the first reference value Yu preset first threshold, when the first reference value When greater than stating first threshold, determining that the first audio status of current frame voice frequency signal is non-pure noise states, otherwise determining current First audio status of frame audio signal is pure noise states.

Further, the processing unit of audio signal can also include：

Third acquiring unit, for obtaining the second audio status of previous frame audio signal, the second audio status is pure makes an uproar Sound state or non-pure noise states；

Selecting unit, for selecting the second audio status according to preset second corresponding relationship of audio status and first threshold Corresponding first threshold selects the second audio status corresponding according to the preset third corresponding relationship of audio status and operational formula Operational formula.

Further, operational parameter further includes corresponding second reference value of previous frame audio signal；

When second audio status is pure noise states, corresponding operational formula isWherein attack is the first reference value, Attack [n-1] is the second reference value, and fast [n] is the first average value, and slow [n] is the first exponential smoothing value, and P and Q are positive Integer, K, L, M and Decay are constant, and 0<Decay<1, and M>L>N；

Audio status judging unit 903 may include：

Second determines subelement, for comparing the size of the first reference value Yu preset first threshold, when the first reference value When greater than first threshold, determine that the first audio status of current frame voice frequency signal is non-pure noise states.

Further, audio status judging unit 903 can also include：

First comparing subunit, for when the first reference value is less than or equal to first threshold, comparing the first reference value and pre- The size for the second threshold set；

Relatedness computation subelement is used for when the first reference value is greater than second threshold, according to preset auto-correlation formula The correlation of current frame voice frequency signal is calculated, and by correlation and the 5th preset threshold value comparison；

Third determines subelement, for determining the first of current frame voice frequency signal when correlation is greater than five threshold values Audio status is non-pure noise states, otherwise determines that the first audio status of current frame voice frequency signal is pure noise states.

Further,

Operational parameter can also include the second average value and second of previous frame audio signal amplitude at each sampled point The corresponding second exponential smoothing value of average value；

When second audio status is non-pure noise states, corresponding operational formula includes the first operational formula diff_fast [n]=| fast [n]-fast [n-1] | and the second operational formula diff_slow [n]=| slow [n]-slow [n-1] |, fast [n] is the first average value, and slow [n] is the first exponential smoothing value, and fast [n-1] is the second average value, and slow [n-1] is second Exponential smoothing value；

Audio status judging unit 903 can specifically include：

Second reference value computation subunit, for being calculated according to the first average value, the second average value and the first operational formula Third reference value diff_fast [n]；

Third reference value computation subunit, for according to the first exponential smoothing value, the second exponential smoothing value and the second operation Formula calculates the 4th reference value diff_slow [n]；

Second comparing subunit, for being respectively compared third reference value and third threshold value, the 4th reference value and the 4th threshold value Size be denoted as the first reference value when third reference value is less than third threshold value and the 4th reference value less than four threshold values and be less than First threshold is primary, and the first reference value includes third reference value and the 4th reference value, and first threshold includes third threshold value and the 4th Threshold value.

4th judgment sub-unit, the read-around ratio for being less than first threshold when the first reference value are more than or equal to default time Number then determines that the first audio status of current frame voice frequency signal is pure noise states, otherwise determines the of current frame voice frequency signal One audio status is non-pure noise states.

Referring to Fig. 10, the structural schematic diagram of the processing equipment of the embodiment of the present application sound intermediate frequency signal.

The embodiment of the present application also provides a kind of processing equipment of audio signal, and equipment includes processor 101 and memory 102：

Program code is transferred to processor 101 for storing program code by memory 102；

Processor 101 is used to execute a kind of audio letter described in foregoing individual embodiments according to the instruction in program code Number processing method in any one embodiment.

The embodiment of the present application also provides a kind of computer readable storage medium, for storing program code, the program code For executing any one embodiment in a kind of processing method of audio signal described in foregoing individual embodiments.

It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.

The description of the present application and term " first " in above-mentioned attached drawing, " second ", " third ", " the 4th " etc. are (if deposited ) it is to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that use in this way Data are interchangeable under appropriate circumstances, so that embodiments herein described herein for example can be in addition to illustrating herein Or the sequence other than those of description is implemented.In addition, term " includes " and " having " and their any deformation, it is intended that Cover it is non-exclusive include, for example, containing the process, method, system, product or equipment of a series of steps or units need not limit In step or unit those of is clearly listed, but may include be not clearly listed or for these process, methods, produce The other step or units of product or equipment inherently.

It should be appreciated that in this application, " at least one (item) " refers to one or more, and " multiple " refer to two or two More than a."and/or" indicates may exist three kinds of relationships, for example, " A and/or B " for describing the incidence relation of affiliated partner It can indicate：A is only existed, B is only existed and exists simultaneously tri- kinds of situations of A and B, wherein A, B can be odd number or plural number.Word Symbol "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or"." at least one of following (a) " or its similar expression, refers to Any combination in these, any combination including individual event (a) or complex item (a).At least one of for example, in a, b or c (a) can indicate：A, b, c, " a and b ", " a and c ", " b and c ", or " a and b and c ", wherein a, b, c can be individually, can also To be multiple.

In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit It closes or communicates to connect, can be electrical property, mechanical or other forms.

The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.

It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.

If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the application Portion or part steps.And storage medium above-mentioned includes：USB flash disk, mobile hard disk, read-only memory (full name in English：Read-Only Memory, english abbreviation：ROM), random access memory (full name in English：Random Access Memory, english abbreviation： RAM), the various media that can store program code such as magnetic or disk.

The above, above embodiments are only to illustrate the technical solution of the application, rather than its limitations；Although referring to before Embodiment is stated the application is described in detail, those skilled in the art should understand that：It still can be to preceding Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features；And these It modifies or replaces, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution.

Claims

1. a kind of processing method of audio signal, which is characterized in that including：

Current frame voice frequency signal amplitude at each sampled point is obtained according to the sample rate of current frame voice frequency signal and duration The first average value；

Obtain the corresponding first exponential smoothing value of first average value；

The first reference value of the current frame voice frequency signal is calculated according to operational parameter and preset operational formula, and by comparing The size of first reference value and preset first threshold determines the first audio status of the current frame voice frequency signal, described Operational parameter includes the first average value and the first exponential smoothing value, and first audio status is pure noise states or non-pure noise State；

The corresponding search window of first audio status is determined according to long preset first corresponding relationship of audio status and search window It is long.

2. processing method according to claim 1, which is characterized in that calculated according to operational parameter and preset operational formula First reference value of the current frame voice frequency signal, and by comparing the size of first reference value and preset first threshold Determine that the first audio status of the current frame voice frequency signal specifically includes：

The difference of first average value and the first exponential smoothing value is calculated according to the first preset difference value equation, and will be described First reference value of the difference as the current frame voice frequency signal；

The size for comparing first reference value and preset first threshold states first threshold when first reference value is greater than When, determine that the first audio status of the current frame voice frequency signal is non-pure noise states, otherwise determines the present frame audio First audio status of signal is pure noise states.

3. processing method according to claim 1, which is characterized in that according to operational parameter and preset operational formula meter Before the first reference value for calculating the current frame voice frequency signal, further include：

The second audio status of previous frame audio signal is obtained, second audio status is pure noise states or non-pure noise-like State；

Corresponding first threshold of second audio status is selected according to preset second corresponding relationship of audio status and first threshold Value selects the corresponding operation of second audio status public according to the preset third corresponding relationship of audio status and operational formula Formula.

4. processing method according to claim 3, which is characterized in that the operational parameter further includes the previous frame audio Corresponding second reference value of signal；

The first reference value of the current frame voice frequency signal is calculated according to operational parameter and preset operational formula, and by comparing The size of first reference value and preset first threshold determines that the first audio status of the current frame voice frequency signal is specific Including：

The size for comparing first reference value and preset first threshold, when first reference value is greater than the first threshold When, determine that the first audio status of the current frame voice frequency signal is non-pure noise states.

5. processing method according to claim 4, which is characterized in that further include：

If first reference value is less than or equal to the first threshold, first reference value and preset second threshold Size；

If first reference value is greater than the second threshold, the present frame audio is calculated according to preset auto-correlation formula The correlation of signal, and by the correlation and the 5th preset threshold value comparison；

If the correlation is greater than the 5th threshold value, determine that the first audio status of the current frame voice frequency signal is non- Otherwise pure noise states determine that the first audio status of the current frame voice frequency signal is pure noise states.

6. processing method according to claim 3, which is characterized in that the operational parameter further includes previous frame audio signal The second average value of amplitude and the corresponding second exponential smoothing value of second average value at each sampled point；

Third reference value diff_ is calculated according to first average value, second average value and first operational formula fast[n]；

The 4th reference value is calculated according to the first exponential smoothing value, the second exponential smoothing value and second operational formula diff_slow[n]；

It is respectively compared the third reference value and the third threshold value, the size of the 4th reference value and the 4th threshold value, When the third reference value is less than the third threshold value and the 4th reference value is less than four threshold value, it is denoted as the first ginseng It is primary less than first threshold to examine value, first reference value includes the third reference value and the 4th reference value, and described the One threshold value includes the third threshold value and the 4th threshold value；

When first reference value be less than first threshold read-around ratio be more than or equal to preset times, then determine the present frame sound First audio status of frequency signal is pure noise states, otherwise determines that the first audio status of the current frame voice frequency signal is non- Pure noise states.

7. processing method according to claim 1, which is characterized in that the exponential smoothing formula is slow [n]=fast [n]*α_s+slow[n-1]*(1-α_s), wherein fast [n] is the first average value, and slow [n] is that current frame voice frequency signal is corresponding First exponential smoothing value, slow [n-1] are the corresponding second exponential smoothing value of previous frame audio signal, α_sIt is smoothing factor.

8. a kind of processing unit of audio signal, which is characterized in that including：

First acquisition unit exists for obtaining the current frame voice frequency signal according to the sample rate and duration of current frame voice frequency signal First average value of amplitude at each sampled point；

Audio status judging unit, for calculating the current frame voice frequency signal according to operational parameter and preset operational formula First reference value, and determine that the present frame audio is believed by comparing the size of first reference value and preset first threshold Number the first audio status, the operational parameter include the first average value and the first exponential smoothing value, first audio status For pure noise states or non-pure noise states；

The long determination unit of search window, for determining described first according to long preset first corresponding relationship of audio status and search window The corresponding search window of audio status is long.

9. a kind of processing equipment of audio signal, which is characterized in that the equipment includes processor and memory：

The processor is used for according to the described in any item audio signals of instruction execution claim 1-7 in said program code Processing method.

10. a kind of computer readable storage medium, which is characterized in that the computer readable storage medium is for storing program generation Code, said program code require the processing method of the described in any item audio signals of 1-7 for perform claim.