CN111986694A

CN111986694A - Audio processing method, device, equipment and medium based on transient noise suppression

Info

Publication number: CN111986694A
Application number: CN202010905336.8A
Authority: CN
Inventors: 付姝华; 汪斌
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-09-01
Filing date: 2020-09-01
Publication date: 2020-11-24
Anticipated expiration: 2040-09-01
Also published as: CN111986694B; WO2021143249A1

Abstract

The present invention relates to the field of artificial intelligence technologies, and in particular, to an audio processing method, apparatus, device, and storage medium based on transient noise suppression. The audio processing method calculates the characteristic value according to the maximum value and the frame energy value of the current frame audio signal, calculates the width of the gate according to the smooth value and the width of the gate gain factor of the characteristic value, judges whether the current frame audio signal contains transient noise or not based on the characteristic value and the width of the gate, obtains the peak value of the maximum value of the appointed number of frames before the current frame when the judgment result is yes, and calculates the suppression gain according to the peak value, the maximum value of the current frame audio signal and the expansion factor.

Description

Audio processing method, device, equipment and medium based on transient noise suppression

[ technical field ] A method for producing a semiconductor device

The present invention relates to the field of audio processing technologies, and in particular, to an audio processing method, apparatus, device, and medium based on transient noise suppression.

[ background of the invention ]

Noise suppression is generally applied to call scenes such as telephone calls, voip calls, mobile audio and video calls, office conference equipment and the like, represents typical application of audio pre-and post-processing for noise suppression, and also determines the successful basis of the performance of a call product. Noise suppression needs to face various different call scenes and deal with various different noise sources, and a noise suppression method in the prior art has a good effect on stationary noise, typically a noise suppression algorithm in google webrtc, but the suppression effect of the algorithm on non-stationary noise is very poor, and particularly transient noise hardly has any suppression effect. Transient noises include keyboard knocking, mobile phone bumping, and clothes rubbing sounds of mobile phones. When these transient noises that are not suppressed participate in the call, the call experience is very poor, and there is a possibility that agc amplification in the back-sheet processing occurs, which causes harsh and sharp sounds and causes great damage to the hearing of the human ear. In the prior art, the noise suppression by applying artificial intelligence has certain effect on the effect of unbalanced noise and transient noise, but the method has the problems of great damage to voice and serious audio shearing.

Therefore, there is a need to provide a new audio processing method based on transient noise suppression.

[ summary of the invention ]

The invention aims to provide an audio processing method, an audio processing device, audio processing equipment and an audio processing medium based on transient noise suppression, and the audio processing method, the audio processing device, the audio processing equipment and the audio processing medium solve the technical problems that the transient noise suppression effect is poor and audio damage is caused by transient noise suppression in the prior art.

The technical scheme of the invention is as follows: provided is an audio processing method based on transient noise suppression, comprising the following steps:

acquiring a maximum value and a frame energy value of a current frame audio signal, and acquiring a characteristic value of the current frame audio signal according to the maximum value and the frame energy value;

calculating a smooth value of the characteristic value of the current frame audio signal according to the characteristic value and a preset first smoothing factor, and calculating a width value of the current frame audio signal according to the smooth value of the characteristic value and a preset width gain factor;

judging whether the audio signal of the current frame contains transient noise or not based on the characteristic value and the width value, if so, acquiring a peak value of a maximum value of a specified number of frames before the current frame, and calculating the suppression gain of the audio signal of the current frame according to the peak value, the maximum value of the audio signal of the current frame and a preset expansion factor;

applying the suppression gain to the current frame audio signal to obtain a transient noise suppressed audio output signal.

Preferably, the characteristic value is a ratio of the maximum value and the frame energy value.

Preferably, the breadth value is a product of the smoothed value of the feature value of the current frame audio signal and the breadth gain factor, and the breadth gain factor is 1-2.

Preferably, when the feature value is greater than the width value, it is determined that the current frame audio signal contains transient noise;

the suppression gain is the product of the peak value and the spreading factor, and then is divided by the maximum value of the current frame audio signal.

Preferably, before applying the suppression gain to the current frame audio signal to obtain the transient noise suppressed audio output signal, the method further includes:

and when the characteristic value is smaller than or equal to the width value, judging that the current frame audio signal does not contain transient noise, and taking a first preset value as a suppression gain.

Preferably, the calculating a smoothing value of the feature value of the current frame audio signal according to the feature value and a preset first smoothing factor includes:

obtaining a smooth value of a characteristic value of the previous frame of audio signal;

calculating a first product of the characteristic value of the current frame audio signal and the first smoothing factor;

calculating a difference value of 1 minus the first smoothing factor and a second product of a smoothed value of the feature value of the previous frame of the audio signal and the difference value;

accumulating the first product and the second product to obtain a smooth value of the characteristic value of the current frame audio signal;

the audio processing method based on transient noise suppression further comprises the following steps:

uploading the feature value and the broad value of the current frame audio signal into a blockchain, so that the blockchain cryptographically stores the feature value and the broad value.

Preferably, the applying the suppression gain to the current frame audio signal to obtain a transient noise suppressed audio output signal includes:

calculating a smooth value of the suppression gain of the current frame audio signal according to the suppression gain of the previous frame audio signal, the suppression gain of the current frame audio signal and a preset second smoothing factor;

and multiplying the current frame audio signal by the smooth value of the suppression gain of the current frame audio signal to obtain the transient noise suppressed audio output signal.

Preferably, the multiplying the current frame audio signal by the smooth value of the suppression gain of the current frame audio signal to obtain the transient noise suppressed audio output signal includes:

and respectively multiplying the amplitude value of each sampling point of the current frame audio signal by the smooth value of the suppression gain of the current frame audio signal to obtain the transient noise suppressed audio output signal.

The other technical scheme of the invention is as follows: there is provided an audio processing apparatus based on transient noise suppression, the apparatus comprising:

the transient noise tracking module is used for acquiring the maximum value and the frame energy value of the current frame audio signal and acquiring the characteristic value of the current frame audio signal according to the maximum value and the frame energy value;

the first calculation module is used for calculating a smooth value of the characteristic value of the current frame audio signal according to the characteristic value and a preset first smoothing factor, and calculating a width value of the current frame audio signal according to the smooth value of the characteristic value and a preset width gain factor;

the second calculation module is used for judging whether the current frame audio signal contains transient noise or not based on the characteristic value and the width value, and acquiring the suppression gain of the current frame audio signal when the judgment result is yes;

and the gain processing module is used for applying the suppression gain to the current frame audio signal to obtain an audio output signal with transient noise suppression.

The other technical scheme of the invention is as follows: an electronic device is provided, which includes a processor, and a memory coupled to the processor, the memory storing program instructions for implementing the above-mentioned transient noise suppression-based audio processing method; the processor is to execute the program instructions stored by the memory to perform transient noise suppression based audio processing.

The other technical scheme of the invention is as follows: there is provided a storage medium having stored therein program instructions capable of implementing the above-described transient noise suppression-based audio processing method.

The invention has the beneficial effects that: the audio processing method, the device, the equipment and the storage medium based on the transient noise suppression of the invention calculate the characteristic value of the current frame audio signal according to the maximum value of the current frame audio signal and the frame energy value, calculate the width value of the current frame audio signal according to the smooth value of the characteristic value and the preset width gain factor, judge whether the current frame audio signal contains transient noise or not based on the characteristic value and the width value, when the judgment result is yes, obtain the peak value of the maximum value of the appointed number of frames before the current frame, calculate the suppression gain of the current frame audio signal according to the peak value, the maximum value of the current frame audio signal and the preset expansion factor, by the above way, the characteristic value is well matched with the characteristics of the transient noise, can accurately track the transient noise, improve the suppression effect of the transient noise, and the peak value and the maximum value used when calculating the suppression gain are both time domain signal characteristics, the whole processing flow only relates to time domain data, so that damage to audio is avoided, transient noise is inhibited, and audio signals are well fidelity.

[ description of the drawings ]

Fig. 1 is a flowchart illustrating an audio processing method based on transient noise suppression according to a first embodiment of the invention;

FIG. 2 is a flowchart illustrating an audio processing method based on transient noise suppression according to a second embodiment of the invention;

FIG. 3 is a schematic structural diagram of an audio processing apparatus based on transient noise suppression according to a third embodiment of the present invention;

FIG. 4 is a schematic structural diagram of an audio processing apparatus based on transient noise suppression according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of a storage medium according to a fifth embodiment of the present invention.

[ detailed description ] embodiments

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first", "second" and "third" in the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise. All directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In the embodiment of the present invention, each frame of audio signal is an audio original digital signal in a unit time, and the frame of audio signal may be in a transient noise frame or a non-transient noise frame. The transient noise frame refers to an original audio digital signal frame containing transient noise; the non-transient noise frame refers to an original audio digital signal frame containing no transient noise. In this embodiment, transient noise tracking detection is performed on each frame of audio signal to determine whether a current frame of audio signal is a transient noise frame, and first, a feature value is extracted from the current frame of audio signal; then, calculating a width value according to the characteristic value; then, judging whether the frame audio signal is transient noise according to the width value, and determining the suppression gain of the current frame audio signal based on the judgment result; and finally, processing the current frame audio signal according to the suppression gain, and suppressing the transient noise.

Fig. 1 is a flowchart illustrating an audio processing method based on transient noise suppression according to a first embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 1 if the results are substantially the same. As shown in fig. 1, the audio processing method based on transient noise suppression includes the steps of:

s101, acquiring a maximum value and a frame energy value of the current frame audio signal, and acquiring a characteristic value of the current frame audio signal according to the maximum value and the frame energy value.

In step S101, feature parameters need to be extracted for each frame of audio signal, and then feature values of the current frame of audio signal are calculated according to the extracted feature parameters, where the extracted feature parameters are maximum values and frame energy values.

Specifically, when transient noise detection needs to be performed on an audio sample to be detected, the audio sample is firstly framed, wherein each frame of audio signal includes a plurality of sampling points, and each sampling point has an amplitude. The maximum value of each frame of audio signal is the maximum value of the amplitudes of the respective sampling points of the audio signal, and it is assumed that the current frame of audio signal includes n sampling points, where the n sampling points are in (1), in (2), … …, in (n), where in (i) represents the ith sampling point in the current frame of audio signal, and then the maximum value maxValue of the current frame of audio signal is max (in (1), in (2), … …, in (n)), and the frame energy value is calculated by the following formula:

in (i) is the amplitude value of the ith sample point in the audio signal of the current frame.

In this embodiment, the characteristic value is the ratio of the maximum value to the frame energy value, i.e. the ratio

The transient noise has large variation span in time domain and frequency domain, and the ratio of the maximum value of the current frame audio signal to the frame energy value is used as a characteristic value, so that the transient noise can be well matched, and the transient noise and other sounds (non-transient noise) can be distinguished by using the characteristic value.

And S102, calculating a smooth value of the feature value of the current frame audio signal according to the feature value and a preset first smoothing factor, and calculating a width value of the current frame audio signal according to the smooth value of the feature value and a preset width gain factor.

In an optional embodiment, the smoothing value may be obtained by performing smoothing processing using a smoothing value of the previous frame of audio signal and a feature value of the current frame of audio signal, and first, obtaining a smoothing value aveRatioME (t-1) of the feature value of the previous frame of audio signal; then, calculating a first product of the characteristic value ratioME of the current frame audio signal and a first smoothing factor sigma 1, and calculating a difference value of subtracting the first smoothing factor sigma 1 from 1 and a second product of a smoothing value aveRatioME (t-1) of the characteristic value of the previous frame audio signal and the difference value; and finally, accumulating the first product and the second product to obtain a smooth value aveRatioME (t) of the characteristic value of the current frame audio signal. The calculation formula is as follows: aveRatioME (t) ═ σ 1 ratiometE + (1- σ 1) aveRatioME (t-1). In an alternative embodiment, the first smoothing factor σ 1 is 0 to 1.

In step S102, the gatewide value sheldRatioME of the current frame audio signal may be a product of the smoothed value averatiiome (t) of the feature value of the current frame audio signal and the gatewide gain factor σ 2. In an alternative embodiment, the gated gain factor σ 2 is 1-2.

In this embodiment, a width value calculated based on the characteristic value is used as a basis for judging whether the current frame audio signal is transient noise, and the width value can be well matched with the characteristic of the transient noise.

S103, judging whether the current frame audio signal contains transient noise or not based on the characteristic value and the width value, and acquiring the suppression gain of the current frame audio signal when the judgment result is yes.

In step S103, determining that the current frame audio signal is a transient noise frame or a non-transient noise frame based on a relationship between a feature value and a width value of the current frame audio signal, and determining a suppression gain of the current frame audio signal according to a transient noise suppression policy if the current frame audio signal is a transient noise frame; and if the current frame audio signal is a non-transient noise frame, performing transient noise suppression on the current frame audio signal.

Specifically, if the feature value of the current frame audio signal is greater than the width value of the current frame audio signal, judging that the current frame audio signal contains transient noise, acquiring the suppression gain of the current frame audio signal based on the time domain feature of the historical audio signal, and firstly, acquiring the peak value of the maximum value of a specified number of frames before the current frame; and then, multiplying the peak value maxoffalvalue by a preset expansion factor k to obtain an expansion product, and dividing the expansion product by the maximum value maxValue of the current frame audio signal to obtain a suppression gain. The calculation formula is as follows: the suppression gain of the current frame audio signal is maxoffvalue k/maxValue, and generally, the spreading factor k is 32868 to 32868 3.

In step S103, the suppression gain is applied to the time domain sampling data, so as to effectively suppress the transient noise.

And S104, applying the suppression gain to the current frame audio signal to obtain an audio output signal with transient noise suppressed.

In step S104, the audio output signal may be obtained by directly multiplying the current frame audio signal by the suppression gain. In an optional embodiment, a smooth value of the suppression gain of the current frame audio signal is calculated according to the suppression gain of the previous frame audio signal, the suppression gain of the current frame audio signal and a preset second smoothing factor; applying the smoothed value of the suppression gain to the current frame audio signal to obtain a transient noise suppressed audio output signal. Specifically, the suppression gain of the previous frame of audio signal is gain (t-1), the suppression gain of the current frame of audio signal is gain (t), first, a third product of the suppression gain (t) of the current frame of audio signal and the second smoothing factor σ 3 is calculated; then, a difference value of 1 minus the second smoothing factor sigma 3 and a fourth product of gain (t-1) and the difference value of the suppression gain of the previous frame of audio signal are calculated; then, accumulating the third product and the fourth product to obtain a smooth value dotgain (t) of the suppression gain of the current frame audio signal; finally, the amplitude in (i) of each sample point of the current frame audio signal is multiplied by dotgain (t) to obtain an output value out (i) of the sample point, and the processed audio output signals are out (1), out (2), … … and out (n). Wherein the second smoothing factor sigma 3 is 0-1.

In this embodiment, the entire audio processing flow is processed in the time domain, and does not involve operations such as frequency domain conversion, FFT transformation, audio signal reconstruction, and the like, and the structure of the audio is not damaged, so that the audio signal is well-fidelity while suppressing transient noise.

Fig. 2 is a flowchart illustrating an audio processing method based on transient noise suppression according to a second embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 2 if the results are substantially the same. As shown in fig. 2, the audio processing method based on transient noise suppression includes the steps of:

s201, framing and windowing are carried out on the audio sample data.

In step S201, a time window win is defined, where the length of the time window win is N, and the time window win is used to store historical time domain data, that is, the time window win is used to store time domain data of N frames of audio signals before the current frame of audio signal, and in practical application, the time window win may be generally divided into frames of 5ms to 20ms, and the time window win is generally 100ms to 500 ms.

S202, acquiring the maximum value and the frame energy value of the current frame audio signal, and calculating the characteristic value of the current frame audio signal according to the ratio of the maximum value and the frame energy value.

S203, updating the time window win, and storing the maximum value [ maxValue1, maxValue2, … …, maxValueN ] of the previous N frames of audio signals within the time window win, and simultaneously updating the peak value of the maximum value of the previous N frames of audio signals stored within the time window win, the peak value maxoffalue ═ max { maxValue1, …, maxValueN }.

In step S203, only the maximum values of the N frames before the current frame are stored in the time window win, and after the audio signal of the current frame is updated, the data in the time window win is updated accordingly, and the peak values of the N maximum values in the time window win are also updated accordingly.

S204, calculating a smooth value of the feature value of the current frame audio signal according to the feature value and a preset first smoothing factor, and calculating a width value of the current frame audio signal according to the smooth value of the feature value and a preset width gain factor.

Step S204 may refer to the description of step S102 in the first embodiment, and is not described in detail here.

S205, it is determined whether the feature value of the current frame audio signal is greater than the width of the current frame audio signal.

S206, when the judgment result of the step S205 is yes, the peak value updated in the step S203 is multiplied by the spreading factor k and divided by the maximum value of the current frame audio signal to calculate the suppression gain of the current frame audio signal.

In step S206, the peak of the N maximum values in the time window win is the feature of the time domain signal, the maximum value of the current frame audio signal is also the time domain data, and the suppression gain calculated according to the time domain data can avoid the audio sample data from being damaged.

And S207, when the judgment result of the step S205 is negative, taking the first preset value as a suppression gain.

In step S207, the first preset value may be 1.0, that is, when the determination result is no, the suppression gain of the current frame audio signal is 1.0.

And S208, smoothing the suppression gain to obtain a corresponding suppression gain smooth value.

Specifically, the smoothing value of the suppression gain of the current frame audio signal is calculated according to the suppression gain of the previous frame audio signal, the suppression gain of the current frame audio signal and the preset second smoothing factor, and the detailed description refers to the first embodiment.

S209, multiplying the current frame audio signal by the suppression gain smooth value to obtain a corresponding audio output signal.

Further, the following steps are included after step S204:

s2041, uploading the characteristic value and the broad value to a block chain, so that the block chain encrypts and stores the characteristic value and the broad value.

Specifically, the corresponding digest information is obtained based on the feature value and the broad value, and specifically, the digest information is obtained by hashing the feature value and the broad value, for example, using the sha256s algorithm. Uploading summary information to the blockchain can ensure the safety and the fair transparency of the user. The user equipment may download the summary information from the blockchain to verify that the characteristic value and the broad value have been tampered with. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

In addition, after step S201 and before step S202, the following steps may be further included:

and S2011, preprocessing the audio signal in the audio sample data, wherein the preprocessing mode comprises at least one of resampling processing, noise reduction processing, howling suppression processing and echo cancellation processing.

The resampling processing comprises at least one of up-resampling processing and down-resampling processing, wherein during the up-resampling processing, the difference processing is carried out on the audio signal, and during the down-resampling processing, the extraction processing is carried out on the audio signal; the noise reduction processing refers to a processing mode of eliminating a noise part in an audio signal; the howling suppression processing refers to the elimination of the howling condition appearing in the audio signal, and the howling suppression can be performed in a mode of eliminating the howling by adjusting the frequency response of the system to an approximate straight line and enabling the gains of all frequencies to be basically consistent by adopting a frequency equalization method; the Echo Cancellation process may be implemented by Echo Cancellation (EC) technology, where echoes are divided into Acoustic Echo (Acoustic Echo) and Line Echo (Line Echo), and the corresponding Echo Cancellation technology corresponds to Acoustic Echo Cancellation (AEC) and Line Echo Cancellation (LEC).

According to the algorithm in the embodiment, the typical characteristics of transient noise are extracted to track the noise according to the rule that the transient noise is short in duration and large in energy, suppression gain is obtained according to the time domain signal characteristics of historical audio data, and damage to the audio is avoided. The audio processing method of the embodiment can effectively track various transient noises and effectively suppress the transient noises, such as keyboard knocking sound, mobile phone collision sound, mobile phone clothes rubbing sound and the like. In conversation application, especially in the scene of mobile office meetings which are increasingly popular, as long as one member of the meeting generates transient noise, the personnel of the meeting can be seriously interfered by the noise, the user experience is influenced constantly, and the quality of conversation service can be obviously improved and the product quality and the grade of the conversation service can be improved by restraining the transient noise.

Fig. 3 is a schematic structural diagram of an audio processing apparatus based on transient noise suppression according to a third embodiment of the invention. As shown in fig. 3, the apparatus 30 includes: the transient noise tracking module 31 is configured to obtain a maximum value and a frame energy value of a current frame audio signal, and obtain a feature value of the current frame audio signal according to the maximum value and the frame energy value; the first calculating module 32 is configured to calculate a smooth value of the feature value of the current frame audio signal according to the feature value and a preset first smoothing factor, and calculate a width value of the current frame audio signal according to the smooth value of the feature value and a preset width gain factor; the second calculating module 33 is configured to determine whether the current frame audio signal contains transient noise based on the feature value and the width value, and if so, obtain a suppression gain of the current frame audio signal; the gain processing module 34 is configured to apply the suppression gain to the current frame audio signal to obtain a transient noise suppressed audio output signal.

Fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention. As shown in fig. 4, the electronic device 40 includes a processor 41 and a memory 42 coupled to the processor 41.

The memory 42 stores program instructions for implementing the transient noise suppression-based audio processing method of any of the above embodiments.

Processor 41 is operative to execute program instructions stored in memory 42 for transient noise suppression-based audio processing.

The processor 41 may also be referred to as a CPU (Central Processing Unit). The processor 41 may be an integrated circuit chip having signal processing capabilities. The processor 41 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a storage medium according to a fifth embodiment of the invention. The storage medium of the embodiment of the present invention stores program instructions 51 capable of implementing all the above-mentioned audio processing methods based on transient noise suppression, where the program instructions 51 may be stored in the storage medium in the form of a software product, and include several instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods according to the embodiments of the present invention. The aforementioned storage device includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.

In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

While the foregoing is directed to embodiments of the present invention, it will be understood by those skilled in the art that various changes may be made without departing from the spirit and scope of the invention.

Claims

1. An audio processing method based on transient noise suppression, comprising:

2. The transient noise suppression-based audio processing method according to claim 1, wherein the feature value is a ratio of the maximum value and the frame energy value.

3. The method of claim 1 or 2, wherein the broad value is a product of a smoothed value of the feature value of the current frame audio signal and the broad gain factor, and the broad gain factor is 1-2.

4. The audio processing method based on transient noise suppression as claimed in claim 3, wherein when the feature value is greater than the width value, it is determined that the audio signal of the current frame contains transient noise;

5. The transient noise suppression-based audio processing method according to claim 4, wherein before applying the suppression gain to the current frame audio signal to obtain the transient noise suppressed audio output signal, further comprising:

6. The audio processing method based on transient noise suppression according to claim 1, wherein said calculating a smoothing value of the feature value of the current frame audio signal according to the feature value and a preset first smoothing factor comprises:

7. The audio processing method based on transient noise suppression according to claim 1, wherein said applying the suppression gain to the current frame audio signal to obtain a transient noise suppressed audio output signal comprises:

8. An apparatus for audio processing based on transient noise suppression, the apparatus comprising:

9. An electronic device, comprising a processor, and a memory coupled to the processor, the memory storing program instructions for implementing the transient noise suppression-based audio processing method according to any one of claims 1 to 7; the processor is to execute the program instructions stored by the memory to perform transient noise suppression based audio processing.

10. A storage medium having stored therein program instructions capable of implementing the transient noise suppression-based audio processing method according to any one of claims 1 to 7.