CN113452855A

CN113452855A - Howling processing method, howling processing device, electronic equipment and storage medium

Info

Publication number: CN113452855A
Application number: CN202110618560.3A
Authority: CN
Inventors: 陈耀斌; 阮良; 陈功
Original assignee: Hangzhou Langhe Technology Co Ltd
Current assignee: Hangzhou Netease Zhiqi Technology Co Ltd
Priority date: 2021-06-03
Filing date: 2021-06-03
Publication date: 2021-09-28
Anticipated expiration: 2041-06-03
Also published as: CN113452855B

Abstract

The disclosure discloses a howling processing method, a howling processing device, an electronic device and a storage medium. The method comprises the steps of performing time-frequency conversion on time domain data of each audio frame in an audio signal to be detected to obtain frequency domain data of each audio frame, and acquiring energy of each frequency band in the audio frame aiming at any audio frame based on the frequency domain data of the audio frame; determining the frequency band category of a target frequency band according to the energy ratio between the target frequency band and adjacent frequency bands in the current audio frame and the energy trend of the target frequency band and corresponding frequency bands in a preset number of historical audio frames; the target frequency band is any frequency band in the current audio frame, and the historical audio frame is an audio frame before the time of the current audio frame, so that based on the frequency band type of the target frequency band, howling processing is performed on the frequency domain data of the current audio frame carrying the target frequency band, howling is avoided, and user experience is improved.

Description

Howling processing method, howling processing device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of audio processing technologies, and in particular, to a howling processing method and apparatus, an electronic device, and a storage medium.

Background

In the fields of voice communication and multimedia communication, and in a telephone conference or a multimedia conference scene, due to the fact that the number of participants is large, howling caused by equipment problems or scene problems is easy to occur. Howling is extremely harmful to conference call quality, so howling suppression is an important link in audio processing. The howling phenomenon can be avoided in the call process through the howling inhibition, and the call voice quality and the subjective experience of participants can be greatly improved.

Disclosure of Invention

The disclosed embodiments provide a howling processing method, a howling processing device, an electronic device, and a storage medium, which solve the above problems in the prior art, avoid the occurrence of howling, and improve user experience.

The specific technical scheme provided by the embodiment of the disclosure is as follows:

in a first aspect, a howling processing method is provided, and the method may include:

performing time-frequency conversion on time domain data of each audio frame in the audio signal to be detected to obtain frequency domain data of each audio frame;

for any audio frame, acquiring energy of each frequency band in the audio frame based on frequency domain data of the audio frame;

determining the band class of a target frequency band according to the energy ratio between the target frequency band and adjacent frequency bands in a current audio frame and the energy trend of the target frequency band and corresponding frequency bands in a preset number of historical audio frames; the target frequency band is any frequency band in the current audio frame, and the historical audio frame is an audio frame before the time of the current audio frame;

and based on the band class of the target frequency band, performing howling processing on the frequency domain data of the current audio frame carrying the target frequency band.

In one possible implementation, the band classes include a first band, a second band, and a third band;

determining the band class of a target frequency band according to the energy ratio between the target frequency band and adjacent frequency bands in a current audio frame and the energy trend of the target frequency band and corresponding frequency bands in a preset number of historical audio frames, wherein the determining comprises the following steps:

acquiring a first energy ratio and a second energy ratio of the energy of the target frequency band and the adjacent frequency bands;

if the first energy ratio and the second energy ratio are not smaller than a preset maximum howling energy threshold, and the energy trends of the corresponding frequency bands and the target frequency band in the preset number of historical audio frames are sequentially increased according to the time sequence, determining that the frequency band category of the target frequency band is the first frequency band;

if the first energy ratio and the second energy ratio are both smaller than the preset maximum howling energy threshold and are not both smaller than the preset minimum howling energy threshold, and the energy trends of the preset number of corresponding frequency bands and the target frequency band are sequentially increased according to the time sequence, determining that the frequency band type of the target frequency band is the second frequency band;

and if at least one of the first energy ratio and the second energy ratio is smaller than the preset minimum howling energy threshold, or the energy trends of the preset number of corresponding frequency bands and the target frequency band are not sequentially increased according to the time sequence, determining that the frequency band class of the target frequency band is the third frequency band.

In one possible implementation, based on the band class of the target band, performing howling processing on frequency domain data of a current audio frame carrying the target band includes:

if the frequency band type of the target frequency band is the first frequency band, performing howling detection on frequency domain data of a current audio frame carrying the first frequency band by adopting a preset howling detection algorithm, and performing howling suppression on the frequency domain data of the current audio frame according to a detection result;

and if the frequency band type of the target frequency band is the second frequency band, multiplying the energy of the second frequency band by a first preset gain value to obtain new energy of the target frequency band and new frequency domain data of the current audio frame, wherein the value range of the first preset gain value is [0,1 ].

In one possible implementation, based on the detection result, performing howling suppression on the frequency domain data of the current audio frame includes:

if the detection result is that the first frequency band is a howling frequency band, performing howling suppression on the frequency domain data of the current audio frame by adopting a preset howling suppression algorithm to obtain processed frequency domain data;

and if the detection result shows that the first frequency band is not the howling frequency band, multiplying the energy of the first frequency band by a second preset gain value to obtain new energy of the first frequency band and new frequency domain data of the current audio frame, wherein the value range of the second preset gain value is [0,1 ].

In one possible implementation, performing howling detection on frequency domain data of an audio frame carrying the first frequency band by using a preset howling detection algorithm includes:

and performing howling detection on the frequency domain data of the current audio frame carrying the first frequency band by adopting a preset spectral flatness howling detection algorithm.

In one possible implementation, performing howling suppression on the frequency domain data of the current audio frame by using a preset howling suppression algorithm to obtain processed frequency domain data, including:

and performing howling suppression on the frequency domain data of the current audio frame by adopting a preset howling suppression algorithm of a wave trap to obtain processed frequency domain data.

In a second aspect, a howling processing apparatus is provided, and the apparatus may include:

the time-frequency conversion unit is used for performing time-frequency conversion on the time domain data of each audio frame in the audio signal to be detected to obtain the frequency domain data of each audio frame;

the acquisition unit is used for acquiring the energy of each frequency band in any audio frame based on the frequency domain data of the audio frame;

the determining unit is used for determining the band type of a target frequency band according to the energy ratio between the target frequency band and adjacent frequency bands in a current audio frame and the energy trend of the target frequency band and corresponding frequency bands in a preset number of historical audio frames; the target frequency band is any frequency band in the current audio frame, and the historical audio frame is an audio frame before the time of the current audio frame;

and the howling processing unit is used for carrying out howling processing on the frequency domain data of the current audio frame carrying the target frequency band based on the frequency band type of the target frequency band.

the determining unit is specifically configured to obtain a first energy ratio and a second energy ratio between the energy of the target frequency band and the adjacent frequency band;

if the first energy ratio and the second energy ratio are not smaller than a preset maximum howling energy threshold value, and the energy trends of the corresponding frequency bands and the target frequency band in the preset number of historical audio frames are sequentially increased according to the time sequence, determining that the frequency band type of the target frequency band is the first frequency band;

In a possible implementation, the howling processing unit is specifically configured to, if the frequency band type of the target frequency band is the first frequency band, perform howling detection on frequency domain data of a current audio frame carrying the first frequency band by using a preset howling detection algorithm, and perform howling suppression on the frequency domain data of the current audio frame according to a detection result;

In a possible implementation, the howling processing unit is further specifically configured to perform howling suppression on the frequency domain data of the current audio frame by using a preset howling suppression algorithm if the detection result indicates that the first frequency band is a howling frequency band, so as to obtain processed frequency domain data;

and if the detection result indicates that the first frequency band is not the howling frequency band, multiplying the energy of the first frequency band by a second preset gain value to obtain new energy of the first frequency band and new frequency domain data of the current audio frame, wherein the value range of the second preset gain value is [0,1 ].

In a possible implementation, the howling processing unit is further specifically configured to perform howling detection on the frequency domain data of the current audio frame carrying the first frequency band by using a preset spectral flatness howling detection algorithm.

In a possible implementation, the howling processing unit is further specifically configured to perform howling suppression on the frequency domain data of the current audio frame by using a preset trap howling suppression algorithm, so as to obtain processed frequency domain data.

In a third aspect, an electronic device is provided, which includes:

at least one memory for storing program instructions;

at least one processor configured to call program instructions stored in the memory, and execute the method steps according to any one of the first aspect described above according to the obtained program instructions.

In a fourth aspect, a computer-readable storage medium is provided, having stored therein a computer program which, when executed by a processor, performs the method steps of any of the above first aspects.

The howling processing method provided by the embodiment of the disclosure performs time-frequency conversion on time domain data of each audio frame in an audio signal to be detected to obtain frequency domain data of each audio frame, and then acquires energy of each frequency band in the audio frame based on the frequency domain data of the audio frame for any audio frame; determining the frequency band category of a target frequency band according to the energy ratio between the target frequency band and adjacent frequency bands in the current audio frame and the energy trend of the target frequency band and corresponding frequency bands in a preset number of historical audio frames; the target frequency band is any frequency band in the current audio frame, and the historical audio frame is an audio frame before the time of the current audio frame, so that based on the frequency band type of the target frequency band, howling processing is performed on the frequency domain data of the current audio frame carrying the target frequency band, howling is avoided, and user experience is improved.

Drawings

Fig. 1 is a schematic diagram illustrating a principle of howling generation in an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of a trap howling suppression system in an embodiment of the present disclosure;

fig. 3 is a schematic diagram of a howling detection process in an embodiment of the present disclosure;

fig. 4 is a flowchart illustrating a howling processing method in an embodiment of the disclosure;

fig. 5 is a schematic structural diagram of a howling processing apparatus in an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of an electronic device in an embodiment of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only some embodiments of the present disclosure, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

First, some terms referred to in the embodiments of the present disclosure are explained so as to be easily understood by those skilled in the art.

The terminal equipment: may be a mobile terminal, a fixed terminal, or a portable terminal such as a mobile handset, station, unit, device, multimedia computer, multimedia tablet, internet node, communicator, desktop computer, laptop computer, notebook computer, netbook computer, tablet computer, personal communication system device, personal navigation device, personal digital assistant, audio/video player, digital camera/camcorder, positioning device, television receiver, radio broadcast receiver, electronic book device, gaming device, or any combination thereof, including the accessories and peripherals of these devices, or any combination thereof. It is also contemplated that the terminal device can support any type of interface to the user (e.g., wearable device), and the like.

A server: the cloud server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, big data and artificial intelligence platform and the like.

Howling is oscillation that occurs due to positive feedback in the sound amplification system. The howling phenomenon is that a microphone receives a sound signal in a sound amplifying system and converts the sound signal into an electric signal, the electric signal is amplified by a power amplifier of a sound amplifying channel, then the electric signal is converted into the sound signal by a loudspeaker and radiated, a part of the radiated signal returns to the microphone system through various paths, a positive feedback path of 'loudspeaker-microphone-amplifier-loudspeaker' is formed, and the positive feedback is formed by repeating the steps. When the amplitude balance condition and the phase balance condition are simultaneously satisfied, oscillation is generated, and the phenomenon is represented by howling.

The principle of generating howling is shown in fig. 1, a signal x (n) input by the sound amplification path is obtained by raising the sound of a sound input signal s (n) and a feedback signal f (n) through a sound transmitter, the signal u (n) is obtained by amplifying the sound by a power amplifier in the sound amplification path, the signal u (n) is output through a loudspeaker, and then the signal is reflected by a sound field feedback path to form a feedback signal f (n), and the feedback signal f (n) is collected by a microphone again to form a closed-loop system. If there is a frequency, according to the Nyquist criterion. 0 ═ 2 pi (f)₀/f_s)，f₀Indicating howling frequency, fs indicating sampling rate, so that G (omega) F (omega) satisfies the following gain and phase conditions, and can be at frequency point omega₀Self-excited oscillation is generated to generate howling.

|G(ω₀F(ω₀))|≥1； (1)

∠G(ω₀)F(ω₀) 2 pi k, k being an integer; (2)

in the formula, G (ω) and F (ω) represent the frequency response of the system amplification gain and the sound field feedback path, respectively, and can be obtained by calculating respective short-time discrete fourier transforms.

Based on the trap howling suppression principle, the trap howling suppression method mainly comprises two parts of howling detection and trap design. The principle is that howling frequency points appearing in acoustic feedback are accurately detected, notch processing is carried out, gain at the howling frequency points is reduced, gain conditions generated by howling are destroyed, and the purpose of suppressing the howling is achieved. As shown in fig. 2, first, the howling detection module detects whether the input signal x (n) generates howling in real time based on a certain characteristic; and if the howling is judged to be generated, accurately calculating a howling frequency point, designing a corresponding wave trap according to the howling frequency, reducing the gain at the howling frequency point and inhibiting the howling. Wherein:

(1) howling detection, as shown in fig. 3, firstly, performing frame windowing on an input signal, obtaining a signal power spectrum through short-time discrete fourier transform, and performing spectrum analysis; secondly, selecting a plurality of frequency points with larger power spectrum amplitude of each frame of signal as candidate howling frequency points; then respectively calculating a characteristic value at each candidate howling frequency point; and finally, detecting the corresponding characteristic value according to the set detection threshold value.

(2) The trap is essentially a band elimination filter, and the howling frequency is designed as the center frequency of the band elimination. When the howling component is detected and the accurate howling frequency is calculated, a trap filter with the corresponding frequency needs to be designed to reduce the gain at the howling frequency point and achieve the purpose of suppressing the howling. The most commonly used notch filter is a second order IIR filter.

According to the spectrum flatness howling detection principle, howling detection is carried out on an audio signal according to the ratio of the geometric mean value and the arithmetic mean value of the power spectrum corresponding to the audio signal.

In the formula, SFM is the spectral flatness of the frequency band num _ band, a (K) is the amplitude of K frequency points in the frequency band num _ band, and K is the number of frequency points in the frequency spectrum. Wherein, the closer to 0 the SFM is, the lower the flatness is, the more likely howling occurs; the closer the SFM is to 1, the higher the flatness, the less squeaking easily occurs.

Howling is a great harm to conference call quality, so howling suppression is an important link in audio processing. When the howling suppression is to be realized, howling detection needs to be performed on the audio signal to be detected first, and then howling suppression needs to be performed on the detected howling.

The current howling detection method comprises the following steps: firstly, an audio frame in an audio signal to be detected is converted into a frequency domain space from a time domain space, so that a plurality of frequency points can be obtained, then the average power of all the frequency points is calculated according to the power value of each frequency point, then the ratio of the power value of each frequency point to the calculated average power is calculated, and when the ratio exceeds a preset threshold value, the corresponding frequency point can be judged to be a howling frequency point, so that howling is detected.

Existing howling suppression schemes mainly include two categories: the first category is a howling suppression model based on machine learning trained by a large number of training samples to achieve the purpose of howling suppression, and the second category is a howling suppression scheme based on a traditional signal processing method, for example: a frequency shift method, a trap howling suppression method (howling detection + trap method), and the like. However, a large amount of calculation is introduced for the first howling suppression scheme, and a large amount of resources of the device are occupied. The frequency shift method in the second howling suppression scheme can also process the audio signal under the condition of no howling, so that the audio signal is distorted to a certain degree; the trap howling suppression method processes an audio signal after howling occurs, and reduces the subjective experience of a user because the howling has already occurred.

The method can be applied to terminal equipment, and can also be applied to a server, such as a cloud server or an application server, the energy of each frequency band in each audio frame is obtained through frequency domain data of each audio frame, and the frequency band category of a target frequency band is determined according to the energy ratio between the target frequency band and an adjacent frequency band in each audio frame and the energy trend of the target frequency band and the corresponding frequency band in a preset number of historical audio frames; therefore, based on the frequency band class of the target frequency band, the howling processing is carried out on the frequency domain data of the current audio frame carrying the target frequency band, so that the howling is avoided, and the user experience is improved.

The preferred embodiments of the present disclosure will be described below with reference to the accompanying drawings of the specification, it being understood that the preferred embodiments described herein are merely for illustrating and explaining the present invention, and are not intended to limit the present invention, and that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.

Fig. 4 is a flowchart illustrating a howling processing method according to an embodiment of the present disclosure. As shown in fig. 4, the method may include:

and step 410, performing time-frequency conversion on the time domain data of each audio frame in the audio signal to be detected to obtain the frequency domain data of each audio frame.

And acquiring each audio frame in the audio signal to be detected. Wherein, each acquired audio frame is time domain data. In order to perform howling band detection, time-frequency conversion needs to be performed on the time domain data of each audio frame, for example, Fast Fourier Transform (FFT) is adopted to perform time-frequency conversion on the time domain data to obtain the frequency domain data of each audio frame.

Step 420, for any audio frame, acquiring energy of each frequency band in the audio frame based on the frequency domain data of the audio frame.

For each audio frame, frequency domain data of the audio frame is band-divided, and energy of the frequency band is calculated.

Step 430, determining the band class of the target frequency band according to the energy ratio between the target frequency band and the adjacent frequency band in the current audio frame and the energy trend of the target frequency band and the corresponding frequency bands in the preset number of historical audio frames.

The target frequency band is any frequency band in the current audio frame, and the historical audio frame is an audio frame before the time of the current audio frame. The current audio frame is any one of the audio frames in the audio signal to be detected.

In order to avoid the situation that the audio signal can be processed only when no howling occurs and the audio signal can be processed only when the howling occurs, the frequency bands are classified according to the frequency band categories, so that the corresponding howling processing is performed on different categories, the howling is avoided, and the user experience is improved.

Therefore, in order to improve the accuracy of band class classification, the class of the target band can be preliminarily detected by obtaining the energy ratio between the target band and the adjacent band in the current audio frame and comparing the energy ratio with different preset thresholds.

In one embodiment, a first energy ratio and a second energy ratio of the energy of the target frequency band to the adjacent frequency bands are obtained. The method specifically comprises the following steps: taking the energy of the target frequency band and the energy of the left adjacent frequency band as a ratio to obtain a first energy ratio; and taking the ratio of the energy of the target frequency band to the energy of the adjacent frequency band on the right side of the target frequency band to obtain a second energy ratio.

For example, if the ith frequency band is the target frequency band and the energy of the ith frequency band is band (i), the energy of the left adjacent frequency band of the ith frequency band is band (i-1) and the energy of the right adjacent frequency band of the ith frequency band is band (i + 1).

Then, the first energy ratio value may be expressed as: band (i)/band (i-1); the second energy ratio may be expressed as: band (i)/band (i + 1).

In order to verify the accuracy of the above preliminary detection, the category of the target frequency band may be further determined by the energy trend of the target frequency band in different audio frames. Specifically, a preset number of historical audio frames before the time of the current audio frame is obtained, for example, 10 historical audio frames are collected.

The historical audio frame may be a preset number of consecutive audio frames before the time of the current audio frame, or a preset number of audio frames according to a preset frame interval before the time of the current audio frame, and may be determined according to an actual situation, which is not limited herein in the embodiment of the present disclosure.

The preset frame interval is a preset minimum number of frames that does not affect the energy trend of consecutive audio frames.

For example, the time of the current audio frame is t, and there are 8 consecutive audio frames before t, and the time of the corresponding audio frame is t1-t8 in the order of time from small to large.

If the preset number is 4, continuous 4 audio frames before t can be acquired, and the time of the corresponding audio frames is t5-t8 in sequence;

if the preset number is 4 and the preset frame interval is 1, 4 audio frames before t can be acquired according to the preset frame interval, and the time of the corresponding audio frame is t7, t5, t3 and t1 in sequence.

Furthermore, the frequency band classes of the target frequency band can be distinguished according to the comparison result of the energy ratio between the adjacent frequency bands and the preset howling energy threshold value and the energy trends of the target frequency band and the corresponding frequency band in the historical audio frame, so that the accuracy of distinguishing the frequency band classes is improved, and howling detection of different frequency bands can be better realized.

In one embodiment, the band class of the target frequency band is determined according to the comparison result of the first energy ratio and the second energy ratio with the preset maximum howling energy threshold, and the energy trends of the target frequency band and the corresponding frequency bands in the preset number of historical audio frames according to the time sequence.

In specific implementation, if the first energy ratio and the second energy ratio are not less than the preset maximum howling energy threshold, and the energy trends of the corresponding frequency bands and the target frequency bands in the preset number of historical audio frames are sequentially increased according to the time sequence, determining that the frequency band class of the target frequency band is the first frequency band;

the preset maximum howling energy threshold is the maximum howling energy for forming howling. Since the energy ratio between adjacent frequency bands exceeds the preset maximum howling energy threshold and the corresponding energy trend is in an increasing trend, the target frequency band has a higher probability of being a howling frequency band, i.e. the first frequency band is a suspected howling frequency band.

If the first energy ratio and the second energy ratio are both smaller than a preset maximum howling energy threshold and are not smaller than a preset minimum howling energy threshold, and the energy trends of a preset number of corresponding frequency bands and a target frequency band are sequentially increased according to the time sequence, determining that the frequency band category of the target frequency band is a second frequency band;

the preset minimum howling energy threshold is the minimum howling energy for forming howling. At this time, the energy ratio between adjacent frequency bands is in the interval of forming howling, and the corresponding energy trend is in an increasing trend, which indicates that the target frequency band is in the process of forming the howling frequency band, i.e. the second frequency band is the howling forming frequency band.

And if at least one of the first energy ratio and the second energy ratio is smaller than a preset minimum howling energy threshold, or the energy trends of a preset number of corresponding frequency bands and the target frequency band are not increased in sequence according to the time sequence, determining that the frequency band class of the target frequency band is a third frequency band.

At this time, the energy ratio between adjacent frequency bands is below the interval for forming howling, and the corresponding energy trend is not an increasing trend, which indicates that the target frequency band is a normal frequency band, i.e. does not belong to the howling forming frequency band or the suspected howling frequency band.

It should be noted that the preset maximum howling energy threshold and the preset minimum howling energy threshold may be set according to actual conditions or historical experiences, and the embodiment of the present disclosure is not limited herein; the energy trend is not an increasing trend, and may include a trend that the energy trend is a decreasing trend, or a trend that the energy trend increases first and then decreases periodically, or a trend that the energy trend increases first and then decreases non-periodically, and the embodiments of the present disclosure are not limited herein.

In an example, the preset maximum howling energy threshold is 12, the preset minimum howling energy threshold is 8, the first energy ratio between the ith frequency band and the adjacent frequency bands is band (i)/band (i-1), and the second energy ratio is band (i)/band (i + 1).

(1) If band (i)/band (i-1) >12 and band [ i ]/band (i +1) >12, then determining the ith frequency band as the first candidate frequency band; if the energy trend of the ith frequency band and the corresponding frequency band in the corresponding preset number of historical audio frames is an increasing trend, determining that the frequency band category of the first candidate frequency band is a suspected howling frequency band;

(2) if 12> ═ band [ i ]/band (i-1) >8 and 12> ═ band [ i ]/band (i +1) >8, then the i-th band is determined as the second candidate band; if the energy trend of the ith frequency band and the corresponding frequency band in the corresponding preset number of historical audio frames is an increasing trend, determining the frequency band category of the second candidate frequency band as a howling forming frequency band;

(3) if 8> band (i)/band (i-1) and/or 8> band [ i ]/band (i +1), then determining that the ith frequency band is a normal frequency band; or if the energy trend of the ith frequency band and the corresponding frequency band in the corresponding preset number of historical audio frames is not an increasing trend, determining that the ith frequency band is a normal frequency band.

Step 440, based on the band class of the target band, perform howling on the frequency domain data of the current audio frame carrying the target band.

By the frequency band type detection scheme, the types of the frequency bands can be accurately distinguished to obtain: and in the normal frequency band, the howling forms a frequency band and a suspected howling frequency band, so that corresponding howling protection processing is performed on different types of frequency bands, and howling is avoided.

In one embodiment, if the band class of the target frequency band is the second frequency band, the energy of the second frequency band is multiplied by a first preset gain value, so as to obtain a new energy of the target frequency band and new frequency domain data of the current audio frame, where the value range of the first preset gain value is [0,1 ]. Because the second frequency band is a howling forming frequency band, the energy of the second frequency band is multiplied by an integer less than 1, namely a gain value less than 1 is applied to the energy of the frequency band to reduce the energy of the second frequency band and the frequency corresponding to the energy, so as to destroy the forming condition of the howling and achieve the purpose of suppressing the howling.

And if the frequency band type of the target frequency band is the first frequency band, performing howling detection on the frequency domain data of the current audio frame carrying the first frequency band by adopting a preset howling detection algorithm, and performing howling suppression on the frequency domain data of the current audio frame according to a detection result.

In an example, when the band class of the target frequency band is the first frequency band, a preset spectral flatness howling detection algorithm or the howling detection algorithm shown in fig. 3 may be adopted to perform howling detection on the frequency domain data of the current audio frame carrying the first frequency band.

When a preset spectrum flatness howling detection algorithm is adopted, the amplitude of each frequency point in a first frequency band and the number of frequency points in a frequency spectrum corresponding to frequency domain data need to be acquired, and the ratio of a geometric mean value to an arithmetic mean value is calculated to obtain the spectrum flatness;

if the spectrum flatness is greater than a preset flatness threshold value, such as 0.6, the first frequency band is a howling frequency band; if the spectrum flatness is not greater than the predetermined flatness threshold, it indicates that the first frequency band may be a howling-forming frequency band or a normal frequency band.

Further, if the detection result indicates that the first frequency band is a howling frequency band, performing howling suppression on the frequency domain data of the current audio frame by using a preset howling suppression algorithm to obtain processed frequency domain data;

in an embodiment, when the first frequency band is a howling frequency band, a preset trap howling suppression algorithm may be adopted to perform howling suppression on the frequency domain data of the current audio frame, so as to obtain processed frequency domain data.

In a specific implementation, since it is detected that the first frequency band is the howling frequency band, when a preset trap howling suppression algorithm is adopted, the maximum frequency corresponding to the energy of the howling frequency band needs to be designed as the center frequency of the trap, and after the frequency domain data of the current audio frame passes through the trap, the gain of the howling frequency band at the frequency point can be reduced, so as to achieve the purpose of suppressing the howling.

When the first frequency band is not a howling frequency band, an error may be detected when detecting a frequency band type, and in this case, the first frequency band may be a howling forming frequency band or a normal frequency band, and in order to avoid the frequency band forming the howling frequency band, the energy of the frequency band and a frequency corresponding to the energy are reduced by multiplying the energy of the frequency band by an integer less than 1.

Therefore, the audio signal carrying the normal frequency band is not processed in the embodiment of the disclosure, so as to achieve the purpose of protecting the audio quality; performing howling destruction processing on the audio signal carrying the howling forming frequency band, and if the energy of the frequency band applies a gain value smaller than 1, destroying the forming condition of the howling and preventing the formation of the howling; the howling suppression processing is carried out on the audio signal carrying the howling frequency band, such as a howling suppression algorithm of a wave trap, so that the howling is avoided, and the subjective experience of a user is improved.

It should be noted that the first preset gain value and the second preset gain value may be the same or different, and may be configured according to actual needs, and the embodiment of the disclosure is not limited herein.

Based on the same technical concept, the embodiments of the present disclosure further provide a howling processing apparatus, and the principle of the howling processing apparatus to solve the problem is similar to that of the howling processing method described above, so the implementation of the howling processing apparatus may refer to the implementation of the howling processing apparatus, and repeated details are not described again. Fig. 5 is a schematic structural diagram of a howling processing apparatus according to an embodiment of the present disclosure, and as shown in fig. 5, the howling processing apparatus includes: a time-frequency conversion unit 510, an acquisition unit 520, a determination unit 530 and a howling processing unit 540;

a time-frequency conversion unit 510, configured to perform time-frequency conversion on time domain data of each audio frame in the audio signal to be detected, so as to obtain frequency domain data of each audio frame;

the FFT may be used to perform time-frequency transformation on the time domain data of each audio frame in the audio signal to be detected, so as to obtain the frequency domain data of each audio frame.

An obtaining unit 520, configured to obtain, for any audio frame, energy of each frequency band in the audio frame based on frequency domain data of the audio frame;

a determining unit 530, configured to determine a band class of a target frequency band according to an energy ratio between the target frequency band and an adjacent frequency band in a current audio frame and an energy trend of the target frequency band and corresponding frequency bands in a preset number of historical audio frames; the target frequency band is any frequency band in the current audio frame, and the historical audio frame is an audio frame before the time of the current audio frame;

first, the energy ratio of the target frequency band to the energy of the adjacent frequency bands, that is, the first energy ratio and the second energy ratio, and a preset number of historical audio frames before the time of the current audio frame are obtained, and the frequency band category of the target frequency band is determined based on the comparison result between the energy ratio of the adjacent frequency bands and the preset howling energy threshold, and the energy trends of the target frequency band and the corresponding frequency bands in the historical audio frames, and the specific determination process may be implemented with reference to the steps described in step 430.

A howling processing unit 540, configured to perform howling processing on the frequency domain data of the current audio frame carrying the target frequency band based on the frequency band class of the target frequency band.

If the frequency band type of the target frequency band is a suspected howling frequency band, and the suspected howling frequency band is not a howling frequency band, or the frequency band type of the target frequency band is a howling forming frequency band, applying a gain value smaller than 1 to the energy of the corresponding frequency band to reduce the gain of the second frequency band, thereby achieving the purpose of suppressing the howling. And if the frequency band type of the target frequency band is a suspected howling frequency band and the suspected howling frequency band is a howling frequency band, carrying out howling suppression on the frequency band.

a determining unit 530, specifically configured to obtain a first energy ratio and a second energy ratio between the energy of the target frequency band and the adjacent frequency band;

In a possible implementation, the howling processing unit 540 is specifically configured to, if the frequency band type of the target frequency band is the first frequency band, perform howling detection on frequency domain data of a current audio frame carrying the first frequency band by using a preset howling detection algorithm, and perform howling suppression on the frequency domain data of the current audio frame according to a detection result;

In a possible implementation, the howling processing unit 540 is further specifically configured to perform howling suppression on the frequency domain data of the current audio frame by using a preset howling suppression algorithm if the detection result indicates that the first frequency band is a howling frequency band, so as to obtain processed frequency domain data;

In a possible implementation, the howling processing unit 540 is further specifically configured to perform howling detection on the frequency domain data of the current audio frame carrying the first frequency band by using a preset spectral flatness howling detection algorithm.

In a possible implementation, the howling processing unit 540 is further specifically configured to perform howling suppression on the frequency domain data of the current audio frame by using a preset trap howling suppression algorithm, so as to obtain processed frequency domain data.

The functions of the functional units of the object recommendation device provided in the above embodiment of the present invention can be implemented by the above method steps, and therefore, detailed working processes and beneficial effects of the units in the object recommendation device provided in the embodiment of the present disclosure are not repeated herein.

Based on the above embodiments, refer to fig. 6, which is a schematic structural diagram of an electronic device in an embodiment of the present disclosure.

The present disclosure provides an electronic device, which may include a processor 610 (CPU), a memory 620, an input device 630, an output device 640, and the like, wherein the input device 630 may include a keyboard, a mouse, a touch screen, and the like, and the output device 640 may include a Display device, such as a Liquid Crystal Display (LCD), a Cathode Ray Tube (CRT), and the like.

Memory 620 may include Read Only Memory (ROM) and Random Access Memory (RAM), and provides processor 610 with program instructions and data stored in memory 620. In the embodiment of the present disclosure, the memory 620 may be used to store a program of any howling processing method in the embodiment of the present disclosure.

The processor 610 is configured to execute any howling processing method in the embodiments of the present disclosure according to the obtained program instructions by calling the program instructions stored in the memory 620.

Based on the above embodiments, in the embodiments of the present disclosure, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements a howling processing method in any of the above method embodiments.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the present disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications can be made in the present disclosure without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and their equivalents, the present disclosure is intended to include such modifications and variations as well.

Claims

1. A howling processing method, characterized in that the method comprises:

2. The method of claim 1, wherein the band classes comprise a first band, a second band, and a third band;

3. The method of claim 2, wherein howling the frequency domain data of the current audio frame carrying the target frequency band based on the band class of the target frequency band comprises:

4. The method of claim 3, wherein performing howling suppression on the frequency domain data of the current audio frame according to the detection result comprises:

5. The method of claim 3, wherein performing howling detection on the frequency domain data of the audio frame carrying the first frequency band by using a preset howling detection algorithm comprises:

6. The method as claimed in claim 4, wherein performing howling suppression on the frequency domain data of the current audio frame by using a preset howling suppression algorithm to obtain processed frequency domain data, comprises:

7. A howling processing apparatus, characterized in that the apparatus comprises:

8. The apparatus of claim 7, wherein the band classes comprise a first band, a second band, and a third band;

9. An electronic device, comprising:

at least one memory for storing program instructions;

at least one processor for calling program instructions stored in said memory and for executing the method steps of any of the preceding claims 1-6 in accordance with the program instructions obtained.

10. A computer-readable storage medium, on which a computer program is stored, which, when the computer program is run on an electronic device, is adapted to cause the electronic device to carry out the method steps of any of claims 1-6.