CN111988707B

CN111988707B - Howling detection method, howling detection device and storage medium

Info

Publication number: CN111988707B
Application number: CN202010815008.9A
Authority: CN
Inventors: 董培; 张晨
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-08-13
Filing date: 2020-08-13
Publication date: 2022-02-08
Anticipated expiration: 2040-08-13
Also published as: CN111988707A

Abstract

The disclosure relates to the technical field of signal processing, and particularly provides a howling detection method, a howling detection device and a storage medium. The method comprises the following steps: acquiring a first audio signal and a second audio signal; transforming the first audio signal and the second audio signal from a time domain to a frequency domain to obtain first frequency domain data and second frequency domain data; setting the energy value of the frequency band signal with the energy value larger than a first preset threshold value as a first value and otherwise as a second value for each frequency band signal in the first frequency domain data and the second frequency domain data; obtaining a correlation characteristic according to the energy value of each frequency band signal in the first frequency domain data and the energy value of each frequency band signal in the second frequency domain data; and when the correlation characteristics meet a preset condition, determining that howling occurs. The method disclosed by the invention establishes the correlation between the current time interval signal and the historical time interval signal by utilizing the time domain characteristics on the basis of frequency domain detection, thereby avoiding the false detection of frequency domain detection and improving the accuracy of howling detection.

Description

Howling detection method, howling detection device and storage medium

Technical Field

The present disclosure relates to the field of signal processing technologies, and in particular, to a howling detection method, apparatus, and storage medium.

Background

Howling refers to the phenomenon of self-excitation of energy between a device sound source and a power amplifier due to some reasons. During a call, a communication device such as a mobile phone generates howling due to a close distance between a microphone and a power amplifier (speaker). Howling not only interferes with normal use of equipment, but also generates a large power output when howling occurs, which may exceed the bearing range of a power amplifier, causing damage to the equipment. Therefore, how to detect the occurrence of howling is an important part of suppressing the howling.

Disclosure of Invention

In order to solve the technical problem of howling detection, embodiments of the present disclosure provide a howling detection method, an apparatus, and a storage medium.

In a first aspect, an embodiment of the present disclosure provides a howling detection method, including:

acquiring a first audio signal of a first time interval before the current time and a second audio signal of a second time interval before the current time, wherein the first time interval is larger than the second time interval;

transforming the first audio signal and the second audio signal from a time domain to a frequency domain to obtain first frequency domain data and second frequency domain data;

for each frequency band signal in the first frequency domain data and the second frequency domain data, setting the energy value of the frequency band signal with the energy value larger than a first preset threshold value as a first value, and setting the energy value of the frequency band signal with the energy value not larger than the first preset threshold value as a second value;

obtaining a correlation characteristic according to the energy value of each frequency band signal in the first frequency domain data and the energy value of each frequency band signal in the second frequency domain data; the correlation feature represents a correlation of the second audio signal with the first audio signal in a time domain;

and when the correlation characteristic meets a preset condition, determining that howling occurs.

In some embodiments, the determining that howling occurs when the correlation characteristic satisfies a preset condition includes:

screening out element values larger than a second preset threshold value from the correlation characteristics as peak values;

and when the time interval of any two adjacent peak values in the peak values of the continuous target number meets a first preset time threshold, determining that the correlation characteristic meets a preset condition.

In some embodiments, the first period and the second period each comprise a plurality of unit periods;

the obtaining a correlation characteristic according to the energy value of each frequency band signal in the first frequency domain data and the energy value of each frequency band signal in the second frequency domain data includes:

constructing a first energy matrix according to the energy values of the frequency domain data of a plurality of unit periods included in the first frequency domain data;

constructing a second energy matrix according to the energy values of the frequency domain data of a plurality of unit periods included in the second frequency domain data;

obtaining a correlation characteristic set according to the first energy matrix and the second energy matrix; the feature values in the set of correlation features represent correlations of the second audio signal with different time-domain positions of the first audio signal.

In some embodiments, the calculating a set of correlation features according to the first energy matrix and the second energy matrix includes:

and sequentially carrying out correlation processing on the second energy matrix and the partial matrixes with the same size in the first energy matrix to obtain the correlation characteristic set.

In some embodiments, said constructing a first energy matrix from said energy values of a plurality of unit periods of frequency domain data included in said first frequency domain data comprises:

selecting a plurality of target frequency band signals of a first preset frequency band from the frequency domain data of each unit time interval of the first frequency domain data;

constructing a first energy matrix according to the number of unit periods in the first frequency domain data, the number of target frequency band signals, and the energy value of the target frequency band signal.

In some embodiments, said constructing a second energy matrix from said energy values of a plurality of unit periods of frequency domain data included in said second frequency domain data comprises:

selecting a plurality of target frequency band signals of a second preset frequency band from the frequency domain data of each unit time interval of the second frequency domain data;

constructing a second energy matrix according to the number of unit periods, the number of target frequency band signals, and the energy value of the target frequency band signal in the second frequency domain data.

In a second aspect, an embodiment of the present disclosure provides a howling detection apparatus, including:

the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a first audio signal in a first time interval before the current moment and a second audio signal in a second time interval before the current moment, and the first time interval is larger than the second time interval;

the transformation module is used for transforming the first audio signal and the second audio signal from a time domain to a frequency domain to obtain first frequency domain data and second frequency domain data;

an energy value reduction module, configured to, for each frequency band signal in the first frequency domain data and the second frequency domain data, set an energy value of the frequency band signal having an energy value greater than a first preset threshold as a first value, and set an energy value of the frequency band signal having an energy value not greater than the first preset threshold as a second value;

the processing module is used for obtaining correlation characteristics according to the energy value of each frequency band signal in the first frequency domain data and the energy value of each frequency band signal in the second frequency domain data; the correlation feature represents a correlation of the second audio signal with the first audio signal in a time domain;

and the determining module is used for determining that howling occurs when the correlation characteristics meet the preset conditions.

In some embodiments, the determining module is specifically configured to:

In some embodiments, the first period and the second period each comprise a plurality of unit periods; the processing module is specifically configured to:

In some embodiments, the processing module, when configured to obtain the correlation feature set according to the first energy matrix and the second energy matrix, is specifically configured to:

In some embodiments, the processing module, when configured to construct the first energy matrix from the energy values of the frequency domain data of the plurality of unit periods comprised in the first frequency domain data, is specifically configured to:

In some embodiments, the processing module, when configured to construct the second energy matrix from the energy values of the frequency domain data of the plurality of unit periods included in the second frequency domain data, is specifically configured to:

In a third aspect, an embodiment of the present disclosure provides a howling detection apparatus, including:

a processor; and

a memory communicatively coupled to the processor, the memory storing computer readable instructions readable by the processor, the processor being configured to perform the method according to any of the embodiments of the first aspect when the computer readable instructions are read.

In a fourth aspect, the embodiments of the present disclosure provide a storage medium storing computer-readable instructions for causing a computer to execute the method according to any one of the embodiments of the first aspect.

The howling detection method provided by the embodiment of the disclosure includes acquiring a first audio signal in a first time period before a current time and a second audio signal in a second time period before the current time, where the first time period is greater than the second time period, that is, the second audio signal represents a signal in the current time period, and the first audio signal may represent a signal in a longer history time period before the current time. The method comprises the steps of converting a first audio signal and a second audio signal from a time domain to a frequency domain to obtain corresponding first frequency domain data and second frequency domain data, setting the energy value of a frequency band signal of which the energy value is larger than a first preset threshold value as a first numerical value for each frequency band signal in the first frequency domain data and the second frequency domain data, otherwise, setting the energy value as a second numerical value, obtaining a correlation characteristic according to the energy value of each frequency band signal in the first frequency domain data and the second frequency domain data, wherein the correlation characteristic represents the correlation of the second audio signal and the first audio signal in the time domain, and when the correlation characteristic meets a preset condition, determining that howling occurs. According to the method, by utilizing the correlation comparison between the current time interval signal and the historical time interval signal, the howling can be determined under the condition that the howling characteristics are met, and the real-time detection of the howling is effectively realized. In addition, the method establishes the correlation between the current time interval signal and the historical time interval signal by utilizing the time domain characteristics on the basis of frequency domain detection, avoids the false detection of frequency domain detection and greatly improves the accuracy of howling detection. And the energy values of the frequency band signals are simplified by utilizing the first numerical value and the second numerical value, so that the subsequent calculation amount is greatly simplified, the calculation force of a computer is saved, and the real-time detection efficiency of howling is improved.

The howling detection method provided by the embodiment of the present disclosure, determining whether the correlation characteristic value satisfies the preset condition, includes: in the correlation characteristic, an element value larger than a second preset threshold is screened out as a peak value, and the peak value indicates that the correlation between the second audio signal in the second time period and the first audio signal at the corresponding time domain position is high, and the energy value is also high, possibly being howling. And further judging that the peak value is determined to be howling when the time interval of any two adjacent peak values in the peak values of the continuous target number meets a first preset time threshold value. The howling characteristics are generated at basically fixed intervals, so that the time interval between the screened peak values can be determined as the howling only if the time interval between the screened peak values is within the delay threshold range of the system, the howling characteristics can be effectively identified by utilizing the time interval between the peak values of continuous target quantity to meet the delay threshold range of the system, and the detection accuracy is improved.

In the howling detection method provided by the embodiment of the disclosure, the first time interval and the second time interval both include a plurality of unit time intervals, and when the correlation characteristic is obtained, the first energy matrix and the second energy matrix are respectively constructed according to the energy values of the frequency domain data of the plurality of unit time intervals in the first frequency domain data and the second frequency domain data. The energy matrix represents a set of energy values and unit time periods of each frequency band signal in the frequency domain data, and a correlation characteristic set is obtained by utilizing the second energy matrix and the first energy matrix. The method disclosed by the invention is based on the frequency domain characteristics, and further establishes the correlation by combining the time domain characteristics between the current time interval signal and the historical time interval signal, so that the accuracy of howling detection is improved.

In the howling detection method provided by the embodiment of the disclosure, when a first energy matrix and a second energy matrix are constructed, a plurality of target frequency band signals of a first preset frequency band are selected from frequency domain data of each unit time interval, and an energy matrix is constructed based on the target frequency band signals. The first preset frequency band is, for example, a frequency band in which howling is easy to occur, and the target frequency band signal meeting the first preset frequency band is screened from the frequency domain data, so that the interference of irrelevant data is greatly reduced, the size of an energy matrix is reduced, and the operation amount is further reduced.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a flow chart of a howling detection method according to some embodiments of the present disclosure.

Fig. 2 is a flow chart of a howling detection method according to other embodiments of the present disclosure.

Fig. 3 is a flow chart of a howling detection method according to other embodiments of the present disclosure.

Fig. 4 is a flow chart of a howling detection method according to other embodiments of the present disclosure.

Fig. 5 is a flow chart of a howling detection method according to other embodiments of the present disclosure.

Fig. 6 is a flowchart of a howling detection method according to an embodiment of the present disclosure.

Fig. 7 is a block diagram of a howling detection apparatus according to some embodiments of the present disclosure.

FIG. 8 is a block diagram of a computer system suitable for use in implementing the methods of embodiments of the present disclosure.

Detailed Description

The technical solutions of the present disclosure will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are only some embodiments of the present disclosure, but not all embodiments. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure. In addition, technical features involved in different embodiments of the present disclosure described below may be combined with each other as long as they do not conflict with each other.

In a first aspect, the howling detection method provided by the embodiments of the present disclosure may be used to detect the howling of an audio device. Howling refers to the phenomenon of self-excitation of energy generated between a sound pickup (sound source) of a device and a power amplifier (loudspeaker), for example, when the distance between a microphone and the power amplifier is too close, the audio amplified by the power amplifier enters the sound pickup again to generate feedback sound, i.e., howling. Howling not only interferes with the normal use of the equipment, but also produces a large power output when howling occurs, which may exceed the tolerance range of the equipment, and if howling continues all the time, there is a risk of equipment burnout.

When suppressing howling, the howling is monitored first, and when the occurrence of howling is detected, the howling is suppressed. The howling detection method provided by the embodiment of the disclosure is used for detecting the howling of the device in real time, so as to effectively detect the generation of the howling. From the foregoing, it can be seen that the howling detection method provided by the embodiments of the present disclosure is applicable to any audio device that may generate howling, such as a smart phone, a sound system, and the like, and the method may be executed by a processor (CPU) of the device, and those skilled in the art will understand that the present disclosure is not limited thereto.

First, in order to facilitate understanding of the following schemes, some terms appearing hereinafter are explained:

1) a time domain signal and a frequency domain signal. A time-domain signal refers to a signal whose waveform can express the change of the signal with time, while a frequency-domain signal is a domain that follows a certain rule, i.e., any waveform in the frequency domain can be synthesized by sine waves of a plurality of different frequency bands. That is, the time domain signal and the frequency domain signal are two forms of the same waveform.

2) FFT transformation: the fast fourier transform refers to converting an audio signal from a time domain signal to a frequency domain signal.

As shown in fig. 1, in some embodiments, the howling detection method of the present disclosure includes:

s10, a first audio signal of a first time period before the current time and a second audio signal of a second time period before the current time are acquired, the first time period being greater than the second time period.

It is worth mentioning that at least one inventive concept of the method of the embodiments of the present disclosure lies in: whether the howling characteristic is met is determined by using the correlation in the time domain of the audio signal of the current time period and the audio signal of a longer time period before the current time.

Specifically, in step S10, the first period is a longer period of time before the current time. In one example, the first audio signal refers to a signal 2s before the current time.

The second time period is a short time period before the current time, that is, the second time period can be used to indicate the current time period. In one example, the second audio signal refers to a signal 100ms before the current time instant.

Of course, it can be understood by those skilled in the art that the first time period and the second time period can be selected accordingly according to different embodiments, and are not limited to the above examples.

And S20, transforming the first audio signal and the second audio signal from the time domain signal to the frequency domain to obtain first frequency domain data and second frequency domain data.

Specifically, after the first audio signal and the second audio signal are acquired, the audio signal is FFT-transformed from a time domain signal to a frequency domain signal. Namely, performing FFT (fast Fourier transform) on the first audio signal to obtain first frequency domain data corresponding to the first audio signal; and performing FFT (fast Fourier transform) on the second audio signal to obtain second frequency domain data corresponding to the second audio signal.

It will be appreciated that in embodiments of the present disclosure, the second audio signal represents a signal a short period of time before the current time instant, and the first audio signal represents a signal a long period of time before the current time instant, i.e. the first period comprises the second period. Therefore, in another example, after the FFT of the first audio signal, the frequency domain signal corresponding to the second time interval length is selected from the first frequency domain data, that is, the second frequency domain data of the second audio signal is selected, that is, the FFT of the second audio signal is not required.

S30, for each frequency band signal in the first frequency domain data and the second frequency domain data, setting the energy value of the frequency band signal with the energy value larger than the first preset threshold as a first value, and setting the energy value of the frequency band signal with the energy value not larger than the first preset threshold as a second value.

Specifically, as can be seen from the foregoing, the frequency domain data obtained after the FFT indicates signals of a plurality of different frequency bands, that is, the first frequency domain data includes a plurality of frequency band signals, and the second frequency domain data also includes a plurality of frequency band signals.

It should be noted that, in the embodiments of the present disclosure, it is considered that each band signal corresponds to one energy value, and if simplification processing is not performed, the plurality of band signals include a plurality of different energy values, and the calculation amount is large. Therefore, by setting the first preset threshold as the reference value, the energy value of the frequency band signal higher than the first preset threshold in each frequency band signal is represented as the first numerical value, and the energy value of the frequency band signal not higher than the first preset threshold is represented as the second numerical value, so that the energy values of the frequency band signals are processed by the first numerical value and the second numerical value in a simplified manner, and the subsequent operation amount is greatly reduced.

In one example, the first value may be set to 1 and the second value may be set to 0, i.e., the energy values of the respective frequency band signals are reduced to 1 and 0 representations. The details of which are set forth below, are not presented here.

S40, obtaining correlation characteristics according to the energy value of each frequency band signal in the first frequency domain data and the energy value of each frequency band signal in the second frequency domain data; the correlation feature represents a correlation of the second audio signal with the first audio signal in the time domain.

Specifically, as can be seen from the foregoing, in step S40, a correlation characteristic in the time domain is constructed by using the energy values of the respective frequency band signals in the frequency domain data of the audio signals of the current period and the history period, the correlation characteristic representing the correlation of the second period signal with the respective segment signals of different time domain positions in the first period. In one example, the first audio signal is a 2s signal and the second audio signal is a 100ms signal, and the correlation characteristic represents the correlation of the second audio signal with any 100ms duration signal of the first audio signal.

In one example, an energy matrix may be constructed using the first frequency domain data and the second frequency domain data, respectively, correlation eigenvalues of the two may be calculated using the energy matrix, and the set of correlation eigenvalues may be used to represent the correlation characteristic. This example is described in detail below, and is not presented here.

And S50, when the correlation characteristics meet the preset conditions, determining that howling occurs.

Specifically, as can be seen from step S40, the correlation characteristic represents the correlation between the audio signal in the second time period and the segment signals in different time domain positions in the first time period. Therefore, the larger the correlation characteristic value of a certain period in the second period and the first period, the higher the correlation between the two is, that is, if the audio signal in the current period is a howling signal, the corresponding period in the history period with the larger correlation characteristic value may also be the howling signal.

After it is determined that a howling signal is possible, it is further determined whether the correlation characteristic satisfies a preset condition of howling, and when the howling condition is satisfied, it is determined that howling occurs.

In one example, considering that howling must occur at intervals of a fixed duration, it is further determined whether an interval between periods of higher correlation satisfies a howling characteristic among the obtained correlation characteristics, and if a preset time interval is satisfied, it is determined that howling occurs. This example is described in detail below, and is not presented here.

As can be seen from the above description, in the howling detection method according to the embodiment of the present disclosure, the audio signal in the current second period and the audio signal in the first period are used to establish a correlation characteristic in a time domain, and whether howling occurs is determined according to the correlation characteristic and a preset condition of the howling. That is, based on the frequency domain characteristics, the correlation detection in the time domain is increased, the probability of false detection is reduced, and the accuracy of howling detection is improved, which will be briefly described below.

In howling detection in the related art, when a frequency band signal relatively close to a howling frequency domain characteristic is detected in a frequency domain signal of an audio signal based on a frequency spectrum characteristic of the howling, the howling is considered to occur, and the frequency band signal is suppressed. However, in this case, false detection is likely to occur for a signal whose spectral characteristic is relatively close to howling, that is, a valid signal is detected as a howling signal, and the valid signal is canceled to cause false detection.

In the embodiment of the present disclosure, after frequency domain conversion, instead of directly performing spectrum detection and comparison on the frequency band signal in the frequency domain data, a correlation characteristic is established in a time domain by using the audio signal of the current second time period and the audio signal of the historical first time period, and whether howling occurs is determined according to the correlation characteristic and a preset condition of the howling. The method avoids false detection of effective signals with spectral characteristics close to howling, and improves the accuracy of howling detection. And the energy values of the frequency band signals are simplified by utilizing the first numerical value and the second numerical value, so that the subsequent calculation amount is greatly simplified, the calculation force of a computer is saved, and the real-time detection efficiency of howling is improved.

In some embodiments, the first and second periods described above include a plurality of unit periods, and a unit period refers to a minimum unit that divides the first and second periods into a plurality.

In one example, the first period is a period 2s before the current time, the second period is a period 100ms before the current time, and the unit period represents each period of 1 ms. That is, the first period includes 2000 unit periods of 1ms, and the second period includes 100 unit periods of 1 ms.

Of course, it is understood that the unit time period may be a time period represented by other time periods in different embodiments, and the above example does not limit this.

Fig. 2 shows an embodiment of obtaining the correlation characteristic according to the energy value of each frequency band signal in the first frequency domain data and the energy value of each frequency band signal in the second frequency domain data in the step S40, which is described in detail below with reference to fig. 2.

As shown in fig. 2, in the present embodiment, the method of the present disclosure includes:

s201, constructing a first energy matrix according to energy values of the frequency domain data of a plurality of unit time intervals included in the first frequency domain data.

S202, a second energy matrix is constructed according to the energy values of the frequency domain data of the plurality of unit periods included in the second frequency domain data.

Specifically, in steps S201 to S202, as can be seen from the foregoing, in the frequency domain data obtained after the FFT, which includes signals of a plurality of different frequency bands, the matrix construction will be described in detail with reference to the example below.

In one example, the first period includes T unit periods, and for each unit period, k frequency band signals are included in the frequency domain data thereof, each frequency band signal corresponding to one energy value. Thus, the first energy matrix can be constructed as: spec_Tk＝spec(1)(1)……spec(1)(k)， spec(2)(1)……spec(2)(k)，……，spec(T)(1)……spec(T)(k)。

Similarly, the second time interval includes t unit time intervals, and the second energy matrix can be constructed as follows: spec_tk＝spec(1)(1)……spec(1)(k)，spec(2)(1)……spec(2)(k)，……， spec(t)(1)……spec(t)(k)。

It should be understood that, in the constructed first energy matrix and the second energy matrix, each element spec (i) and (j) in the matrix represents: an energy value of a jth frequency band signal in an ith unit period. For example, spec (2) (1) represents an energy value of the 1 st band signal in the 2 nd unit period.

It should be understood by those skilled in the art that the foregoing examples are for illustrative purposes only and are not limiting to the present disclosure, and will not be described in detail herein.

S203, obtaining a correlation characteristic set according to the first energy matrix and the second energy matrix; the feature values in the set of correlation features represent correlations of the second audio signal with different time domain positions of the first audio signal.

In some embodiments, deriving the set of correlation features from the first energy matrix and the second energy matrix comprises: and sequentially carrying out correlation processing on the second energy matrix and the partial matrixes with the same size in the first energy matrix to obtain a correlation characteristic set. In one example, the correlation process includes: and sequentially performing point multiplication on the second energy matrix and a partial matrix with the same size in the first energy matrix, and summing elements in each matrix obtained by the point multiplication to obtain a correlation characteristic set of the first audio signal and the second audio signal.

Specifically, as can be seen from the examples in steps S201 and S202, the first energy matrix is a large matrix with T rows and k columns, and the second energy matrix is a small matrix with T rows and k columns, and the dot product of the matrices must ensure that the rows and the columns of the matrices are identical. Therefore, in performing the dot multiplication, it is necessary to perform the dot multiplication on the partial matrix of the second energy matrix having the same size as that of the first energy matrix. And sequentially translating the point multiplication in the first energy matrix equivalent to the second energy matrix, and performing (T-T) point multiplication operations in total.

According to the dot multiplication operation of the matrix, the two matrixes obtained after dot multiplication are still the matrix, namely (T-T) matrixes are obtained after dot multiplication. In each obtained matrix, summing all element values in the matrix to obtain a sum value, wherein the sum value represents the correlation between the signal in the second time period and the signal in different time domain positions in the first time period, and the larger the sum value is, the higher the correlation between the signal in the second time period and the signal in different time domain positions in the first time period is.

After the (T-T) matrixes are subjected to summation operation in sequence, a correlation characteristic set consisting of the (T-T) matrixes can be obtained. Each element in the set represents: the correlation of the signal in the second time period and the signal in different time domain positions in the first time period is higher, and the value of the element is larger, so that the correlation of the signal in the second time period and the signal in different time domain positions in the first time period is higher.

In some embodiments, considering that the energy value of each frequency band signal in the frequency domain data is different, if simplified processing is not performed, a huge amount of calculation is required, which undoubtedly imposes a burden on the computer. Therefore, the detection method of the present disclosure further includes:

for each frequency band signal in the frequency domain data of each unit time interval, if the energy value of the frequency band signal is greater than a first preset threshold value, setting the energy value of the frequency band signal as a first numerical value; and if the energy value of the frequency band signal is less than or equal to the first preset threshold value, setting the energy value of the frequency band signal to be a second value.

In one example, the first value is 1 and the second value is 0. Of course, one skilled in the art will appreciate that the first and second values may be other values and the disclosure is not limited thereto. The first value is 1 and the second value is 0.

Specifically, in the frequency domain data of one unit period, k band signals each corresponding to an energy value are included in total. If the simplification processing is not performed, k different energy value data are shared, and the calculation amount is large. In the present embodiment, by setting the first preset threshold as the reference value, the energy value of the frequency band signal higher than the first preset threshold in each frequency band signal is represented by 1, and the energy value of the frequency band signal lower than the first preset threshold is represented by 0, that is, the energy values of the k frequency band signals are simplified to 1 and 0.

In one example, the frequency domain data of one unit period includes k frequency band signals in total, denoted as spec (1), spec (2), … …, spec (k). The first preset threshold is the mean value of the energy values of the k frequency band signals, and is represented as [ spec (1) + spec (2) … + spec (k) ]/k. The energy value of the frequency band signal having the energy value higher than the mean value is represented as 1, and the energy value of the frequency band signal having the energy value lower than the mean value is represented as 0.

It will be understood by those skilled in the art that the first preset threshold is not limited to the mean value, but may be other reference values suitable for implementation, and the disclosure is not limited thereto.

After the simplification processing, only elements represented by 1 and 0 are included in the constructed first energy matrix and the constructed second energy matrix, so that the operation amount is greatly reduced during matrix construction and dot product operation, and the howling can be quickly detected under low calculation power.

In some embodiments, in order to further reduce the amount of computation, when constructing the first energy matrix and the second energy matrix, not all band signals are processed, but only band signals of frequency bands in which howling is likely to occur are processed, so that the number of elements of the matrix is reduced, and the amount of computation is further reduced. This will be described in detail with reference to fig. 3.

As shown in fig. 3, in constructing the first energy matrix, the detection method in the embodiment of the present disclosure further includes:

s301, selecting a plurality of target frequency band signals of a first preset frequency band from the frequency domain data of each unit time interval of the first frequency domain data.

S302, constructing a first energy matrix according to the number of unit time intervals in the first frequency domain data, the number of target frequency band signals and the energy value of the target frequency band signals.

Specifically, before constructing the energy matrix, first, a plurality of target frequency band signals of a frequency band in which howling is likely to occur are selected from among the plurality of frequency band signals included in the first frequency domain data, and the number of target frequency band signals is smaller than the total number of frequency band signals. The first preset frequency band is a frequency band indicating a frequency band in which howling is likely to occur, and the parameter may be obtained by counting frequency band data in which howling occurs in historical data, which can be understood by those skilled in the art and will not be described herein again.

In one example, the frequency domain data of one unit period includes k frequency band signals in total, denoted as spec (1), spec (2), … …, spec (k). Of the k band signals, n target band signals, denoted as spec (1), spec (2), … …, spec (n), are selected based on a first preset frequency band in which howling is likely to occur, which is obtained in advance.

The first energy matrix is constructed similarly to the previous step S201, except that the number of columns of the energy matrix constructed in this step is greatly reduced, i.e. the number of elements is greatly reduced. In step S201, the constructed first energy matrix includes T rows and k columns, i.e., T × k element values. In the present embodiment, since only n target band signals (n is smaller than k) are selected from k band signals, the first energy matrix is constructed to include T rows and n columns, i.e., T × n element values.

Similarly, as shown in fig. 4, when constructing the second energy matrix, the detection method in the embodiment of the disclosure further includes:

s401, selecting a plurality of target frequency band signals of a second preset frequency band from the frequency domain data of each unit time interval of the second frequency domain data.

S402, constructing a second energy matrix according to the number of unit time intervals in the second frequency domain data, the number of target frequency band signals and the energy value of the target frequency band signals.

Specifically, the construction of the second energy matrix in the present embodiment is similar to the construction of the first energy matrix in the embodiment of fig. 3, and those skilled in the art should understand based on the above description, and will not be described herein again.

As can be seen from the above, in step S202, the constructed second energy matrix includes t rows and k columns, i.e., t × k element values. In the present embodiment, since n target band signals (n is smaller than k) are selected from only k band signals, the constructed second energy matrix includes t rows and n columns, i.e., t × n element values.

Therefore, in the embodiment, the target frequency band signal meeting the first preset frequency band is screened from the frequency domain data, so that the interference of irrelevant data is greatly reduced, the size of an energy matrix is reduced, and the operation amount is reduced.

In step S203, after the correlation feature set is obtained, it is necessary to determine whether the howling feature is met according to the correlation feature set, so as to determine whether howling occurs. Specifically, one embodiment is shown in fig. 5, which is described in detail below in conjunction with fig. 5.

As shown in fig. 5, when determining whether howling occurs according to the correlation characteristics, the detection method of the embodiment of the present disclosure includes:

s501, screening out element values larger than a second preset threshold value from the correlation characteristics as peak values.

S502, when the time interval of any two adjacent peak values in the peak values of the continuous target number meets a first preset time threshold value, determining that the correlation characteristics meet preset conditions.

In one example, as can be seen from the foregoing step S203, in the obtained correlation feature set, the set is composed of (T-T) sum values, where each sum value represents the correlation of the second period signal and the period signal corresponding to the sum value, and the larger the sum value is, the higher the correlation of the two is.

After the correlation feature set is obtained, the element values meeting the second preset threshold value need to be screened out from the set as peak values, and the peak values indicate that the correlation between the second time interval signal and the first time interval signal at the suitable position is high.

In one example, the set of relevance features consisting of (T-T) sums is denoted acc (1), acc (2), … …, acc (T-T). The second predetermined threshold may be taken as 3 times the mean of the set, expressed as [ acc (1) + acc (2) + … acc (T-T)/(T-T) ]. times.3. And screening out the element values which are greater than the second preset threshold value in the set as peak values according to the second preset threshold value. Each peak corresponds to a certain and knowable time period in the first time period, since each peak is obtained by multiplying the second energy matrix by the first energy matrix point and then summing the elements.

After the plurality of peaks are obtained, whether howling is generated is determined according to the characteristics of the howling signal. The judgment condition of the howling comprises two aspects: the first is that howling must occur at intervals of a basic time interval, so that the time intervals between two adjacent peaks in a plurality of peaks should also be uniform, and the howling is considered to be possible; the second is that the interval time between two adjacent howling should be in accordance with the delay threshold range of the communication system, so the time intervals between two adjacent peaks in the plurality of peaks should all be in the delay threshold range of the system.

Based on the above factors, the following should be satisfied when judging howling: when the time interval of any two adjacent peaks among the continuous target number of peaks satisfies the delay threshold range of the communication system, it is determined that howling occurs.

In one example, the delay threshold range of a certain voip call is 100ms to 300ms, and in three consecutive peaks a1, a2, and A3, the time interval between a1 and a2 and the time interval between a2 and A3 are both between 100ms to 300ms, for example, both within 240ms to 250ms, it is determined that howling occurs. On the contrary, if the time interval between a1 and a2 and the time interval between a2 and A3 are both 340ms to 350ms and exceed the system delay threshold range, the howling is not considered. It is to be understood that the numerical values in this example are merely illustrative and do not limit the disclosure.

As can be seen from the above description, in the howling detection method according to the embodiment of the present disclosure, the audio signal in the current second time period and the audio signal in the historical first time period are used to establish the correlation characteristic in the time domain, and whether howling occurs is determined according to the correlation characteristic and the preset condition of the howling. Namely, on the basis of frequency domain characteristics, the correlation detection of a time domain is increased, the possibility of false detection is reduced, and the accuracy of howling detection is improved. And the energy values of the frequency band signals are simplified by utilizing the first numerical value and the second numerical value, and meanwhile, the target frequency band signals meeting the first preset frequency band are screened out from the frequency domain data, so that the interference of irrelevant data is greatly reduced, the operation amount is reduced, and the detection efficiency is improved.

One embodiment of the detection method of the present disclosure is shown in fig. 6, and the method of the present disclosure is further described below with reference to the embodiment of fig. 6.

As shown in fig. 6, in the present embodiment, the howling detection method of the present disclosure includes:

s601, acquiring a first audio signal in a first time interval before the current time and a second audio signal in a second time interval before the current time, wherein the first time interval is larger than the second time interval.

In the present embodiment, the first period is a period T before the current time, that is, the first audio signal is a signal T before the current time. The second time period is a time period t before the current time, i.e. the second audio signal is a signal t before the current time. The first period T and the second period T each include a plurality of unit periods.

In one example, T may take a value of 2s, T may take a value of 100ms, and the unit period may take a value of 1 ms.

S602, transforming the first audio signal and the second audio signal from the time domain signal to the frequency domain to obtain first frequency domain data and second frequency domain data.

Specifically, refer to step S20, which is not described in detail.

S603, selecting a plurality of target frequency band signals of a first preset frequency band from the frequency domain data of each unit time interval of the first frequency domain data and the second frequency domain data.

Specifically, taking one unit period as an example, in the frequency domain data, k band signals are included in total, which are denoted as spec (1), spec (2), … …, spec (k). Of the k band signals, n target band signals are selected from the first preset frequency band in which howling is likely to occur, and are denoted as spec (1), spec (2), … …, spec (n).

In one example, the first predetermined frequency band is 500Hz to 4000 Hz.

S604, in the frequency domain data of each unit time interval, representing the energy value of the frequency band signal with the energy value greater than the first preset threshold as 1, and vice versa as 0.

Specifically, in the present embodiment, the first preset threshold is an energy value average of k band signals, and is represented as [ spec (1) + spec (2) … + spec (k)/k. Of the n selected target band signals spec (1), spec (2), … …, spec (n), the energy value of the band signal having the energy value higher than the average value is represented by 1, and the energy value of the band signal having the energy value lower than the average value is represented by 0.

S605, constructing a first energy matrix according to the number of unit time intervals, the number of target frequency band signals and the energy of a target frequency band in the first frequency domain data; and constructing a second energy matrix according to the number of unit time intervals, the number of target frequency band signals and the energy of the target frequency band in the second frequency domain data.

Specifically, for the building process of the first energy matrix and the second energy matrix, reference may be made to the foregoing embodiment of fig. 3 and fig. 4, and details are not repeated.

The first energy matrix constructed included T rows and n columns for T × n values of spec (1) (1) … … spec (1) (n), spec (2) (1) … … spec (2) (n), … …, spec (T) (1) … … spec (T) (n).

The constructed second energy matrix comprises t rows and n columns and t x n element values, which are represented as spec (1) (1) … … spec (1) (n), spec (2) (1) … … spec (2) (n), … …, spec (t) (1) … … spec (T) (n).

And S606, sequentially performing point multiplication on the second energy matrix and a partial matrix with the same size in the first energy matrix, and summing elements in each matrix obtained by the point multiplication to obtain a correlation characteristic set of the first audio signal and the second audio signal.

Specifically, as can be seen from the foregoing, the second energy matrix and the first energy matrix perform (T-T) dot product operations in common, and are described as

First dot product:

spec(1)(1)……spec(1)(n),spec(2)(1)……spec(2)(n),……spec(t)(1)……spec(t)(n ).*

spec(2)(1)……spec(2)(n),spec(3)(1)……spec(3)(n),……spec(t+1)(1)……spec(t+1)( n)

second dot product:

spec(3)(1)……spec(3)(n),spec(2)(1)……spec(2)(n),……spec(t+2)(1)……spec(t+2)( n)

the third dot product:

……

the T-T th dot product:

spec(T-t)(1)……spec(T-t)(n),spec(T-t+1)(1)……spec(T-t+1)(n),……spec(T)(1)……spec(T)(n)

and obtaining a result matrix after each dot multiplication, and summing all element values in each result matrix to obtain a sum value. It should be noted that, since the element values in the matrix are only represented by 0 and 1 after the energy matrix is simplified, the number of 1, that is, the sum of the element values, is directly counted in the result matrix. From this, it can be seen that the amount of computation in the present embodiment is greatly reduced.

After the processing is completed, T-T sum values are calculated to form a correlation characteristic set which is expressed as acc (1), acc (2), … … and acc (T-T). Each element in the set represents the correlation of the second period signal and the first period signal in the time domain, and the larger the value of the element, the higher the correlation of the two.

And S607, screening out element values larger than a second preset threshold value from the correlation characteristic set as peak values.

Specifically, in the present embodiment, the second preset threshold is 3 times of the mean value of the set, which is expressed as [ acc (1) + acc (2) + … acc (T-T)/(T-T) ]. times.3. And screening element values larger than a second preset threshold value from the obtained correlation characteristic sets acc (1), acc (2), … … and acc (T-T) as peak values.

And S608, when the time interval between any two adjacent peaks in the peaks of the continuous target number meets a first preset time threshold, determining that howling is generated.

Specifically, in the present embodiment, the determination of howling may be described with reference to the foregoing step S402, and is not described herein again.

The detection method of the present disclosure is described in detail with reference to the embodiment of fig. 6, and reference to the foregoing description may be made to those parts of the present disclosure which are not described in detail in the present embodiment.

As can be seen from the above description, in the detection method in this embodiment, the audio signal in the current second time period and the audio signal in the historical first time period are used to establish the correlation characteristic in the time domain, and whether howling occurs is determined according to the correlation characteristic and the preset condition of howling. Namely, on the basis of frequency domain characteristics, the correlation detection of a time domain is increased, the possibility of false detection is reduced, and the accuracy of howling detection is improved. And the energy values of the frequency band signals are simplified by utilizing the first numerical value and the second numerical value, and meanwhile, the target frequency band signals meeting the first preset frequency band are screened out from the frequency domain data, so that the interference of irrelevant data is greatly reduced, the operation amount is reduced, and the detection efficiency is improved.

In a second aspect, the disclosed embodiments provide a howling detection apparatus. As shown in fig. 7, in some embodiments, the howling detection apparatus of the present disclosure includes:

an obtaining module 10, configured to obtain a first audio signal in a first time period before a current time and a second audio signal in a second time period before the current time, where the first time period is greater than the second time period;

a transforming module 20, configured to transform the first audio signal and the second audio signal from a time domain to a frequency domain to obtain first frequency domain data and second frequency domain data;

an energy value simplifying module 30, configured to, for each frequency band signal in the first frequency domain data and the second frequency domain data, set an energy value of the frequency band signal having an energy value greater than a first preset threshold as a first numerical value, and set an energy value of the frequency band signal having an energy value not greater than the first preset threshold as a second numerical value;

a processing module 40, configured to obtain a correlation characteristic according to an energy value of each frequency band signal in the first frequency domain data and an energy value of each frequency band signal in the second frequency domain data; the correlation characteristic represents a correlation of the second audio signal with the first audio signal in a time domain;

and the determining module 50 is configured to determine that howling occurs when the correlation characteristic meets a preset condition.

In some embodiments, the determining module 50 is specifically configured to:

In some embodiments, the first period and the second period each comprise a plurality of unit periods; the processing module 40 is specifically configured to:

constructing a first energy matrix according to energy values of frequency domain data of a plurality of unit periods included in the first frequency domain data;

constructing a second energy matrix according to the energy values of the frequency domain data of the plurality of unit periods included in the second frequency domain data;

obtaining a correlation characteristic set according to the first energy matrix and the second energy matrix; the feature values in the set of correlation features represent correlations of the second audio signal with different time domain positions of the first audio signal.

In some embodiments, the processing module 40, when configured to obtain the correlation feature set according to the first energy matrix and the second energy matrix, is specifically configured to:

and sequentially carrying out correlation processing on the second energy matrix and the partial matrixes with the same size in the first energy matrix to obtain a correlation characteristic set.

In some embodiments, the processing module 40, when configured to construct the first energy matrix according to the energy values of the frequency domain data of the plurality of unit periods included in the first frequency domain data, is specifically configured to:

and constructing a first energy matrix according to the number of unit time intervals, the number of the target frequency band signals and the energy value of the target frequency band signals in the first frequency domain data.

In some embodiments, the processing module 40, when configured to construct the second energy matrix according to the energy values of the frequency domain data of the plurality of unit periods included in the second frequency domain data, is specifically configured to:

and constructing a second energy matrix according to the number of unit time intervals, the number of the target frequency band signals and the energy value of the target frequency band signals in the second frequency domain data.

a processor; and

Specifically, fig. 8 is a schematic structural diagram of a computer system 600 suitable for implementing the method of the present disclosure, and the system shown in fig. 8 implements the corresponding functions of the multi-screen terminal device and the storage medium.

As shown in fig. 8, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

In particular, the above method processes may be implemented as a computer software program according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the above-described method. In such embodiments, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

It should be understood that the above embodiments are only examples for clearly illustrating the present invention, and are not intended to limit the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications of the present disclosure may be made without departing from the scope of the present disclosure.

Claims

1. A howling detection method, comprising:

for each frequency band signal in the first frequency domain data and the second frequency domain data, setting the energy value of the frequency band signal with the energy value larger than a first preset threshold value as a first value, and setting the energy value of the frequency band signal with the energy value not larger than the first preset threshold value as a second value, wherein the first value is larger than the second value;

2. The method according to claim 1, wherein the determining that howling occurs when the correlation characteristic satisfies a preset condition comprises:

3. The method of claim 1, wherein the first period of time and the second period of time each comprise a plurality of unit periods of time;

4. The method of claim 3, wherein the deriving a set of correlation features from the first energy matrix and the second energy matrix comprises:

5. The method of claim 3, wherein constructing a first energy matrix from the energy values of the frequency domain data for a plurality of unit periods included in the first frequency domain data comprises:

6. The method of claim 3, wherein constructing a second energy matrix from the energy values of the frequency domain data for a plurality of unit periods included in the second frequency domain data comprises:

7. A howling detection apparatus, comprising:

an energy value reduction module, configured to, for each frequency band signal in the first frequency domain data and the second frequency domain data, set an energy value of the frequency band signal having an energy value greater than a first preset threshold as a first value, and set an energy value of the frequency band signal having an energy value not greater than the first preset threshold as a second value, where the first value is greater than the second value;

8. The apparatus of claim 7, wherein the determining module is specifically configured to:

9. The apparatus of claim 7, wherein the first time period and the second time period each comprise a plurality of unit time periods; the processing module is specifically configured to:

10. The apparatus according to claim 9, wherein the processing module, when being configured to obtain the correlation feature set according to the first energy matrix and the second energy matrix, is specifically configured to:

11. The apparatus of claim 9, wherein the processing module, when configured to construct a first energy matrix from the energy values of the frequency domain data for a plurality of unit time periods included in the first frequency domain data, is specifically configured to:

12. The apparatus according to claim 9, wherein the processing module, when being configured to construct the second energy matrix from the energy values of the frequency domain data of the plurality of unit time periods comprised in the second frequency domain data, is specifically configured to:

13. A howling detection apparatus, comprising:

a processor; and

a memory communicatively coupled with the processor, the memory storing computer readable instructions readable by the processor, the processor to perform the method of any of claims 1 to 6 when the computer readable instructions are read.

14. A storage medium having stored thereon computer-readable instructions for causing a computer to perform the method of any one of claims 1 to 6.