WO2018014673A1 - 一种啸叫检测方法和装置 - Google Patents

一种啸叫检测方法和装置 Download PDF

Info

Publication number
WO2018014673A1
WO2018014673A1 PCT/CN2017/087878 CN2017087878W WO2018014673A1 WO 2018014673 A1 WO2018014673 A1 WO 2018014673A1 CN 2017087878 W CN2017087878 W CN 2017087878W WO 2018014673 A1 WO2018014673 A1 WO 2018014673A1
Authority
WO
WIPO (PCT)
Prior art keywords
howling
value
frequency point
analysis window
preset
Prior art date
Application number
PCT/CN2017/087878
Other languages
English (en)
French (fr)
Inventor
梁俊斌
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP17830309.5A priority Critical patent/EP3451697B1/en
Publication of WO2018014673A1 publication Critical patent/WO2018014673A1/zh
Priority to US16/043,837 priority patent/US10339953B2/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/02Circuits for transducers, loudspeakers or microphones for preventing acoustic reaction, i.e. acoustic oscillatory feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03GCONTROL OF AMPLIFICATION
    • H03G5/00Tone control or bandwidth control in amplifiers
    • H03G5/16Automatic control
    • H03G5/165Equalizers; Volume or gain control in limited frequency bands

Definitions

  • the present application relates to the field of audio processing, and in particular, to a howling detection method and apparatus.
  • Howling is a sharp, harsh sound that occurs when a pickup (such as a microphone) is used.
  • the howling is generally caused by the sound output from the sounder (such as audio, speakers, etc.) being continuously captured by the pickup and returned to the sounder.
  • the power amplifier of the sounder is amplified and output, so that the sound is positively feedback.
  • the existing howling suppression scheme determines whether or not a howling occurs by detecting the energy of the output signal, thereby suppressing howling. .
  • the embodiment of the present application provides a howling detection method and device.
  • the sensitivity of the human ear to different frequency sounds is incorporated into the howling detection scheme, so that the detection result is more accurate.
  • Performing a windowing process on the audio signal to obtain a plurality of analysis windows, and performing at least one of the analysis windows performs the following processing:
  • the signal energy indication value is calculated to obtain a perceptual energy indication value of each frequency point, wherein the perceptual coefficient corresponding to each frequency point indicates the sensitivity of the human ear to the sound of each frequency point;
  • a window dividing module configured to perform window processing on the audio signal to obtain a plurality of analysis windows
  • a calculation module configured to perform, according to at least one of the plurality of analysis windows, a process of: acquiring a signal energy indication value of each frequency point preset in the analysis window; using a preset and the frequency point Corresponding perceptual coefficients respectively calculate the signal energy indication values of the respective frequency points, and obtain the perceptual energy indication values of the respective frequency points, wherein the perceptual coefficients corresponding to the respective frequency points represent the sensitivity of the human ear to each frequency point;
  • the determining module is configured to determine whether a howling occurs according to the perceived energy indication value of each frequency point in the at least one analysis window.
  • the measured energy of each frequency point of the audio is weighted, thereby more conforming to the human ear sensing characteristic, and the howling detection result is more accurate.
  • FIG. 1 is a flowchart of a howling detection method according to an embodiment of the present application.
  • Figure 2 is a graph of the perceived coefficients calculated in an example
  • FIG. 3 is a flowchart of a method for detecting howling according to an embodiment of the present invention
  • FIG. 4 is a schematic view of a howling detection device of the present invention.
  • the sensitivity of the human ear to the sound of different frequencies is included in the detection scheme, and the signal energy indication value of each frequency point of the audio signal is weighted, and the weighted signal energy indication value (hereinafter referred to as the sensing energy indication value) is performed.
  • the howling detection makes the detection result more in line with the auditory characteristics of the human ear, and thus more accurate.
  • FIG. 1 is a flowchart of a howling detection method according to an embodiment of the present application.
  • the method 10 can include the following steps.
  • step S11 window processing is performed on the audio signal to obtain a plurality of analysis windows.
  • step S12 the at least one analysis window performs the following processes: acquiring signal energy indication values of the preset frequency points in the analysis window, and respectively using the preset perceptual coefficients corresponding to the frequency points to respectively The signal energy indication value of the point is calculated to obtain the perceived energy indication value of each frequency point.
  • Step S13 Determine whether a howling occurs according to the perceived energy indication value of each frequency point in the at least one analysis window.
  • the human ear sensing characteristic is more conformed, and the howling detection result is more accurate.
  • the scheme of each embodiment can be applied to various scenarios using a sound pickup device and a sound emitting device, such as audio and video calls, broadcasting, conferences, and various live events using a loudspeaker, and the like.
  • the audio signal can be subjected to windowing processing in step S11 to make it a piece of audio signal, that is, multiple analysis windows, each time Only the audio signals in one analysis window are processed.
  • Windowing processing usually uses an analysis window with a duration of 10ms or 20ms.
  • the window function can be selected from Hanning window, Hamming window, and so on.
  • the signal energy indicator value refers to a value that can indicate the amount of energy of the audio signal.
  • the signal energy indication value can be signal energy, signal power, and the like. Signal energy or power can be measured by signal.
  • the signal energy indication value may also be a value obtained by processing a signal energy or signal power through a predetermined algorithm. The specific algorithm can be set as needed, and is not limited herein.
  • the human ear feels differently about the sounds of different frequencies of the same energy. For example, some frequency points are located in the sensitive frequency band of the human ear. Although the sound energy measurement value at the frequency point is not high, the human ear can clearly perceive and form a howling.
  • the signal energy indication value of each frequency point is weighted by using the perceptual coefficient of each frequency point to obtain a perceptual energy indication value.
  • the perceived energy indicator can indicate how strong the sound is perceived by the human ear.
  • each frequency point corresponds to one frequency value or a frequency band.
  • frequency 1 can correspond to a frequency having a frequency value of 100-200 Hz.
  • the number of frequency points selected in each embodiment may be different, and may also correspond to different frequency values or frequency bands.
  • the number of frequency points and the frequency value of each frequency point (when the frequency point corresponds to the frequency band, the center frequency value of the corresponding frequency band) can be determined as needed. example For example, in the frequency band where the human ear is more sensitive, more frequency points can be selected, and so on. When the frequency points are selected, the more dense, the more accurate the detection results, of course, the greater the amount of calculation and processing complexity.
  • the perceptual coefficient of each frequency point indicates the sensitivity of the human ear to the sound of each frequency point, which can be set according to experience, can also be determined according to experiments, or can be determined by other means.
  • it may be set in a frequency range sensitive to the human ear, such as 1000 Hz to 4000 Hz, for any pair of first frequency points and second frequency points, when the first frequency point is higher than the second frequency At the time of the point, the perceptual coefficient corresponding to the first frequency point is greater than the perceptual coefficient corresponding to the second frequency point.
  • the value of each sensing coefficient can be set as needed.
  • the relationship between the set perceptual coefficient and each frequency point conforms to the law of the equal-curve curve.
  • the equal-acoustic curve is used to describe the relationship between the sound pressure level and the acoustic frequency under equal loudness conditions. Loudness indicates how loud a sound sounds. Loudness varies mainly with the intensity of the sound, but is also affected by the frequency, that is, the sound of the same intensity and different frequencies has a different auditory perception of the human ear.
  • the International Acoustic Standards Organization has determined the acoustic equal-impedance curve and gives the sound pressure level required for pure tone at different frequencies to achieve a consistent auditory loudness to the listener.
  • the perceptual coefficient can be set with reference to the equivalence curve. For example, the perceptual coefficient can be calculated based on psychoacoustic equivalence curve data of the BS 3383 Specification for normal equal-loudness level contours for pure tones under free-field listening conditions.
  • the following is a calculation method for interpolating existing equal-corrugated curve data by linear interpolation method to obtain the loudness value of the preset frequency point.
  • Afy(freq) af(k-1)+(freq-ff(k-1))*(af(k)-af(k-1))/(ff(k)-ff(k-1)) ;
  • Bfy(freq) bf(k-1)+(freq-ff(k-1))*(bf(k)-bf(k-1))/(ff(k)-ff(k-1)) ;
  • freq is the frequency value of the frequency point at which the perceptual coefficient needs to be calculated (for example, the center frequency value of the frequency band corresponding to the frequency point);
  • k is the frequency sequence value (ie, the frequency point value) in the existing isochronous curve data table, etc.
  • Each frequency sequence value in the ringing curve data table corresponds to a frequency value; the frequency value freq is less than or equal to the frequency value corresponding to the frequency sequence value k in the equal curve data table, and is greater than or equal to the frequency value corresponding to the frequency sequence value k-1; , af, bf, cf are the data in the equal-tone curve data table disclosed by BS3383;loud(freq) represents the loudness of the frequency freq, and cof(freq) represents the perceptual coefficient corresponding to the frequency freq.
  • Figure 2 is a graph of the perceived coefficients calculated in one example.
  • a howling indication value of the analysis window may be determined according to a perceptual energy indication value of each frequency point in the analysis window, and The howling indication value of the analysis window is compared with a preset howling threshold. If the comparison result of the howling indication value of the preset number of analysis windows and the howling threshold value in the at least one analysis window meets a preset condition, determining that a howling occurs occurs.
  • the howling indicator value is used to indicate the probability of a howling.
  • the whistle indication value may be calculated according to the perceptual energy indication value of each frequency point and calculated by using a predetermined algorithm, and the specific algorithm is not limited herein.
  • the greater the howling indication value the greater the likelihood of a howling occurrence; in some instances, the smaller the howling indication value indicates the greater the likelihood of a howling.
  • the howling indication value can be the spectral entropy of the audio signal.
  • the analysis energy value of each frequency point in the analysis window may be determined according to the following formula: Spectral entropy of the signal:
  • the frequency points may have other numbering methods.
  • the number m of each frequency point may be 1, 2, 3, ..., M, etc., and the above formula needs to be adjusted according to the frequency point number.
  • the accuracy of the detection result may be increased by the comparison result of the plurality of analysis windows, that is, if the spectral entropy of the signal in the predetermined number of analysis windows in the at least one analysis window is smaller than the howling threshold , to determine the occurrence of howling.
  • the howling indication value can be the peak to average ratio of the audio signal.
  • a peak-to-average ratio of signals in the analysis window may be determined according to a perceptual energy indication value of each frequency point in the analysis window:
  • the comparison result of the howling indication value and the howling threshold is in accordance with the preset condition that the peak-to-average ratio value is greater than the howling threshold.
  • the accuracy of the detection result may be increased by comparing the results of the plurality of analysis windows, that is, if the peak-to-average ratio of the signals in the predetermined number of analysis windows in the at least one analysis window is greater than the howling Threshold to determine howling occurs.
  • the plurality of analysis windows may be a plurality of continuous analysis windows or a plurality of analysis windows having intervals.
  • a first count value and a second count value may be set, and the values of the first count value and the second count value are respective preset initial values. If the comparison result of the howling indication value of the current analysis window and the howling threshold value meets a preset condition, a preset first step length value is added to the first counting value. If the comparison result of the howling indication value of the analysis window and the howling threshold does not meet the preset condition, the preset second step value is added to the second counting value. When the first count value is equal to the preset first value, it is determined that howling occurs. When the second count value reaches a preset second value, the first count value is restored to an initial value. Among them, the first step The long value, the second step value, the first value, and the second value may be preset according to specific needs.
  • FIG. 3 is a flowchart of a method for detecting howling according to an embodiment of the present invention.
  • the method 30 can include the following steps.
  • the first count value and the second count value may be set and a variable i is set.
  • the initial values of the first count value and the second count value may be set to 0, and the initial value of i may be set to 1.
  • step S31 the input audio signal is subjected to windowing processing to obtain a multi-frame signal.
  • a frame signal refers to a signal in an analysis window.
  • FFT fast Fourier transform
  • step S33 the signal energy value p(i, m) of each frequency point of the ith frame signal is multiplied by the sensing coefficient preset for each frequency point, and the perceived energy value p'(i, m) of each frequency point is obtained.
  • cof(fc(m)) is the perceptual coefficient corresponding to the frequency point m
  • fc(m) is the frequency f center frequency value
  • step S34 the spectral entropy H(i) of the ith frame signal is calculated.
  • Pd(i,m) is the probability density function of the mth frequency point of the ith frame signal, which can be calculated according to formula 10:
  • the spectral entropy H(i) is used as the howling indication value.
  • you can also The value obtained by other calculation methods is used as the howling indication value, for example, the energy peak ratio of the frequency point can be calculated as the howling indication value, and the like.
  • the calculation method of the howling indication value can be designed according to needs, and any feasible algorithm can be adopted, which is not limited herein.
  • step S35 it is determined whether the spectral entropy H(i) is smaller than the preset spectral entropy threshold T0, and if yes, step S36 is performed; if not, step S38 is performed.
  • Step S36 adding 1 to the first count value, determining whether the first count value is equal to the preset first value; if yes, executing step S37; if not, executing step S40.
  • step S37 it is determined that howling occurs.
  • Step S38 adding 1 to the second count value, determining whether the second count value is equal to the preset second value; if yes, executing step S39; if not, executing step S40.
  • step S39 the first count value and the second count value are cleared.
  • step S40 1 is added to i, and step S32 is performed.
  • the spectral entropy threshold T0, the preset first value, and the preset second value may be determined according to actual conditions or empirically.
  • the preset first value may be 3, and the preset second value may be 5.
  • the embodiment of the invention further provides a howling detection device.
  • 4 is a schematic view of a howling detection device of the present invention.
  • the apparatus 40 can include a 4 processor 41, a communication interface 44, a storage device 46, and a bus 49.
  • the storage device 46 includes an operating system 47, a communication module 48, and a howling detection module 43.
  • the processor 41 may have one or more, may be in the same physical device, or may be distributed among multiple physical devices.
  • the howling detection device 40 can utilize the communication interface 44 to pass the input audio signal and provide the howling detection result to other devices via the communication interface 44.
  • the howling detection module 43 can include a windowing module 431, a computing module 432, and a determination Module 433.
  • the window dividing module 431 is configured to perform windowing processing on the audio signal to obtain a plurality of analysis windows.
  • the calculating module 432 is configured to perform, for the at least one of the plurality of analysis windows, a process of: acquiring a signal energy indication value of each frequency point preset in the analysis window; using the preset and the frequency
  • the perceptual coefficients corresponding to the points are respectively calculated for the signal energy indication values of the respective frequency points, and the perceptual energy indication values of the respective frequency points are obtained, wherein the perceptual coefficients corresponding to the respective frequency points indicate the sensitivity of the human ear to each frequency point.
  • the determining module 433 is configured to determine whether a howling occurs according to the perceived energy indication value of each frequency point in the at least one analysis window.
  • the decision module 433 can:
  • the howling indication value is used to indicate a whistling Probability of calling
  • the decision module 433 can:
  • the judging module 433 can calculate the howling indication value by using the method of each embodiment, and details are not described herein again.
  • the decision module 433 can:
  • the determining that the howling occurs includes:
  • the first count value is restored to an initial value.
  • the hardware modules in the embodiments may be implemented in a hardware manner or a hardware platform plus software.
  • the above software includes machine readable instructions stored in a non-volatile storage medium.
  • embodiments can also be embodied as software products.
  • the hardware may be implemented by specialized hardware or hardware that executes machine readable instructions.
  • the hardware can be a specially designed permanent circuit or logic device (such as a dedicated processor such as an FPGA or ASIC) for performing a particular operation.
  • the hardware may also include programmable logic devices or circuits (such as including general purpose processors or other programmable processors) that are temporarily configured by software for performing particular operations.
  • the machine readable instructions corresponding to the modules in the figures may cause an operating system or the like operating on a computer to perform some or all of the operations described herein.
  • the non-transitory computer readable storage medium may be inserted into a memory provided in an expansion board within the computer or written to a memory provided in an expansion unit connected to the computer.
  • the CPU or the like installed on the expansion board or the expansion unit can perform part and all of the actual operations according to the instructions.
  • the non-transitory computer readable storage medium includes a floppy disk, a hard disk, a magneto-optical disk, an optical disk (such as a CD-ROM, a CD-R, a CD-RW, a DVD-ROM, a DVD-RAM, a DVD-RW, a DVD+RW), and a magnetic tape. , non-volatile memory card and ROM.
  • the program code can be downloaded from the server computer by the communication network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Otolaryngology (AREA)
  • General Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

本申请公开了一种啸叫检测方法和装置。对音频信号进行分窗处理得到多个分析窗。针对其中至少一个分析窗,获取所述分析窗中预设的各频点的信号能量指示值;利用预设的与所述各频点对应的感知系数分别对各频点的信号能量指示值进行计算,得到各频点的感知能量指示值。各频点对应的感知系数表示人耳对各频点声音的敏感程度。根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫。。

Description

一种啸叫检测方法和装置
相关文件
本申请要求于2016年07月20日提交中国专利局、申请号为201610576227.X、发明名称为“一种啸叫检测方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及音频处理领域,特别涉及一种啸叫检测方法和装置。
背景
啸叫是指在使用拾音器(如麦克风等)的场合中出现的一种尖锐、刺耳的声音。啸叫一般是由于放音器(如音响、喇叭等)输出的声音不断被拾音器捕捉后回到放音器,由放音器的功率放大器放大后输出,如此往复,从而产生的声音正反馈现象。现有的啸叫抑制方案通过检测输出信号的能量来判断是否发生啸叫,从而对啸叫进行抑制。。
技术内容
本申请实施例提供了一种啸叫检测方法和装置,通过将人耳对不同频点声音的敏感程度纳入啸叫检测方案,使得检测结果更准确。
本申请各实施例的一种啸叫检测方法可以包括:
对音频信号进行分窗处理得到多个分析窗,针对其中至少一个分析窗执行如下处理:
获取所述分析窗中预设的各频点的信号能量指示值;及
利用预设的与所述各频点对应的感知系数分别对各频点的 信号能量指示值进行计算,得到各频点的感知能量指示值,其中,所述各频点对应的感知系数表示人耳对各频点声音的敏感程度;及
根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫。
本申请各实施例的一种啸叫检测装置可以包括:
分窗模块,用于对音频信号进行分窗处理得到多个分析窗;
计算模块,用于针对所述多个分析窗中的至少一个分析窗执行如下处理:获取所述分析窗中预设的各频点的信号能量指示值;利用预设的与所述各频点对应的感知系数分别对各频点的信号能量指示值进行计算,得到各频点的感知能量指示值,其中,所述各频点对应的感知系数表示人耳对各频点的敏感程度;
判断模块,用于根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫。
根据本申请实施例的技术方案,由于考虑了心理声学感知因素,对测得的音频各频点的能量进行加权,从而更符合人耳感知特性,啸叫检测结果更准确。
附图简要说明
图1是本申请实施例的一种啸叫检测方法的流程图;
图2为一个例子中计算得到的感知系数图;
图3为本发明实施例的一种确啸叫检测方法的流程图;
图4为本发明一种啸叫检测装置的示意图。
实施本发明的方式
为了描述上的简洁和直观,下文通过描述若干代表性的实施例来对本发明的方案进行阐述。但本文并未示出所有实施方式。实施例中大量的细节仅用于帮助理解本发明的方案,本发明的技术方案实现时可以不局限于这些细节。为了避免不必要地模糊了本发明的方案,一些实施方式没有进行细致地描述,而是仅给出了框架。下文中,“包括”是指“包括但不限于”,“根据……”是指“至少根据……,但不限于仅根据……”。说明书和权利要求书中的“包括”是指某种程度上至少包括,应当解释为除了包括之后提到的特征外,其它特征也可以存在。
各实施例将人耳对不同频率声音的敏感程度纳入检测方案,对音频信号各频点的信号能量指示值进行加权处理,根据加权后的信号能量指示值(以下简称为感知能量指示值)进行啸叫检测,使得检测结果更符合人耳的听觉特点,从而更加准确。
图1是本申请实施例的一种啸叫检测方法的流程图。该方法10可以包括以下步骤。
步骤S11,对音频信号进行分窗处理得到多个分析窗。
步骤S12,针对其中至少一个分析窗执行如下处理:获取所述分析窗中预设的各频点的信号能量指示值,并利用预设的与所述各频点对应的感知系数分别对各频点的信号能量指示值进行计算,得到各频点的感知能量指示值。
步骤S13,根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫。
这样,通过根据心理声学感知因素对测得的音频各频点的能量进行计算,从而更符合人耳感知特性,啸叫检测结果更准确。
各实施例的方案可以应用在各种使用拾音设备和放音设备的场景中,例如音视频通话、广播、会议,以及各种使用扩音器的现场活动,等。
一些实施例中,由于一些音频处理装置只能处理有限长度的信号,因此在步骤S11中可以对音频信号进行分窗处理,使之成为一段一段的音频信号,即多个分析窗,每次可以仅对一个分析窗内的音频信号进行处理。分窗处理通常使用时长为10ms或20ms的分析窗,窗函数可以选用汉宁窗、汉明窗,等。
信号能量指示值是指可以指示音频信号的能量大小的值。一些例子中,信号能量指示值可以是信号能量、信号功率等。信号能量或者功率可以通过信号测量得到。一些例子中,信号能量指示值还可以是对信号能量或者信号功率经过预定的算法处理得到的值。具体的算法可以根据需要设定,这里不进行限定。
由于啸叫是一种主观感受,人耳对同样能量的不同频率的声音的感受是不一样的。例如,有的频点位于人耳敏感的频带,虽然该频点上的声音能量测量值并不高,但人耳已经能明显感知,形成了啸叫。本申请各实施例利用各频点的感知系数对各频点的信号能量指示值进行加权处理,得到感知能量指示值。感知能量指示值可以指示人耳感受到的声音强弱程度。
其中,每个频点对应一个频率值或者一段频带。例如,有频点0,1,2,……M,M为大于1的整数,频点1可以对应频率值为100-200Hz的频率。这里仅仅是个例子,各实施例中选取的频点的数量可以不同,也可以对应不同频率值或频带。频点的数量和各频点的频率值(频点对应频带时,则指对应频带的中心频率值)可以根据需要来确定。例 如,在人耳较敏感的频段可以选取较多频点,等。当选择的频点越多、越密,检测结果更准确,当然也意味着更大的计算量和处理复杂度。
各频点的感知系数表示人耳对各频点声音的敏感程度,可以根据经验设定,也可以根据实验来确定,或者可以通过其它途径确定。
一些例子中,可以设定,在人耳敏感的频率范围内,例如1000Hz到4000Hz,对于任一对第一频点和第二频点,当所述第一频点高于所述第二频点时,所述第一频点对应的感知系数大于所述第二频点对应的感知系数。各感知系数的值可以根据需要来设定。
一些例子中,设定的感知系数与各频点之间的关系符合等响曲线的规律。等响曲线是用于描述等响度条件下声压级与声波频率的关系曲线。响度表示一个声音听来有多响的程度。响度主要随声音的强度而变化,但也受频率的影响,即相同强度、不同频率的声音对于人耳有着不一样的听觉感知。国际声学标准组织测定了声学等响曲线图,给出了在不同频率下的纯音需要达到何种声压级,才能获得对听者来说一致的听觉响度。感知系数可以参考等响曲线图来设定。例如,感知系数可以基于BS3383标准《BS 3383 Specification for normal equal-loudness level contours for pure tones under free-field listening conditions》的心理声学等响曲线数据计算得到。
下面给出一个采用线性插值法对现有等响曲线数据进行插值,得到预设频点的响度值的计算方法。
afy(freq)=af(k-1)+(freq-ff(k-1))*(af(k)-af(k-1))/(ff(k)-ff(k-1));
(公式1)
bfy(freq)=bf(k-1)+(freq-ff(k-1))*(bf(k)-bf(k-1))/(ff(k)-ff(k-1));
(公式2)
cfy(freq)=cf(k-1)+(freq-ff(k-1))*(cf(k)-cf(k-1))/(ff(k)-ff(k-1));
(公式3)
loud(freq)=4.2+afy*(dB-cfy)/(1+bfy*(dB-cfy));
(公式4)
cof(freq)=(10^loud(freq)/20)/1000;
(公式5)
其中,freq为需要计算感知系数的频点的频率值(例如,频点对应的频带的中心频率值);k为现有等响曲线数据表中的频率序号值(即频点值),等响曲线数据表中每个频率序号值对应一个频率值;频率值freq小于等于等响曲线数据表中频率序号值k对应的频率值,且大于等于频率序号值k-1对应的频率值;ff、af、bf、cf为BS3383公开的等响曲线数据表内的数据;loud(freq)表示频点freq的响度,cof(freq)表示频点freq对应的感知系数。图2为一个例子中计算得到的感知系数图。
一些例子中,步骤S13中,针对所述至少一个分析窗中的每个分析窗,可以根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值,并将所述分析窗的啸叫指示值与预设的啸叫阈值进行比较。如果所述至少一个分析窗中有预设数目个分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,确定发生啸叫。
啸叫指示值用于指示发生啸叫的概率。啸叫指示值可以根据各频点的感知能量指示值并采用预定的算法来计算得出,具体的算法这里不作限定。有的例子中,啸叫指示值越大,表示发生啸叫的可能性越大;有的例子中,啸叫指示值越小,表示发生啸叫的可能性越大。
例如,啸叫指示值可以是音频信号的谱熵。具体地,可以根据所述分析窗中各频点的感知能量指示值按以下公式确定所述分析窗中 信号的谱熵:
Figure PCTCN2017087878-appb-000001
其中,
Figure PCTCN2017087878-appb-000002
表示所述分析窗中的信号在第m个频点的概率密度函数;p'(m)=p(m)×cof(fc(m)),表示所述分析窗中的信号在频点m的感知能量指示值;p(m)表示所述分析窗中频点m的信号能量指示值;cof(fc(m))表示频点m对应的感知系数,fc(m)表示频点m的中心频率,m=0,1,2,…,M-1,j=0,1,2,…,M-1,M为预设频点的总数。即,当总共有M各频点时,各频点的编号m可以依次为0,1,2,……,M-1。
其它例子中,频点可以有其它的编号方法,例如,各频点的编号m可以依次为1,2,3,……,M,等,则上述公式需要根据频点编号情况进行相应调整。
在这个例子中,谱熵的值越小,说明发生啸叫的可能性越大。因此,本例中,啸叫指示值与啸叫阈值的比较结果符合预设条件是指,谱熵的值小于啸叫阈值。一些例子中,如果所述分析窗中信号的谱熵小于所述啸叫阈值,则确定发生啸叫。另一些例子中,可以通过多个分析窗的比较结果来增加检测结果的准确性,即,如果所述至少一个分析窗中有预设数目个分析窗中信号的谱熵小于所述啸叫阈值,确定发生啸叫。
又例如,啸叫指示值可以是音频信号的峰均比。具体地,可以根据所述分析窗中各频点的感知能量指示值确定所述分析窗中信号的峰均比:
Rpm=Peak/PM       (公式7)
其中,Peak=Max(p'(m)),表示所述分析窗中各频点的感知能量指示值的峰值;p'(m)=p(m)×cof(fc(m)),表示所述分析窗中的信号在频点m的感知能量指示值;p(m)表示所述分析窗中频点m的信号能量指示值;cof(fc(m))表示频点m对应的感知系数,fc(m)表示频点m的中心频率;
Figure PCTCN2017087878-appb-000003
表示所述分析窗中各频点的感知能量指示值的均值;m=0,1,2,…,M-1,j=0,1,2,…,M-1,M为预设频点的总数。
在这个例子中,峰均比的值越大,说明发生啸叫的可能性越大。因此,本例中,啸叫指示值与啸叫阈值的比较结果符合预设条件是指,峰均比的值大于啸叫阈值。一些例子中,如果所述分析窗中信号的峰均比大于所述啸叫阈值,则确定发生啸叫。另一些例子中,可以通过多个分析窗的比较结果来增加检测结果的准确性,即,如果所述至少一个分析窗中有预设数目个分析窗中信号的峰均比大于所述啸叫阈值,确定发生啸叫。
当通过多个分析窗的比较结果来确定是否发生啸叫时,这多个分析窗可以是连续的多个分析窗,或者是有间隔的多个分析窗。
例如,当预设数目个连续的分析窗的比较结果符合预设条件,确定发生啸叫。
又例如,可以设置第一计数值和第二计数值,所述第一计数值和所述第二计数值的值为各自的预设初始值。如果当前分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,在第一计数值上增加预设第一步长值。如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果不符合预设条件,在第二计数值上增加预设第二步长值。当所述第一计数值等于预设第一数值时,确定发生啸叫。当第二计数值达到预设第二数值时,将所述第一计数值恢复为初始值。其中,第一步 长值、第二步长值、第一数值和第二数值可以根据具体需要预先设置。
图3为本发明实施例的一种确啸叫检测方法的流程图。该方法30可以包括以下步骤。
在执行该方法前,可以设置第一计数值和第二计数值,并设置一变量i。例如,第一计数值和第二计数值的初始值可以设为0,i的初始值可以设为1。
步骤S31,对输入音频信号进行分窗处理,得到多帧信号。
这里,一帧信号是指一个分析窗中的信号。
步骤S32,对第i帧信号做快速傅立叶变换(FFT),求出第i帧信号中各频点的信号能量值p(i,m),m为频点的编号,m=0,1,2,…,M-1,M为总频点数。
步骤S33,将第i帧信号各频点的信号能量值p(i,m)乘以针对各频点预设的感知系数,得到各频点的感知能量值p'(i,m)。
p'(i,m)=p(i,m)*cof(fc(m))    (公式8)
其中,cof(fc(m))为频点m对应的感知系数,fc(m)为的频点m中心频率值。
步骤S34,计算第i帧信号的谱熵H(i)。
Figure PCTCN2017087878-appb-000004
其中,Pd(i,m)为第i帧信号的第m个频点的概率密度函数,可以根据公式10计算:
Figure PCTCN2017087878-appb-000005
本例中,采用谱熵H(i)作为啸叫指示值。其它例子中,还可以采 用其它计算方法获得的值作为啸叫指示值,例如可以计算频点的能量峰均比作为啸叫指示值,等。啸叫指示值的计算方法可以根据需要设计,任何可行的算法都可以采用,这里不进行限定。
步骤S35,判断谱熵H(i)是否小于预设的谱熵阈值T0,如果是,则执行步骤S36;如果否,则执行步骤S38。
步骤S36,对第一计数值加1,判断第一计数值是否等于预设的第一数值;如果是,则执行步骤S37;如果否,执行步骤S40。
步骤S37,确定发生啸叫。
步骤S38,对第二计数值加1,判断第二计数值是否等于预设的第二数值;如果是,执行步骤S39;如果否,执行步骤S40。
步骤S39,将第一计数值和第二计数值清零。
步骤S40,对i加1,执行步骤S32。
各实施例中,谱熵阈值T0、预设第一数值和预设第二数值可以根据实际情况或者根据经验确定。例如,预设第一数值可以为3,预设第二数值可以为5。
本发明实施例还提供一种啸叫检测装置。图4为本发明一种啸叫检测装置的示意图。该装置40可以包括4处理器41、通信接口44、存储装置46和总线49。存储装置46中包括操作系统47、通信模块48,以及啸叫检测模块43。
处理器41可以有一个或者多个,可以在同一个物理设备中,或者分布在多个物理设备中。
啸叫检测装置40可以利用通信接口44通过输入的音频信号,并通过通信接口44将啸叫检测结果提供给其它设备。
啸叫检测模块43可以包括分窗模块431、计算模块432和判断 模块433。
分窗模块431,用于对音频信号进行分窗处理得到多个分析窗。
计算模块432,用于针对所述多个分析窗中的至少一个分析窗执行如下处理:获取所述分析窗中预设的各频点的信号能量指示值;利用预设的与所述各频点对应的感知系数分别对各频点的信号能量指示值进行计算,得到各频点的感知能量指示值,其中,所述各频点对应的感知系数表示人耳对各频点的敏感程度。
判断模块433,用于根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫。
一些例子中,判断模块433可以:
针对所述至少一个分析窗中的每个分析窗,根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值,所述啸叫指示值用于指示发生啸叫的概率;
将所述分析窗的啸叫指示值与预设的啸叫阈值进行比较;
如果所述至少一个分析窗中有预设数目个分析窗的比较结果符合预设条件,确定发生啸叫。
一些例子中,判断模块433可以:
当连续的预设数目个分析窗的比较结果符合预设条件,确定发生啸叫。
判断模块433可以采用各实施例的方法来计算啸叫指示值,这里不再赘述。
一些例子中,判断模块433可以:
设置第一计数值和第二计数值,所述第一计数值和所述第二计数值的值为各自的预设初始值;
如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果符合 预设条件,在第一计数值上增加预设第一步长值;
如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果不符合预设条件,在第二计数值上增加预设第二步长值;
所述确定发生啸叫包括:
当所述第一计数值等于预设第一数值时,确定发生啸叫;
当第二计数值达到预设第二数值时,将所述第一计数值恢复为初始值。
计算模块432和判断模块433的具体功能可以参见上文中相应的方法步骤,这里不再赘述。
需要说明的是,上述各流程和各结构图中不是所有的步骤和模块都是必须的,可以根据实际的需要忽略某些步骤或模块。各步骤的执行顺序不是固定的,可以根据需要进行调整。各模块的划分仅仅是为了便于描述采用的功能上的划分,实际实现时,一个模块可以分由多个模块实现,多个模块的功能也可以由同一个模块实现,这些模块可以位于同一个设备中,也可以位于不同的设备中。另外,上面描述中采用“第一”、“第二”仅仅为了方便区分具有同一含义的两个对象,并不表示其有实质的区别。
各实施例中的硬件模块可以以硬件方式或硬件平台加软件的方式实现。上述软件包括机器可读指令,存储在非易失性存储介质中。因此,各实施例也可以体现为软件产品。
各例中,硬件可以由专门的硬件或执行机器可读指令的硬件实现。例如,硬件可以为专门设计的永久性电路或逻辑器件(如专用处理器,如FPGA或ASIC)用于完成特定的操作。硬件也可以包括由软件临时配置的可编程逻辑器件或电路(如包括通用处理器或其它可编程处理器)用于执行特定操作。
图中的模块对应的机器可读指令可以使计算机上操作的操作系统等来完成这里描述的部分或者全部操作。非易失性计算机可读存储介质可以是插入计算机内的扩展板中所设置的存储器中或者写到与计算机相连接的扩展单元中设置的存储器。安装在扩展板或者扩展单元上的CPU等可以根据指令执行部分和全部实际操作。
非易失性计算机可读存储介质包括软盘、硬盘、磁光盘、光盘(如CD-ROM、CD-R、CD-RW、DVD-ROM、DVD-RAM、DVD-RW、DVD+RW)、磁带、非易失性存储卡和ROM。可选择地,可以由通信网络从服务器计算机上下载程序代码。
综上所述,权利要求的范围不应局限于以上描述的例子中的实施方式,而应当将说明书作为一个整体并给予最宽泛的解释。

Claims (20)

  1. 一种啸叫检测方法,其特征在于,包括:
    对音频信号进行分窗处理得到多个分析窗,针对其中至少一个分析窗执行如下处理:
    获取所述分析窗中预设的各频点的信号能量指示值;及
    利用预设的与所述各频点对应的感知系数分别对各频点的信号能量指示值进行计算,得到各频点的感知能量指示值,其中,所述各频点对应的感知系数表示人耳对各频点声音的敏感程度;
    根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫。
  2. 根据权利要求1所述的方法,其特征在于,
    在人耳敏感的频率范围内,对于任一对第一频点和第二频点,当所述第一频点高于所述第二频点时,所述第一频点对应的感知系数大于所述第二频点对应的感知系数。
  3. 根据权利要求1所述的方法,其特征在于,其特征在于,所述感知系数与各频点之间的关系符合等响曲线的规律。
  4. 根据权利要求1所述的方法,其特征在于,根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫包括:
    针对所述至少一个分析窗中的每个分析窗,根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值,所述啸叫指示值用于指示发生啸叫的概率,并将所述分析窗的啸叫指示值与预设的啸叫阈 值进行比较;
    如果所述至少一个分析窗中有预设数目个分析窗的比较结果符合预设条件,确定发生啸叫。
  5. 根据权利要求4所述的方法,其特征在于,所述确定发生啸叫包括:
    当预设数目个连续的分析窗的比较结果符合预设条件,确定发生啸叫。
  6. 根据权利要求4所述的方法,其特征在于,进一步包括:
    设置第一计数值和第二计数值,所述第一计数值和所述第二计数值的值为各自的预设初始值;
    如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,在第一计数值上增加预设第一步长值;
    如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果不符合预设条件,在第二计数值上增加预设第二步长值;
    所述确定发生啸叫包括:
    当所述第一计数值等于预设第一数值时,确定发生啸叫;
    当第二计数值达到预设第二数值时,将所述第一计数值恢复为初始值。
  7. 根据权利要求4所述的方法,其特征在于,根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值包括:
    根据所述分析窗中各频点的感知能量指示值按以下公式确定所述分析窗中信号的谱熵:
    Figure PCTCN2017087878-appb-100001
    其中,
    Figure PCTCN2017087878-appb-100002
    表示所述分析窗中的信号在第m个频点的概率密度函数;p'(m)=p(m)×cof(fc(m)),表示所述分析窗中的信号在频点m的感知能量指示值;p(m)表示所述分析窗中频点m的信号能量指示值;cof(fc(m))表示频点m对应的感知系数,fc(m)表示频点m的中心频率,m=0,1,2,…,M-1,j=0,1,2,…,M-1,M为预设频点的总数;
    如果所述至少一个分析窗中有预设数目个分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,确定发生啸叫包括:
    如果所述至少一个分析窗中有预设数目个分析窗中信号的谱熵小于所述啸叫阈值,确定发生啸叫。
  8. 根据权利要求4所述的方法,其特征在于,根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值包括:
    根据所述分析窗中各频点的感知能量指示值确定所述分析窗中信号的峰均比:
    Rpm=Peak/PM,
    其中,Peak=Max(p'(m)),表示所述分析窗中各频点的感知能量指示值的峰值;p'(m)=p(m)×cof(fc(m)),表示所述分析窗中的信号在频点m的感知能量指示值;p(m)表示所述分析窗中频点m的信号能量指示值;cof(fc(m))表示频点m对应的感知系数,fc(m)表示频点m的中心频率;
    Figure PCTCN2017087878-appb-100003
    表示所述分析窗中各频点的感知能量指示值的均值;m=0,1,2,…,M-1,j=0,1,2,…,M-1,M为预设频点的总数;
    如果所述至少一个分析窗中有预设数目个分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,确定发生啸叫包括:
    如果所述至少一个分析窗中有预设数目个分析窗中信号的峰均比大 于所述啸叫阈值,确定发生啸叫。
  9. 一种啸叫检测装置,其特征在于,包括:处理器和存储器,所述存储器中存储有一系列计算机可读指令,可以使所述处理器执行以下操作:
    对音频信号进行分窗处理得到多个分析窗;
    针对所述多个分析窗中的至少一个分析执行如下处理:获取所述分析窗中预设的各频点的信号能量指示值;利用预设的与所述各频点对应的感知系数分别对各频点的信号能量指示值进行计算,得到各频点的感知能量指示值,其中,所述各频点对应的感知系数表示人耳对各频点的敏感程度;
    根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫。
  10. 根据权利要求9所述的装置,其特征在于,所述计算机可读指令可以使所述处理器执行以下操作:
    针对所述至少一个分析窗中的每个分析窗,根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值,所述啸叫指示值用于指示发生啸叫的概率;
    将所述分析窗的啸叫指示值与预设的啸叫阈值进行比较;
    如果所述至少一个分析窗中有预设数目个分析窗的比较结果符合预设条件,确定发生啸叫。
  11. 根据权利要求10所述的装置,其特征在于,所述计算机可读指令可以使所述处理器执行以下操作:
    当预设数目个连续的分析窗的比较结果符合预设条件,确定发生啸 叫。
  12. 根据权利要求10所述的装置,其特征在于,计算机可读指令可以使所述处理器执行以下操作:
    设置第一计数值和第二计数值,所述第一计数值和所述第二计数值的值为各自的预设初始值;
    如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,在第一计数值上增加预设第一步长值;
    如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果不符合预设条件,在第二计数值上增加预设第二步长值;
    所述确定发生啸叫包括:
    当所述第一计数值等于预设第一数值时,确定发生啸叫;
    当第二计数值达到预设第二数值时,将所述第一计数值恢复为初始值。
  13. 一种计算机可读存储介质,存储有一系列计算机可读指令,其特征在于,所述计算机可读指令可以使一个或多个处理器执行以下操作:
    对音频信号进行分窗处理得到多个分析窗,针对其中至少一个分析窗执行如下处理:
    获取所述分析窗中预设的各频点的信号能量指示值;及
    利用预设的与所述各频点对应的感知系数分别对各频点的信号能量指示值进行计算,得到各频点的感知能量指示值,其中,所述各频点对应的感知系数表示人耳对各频点声音的敏感程度;
    根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生 啸叫。
  14. 根据权利要求13所述的存储介质,其特征在于,
    在人耳敏感的频率范围内,对于任一对第一频点和第二频点,当所述第一频点高于所述第二频点时,所述第一频点对应的感知系数大于所述第二频点对应的感知系数。
  15. 根据权利要求13所述的存储介质,其特征在于,其特征在于,所述感知系数与各频点之间的关系符合等响曲线的规律。
  16. 根据权利要求13所述的存储介质,其特征在于,根据所述至少一个分析窗中各频点的感知能量指示值确定是否发生啸叫包括:
    针对所述至少一个分析窗中的每个分析窗,根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值,所述啸叫指示值用于指示发生啸叫的概率,并将所述分析窗的啸叫指示值与预设的啸叫阈值进行比较;
    如果所述至少一个分析窗中有预设数目个分析窗的比较结果符合预设条件,确定发生啸叫。
  17. 根据权利要求16所述的存储介质,其特征在于,所述确定发生啸叫包括:
    当预设数目个连续的分析窗的比较结果符合预设条件,确定发生啸叫。
  18. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令可以使一个或多个处理器执行以下操作:
    设置第一计数值和第二计数值,所述第一计数值和所述第二计数值的值为各自的预设初始值;
    如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,在第一计数值上增加预设第一步长值;
    如果所述分析窗的啸叫指示值与所述啸叫阈值的比较结果不符合预设条件,在第二计数值上增加预设第二步长值;
    所述确定发生啸叫包括:
    当所述第一计数值等于预设第一数值时,确定发生啸叫;
    当第二计数值达到预设第二数值时,将所述第一计数值恢复为初始值。
  19. 根据权利要求16所述的存储介质,其特征在于,根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值包括:
    根据所述分析窗中各频点的感知能量指示值按以下公式确定所述分析窗中信号的谱熵:
    Figure PCTCN2017087878-appb-100004
    其中,
    Figure PCTCN2017087878-appb-100005
    表示所述分析窗中的信号在第m个频点的概率密度函数;p'(m)=p(m)×cof(fc(m)),表示所述分析窗中的信号在频点m的感知能量指示值;p(m)表示所述分析窗中频点m的信号能量指示值;cof(fc(m))表示频点m对应的感知系数,fc(m)表示频点m的中心频率,m=0,1,2,…,M-1,j=0,1,2,…,M-1,M为预设频点的总数;
    如果所述至少一个分析窗中有预设数目个分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,确定发生啸叫包括:
    如果所述至少一个分析窗中有预设数目个分析窗中信号的谱熵小于所述啸叫阈值,确定发生啸叫。
  20. 根据权利要求16所述的存储介质,其特征在于,根据所述分析窗中各频点的感知能量指示值确定所述分析窗的啸叫指示值包括:
    根据所述分析窗中各频点的感知能量指示值确定所述分析窗中信号的峰均比:
    Rpm=Peak/PM,
    其中,Peak=Max(p'(m)),表示所述分析窗中各频点的感知能量指示值的峰值;p'(m)=p(m)×cof(fc(m)),表示所述分析窗中的信号在频点m的感知能量指示值;p(m)表示所述分析窗中频点m的信号能量指示值;cof(fc(m))表示频点m对应的感知系数,fc(m)表示频点m的中心频率;
    Figure PCTCN2017087878-appb-100006
    表示所述分析窗中各频点的感知能量指示值的均值;m=0,1,2,…,M-1,j=0,1,2,…,M-1,M为预设频点的总数;
    如果所述至少一个分析窗中有预设数目个分析窗的啸叫指示值与所述啸叫阈值的比较结果符合预设条件,确定发生啸叫包括:
    如果所述至少一个分析窗中有预设数目个分析窗中信号的峰均比大于所述啸叫阈值,确定发生啸叫。
PCT/CN2017/087878 2016-07-20 2017-06-12 一种啸叫检测方法和装置 WO2018014673A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP17830309.5A EP3451697B1 (en) 2016-07-20 2017-06-12 Method and device for howling detection
US16/043,837 US10339953B2 (en) 2016-07-20 2018-07-24 Howling detection method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610576227.XA CN107645696B (zh) 2016-07-20 2016-07-20 一种啸叫检测方法和装置
CN201610576227.X 2016-07-20

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/043,837 Continuation US10339953B2 (en) 2016-07-20 2018-07-24 Howling detection method and apparatus

Publications (1)

Publication Number Publication Date
WO2018014673A1 true WO2018014673A1 (zh) 2018-01-25

Family

ID=60991883

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/087878 WO2018014673A1 (zh) 2016-07-20 2017-06-12 一种啸叫检测方法和装置

Country Status (4)

Country Link
US (1) US10339953B2 (zh)
EP (1) EP3451697B1 (zh)
CN (1) CN107645696B (zh)
WO (1) WO2018014673A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108494954B (zh) * 2018-03-12 2019-10-25 Oppo广东移动通信有限公司 语音通话数据检测方法、装置、存储介质及移动终端
JP6632043B1 (ja) * 2019-03-13 2020-01-15 Necプラットフォームズ株式会社 電話装置及び電話装置の制御方法
CN110677796B (zh) * 2019-03-14 2021-12-17 深圳攀高医疗电子有限公司 一种音频信号处理方法及助听器
CN110213694B (zh) * 2019-04-16 2020-12-04 浙江大华技术股份有限公司 一种音频设备及其啸叫的处理方法、计算机存储介质
CN112750444B (zh) * 2020-06-30 2023-12-12 腾讯科技(深圳)有限公司 混音方法、装置及电子设备
CN113271386B (zh) * 2021-05-14 2023-03-31 杭州网易智企科技有限公司 啸叫检测方法及装置、存储介质、电子设备
CN113724725B (zh) * 2021-11-04 2022-01-18 北京百瑞互联技术有限公司 一种蓝牙音频啸叫检测抑制方法、装置、介质及蓝牙设备
CN114688591B (zh) * 2022-03-22 2024-07-05 杭州老板电器股份有限公司 中央吸油烟机系统及其控制方法、控制装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544961A (zh) * 2012-07-10 2014-01-29 中兴通讯股份有限公司 语音信号处理方法及装置
US20150104039A1 (en) * 2013-10-15 2015-04-16 Electronics And Telecommunications Research Institute Apparatus and method of suppressing howling
CN104871436A (zh) * 2012-12-18 2015-08-26 摩托罗拉解决方案公司 用于减轻在数字无线电接收器中的反馈的方法和设备
CN105308985A (zh) * 2013-06-19 2016-02-03 创新科技有限公司 声反馈消除器
CN105516876A (zh) * 2015-12-09 2016-04-20 天津大学 一种基于谱熵的啸叫检测方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7089176B2 (en) * 2003-03-27 2006-08-08 Motorola, Inc. Method and system for increasing audio perceptual tone alerts
US7756276B2 (en) * 2003-08-20 2010-07-13 Phonak Ag Audio amplification apparatus
US10038952B2 (en) * 2014-02-04 2018-07-31 Steelcase Inc. Sound management systems for improving workplace efficiency
US20160165361A1 (en) * 2014-12-05 2016-06-09 Knowles Electronics, Llc Apparatus and method for digital signal processing with microphones

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544961A (zh) * 2012-07-10 2014-01-29 中兴通讯股份有限公司 语音信号处理方法及装置
CN104871436A (zh) * 2012-12-18 2015-08-26 摩托罗拉解决方案公司 用于减轻在数字无线电接收器中的反馈的方法和设备
CN105308985A (zh) * 2013-06-19 2016-02-03 创新科技有限公司 声反馈消除器
US20150104039A1 (en) * 2013-10-15 2015-04-16 Electronics And Telecommunications Research Institute Apparatus and method of suppressing howling
CN105516876A (zh) * 2015-12-09 2016-04-20 天津大学 一种基于谱熵的啸叫检测方法

Also Published As

Publication number Publication date
EP3451697A4 (en) 2019-05-01
EP3451697B1 (en) 2021-03-03
EP3451697A1 (en) 2019-03-06
CN107645696A (zh) 2018-01-30
US20180330744A1 (en) 2018-11-15
CN107645696B (zh) 2019-04-19
US10339953B2 (en) 2019-07-02

Similar Documents

Publication Publication Date Title
WO2018014673A1 (zh) 一种啸叫检测方法和装置
CN103152668B (zh) 输出音频调节方法及其系统
US9282419B2 (en) Audio processing method and audio processing apparatus
US20150162021A1 (en) Spectral Comb Voice Activity Detection
CN113452855B (zh) 啸叫处理方法、装置、电子设备及存储介质
EP2949133B1 (en) Automatic loudspeaker polarity detection
CN107682802B (zh) 音频设备音效的调试方法及装置
JP2010112995A (ja) 通話音声処理装置、通話音声処理方法およびプログラム
JP2014126856A (ja) 雑音除去装置及びその制御方法
US10438606B2 (en) Pop noise control
US10070219B2 (en) Sound feedback detection method and device
JP5772591B2 (ja) 音声信号処理装置
WO2020073564A1 (zh) 用于检测音频信号的响度的方法和装置
JP6314475B2 (ja) 音声信号処理装置及びプログラム
CN113316075B (zh) 一种啸叫检测方法、装置及电子设备
US11922933B2 (en) Voice processing device and voice processing method
JP6644213B1 (ja) 音響信号処理装置、音響システム、音響信号処理方法、及び音響信号処理プログラム
CN108932953B (zh) 一种音频均衡函数确定方法、音频均衡方法及设备
US20210012787A1 (en) Detection and restoration of distorted signals of blocked microphones
US11017793B2 (en) Nuisance notification
CN114303392A (zh) 多声道音频信号的声道标识
JP6527768B2 (ja) 情報処理方法及び装置
CN113316074B (zh) 一种啸叫检测方法、装置及电子设备
JP2005284016A (ja) 音声信号の雑音推定方法およびそれを用いた雑音除去装置
CN116524950A (zh) 一种音频信号处理方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17830309

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017830309

Country of ref document: EP

Effective date: 20181128

NENP Non-entry into the national phase

Ref country code: DE