WO2021248523A1 - Airflow noise elimination method and apparatus, computer device, and storage medium - Google Patents

Airflow noise elimination method and apparatus, computer device, and storage medium Download PDF

Info

Publication number
WO2021248523A1
WO2021248523A1 PCT/CN2020/096686 CN2020096686W WO2021248523A1 WO 2021248523 A1 WO2021248523 A1 WO 2021248523A1 CN 2020096686 W CN2020096686 W CN 2020096686W WO 2021248523 A1 WO2021248523 A1 WO 2021248523A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
initial signal
audio
initial
airflow noise
Prior art date
Application number
PCT/CN2020/096686
Other languages
French (fr)
Chinese (zh)
Inventor
吴锐兴
田晓晖
叶利剑
Original Assignee
瑞声声学科技(深圳)有限公司
瑞声科技(新加坡)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 瑞声声学科技(深圳)有限公司, 瑞声科技(新加坡)有限公司 filed Critical 瑞声声学科技(深圳)有限公司
Publication of WO2021248523A1 publication Critical patent/WO2021248523A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0264Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Abstract

Disclosed in embodiments of the present application is an airflow noise elimination method, comprising: acquiring an original audio signal in a miniature loudspeaker, and preprocessing the original audio signal to obtain an initial signal; performing feature extraction on the initial signal to obtain an audio feature; classifying the initial signal according to the audio feature to determine the type of the initial signal; if the type of the initial signal is a signal containing excitation airflow noise, performing signal compression on the initial signal to obtain a target compressed signal, and analyzing and quantifying the audio feature to determine an audio signal of the airflow noise. Because only the airflow noise signal is filtered out, the original audio signal is retained to the maximum extent; moreover, without changing the structure of the miniature loudspeaker, the sound output quality of the miniature loudspeaker is improved by performing a series of processing on the audio feature, thereby improving the user experience. In addition, also provided are an airflow noise elimination apparatus, a computer device, and a storage medium.

Description

气流杂音消除方法、装置、计算机设备及存储介质Airflow noise elimination method, device, computer equipment and storage medium 技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种气流杂音消除方法、装置、计算机设备及存储介质。This application relates to the field of computer technology, and in particular to a method, device, computer equipment, and storage medium for eliminating airflow noise.
背景技术Background technique
扬声器气流杂音是扬声器杂音的主要来源之一。微型扬声器腔体较小,结构精密,但是振膜振幅较大,气流在腔体内容易形成湍流并产生流致噪声。流致噪声经过小腔体共振放大,在频率较高的高频谐振峰附近形成宽频的能量集中分布,形成人们主观听音中的气流杂音,主要表现为“嘶嘶”“沙沙”声。气流杂音是微型扬声器中普遍存在的问题,在大电压、大振幅的情况下愈加明显。不同样品存在差异,与扬声器的腔体结构、发声方式有显著关系。Loudspeaker airflow noise is one of the main sources of loudspeaker noise. The cavity of the micro speaker is small and the structure is precise, but the vibration amplitude of the diaphragm is large, and the airflow is easy to form turbulence in the cavity and generate flow-induced noise. The flow-induced noise is amplified by the resonance of a small cavity, and a broadband energy concentration is formed near the high-frequency resonance peak with higher frequency, which forms the airflow noise in people's subjective listening, which is mainly manifested as "hissing" and "sanding" sound. Airflow noise is a common problem in micro-speakers, and it becomes more obvious in the case of large voltage and large amplitude. There are differences between different samples, which are significantly related to the speaker cavity structure and the way of sounding.
技术问题technical problem
然而,现有的技术往往是通过改变腔体结构、出声孔和导管设计等改变扬声器物理结构方式来改善微型扬声器气流杂音的问题,但这种方法工艺成本较高、周期较长,且通用性受到限制,改善效果不佳。鉴于此,亟需提供一种新的气流杂音消除方法。However, the existing technology often changes the physical structure of the speaker by changing the cavity structure, sound hole and duct design, etc. to improve the airflow noise of the micro speaker. However, this method has a higher process cost, a longer cycle, and is versatile. Sex is limited, and the improvement effect is not good. In view of this, there is an urgent need to provide a new airflow noise elimination method.
技术解决方案Technical solutions
有鉴于此,本申请提供了一种气流杂音消除方法、装置、计算机设备及存储介质,用于解决现有技术中气流杂音消除效果不佳的问题。In view of this, the present application provides an airflow noise elimination method, device, computer equipment, and storage medium, which are used to solve the problem of poor airflow noise elimination effect in the prior art.
本申请实施例的具体技术方案为:The specific technical solutions of the embodiments of this application are:
第一方面,本申请实施例提供一种气流杂音消除方法,应用于微型扬声器,包括:In the first aspect, an embodiment of the present application provides a method for eliminating airflow noise, which is applied to a micro speaker, and includes:
采集微型扬声器中的原始音频信号,对所述原始音频信号进行预处理,得到初始信号;Collecting the original audio signal in the micro speaker, and preprocessing the original audio signal to obtain the initial signal;
对所述初始信号进行特征提取,得到音频特征;Performing feature extraction on the initial signal to obtain audio features;
根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别;Classify the initial signal according to the audio feature, and determine the category of the initial signal;
当所述初始信号的类别为包含激励气流杂音的信号,则对所述初始信号进行信号压缩处理,得到目标压缩信号。When the type of the initial signal is a signal that includes the noise of the excitation airflow, signal compression processing is performed on the initial signal to obtain a target compressed signal.
进一步地,在所述对所述初始信号进行信号压缩处理,得到目标压缩信号之后,还包括:Further, after the signal compression processing is performed on the initial signal to obtain the target compressed signal, the method further includes:
将所述目标压缩信号,通过淡入淡出机制进行编辑,得到目标音频信号。The target compressed signal is edited through a fade-in and fade-out mechanism to obtain a target audio signal.
进一步地,对所述初始信号进行特征提取,得到音频特征,包括:Further, performing feature extraction on the initial signal to obtain audio features includes:
提取所述初始信号高频分量包络和低频分量包络,并根据所述高频分量包络和所述低频分量包络的比值确定所述音频特征,和/或提取所述初始信号的梅尔倒谱系数作为所述音频特征。The high-frequency component envelope and the low-frequency component envelope of the initial signal are extracted, and the audio feature is determined according to the ratio of the high-frequency component envelope and the low-frequency component envelope, and/or the initial signal is extracted. The cepstrum coefficient is used as the audio feature.
进一步地,根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别,包括:Further, classifying the initial signal according to the audio feature and determining the category of the initial signal includes:
若根据所述高频分量包络和所述低频分量包络的比值确定所述音频特征,则获取所述高频分量包络和所述低频分量包络的比值;If the audio feature is determined according to the ratio of the high-frequency component envelope and the low-frequency component envelope, obtaining the ratio of the high-frequency component envelope and the low-frequency component envelope;
当所述比值小于预设比值阈值时,则确定所述初始信号的类别为包含激励气流杂音的信号;When the ratio is less than the preset ratio threshold, it is determined that the type of the initial signal is a signal that includes an exciting airflow noise;
当所述比值大于或者等于预设比值阈值时,则确定所述初始信号的类别为不包含激励气流杂音的信号。When the ratio is greater than or equal to the preset ratio threshold, it is determined that the type of the initial signal is a signal that does not include exciting airflow noise.
进一步地,根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别,包括:Further, classifying the initial signal according to the audio feature and determining the category of the initial signal includes:
若所述音频特征为所述初始信号的梅尔倒谱系数,则将所述梅尔倒谱系数输入到音频信号分类器进行分类,得到所述初始信号的类别。If the audio feature is the Mel cepstrum coefficient of the initial signal, the Mel cepstrum coefficient is input to an audio signal classifier for classification to obtain the category of the initial signal.
进一步地,对所述初始信号进行信号压缩处理,得到目标压缩信号,得到目标压缩信号,包括:Further, performing signal compression processing on the initial signal to obtain a target compressed signal to obtain the target compressed signal includes:
计算所述初始信号的振膜速度;Calculating the diaphragm velocity of the initial signal;
根据所述振膜速度和预设速度阈值,对所述初始信号进行信号压缩处理,得到所述目标压缩信号。According to the diaphragm speed and a preset speed threshold, signal compression processing is performed on the initial signal to obtain the target compressed signal.
进一步地,所述方法还包括:Further, the method further includes:
获取训练样本集,所述训练样本集中包括初始信号的梅尔倒谱系数和对应的音频类别;Acquiring a training sample set, where the training sample set includes the Mel cepstrum coefficient of the initial signal and the corresponding audio category;
将所述梅尔倒谱系数作为预设的分类器的输入,将所述音频类别作为期望的输出,对预设的分类器进行训练,得到训练完成的所述音频信号分类器。Taking the Mel cepstrum coefficient as the input of a preset classifier, and taking the audio category as the desired output, and training the preset classifier to obtain the audio signal classifier that has been trained.
第二方面,本申请实施例还提供一种气流杂音消除装置,包括:In the second aspect, an embodiment of the present application also provides a device for eliminating airflow noise, including:
信号获取模块,用于采集微型扬声器中的原始音频信号,对所述原始音频信号进行预处理,得到初始信号;The signal acquisition module is used to collect the original audio signal in the micro speaker, and preprocess the original audio signal to obtain the initial signal;
特征提取模块,用于对所述初始信号进行特征提取,得到音频特征;The feature extraction module is used to perform feature extraction on the initial signal to obtain audio features;
信号分类模块,用于根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别;A signal classification module, configured to classify the initial signal according to the audio characteristics, and determine the type of the initial signal;
信号压缩模块,用于当所述初始信号的类别为包含激励气流杂音的信号,则对所述初始信号进行信号压缩处理,得到目标压缩信号。The signal compression module is configured to perform signal compression processing on the initial signal when the type of the initial signal is a signal that includes the noise of the exciting airflow to obtain a target compressed signal.
第三方面,本申请实施例还提供一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上所述气流杂音消除方法的步骤。In the third aspect, the embodiments of the present application also provide a computer device, including a memory, a processor, and a computer program stored on the memory and running on the processor. When the processor executes the computer program, The steps of the method for eliminating air noise as described above are realized.
第四方面,本申请实施例还提供一种计算机可读存储介质,包括计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如上所述气流杂音消除方法的步骤。In a fourth aspect, an embodiment of the present application also provides a computer-readable storage medium, including computer instructions, which when run on a computer, cause the computer to execute the steps of the method for eliminating airflow noise as described above.
有益效果Beneficial effect
实施本申请实施例,将具有如下有益效果:Implementing the embodiments of this application will have the following beneficial effects:
采用了上述气流杂音消除方法、装置、计算机设备及存储介质之后,通过采集微型扬声器中的原始音频信号,对原始音频信号进行预处理,得到初始信号;对初始信号进行特征提取,得到音频特征;根据音频特征对初始信号进行分类,确定初始信号的类别;当初始信号的类别为包含激励气流杂音的信号,则对初始信号进行信号压缩处理,得到目标压缩信号,通过对音频特征进行分析和量化,确定气流杂音的音频信号,由于只对有气流杂音信号进行滤除,最大程度地保留了原始音频信号,并且在不改变微型扬声器结构的基础上,通过对音频特特征进行一系列处理,提高了微型扬声器的声音输出质量,提升用户体验。After adopting the above-mentioned airflow noise elimination method, device, computer equipment and storage medium, the original audio signal in the micro speaker is collected, and the original audio signal is preprocessed to obtain the original signal; the original signal is feature extracted to obtain the audio feature; The initial signal is classified according to the audio characteristics, and the category of the initial signal is determined; when the category of the initial signal is a signal that contains exciting airflow noise, signal compression is performed on the initial signal to obtain the target compressed signal, and the audio characteristics are analyzed and quantified , To determine the audio signal of the airflow noise, because only the airflow noise signal is filtered, the original audio signal is retained to the greatest extent, and on the basis of not changing the structure of the micro speaker, a series of audio characteristics are processed to improve Improve the sound output quality of the micro speakers and enhance the user experience.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
其中:in:
图1为一个实施例中所述气流杂音消除方法的流程图;Figure 1 is a flow chart of the method for eliminating airflow noise in an embodiment;
图2为一个实施例中所述目标音频信号的波形图的示意图;Fig. 2 is a schematic diagram of a waveform diagram of the target audio signal in an embodiment;
图3为一个实施例中所述初始信号的类别确定方法的流程图;FIG. 3 is a flowchart of a method for determining the type of the initial signal in an embodiment;
图4为一个实施例中所述目标压缩信号确定方法的流程图;Figure 4 is a flowchart of the method for determining the target compressed signal in an embodiment;
图5为另一个实施例中所述气流杂音消除方法的流程图;Figure 5 is a flow chart of the method for eliminating airflow noise in another embodiment;
图6为一个实施例中所述气流杂音消除方法装置的结构示意图;FIG. 6 is a schematic diagram of the structure of the method and apparatus for eliminating airflow noise in an embodiment;
图7为一个实施例中运行上述气流杂音消除方法的计算机设备的内部结构示意图。Fig. 7 is a schematic diagram of the internal structure of a computer device running the above-mentioned airflow noise elimination method in an embodiment.
本发明的实施方式Embodiments of the present invention
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
为解决传统技术中通过改变腔体结构、出声孔和导管设计等改变扬声器物理结构方式来改善微型扬声器气流杂音的改善效果不佳问题。In order to solve the problem of the poor effect of improving the airflow noise of the micro-speaker by changing the cavity structure, the sound hole and the duct design, etc., the physical structure of the speaker is changed in the traditional technology.
基于上述问题,在本实施例中,特提出了一种气流杂音消除方法。该方法的实现可依赖于计算机程序,该计算机程序可运行于基于冯诺依曼体系的计算机系统之上。Based on the above problems, in this embodiment, a method for eliminating airflow noise is proposed. The realization of the method can rely on a computer program, which can run on a computer system based on the von Neumann system.
如图1所示,本实施例提供的气流杂音消除方法,应用于微型扬声器,该气流杂音消除方法具体包括以下步骤:As shown in FIG. 1, the airflow noise elimination method provided in this embodiment is applied to a micro speaker, and the airflow noise elimination method specifically includes the following steps:
步骤102:采集微型扬声器中的原始音频信号,对原始音频信号进行预处理,得到初始信号。Step 102: Collect the original audio signal in the micro speaker, and preprocess the original audio signal to obtain the original signal.
其中,原始音频信号是指微信扬声器中的未经处理的信号,初始信号是指对原始音频信号进行预处理后的音频信号。通常,原始音频信号由于微型扬声器腔体较小,结构精密,但是振膜振幅较大,使得气流在腔体内容易形成湍流并产生流致噪声,因此,本实施例中还对原始音频信号进行预处理,以调整原始音频信号的动态范围和信号频域能量分布,补偿微型扬声器频率响应等。其中的预处理包括但不限于是DRC (Dynamic Range Control,动态范围压缩处理)、mbDRC(Multi-band Dynamic Range Control,多频段动态范围压缩处理), EQ(Equaliser,均衡处理)中的至少一种。可以理解地,通过对原始音频信号进行预处理,使得初始信号适合微型扬声器播放,以便后续提取更加清晰可靠的音频特征。Among them, the original audio signal refers to the unprocessed signal in the WeChat speaker, and the initial signal refers to the audio signal after the original audio signal is preprocessed. Generally, the original audio signal has a small cavity and precise structure due to the micro-speaker, but the amplitude of the diaphragm is large, which makes it easy for the airflow to form turbulence in the cavity and generate flow-induced noise. Therefore, the original audio signal is also pre-processed in this embodiment. Processing to adjust the dynamic range of the original audio signal and the frequency domain energy distribution of the signal, to compensate for the frequency response of the micro-speaker, etc. The preprocessing includes but is not limited to DRC (Dynamic Range Control, dynamic range compression processing), mbDRC (Multi-band Dynamic Range Control, multi-band dynamic range compression processing), EQ (Equaliser, equalization processing) at least one. Understandably, by preprocessing the original audio signal, the original signal is suitable for playback by the micro speaker, so that the subsequent extraction of clearer and more reliable audio features.
步骤104:对初始信号进行特征提取,得到音频特征。Step 104: Perform feature extraction on the initial signal to obtain audio features.
其中,音频特征是用于表征音频信号的特征,具体地,可以通过C/C++ 、Python、MATLAB等工具包对初始信号进行特征提取,得到音频特征,如梅尔倒谱系数(MFCC)、线性预测普系数(LPCC)或者分量包络等。作为本实施例的优选,为了更好地分辨气流杂音,提取初始信号的梅尔倒谱系数(MFCC)或者分量包络作为音频特征,以利用该音频特征的可视化和量化的特点进行更为准确地分析。Among them, the audio feature is used to characterize the audio signal. Specifically, the initial signal can be extracted through toolkits such as C/C++, Python, MATLAB, etc., to obtain audio features, such as Mel Cepstral Coefficient (MFCC), linear Prediction general coefficient (LPCC) or component envelope, etc. As the preferred embodiment of this embodiment, in order to better distinguish the airflow noise, the Mel Cepstral Coefficient (MFCC) or component envelope of the initial signal is extracted as the audio feature, so as to make use of the features of visualization and quantization of the audio feature for more accuracy地Analyze.
步骤106:根据音频特征对初始信号进行分类,确定初始信号的类别。Step 106: Classify the initial signal according to the audio characteristics, and determine the type of the initial signal.
具体地,初始信号的类别包括两类,分别为包含激励气流杂音的信号的类别、不包含激励气流杂音的信号的类别。具体地,通过对音频特征进行分析,例如,可以将音频特征与预设的分类阈值进行比较,根据比较结果,确定初始信号的类别;还可以预先训练初始信号的类别分类器,然后将音频特征输入到类别分类器,输出的类别即为初始信号的类别。可以理解地,本实施例通过对音频特征进行分析,基于音频特征的量化值进行比较,确定初始信号的类别,以便识别出初始信号中包含的气流杂音。Specifically, the categories of the initial signal include two categories, namely, the category of signals that include the excitation airflow noise and the category of the signals that do not include the excitation airflow noise. Specifically, by analyzing the audio features, for example, the audio features can be compared with a preset classification threshold, and the category of the initial signal can be determined according to the comparison result; the category classifier of the initial signal can also be pre-trained, and then the audio feature Input to the category classifier, and the output category is the category of the initial signal. Understandably, this embodiment analyzes the audio features and compares the quantized values of the audio features to determine the type of the initial signal, so as to identify the airflow noise contained in the initial signal.
步骤108:当初始信号的类别为包含激励气流杂音的信号,则对初始信号进行信号压缩处理,得到目标压缩信号。Step 108: When the category of the initial signal is a signal that includes the excitation airflow noise, signal compression processing is performed on the initial signal to obtain a target compressed signal.
其中,目标压缩信号是指已经消除了气流杂音的初始信号。信号压缩处理是一种用于降低音频信号振动速度的处理过程。具体地,可以通过声学模型(Speaker Model),如隐马尔科夫模型(HMM)预测初始信号在微型扬声器上产生气流杂音的强度,根据气流杂音的强度将初始信号的气流杂音降低到预设阈值以内,其中,不同的微型扬声器的预设阈值可能不同,可以通过主观听音来确定。可以通过压DRC实现对初始信号的气流杂音的轻度进行压缩,得到目标压缩信息,实现了对其气流杂音的消除。可以理解地,通过对音频特征进行分析和量化,确定气流杂音的音频信号,由于只对有气流杂音信号进行滤除,最大程度地保留了原始音频信号,并且在不改变微型扬声器结构的基础上,通过对音频特特征进行一系列处理,提高了微型扬声器的声音输出质量,提升用户体验。Among them, the target compression signal refers to the initial signal that has eliminated the airflow noise. Signal compression is a process used to reduce the vibration speed of audio signals. Specifically, the acoustic model (Speaker Model), such as the Hidden Markov Model (HMM) predicts the intensity of the airflow noise generated by the initial signal on the microspeaker, and reduces the airflow noise of the initial signal to within a preset threshold according to the intensity of the airflow noise. The preset threshold may be different and can be determined by subjective listening. The airflow noise of the initial signal can be slightly compressed by pressing the DRC, and the target compression information can be obtained, and the elimination of the airflow noise can be realized. Understandably, by analyzing and quantifying the audio characteristics, the audio signal of the airflow noise is determined. Since only the airflow noise signal is filtered out, the original audio signal is retained to the greatest extent, and the structure of the micro speaker is not changed. , Through a series of processing of audio characteristics, the sound output quality of the micro speakers is improved, and the user experience is improved.
上述气流杂音消除方法,通过采集微型扬声器中的原始音频信号,对原始音频信号进行预处理,得到初始信号;对初始信号进行特征提取,得到音频特征;根据音频特征对初始信号进行分类,确定初始信号的类别;当初始信号的类别为包含激励气流杂音的信号,则对初始信号进行信号压缩处理,得到目标压缩信号,通过对音频特征进行分析和量化,确定气流杂音的音频信号,由于只对有气流杂音信号进行滤除,最大程度地保留了原始音频信号,并且在不改变微型扬声器结构的基础上,通过对音频特特征进行一系列处理,提高了微型扬声器的声音输出质量,提升用户体验。The above air noise elimination method is to collect the original audio signal in the micro speaker, preprocess the original audio signal to obtain the initial signal; perform feature extraction on the initial signal to obtain the audio feature; classify the initial signal according to the audio feature to determine the initial The type of signal; when the type of the initial signal is a signal that contains exciting airflow noise, the initial signal is compressed to obtain the target compressed signal. By analyzing and quantifying the audio characteristics, the audio signal of the airflow noise is determined. The airflow noise signal is filtered out, the original audio signal is retained to the greatest extent, and on the basis of not changing the structure of the micro speaker, through a series of processing of audio characteristics, the sound output quality of the micro speaker is improved, and the user experience is improved .
在一个实施例中,在对初始信号进行信号压缩处理,得到目标压缩信号之后,还包括:In an embodiment, after performing signal compression processing on the initial signal to obtain the target compressed signal, the method further includes:
将目标压缩信号,通过淡入淡出机制进行编辑,得到目标音频信号。The target compressed signal is edited through the fade-in and fade-out mechanism to obtain the target audio signal.
其中,淡入淡出机制是一种用于提高音频信号连续性的处理方式,淡入淡出机制包括淡入和淡出,淡入在初始信号的类别从不包含激励气流杂音的信号变为包含激励气流杂音的信号的瞬间起作用,淡出在初始信号的类别从包含激励气流杂音的信号变为不包含激励气流杂音的信号的瞬间起作用。目标音频信号是指通过微型扬声器进行播放的音频信号。可以理解地,由于步骤106对初始信号进行了分类,然后对类别为包含激励气流杂音的信号进行了信号压缩处理,由此可能会导致类别为包含激励气流杂音的信号与类别为不包含激励气流杂音的信号发生突变,影响微型扬声器的声音输出质量,因此,将目标压缩信号通过淡入淡出机制进行编辑,提高音频信号的连续性,进而提高音频信号质量。如图2所示,为目标音频信号的波形图,其中, w(t) 为对类别为包含激励气流杂音的信号进行信号压缩处理的波形,f(t)表示为初始信号类别的波形,从图2中可以看出,在初始信号类别发生变化的时间段内,通过加入淡出淡出机制,实现了目标音频信号的连续性,进而进一步提高了目标音频信号质量。Among them, the fade-in and fade-out mechanism is a processing method used to improve the continuity of the audio signal. The fade-in and fade-out mechanism includes fade-in and fade-out. The type of the initial signal changes from a signal that does not contain exciting airflow noise to a signal that contains exciting airflow noise. It works instantaneously, and fade-out works at the moment when the type of the initial signal changes from a signal containing exciting airflow noise to a signal that does not include exciting airflow noise. The target audio signal refers to an audio signal that is played through a micro speaker. Understandably, because step 106 classifies the initial signal, and then performs signal compression processing on the signal that contains the excitation airflow noise, it may result in the signal that contains the excitation airflow noise and the signal that does not contain the excitation airflow. The noise signal has a sudden change, which affects the sound output quality of the micro-speaker. Therefore, the target compressed signal is edited through the fade-in and fade-out mechanism to improve the continuity of the audio signal, thereby improving the quality of the audio signal. As shown in Figure 2, it is the waveform diagram of the target audio signal, where w(t) is the waveform of signal compression processing for the signal containing the excitation airflow noise, f(t) is the waveform of the initial signal category, from It can be seen in Figure 2 that during the time period when the initial signal category changes, the continuity of the target audio signal is achieved by adding a fade-out mechanism, thereby further improving the quality of the target audio signal.
在一个实施例中,对初始信号进行特征提取,得到音频特征,包括:In one embodiment, performing feature extraction on the initial signal to obtain audio features includes:
提取初始信号高频分量包络和低频分量包络,并根据高频分量包络和低频分量包络的比值确定音频特征,和/或提取初始信号的梅尔倒谱系数作为音频特征。The high-frequency component envelope and the low-frequency component envelope of the initial signal are extracted, and the audio feature is determined according to the ratio of the high-frequency component envelope and the low-frequency component envelope, and/or the Mel cepstrum coefficient of the initial signal is extracted as the audio feature.
具体地,本实施例中的音频特征为初始信号的高频分量包络和低频分量包络的比值值,和/或初始信号的梅尔倒谱系数。其中的高频分量包络、低频分量包络可以通过平方律检波技术(SQL)分别进行提取,然后计算高频分量包络、低频分量包络的比例值,得到音频特征。其中的梅尔倒谱系数可以通过matlab工具自带的evenlope函数或者hilbert函数进行提取。可以理解地,本实施例中的高频分量包络和低频分量包络的比值值,和/或初始信号的梅尔倒谱系数能够很好地反映音频信号的气流杂音信息,因此,提高了音频特征的可靠性。Specifically, the audio feature in this embodiment is the ratio of the high-frequency component envelope and the low-frequency component envelope of the initial signal, and/or the Mel cepstrum coefficient of the initial signal. Among them, the high-frequency component envelope and the low-frequency component envelope can be separately extracted by the square law detection technology (SQL), and then the ratio values of the high-frequency component envelope and the low-frequency component envelope are calculated to obtain the audio characteristics. The Mel cepstrum coefficients can be extracted through the evenlope function or hilbert function that comes with the matlab tool. It is understandable that the ratio of the envelope of the high-frequency component to the envelope of the low-frequency component in this embodiment, and/or the Mel cepstrum coefficient of the initial signal can well reflect the airflow noise information of the audio signal, thus improving The reliability of audio characteristics.
如图3所示,在一个实施例中,根据音频特征对初始信号进行分类,确定初始信号的类别,包括:As shown in Fig. 3, in one embodiment, classifying the initial signal according to audio characteristics and determining the category of the initial signal includes:
步骤106A:若根据高频分量包络和低频分量包络的比值确定音频特征,则获取高频分量包络和低频分量包络的比值;Step 106A: If the audio feature is determined according to the ratio of the high-frequency component envelope and the low-frequency component envelope, then the ratio of the high-frequency component envelope and the low-frequency component envelope is obtained;
步骤106B:当比值小于预设比值阈值时,则确定初始信号的类别为包含激励气流杂音的信号;Step 106B: When the ratio is less than the preset ratio threshold, it is determined that the type of the initial signal is a signal that contains exciting airflow noise;
步骤106C:当比值大于或者等于预设比值阈值时,则确定初始信号的类别为不包含激励气流杂音的信号。Step 106C: When the ratio is greater than or equal to the preset ratio threshold, it is determined that the type of the initial signal is a signal that does not include the excitation airflow noise.
在这个实施例中,当音频特征为高频分量包络和低频分量包络的比值时,通过将该比值与预设比值阈值进行比较,根据比较结果确定初始信号的类别。其中的预设比值阈值是用于区分初始信号的类别的比值的临界值。示例性地,该预设比例阈值为10-5,当比值小于预设比值阈值时,初始信号的类别为包含激励气流杂音的信号,当比值大于或者等于预设比值阈值时,初始信号的类别为不包含激励气流杂音的信号。可以理解地,通过比较高频分量包络和低频分量包络的比值与预设比值阈值,能够快速准确地确定初始信号的类别。In this embodiment, when the audio feature is the ratio of the envelope of the high frequency component to the envelope of the low frequency component, by comparing the ratio with a preset ratio threshold, the type of the initial signal is determined according to the comparison result. The preset ratio threshold is a critical value for distinguishing the ratio of the initial signal category. Exemplarily, the preset ratio threshold is 10-5. When the ratio is less than the preset ratio threshold, the type of the initial signal is a signal containing the excitation airflow noise. When the ratio is greater than or equal to the preset ratio threshold, the type of the initial signal It is a signal that does not include exciting airflow noise. It is understandable that by comparing the ratio of the envelope of the high-frequency component and the envelope of the low-frequency component with the preset ratio threshold, the type of the initial signal can be quickly and accurately determined.
在一个实施例中,根据音频特征对初始信号进行分类,确定初始信号的类别,包括:In one embodiment, classifying the initial signal according to audio characteristics and determining the category of the initial signal includes:
若音频特征为初始信号的梅尔倒谱系数,则将梅尔倒谱系数输入到音频信号分类器进行分类,得到初始信号的类别。If the audio feature is the Mel cepstrum coefficient of the initial signal, the Mel cepstrum coefficient is input to the audio signal classifier for classification, and the category of the initial signal is obtained.
在这个实施例中,对于音频特征为初始信号的梅尔倒谱系数的情况,利用机器学习模型的方法确定初始信号的类别,即利用预先训练好的音频信号分类器进行分类,得到初始信号的类别。其中的机器学习模型可以是支持向量机模型SVM,也可以是使用集成学习Ensemble方法组合多个弱分类器而成的分类器集,可以理解地,通过音频信号分类器对初始信号进行分类,利用了机器学习的较高准确度的特点,提高了初始信号类别确定的准确性。In this embodiment, for the case where the audio feature is the Mel cepstrum coefficient of the initial signal, the machine learning model method is used to determine the category of the initial signal, that is, the pre-trained audio signal classifier is used for classification to obtain the initial signal category. The machine learning model can be a support vector machine model SVM, or it can be a classifier set formed by combining multiple weak classifiers using the Ensemble method of ensemble learning. Understandably, the audio signal classifier is used to classify the initial signal. The characteristics of higher accuracy of machine learning are improved, and the accuracy of determining the initial signal category is improved.
如图4所示,在一个实施例中,对初始信号进行信号压缩处理,得到目标压缩信号,得到目标压缩信号,包括:As shown in FIG. 4, in one embodiment, performing signal compression processing on the initial signal to obtain the target compressed signal to obtain the target compressed signal includes:
步骤110:计算初始信号的振膜速度;Step 110: Calculate the diaphragm velocity of the initial signal;
步骤112:根据振膜速度和预设速度阈值,对初始信号进行信号压缩处理,得到目标压缩信号。Step 112: Perform signal compression processing on the initial signal according to the diaphragm speed and the preset speed threshold to obtain the target compressed signal.
具体地,振膜速度是一种用于反映音频信号的气流杂音强度的指标数据,具体地,比较振膜速度和预设速度阈值,当振膜速度大于预设速度阈值时,通过声学模型计算初始信号的振膜速度,然后采用DRC或者mbDRC对初始信号进行信号压缩处理,并压缩至预设速度阈值以内,从而达到了对气流杂音滤除的功能,实现了气流杂音的消除,进而提高了微型扬声器输出的声音质量,提升用户体验。Specifically, the diaphragm speed is an index data used to reflect the intensity of the airflow noise of the audio signal. Specifically, the diaphragm speed is compared with a preset speed threshold. When the diaphragm speed is greater than the preset speed threshold, it is calculated by the acoustic model The diaphragm speed of the initial signal, and then use DRC or mbDRC to compress the initial signal, and compress it to within the preset speed threshold, so as to achieve the function of filtering airflow noise, realize the elimination of airflow noise, and improve The sound quality output by the micro speakers enhances the user experience.
如图5所示,在一个实施例中,该气流杂音消除方法还包括:As shown in FIG. 5, in one embodiment, the method for eliminating airflow noise further includes:
步骤114:获取训练样本集,训练样本集中包括初始信号的梅尔倒谱系数和对应的音频类别;Step 114: Obtain a training sample set, the training sample set includes the Mel cepstrum coefficient of the initial signal and the corresponding audio category;
步骤116:将梅尔倒谱系数作为预设的分类器的输入,将音频类别作为期望的输出,对预设的分类器进行训练,得到训练完成的音频信号分类器。Step 116: Use the Mel cepstrum coefficient as the input of the preset classifier, and use the audio category as the desired output, and train the preset classifier to obtain a trained audio signal classifier.
具体地,获取已确定为类别为包含激励气流杂音的信号的样本,并获取已确定为类别为不包含激励气流杂音的信号的样本,将初始信号的梅尔倒谱系数作为预设的分类器的输入,将音频类别作为期望的输出,对预设的分类器进行训练,可生成与训练样本集中的梅尔倒谱系数相应的音频类别,从而根据与梅尔倒谱系数对应的期望的输出,训练预设的分类器,得到训练完成的环境分类器。Specifically, the sample that has been determined to be a signal containing the excitation airflow noise is obtained, and the sample that has been determined to be a signal that does not contain the excitation airflow noise is obtained, and the Mel cepstrum coefficient of the initial signal is used as the preset classifier The audio category is used as the desired output, and the preset classifier is trained to generate the audio category corresponding to the Mel cepstrum coefficient in the training sample set, so as to according to the expected output corresponding to the Mel cepstrum coefficient , Train the preset classifier to get the trained environment classifier.
本实施例中,训练样本集包括类别为包含激励气流杂音的信号的梅尔倒谱系数以及类别为不包含激励气流杂音的信号的梅尔倒谱系数,保证了训练样本集的全面性,利用这样训练样本集训练出的音频信号分类器能够学习到更加全面准确的音频类别分类规则,提高了训练机器学习预设分类器的效率,从而可以进一步提高对初始信号的类别区分的效率。In this embodiment, the training sample set includes the Mel cepstrum coefficients of the signal containing the excitation airflow noise and the Mel cepstrum coefficients of the signal that does not contain the excitation airflow noise, which ensures the comprehensiveness of the training sample set. In this way, the audio signal classifier trained by the training sample set can learn more comprehensive and accurate audio category classification rules, which improves the efficiency of training the machine learning preset classifier, thereby further improving the efficiency of classifying the initial signal.
基于同一申请构思,本申请实施例提供一种气流杂音消除装置600,如图6所示,包括:信号获取模块602,用于采集微型扬声器中的原始音频信号,对所述原始音频信号进行预处理,得到初始信号;特征提取模块604,用于对所述初始信号进行特征提取,得到音频特征;信号分类模块606,用于根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别;信号压缩模块608,用于当所述初始信号的类别为包含激励气流杂音的信号,则对所述初始信号进行信号压缩处理,得到目标压缩信号。Based on the same application concept, an embodiment of the present application provides an airflow noise cancellation device 600, as shown in FIG. 6, including: a signal acquisition module 602, configured to collect the original audio signal in the micro speaker, and preprocess the original audio signal Processing to obtain the initial signal; the feature extraction module 604 is configured to perform feature extraction on the initial signal to obtain audio features; the signal classification module 606 is configured to classify the initial signal according to the audio feature, and determine the initial Signal category; a signal compression module 608, configured to perform signal compression processing on the initial signal to obtain a target compressed signal when the category of the initial signal is a signal containing exciting airflow noise.
具体地,本实施例的气流杂音消除装置600,如图6所示,包括:信号获取模块602,用于采集微型扬声器中的原始音频信号,对所述原始音频信号进行预处理,得到初始信号;特征提取模块604,用于对所述初始信号进行特征提取,得到音频特征;信号分类模块606,用于根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别;信号压缩模块608,用于当所述初始信号的类别为包含激励气流杂音的信号,则对所述初始信号进行信号压缩处理,得到目标压缩信号。通过对音频特征进行分析和量化,确定气流杂音的音频信号,由于只对有气流杂音信号进行滤除,最大程度地保留了原始音频信号,并且在不改变微型扬声器结构的基础上,通过对音频特特征进行一系列处理,提高了微型扬声器的声音输出质量,提升用户体验。Specifically, the airflow noise cancellation device 600 of this embodiment, as shown in FIG. 6, includes: a signal acquisition module 602, configured to collect the original audio signal in the micro speaker, and preprocess the original audio signal to obtain the initial signal The feature extraction module 604 is configured to extract features of the initial signal to obtain audio features; the signal classification module 606 is configured to classify the initial signal according to the audio features and determine the category of the initial signal; The compression module 608 is configured to perform signal compression processing on the initial signal to obtain a target compressed signal when the type of the initial signal is a signal that includes exciting airflow noise. Through the analysis and quantification of the audio characteristics, the audio signal of the airflow noise is determined. Since only the airflow noise signal is filtered, the original audio signal is retained to the greatest extent, and the structure of the micro speaker is not changed. The special feature performs a series of processing to improve the sound output quality of the micro-speaker and enhance the user experience.
需要说明的是,本实施例中气流杂音消除装置的实现与上述气流杂音消除方法的实现思想一致,其实现原理在此不再进行赘述,可具体参阅上述方法中对应内容。It should be noted that the implementation of the airflow noise elimination device in this embodiment is consistent with the realization idea of the above-mentioned airflow noise elimination method, and its implementation principle will not be repeated here. For details, please refer to the corresponding content in the above method.
图7示出了一个实施例中计算机设备的内部结构图。该计算机设备700具体可以是服务器,也可以是终端。如图7所示,该计算机设备包括通过系统总线连接的处理器710、存储器720和网络接口730。其中,存储器720包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操作系统,还可存储有计算机程序,该计算机程序被处理器执行时,可使得处理器实现气流杂音消除方法。该内存储器中也可储存有计算机程序,该计算机程序被处理器执行时,可使得处理器执行气流杂音消除方法。本领域技术人员可以理解,图7中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图7中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Fig. 7 shows an internal structure diagram of a computer device in an embodiment. The computer device 700 may specifically be a server or a terminal. As shown in FIG. 7, the computer device includes a processor 710, a memory 720, and a network interface 730 connected through a system bus. Among them, the memory 720 includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program. When the computer program is executed by the processor, the processor can realize the airflow noise elimination method. A computer program can also be stored in the internal memory, and when the computer program is executed by the processor, the processor can execute the airflow noise elimination method. Those skilled in the art can understand that the structure shown in FIG. 7 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or fewer components than shown in FIG. 7, or combining certain components, or having a different component arrangement.
在一个实施例中,本申请提供的气流杂音消除方法可以实现为一种计算机程序的形式,计算机程序可在如图7所示的计算机设备上运行。计算机设备的存储器中可存储组成所述气流杂音消除装置的各个程序模块。比如,信号获取模块602, 特征提取模块604,信号分类模块606,信号压缩模块608。In an embodiment, the airflow noise elimination method provided by the present application can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in FIG. 7. The memory of the computer equipment can store various program modules that make up the airflow noise elimination device. For example, the signal acquisition module 602, the feature extraction module 604, the signal classification module 606, and the signal compression module 608.
在一个实施例中,提出了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行以下步骤:采集微型扬声器中的原始音频信号,对所述原始音频信号进行预处理,得到初始信号;对所述初始信号进行特征提取,得到音频特征;根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别;当所述初始信号的类别为包含激励气流杂音的信号,则对所述初始信号进行信号压缩处理,得到目标压缩信号。In one embodiment, a computer device is provided, including a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the following steps: The original audio signal in the speaker is preprocessed to obtain an initial signal; feature extraction is performed on the initial signal to obtain an audio feature; the initial signal is classified according to the audio feature to determine the The category of the initial signal; when the category of the initial signal is a signal that contains exciting airflow noise, signal compression processing is performed on the initial signal to obtain a target compressed signal.
在一个实施例中,提出了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如下步骤:采集微型扬声器中的原始音频信号,对所述原始音频信号进行预处理,得到初始信号;对所述初始信号进行特征提取,得到音频特征;根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别;当所述初始信号的类别为包含激励气流杂音的信号,则对所述初始信号进行信号压缩处理,得到目标压缩信号。In one embodiment, a computer-readable storage medium is provided, and the computer-readable storage medium stores a computer program, which is characterized in that, when the computer program is executed by a processor, the following steps are implemented: Original audio signal, preprocess the original audio signal to obtain an initial signal; perform feature extraction on the initial signal to obtain an audio feature; classify the initial signal according to the audio feature to determine the initial signal Category; when the category of the initial signal is a signal that contains exciting airflow noise, signal compression processing is performed on the initial signal to obtain a target compressed signal.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The program can be stored in a non-volatile computer readable storage medium. Here, when the program is executed, it may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上所揭露的仅为本申请较佳实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。The above-disclosed are only the preferred embodiments of the application, and of course the scope of rights of the application cannot be limited by this. Therefore, equivalent changes made in accordance with the claims of the application still fall within the scope of the application.

Claims (10)

  1. 一种气流杂音消除方法,其特征在于,应用于微型扬声器,所述方法包括:A method for eliminating airflow noise, which is characterized in that it is applied to a miniature speaker, and the method includes:
    采集微型扬声器中的原始音频信号,对所述原始音频信号进行预处理,得到初始信号;Collecting the original audio signal in the micro speaker, and preprocessing the original audio signal to obtain the initial signal;
    对所述初始信号进行特征提取,得到音频特征;Performing feature extraction on the initial signal to obtain audio features;
    根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别;Classify the initial signal according to the audio feature, and determine the category of the initial signal;
    当所述初始信号的类别为包含激励气流杂音的信号,则对所述初始信号进行信号压缩处理,得到目标压缩信号。When the type of the initial signal is a signal that includes the noise of the excitation airflow, signal compression processing is performed on the initial signal to obtain a target compressed signal.
  2. 如权利要求1所述气流杂音消除方法,其特征在于,在所述对所述初始信号进行信号压缩处理,得到目标压缩信号之后,还包括:The method for eliminating airflow noise according to claim 1, wherein after said performing signal compression processing on said initial signal to obtain a target compressed signal, the method further comprises:
    将所述目标压缩信号,通过淡入淡出机制进行编辑,得到目标音频信号。The target compressed signal is edited through a fade-in and fade-out mechanism to obtain a target audio signal.
  3. 如权利要求1所述气流杂音消除方法,其特征在于,所述对所述初始信号进行特征提取,得到音频特征,包括:The method for eliminating airflow noise according to claim 1, wherein said performing feature extraction on said initial signal to obtain audio features comprises:
    提取所述初始信号高频分量包络和低频分量包络,并根据所述高频分量包络和所述低频分量包络的比值确定所述音频特征,和/或提取所述初始信号的梅尔倒谱系数作为所述音频特征。The high-frequency component envelope and the low-frequency component envelope of the initial signal are extracted, and the audio feature is determined according to the ratio of the high-frequency component envelope and the low-frequency component envelope, and/or the initial signal is extracted. The cepstrum coefficient is used as the audio feature.
  4. 如权利要求3所述气流杂音消除方法,其特征在于,所述根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别,包括:The method for eliminating airflow noise according to claim 3, wherein the classifying the initial signal according to the audio characteristics and determining the type of the initial signal comprises:
    若根据所述高频分量包络和所述低频分量包络的比值确定所述音频特征,则获取所述高频分量包络和所述低频分量包络的比值;If the audio feature is determined according to the ratio of the high-frequency component envelope and the low-frequency component envelope, obtaining the ratio of the high-frequency component envelope and the low-frequency component envelope;
    当所述比值小于预设比值阈值时,则确定所述初始信号的类别为包含激励气流杂音的信号;When the ratio is less than the preset ratio threshold, it is determined that the type of the initial signal is a signal that includes an exciting airflow noise;
    当所述比值大于或者等于预设比值阈值时,则确定所述初始信号的类别为不包含激励气流杂音的信号。When the ratio is greater than or equal to the preset ratio threshold, it is determined that the type of the initial signal is a signal that does not include exciting airflow noise.
  5. 如权利要求3所述气流杂音消除方法,其特征在于,所述根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别,包括:The method for eliminating airflow noise according to claim 3, wherein the classifying the initial signal according to the audio characteristics and determining the type of the initial signal comprises:
    若所述音频特征为所述初始信号的梅尔倒谱系数,则将所述梅尔倒谱系数输入到音频信号分类器进行分类,得到所述初始信号的类别。If the audio feature is the Mel cepstrum coefficient of the initial signal, the Mel cepstrum coefficient is input to an audio signal classifier for classification to obtain the category of the initial signal.
  6. 如权利要求1所述气流杂音消除方法,其特征在于,所述对所述初始信号进行信号压缩处理,得到目标压缩信号,得到目标压缩信号,包括:The method for eliminating airflow noise according to claim 1, wherein the performing signal compression processing on the initial signal to obtain a target compressed signal to obtain a target compressed signal comprises:
    计算所述初始信号的振膜速度;Calculating the diaphragm velocity of the initial signal;
    根据所述振膜速度和预设速度阈值,对所述初始信号进行信号压缩处理,得到所述目标压缩信号。According to the diaphragm speed and a preset speed threshold, signal compression processing is performed on the initial signal to obtain the target compressed signal.
  7. 如权利要求5所述气流杂音消除方法,其特征在于,所述方法还包括:The method for eliminating airflow noise according to claim 5, wherein the method further comprises:
    获取训练样本集,所述训练样本集中包括初始信号的梅尔倒谱系数和对应的音频类别;Acquiring a training sample set, where the training sample set includes the Mel cepstrum coefficient of the initial signal and the corresponding audio category;
    将所述梅尔倒谱系数作为预设的分类器的输入,将所述音频类别作为期望的输出,对预设的分类器进行训练,得到训练完成的所述音频信号分类器。Taking the Mel cepstrum coefficient as the input of a preset classifier, and taking the audio category as the desired output, and training the preset classifier to obtain the audio signal classifier that has been trained.
  8. 一种气流杂音消除装置,其特征在于,所述装置包括:An airflow noise elimination device, characterized in that the device comprises:
    信号获取模块,用于采集微型扬声器中的原始音频信号,对所述原始音频信号进行预处理,得到初始信号;The signal acquisition module is used to collect the original audio signal in the micro speaker, and preprocess the original audio signal to obtain the initial signal;
    特征提取模块,用于对所述初始信号进行特征提取,得到音频特征;The feature extraction module is used to perform feature extraction on the initial signal to obtain audio features;
    信号分类模块,用于根据所述音频特征对所述初始信号进行分类,确定所述初始信号的类别;A signal classification module, configured to classify the initial signal according to the audio characteristics, and determine the type of the initial signal;
    信号压缩模块,用于当所述初始信号的类别为包含激励气流杂音的信号,则对所述初始信号进行信号压缩处理,得到目标压缩信号。The signal compression module is configured to perform signal compression processing on the initial signal when the type of the initial signal is a signal that includes the noise of the exciting airflow to obtain a target compressed signal.
  9. 一种计算机设备,其特征在于,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至7中任一项所述气流杂音消除方法的步骤。A computer device, characterized in that it includes a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the rights when the computer program is executed. The steps of the airflow noise elimination method described in any one of 1 to 7 are required.
  10. 一种计算机可读存储介质,其特征在于,包括计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如权利要求1至7中任一项所述的气流杂音消除方法的步骤。A computer-readable storage medium, characterized by comprising computer instructions, when the computer instructions run on a computer, causes the computer to execute the steps of the airflow noise elimination method according to any one of claims 1 to 7.
PCT/CN2020/096686 2020-06-12 2020-06-18 Airflow noise elimination method and apparatus, computer device, and storage medium WO2021248523A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010537956.0A CN111768801A (en) 2020-06-12 2020-06-12 Airflow noise eliminating method and device, computer equipment and storage medium
CN202010537956.0 2020-06-12

Publications (1)

Publication Number Publication Date
WO2021248523A1 true WO2021248523A1 (en) 2021-12-16

Family

ID=72720895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/096686 WO2021248523A1 (en) 2020-06-12 2020-06-18 Airflow noise elimination method and apparatus, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN111768801A (en)
WO (1) WO2021248523A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101404160A (en) * 2008-11-21 2009-04-08 北京科技大学 Voice denoising method based on audio recognition
CN104599677A (en) * 2014-12-29 2015-05-06 中国科学院上海高等研究院 Speech reconstruction-based instantaneous noise suppressing method
US20190281389A1 (en) * 2018-03-08 2019-09-12 Bose Corporation Prioritizing delivery of location-based personal audio
CN110634497A (en) * 2019-10-28 2019-12-31 普联技术有限公司 Noise reduction method and device, terminal equipment and storage medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3354252B2 (en) * 1993-12-27 2002-12-09 株式会社リコー Voice recognition device
US7457745B2 (en) * 2002-12-03 2008-11-25 Hrl Laboratories, Llc Method and apparatus for fast on-line automatic speaker/environment adaptation for speech/speaker recognition in the presence of changing environments
EP1542206A1 (en) * 2003-12-11 2005-06-15 Sony International (Europe) GmbH Apparatus and method for automatic classification of audio signals
CN101197130B (en) * 2006-12-07 2011-05-18 华为技术有限公司 Sound activity detecting method and detector thereof
CN106202952A (en) * 2016-07-19 2016-12-07 南京邮电大学 A kind of Parkinson disease diagnostic method based on machine learning
CN107610707B (en) * 2016-12-15 2018-08-31 平安科技(深圳)有限公司 A kind of method for recognizing sound-groove and device
CN108322868B (en) * 2018-01-19 2020-07-07 瑞声科技(南京)有限公司 Method for improving sound quality of piano played by loudspeaker
CN109308913A (en) * 2018-08-02 2019-02-05 平安科技(深圳)有限公司 Sound quality evaluation method, device, computer equipment and storage medium
CN110473566A (en) * 2019-07-25 2019-11-19 深圳壹账通智能科技有限公司 Audio separation method, device, electronic equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101404160A (en) * 2008-11-21 2009-04-08 北京科技大学 Voice denoising method based on audio recognition
CN104599677A (en) * 2014-12-29 2015-05-06 中国科学院上海高等研究院 Speech reconstruction-based instantaneous noise suppressing method
US20190281389A1 (en) * 2018-03-08 2019-09-12 Bose Corporation Prioritizing delivery of location-based personal audio
CN110634497A (en) * 2019-10-28 2019-12-31 普联技术有限公司 Noise reduction method and device, terminal equipment and storage medium

Also Published As

Publication number Publication date
CN111768801A (en) 2020-10-13

Similar Documents

Publication Publication Date Title
El-Moneim et al. Text-independent speaker recognition using LSTM-RNN and speech enhancement
KR101269296B1 (en) Neural network classifier for separating audio sources from a monophonic audio signal
US8036884B2 (en) Identification of the presence of speech in digital audio data
Logan Mel frequency cepstral coefficients for music modeling.
US9613640B1 (en) Speech/music discrimination
JP4797342B2 (en) Method and apparatus for automatically recognizing audio data
Schröder et al. Spectro-temporal Gabor filterbank features for acoustic event detection
KR20060021299A (en) Parameterized temporal feature analysis
WO2021248522A1 (en) Current noise detection method and apparatus, terminal, and storage medium
WO2018095167A1 (en) Voiceprint identification method and voiceprint identification system
WO2023001128A1 (en) Audio data processing method, apparatus and device
CN109584888A (en) Whistle recognition methods based on machine learning
Bäckström et al. Voice activity detection
WO2021248523A1 (en) Airflow noise elimination method and apparatus, computer device, and storage medium
CN114302301B (en) Frequency response correction method and related product
Krijnders et al. Tone-fit and MFCC scene classification compared to human recognition
KR20220053498A (en) Audio signal processing apparatus including plurality of signal component using machine learning model
CN111061909B (en) Accompaniment classification method and accompaniment classification device
CN114694689A (en) Sound signal processing and evaluating method and device
Astapov et al. Acoustic event mixing to multichannel AMI data for distant speech recognition and acoustic event classification benchmarking
Desai et al. 2-D psychoacoustic modeling for automatic speech recognition in noisy environment
EP4278350A1 (en) Detection and enhancement of speech in binaural recordings
Raju et al. LANGUAGE DETECTION IN SPEECH
CN116580703A (en) Audio segmentation and classification method based on multi-granularity slices
Min et al. Wavelet Packet Sub-band Cepstral Coefficient for Speaker Verification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20939556

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20939556

Country of ref document: EP

Kind code of ref document: A1