CN106548782A

CN106548782A - The processing method and mobile terminal of acoustical signal

Info

Publication number: CN106548782A
Application number: CN201610940699.9A
Authority: CN
Inventors: 申厚拯
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2016-10-31
Filing date: 2016-10-31
Publication date: 2017-03-29

Abstract

The embodiment of the present invention discloses a sound signal processing method, and the sound signal processing method includes: acquiring a first sound signal collected by a microphone of a mobile terminal; calculating a short-time zero-crossing rate of the first sound signal, the short The time zero-crossing rate includes the zero-crossing rate of the sound in 5ms-15ms; remove the low-frequency part and high-frequency part of the first sound signal to obtain the second sound signal; obtain the energy data of the second sound signal; according to the The short-term zero-crossing rate and the energy data, remove the non-speech part of the second sound signal, to obtain a third sound signal; obtain the howling signal of the third sound signal; in the third sound signal , removing the single-frequency signal of the howling signal to obtain a fourth sound signal; amplifying and processing the fourth sound signal. The invention also discloses a corresponding mobile terminal. The sound signal processing method disclosed in the embodiment of the present invention realizes the combination of human voice recognition, suppresses or eliminates howling sound while ensuring the playback quality of human voice, and obtains better user experience.

Description

Sound signal processing method and mobile terminal

技术领域technical field

本发明涉及移动通信技术领域，尤其涉及声音信号的处理方法及移动终端。The invention relates to the technical field of mobile communication, in particular to a sound signal processing method and a mobile terminal.

背景技术Background technique

随着用户对便携式移动终端，尤其是手机的依赖越来越强，移动终端的使用率也越来越高。其中，越来越多的用户利用移动终端外接音响设备进行实时唱歌和录音合成等娱乐，例如可以在移动终端中安装移动KTV等应用，由此用户可以在KTV以为的场合进行唱歌娱乐。可以说，移动KTV给用户的休闲娱乐生活带来了更多的趣味和便利。As users rely more and more on portable mobile terminals, especially mobile phones, the utilization rate of mobile terminals is also increasing. Among them, more and more users use external audio equipment connected to mobile terminals for entertainment such as real-time singing and recording synthesis. For example, applications such as mobile KTV can be installed in mobile terminals, so that users can sing and entertain in KTV-like occasions. It can be said that mobile KTV has brought more fun and convenience to users' leisure and entertainment life.

在现有技术中，在使用移动终端和音响设备进行唱歌娱乐时，由于音响设备的声音一般比较大，其音响声音会与用户所唱的人声混杂在一起被移动终端的麦克风录进去，在进行音效处理时，移动终端会对录入的声音进行放大，当录入的声音放大后比音响播放的声音大，就会一直叠加形成自激，最终播放时会有比较大的啸叫声。因此，一方面音响设备和移动终端的麦克风需要有一定的距离；另一方面，对音响设备的质量也比较有要求。目前，也可以采用回音消除的方法来消除啸叫声，该方案需要另一个麦克风来采集音响设备旁边的信号，但是该方案会对人声有比较大的影响，播放出来的效果较差。In the prior art, when using a mobile terminal and audio equipment for singing entertainment, because the sound of the audio equipment is generally relatively loud, its audio sound will be mixed with the human voice sung by the user and recorded by the microphone of the mobile terminal. When performing sound effect processing, the mobile terminal will amplify the recorded sound. When the recorded sound is amplified and louder than the sound played by the audio system, it will always be superimposed to form self-excitation, and there will be a relatively loud howling sound when it is finally played. Therefore, on the one hand, there needs to be a certain distance between the audio equipment and the microphone of the mobile terminal; on the other hand, there are relatively high requirements for the quality of the audio equipment. At present, the method of echo cancellation can also be used to eliminate the howling sound. This solution requires another microphone to collect the signal next to the audio equipment, but this solution will have a relatively large impact on the human voice, and the playback effect is poor.

发明内容Contents of the invention

本发明实施例提供了一种声音信号的处理方法及移动终端，以解决现有技术中难以在抑制或消除啸叫声的同时保证人声质量的问题。Embodiments of the present invention provide a sound signal processing method and a mobile terminal to solve the problem in the prior art that it is difficult to suppress or eliminate howling sound while ensuring the quality of human voice.

一方面，本发明实施例提供声音信号的处理方法，其应用于移动终端，该方法包括：On the one hand, an embodiment of the present invention provides a sound signal processing method, which is applied to a mobile terminal, and the method includes:

获取移动终端的麦克风采集的第一声音信号；Obtaining the first sound signal collected by the microphone of the mobile terminal;

计算得到所述第一声音信号的短时过零率，所述短时过零率包括5ms-15ms内的声音的过零率；Calculate the short-term zero-crossing rate of the first sound signal, the short-term zero-crossing rate includes the zero-crossing rate of the sound within 5ms-15ms;

去除所述第一声音信号的低频部分和高频部分，得到第二声音信号；removing the low-frequency part and the high-frequency part of the first sound signal to obtain a second sound signal;

获取所述第二声音信号的能量数据；acquiring energy data of the second sound signal;

根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号；removing the non-speech part of the second sound signal according to the short-term zero-crossing rate and the energy data, to obtain a third sound signal;

获取所述第三声音信号的啸叫信号；acquiring a howling signal of the third sound signal;

在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号；In the third sound signal, removing the single-frequency signal of the howling signal to obtain a fourth sound signal;

放大处理所述第四声音信号。Amplify and process the fourth sound signal.

另一方面，本发明实施例还提供了一种移动终端，包括：On the other hand, the embodiment of the present invention also provides a mobile terminal, including:

第一获取模块，用于获取移动终端的麦克风采集的第一声音信号；The first obtaining module is used to obtain the first sound signal collected by the microphone of the mobile terminal;

计算模块，用于计算得到所述第一声音信号的短时过零率，所述短时过零率包括5ms-15ms内的声音的过零率；A calculation module, configured to calculate the short-term zero-crossing rate of the first sound signal, and the short-term zero-crossing rate includes the zero-crossing rate of the sound within 5ms-15ms;

第一滤波模块，用于去除所述第一声音信号的低频部分和高频部分，得到第二声音信号；A first filtering module, configured to remove the low-frequency part and high-frequency part of the first sound signal to obtain a second sound signal;

第二获取模块，用于获取所述第二声音信号的能量数据；A second acquisition module, configured to acquire energy data of the second sound signal;

静音模块，用于根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号；A mute module, configured to remove the non-speech part of the second sound signal according to the short-term zero-crossing rate and the energy data, to obtain a third sound signal;

第三获取模块，用于获取所述第三声音信号的啸叫信号；A third acquisition module, configured to acquire a howling signal of the third sound signal;

第二滤波模块，用于在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号；The second filtering module is configured to remove the single-frequency signal of the howling signal from the third sound signal to obtain a fourth sound signal;

放大模块，用于放大处理所述第四声音信号。The amplifying module is used for amplifying and processing the fourth sound signal.

本发明实施例提供的声音信号的处理方法，通过获取移动终端的麦克风采集的第一声音信号；计算得到所述第一声音信号的短时过零率，所述短时过零率包括5ms-15ms内的声音的过零率；去除所述第一声音信号的低频部分和高频部分，得到第二声音信号；获取所述第二声音信号的能量数据；根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号；获取所述第三声音信号的啸叫信号；在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号；放大处理所述第四声音信号，实现了结合人声识别，在抑制或消除啸叫声的同时保证人声的播放质量，获得更好的用户体验。The sound signal processing method provided by the embodiment of the present invention obtains the first sound signal collected by the microphone of the mobile terminal; calculates the short-term zero-crossing rate of the first sound signal, and the short-term zero-crossing rate includes 5 ms- The zero-crossing rate of the sound within 15ms; remove the low-frequency part and high-frequency part of the first sound signal to obtain a second sound signal; obtain the energy data of the second sound signal; according to the short-term zero-crossing rate and The energy data, removing the non-speech part of the second sound signal to obtain a third sound signal; obtaining the howling signal of the third sound signal; removing the howling signal from the third sound signal The single-frequency signal is obtained to obtain a fourth sound signal; the fourth sound signal is amplified and processed to achieve a combination of human voice recognition, while suppressing or eliminating howling sound while ensuring the playback quality of human voice, and obtaining a better user experience.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对本发明实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例的附图，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that need to be used in the description of the embodiments of the present invention will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention For those skilled in the art, other drawings can also be obtained based on these drawings without creative work.

图1是本发明声音信号的处理方法的第一实施例的流程图；Fig. 1 is the flow chart of the first embodiment of the processing method of sound signal of the present invention;

图2是本发明声音信号的处理方法的第二实施例的流程图；Fig. 2 is the flow chart of the second embodiment of the processing method of sound signal of the present invention;

图3是本发明的移动终端的第一实施例的结构框图；Fig. 3 is a structural block diagram of the first embodiment of the mobile terminal of the present invention;

图4是本发明的移动终端的第二实施例的结构框图；Fig. 4 is a structural block diagram of the second embodiment of the mobile terminal of the present invention;

图5是本发明的移动终端的第三实施例的结构框图。Fig. 5 is a structural block diagram of the third embodiment of the mobile terminal of the present invention.

具体实施方式detailed description

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

第一实施例first embodiment

如图1所示，是本发明声音信号的处理方法的第一实施例的流程图。该声音信号的处理方法包括：As shown in FIG. 1 , it is a flow chart of the first embodiment of the method for processing sound signals of the present invention. The processing method of this sound signal comprises:

步骤101，获取移动终端的麦克风采集的第一声音信号。Step 101, acquiring a first sound signal collected by a microphone of a mobile terminal.

本发明实施例中，通过移动终端的麦克风采集第一声音信号，该第一声音信号包括了麦克风实时采集的所有声音，如人声、音响设备播放的音乐声等。获取该第一声音信号后以便于进一步处理。In the embodiment of the present invention, the first sound signal is collected by the microphone of the mobile terminal, and the first sound signal includes all sounds collected by the microphone in real time, such as human voice, music played by audio equipment, and the like. The first sound signal is obtained for further processing.

步骤102，计算得到所述第一声音信号的短时过零率。Step 102, calculating the short-term zero-crossing rate of the first sound signal.

本发明实施例中，计算该第一声音信号的短时过零率，已知地，过零率(zero-crossing rate,ZCR)是指一个信号的符号变化的比率，例如信号从正数变成负数或反向。这个特征在语音对比、语音识别和音乐信息检索(music information retrieval)领域得到广泛使用，是对敲击声音的进行分类的主要特征。In the embodiment of the present invention, the short-term zero-crossing rate of the first sound signal is calculated. Knownly, the zero-crossing rate (zero-crossing rate, ZCR) refers to the ratio of a sign change of a signal, for example, the signal changes from a positive number to a zero-crossing rate. Negative or reversed. This feature is widely used in the fields of speech comparison, speech recognition and music information retrieval, and is the main feature for classifying percussion sounds.

步骤103，去除所述第一声音信号的低频部分和高频部分，得到第二声音信号。Step 103, removing the low-frequency part and the high-frequency part of the first sound signal to obtain a second sound signal.

本发明实施例中，实际的人声频率一般处于80Hz-1.1KHz,频率过低或过高的都属于非语音频率，因此需要初步将之去除。如，可以通过带通滤波器去除第一声音信号的低频部分和高频部分。In the embodiment of the present invention, the actual human voice frequency is generally in the range of 80 Hz-1.1 KHz, and the frequencies that are too low or too high are all non-speech frequencies, so they need to be preliminarily removed. For example, the low-frequency part and the high-frequency part of the first sound signal may be removed by a band-pass filter.

步骤104，获取所述第二声音信号的能量数据。Step 104, acquiring energy data of the second sound signal.

本发明实施例中，能量数据是用于语音识别的重要数据。该能量数据包括第二声音信号各个频点对应的能量数值。In the embodiment of the present invention, energy data is important data for speech recognition. The energy data includes energy values corresponding to each frequency point of the second sound signal.

步骤105，根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号。Step 105, according to the short-term zero-crossing rate and the energy data, remove the non-speech part of the second sound signal to obtain a third sound signal.

本发明实施例中，根据短时过零率和能量数据进行综合分析，得出第二声音信号的人声部分和非语音部分，并将该非语音部分去除，得到未经进一步处理的人声部分，即第三声音信号。In the embodiment of the present invention, comprehensive analysis is carried out according to the short-term zero-crossing rate and energy data to obtain the human voice part and non-speech part of the second sound signal, and remove the non-speech part to obtain the human voice without further processing part, the third sound signal.

步骤106，获取所述第三声音信号的啸叫信号。Step 106, acquiring a howling signal of the third sound signal.

本发明实施例中，由于第三声音信号的人声同时包括了用户实时发出的声音和外放音响设备播放的人声，当这两部分的人声产生自激，就会产生啸叫信号，这些啸叫信号在声音播放时会产生啸叫声，因此需要进一步处理这些啸叫信号。In the embodiment of the present invention, because the human voice of the third sound signal includes both the real-time voice of the user and the human voice played by the external audio equipment, when the two parts of the human voice are self-excited, a howling signal will be generated. These howling signals will produce howling sounds when the sound is played, so these howling signals need to be further processed.

步骤107，在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号。Step 107, removing the single-frequency signal of the howling signal from the third sound signal to obtain a fourth sound signal.

本发明实施例中，需要对第三声音信号进行进一步处理，消除啸叫信号中的单频信号，还原人声，以达到消除啸叫声的目的。In the embodiment of the present invention, the third sound signal needs to be further processed to eliminate the single-frequency signal in the howling signal and restore the human voice, so as to achieve the purpose of eliminating the howling sound.

步骤108，放大处理所述第四声音信号。Step 108, amplifying and processing the fourth sound signal.

本发明实施例中，在声音播放前，还需要进一步经放大处理，以保证外放音响的播放效果。In the embodiment of the present invention, before the sound is played, further amplification processing is required to ensure the playback effect of the external sound.

第二实施例second embodiment

如图2所示，是本发明声音信号的处理方法的第二实施例的流程图。该声音信号的处理方法包括：As shown in FIG. 2 , it is a flow chart of the second embodiment of the sound signal processing method of the present invention. The processing method of this sound signal comprises:

步骤201，获取移动终端的麦克风采集的第一声音信号。Step 201, acquiring a first sound signal collected by a microphone of a mobile terminal.

步骤202，计算得到所述第一声音信号的短时过零率。Step 202, calculating the short-term zero-crossing rate of the first sound signal.

步骤203，去除所述第一声音信号的低频部分和高频部分，得到第二声音信号。Step 203, removing the low-frequency part and the high-frequency part of the first sound signal to obtain a second sound signal.

步骤201至步骤203与本发明声音信号的处理方法的第一实施例的相应步骤相同，此处不再赘述。Steps 201 to 203 are the same as the corresponding steps in the first embodiment of the sound signal processing method of the present invention, and will not be repeated here.

步骤204，对所述第二声音信号进行快速傅里叶变换，得到所述第二声音信号的频谱数据。Step 204, performing fast Fourier transform on the second sound signal to obtain spectrum data of the second sound signal.

本发明实施例中，由于傅里叶变换是把各种形式的信号用正弦信号表示，因此傅里叶变换后可以得到第二声音信号的频谱数据。而快速傅里叶变换(fast Fouriertransform),即利用计算机计算离散傅里叶变换(DFT)的高效、快速计算方法的统称，简称FFT，因此使用快速傅里叶变换可以快速地得到第二声音信号的频谱数据。In the embodiment of the present invention, since the Fourier transform expresses various forms of signals with sinusoidal signals, the spectral data of the second sound signal can be obtained after the Fourier transform. Fast Fourier transform (fast Fourier transform), that is, the use of computers to calculate discrete Fourier transform (DFT) is a general term for efficient and fast calculation methods, referred to as FFT, so the second sound signal can be obtained quickly by using fast Fourier transform spectrum data.

步骤205，从所述频谱数据中获取所述能量数据，所述能量数据包括所述第二声音信号在低频、中频、高频的能量峰值信号。Step 205: Acquire the energy data from the frequency spectrum data, where the energy data includes energy peak signals of the second sound signal at low frequency, medium frequency, and high frequency.

本发明实施例中，可以从频谱数据的波形中获取到能量数据，此处，分别在频谱的低频、中频、高频部分取能量峰值信号。这些能量峰值信号能够反映该第二声音信号的声音特质，如振幅等。具体地，低频范围包括20Hz～1600Hz,中频范围包括1600Hz～3000Hz,高频范围包括3000Hz以上。In the embodiment of the present invention, the energy data can be obtained from the waveform of the frequency spectrum data. Here, the energy peak signals are respectively obtained from the low frequency, middle frequency and high frequency parts of the frequency spectrum. These energy peak signals can reflect the sound characteristics of the second sound signal, such as amplitude and so on. Specifically, the low frequency range includes 20 Hz to 1600 Hz, the middle frequency range includes 1600 Hz to 3000 Hz, and the high frequency range includes above 3000 Hz.

步骤206，分析所述短时过零率和所述能量数据，得到所述第二声音信号的频率数据和振幅数据。Step 206, analyzing the short-term zero-crossing rate and the energy data to obtain frequency data and amplitude data of the second sound signal.

本发明实施例中，根据短时过零率和能量数据，可以得到第二声音信号的频率数据和振幅数据，短时过零率包括5ms-15ms内的声音的过零率。In the embodiment of the present invention, the frequency data and amplitude data of the second sound signal can be obtained according to the short-term zero-crossing rate and energy data, and the short-term zero-crossing rate includes the zero-crossing rate of the sound within 5ms-15ms.

步骤207，根据所述频率数据和所述振幅数据，判断所述第二声音信号中是否存在所述非语音部分。Step 207, according to the frequency data and the amplitude data, determine whether the non-speech part exists in the second sound signal.

本发明实施例中，可以根据第二声音信号的频率数据和振幅数据判断其声音的成分。In the embodiment of the present invention, the sound components of the second sound signal can be judged according to the frequency data and amplitude data.

步骤208，若是，静音处理所述非语音部分，得到所述第三声音信号。Step 208, if yes, silently process the non-speech part to obtain the third sound signal.

本发明实施例中，当判断出第二声音信号中存在非语音部分时，则将该非语音部分静音，得到第三声音信号。已知地，静音处理为将相应的音频信号置为零，无声音输出。In the embodiment of the present invention, when it is determined that there is a non-speech part in the second sound signal, the non-speech part is muted to obtain a third sound signal. Knownly, the mute process is to set the corresponding audio signal to zero, and no sound is output.

步骤209，根据所述能量数据，获取所述第三声音信号的低频、中频、高频的最大能量信号。Step 209, according to the energy data, obtain the maximum energy signals of the low frequency, medium frequency and high frequency of the third sound signal.

本发明实施例中，由于自激产生的啸叫信号具有能量值较大的特点，因此先获取第三声音信号的低频、中频、高频的最大能量信号。In the embodiment of the present invention, since the howling signal generated by self-excitation has a characteristic of relatively large energy value, the maximum energy signals of the low frequency, intermediate frequency and high frequency of the third sound signal are obtained first.

步骤210，判断所述最大能量信号是否为持续信号。Step 210, judging whether the maximum energy signal is a continuous signal.

本发明实施例中，由于啸叫信号除了能量值较大，也有持续时间较长的特点，根据这两个条件可以判断出该信号是否为啸叫信号。其中，当最大能量信号的持续时间为30～40ms时可判断为持续信号。In the embodiment of the present invention, since the howling signal not only has a large energy value, but also has a longer duration, it can be determined whether the signal is a howling signal according to these two conditions. Wherein, when the duration of the maximum energy signal is 30-40 ms, it can be judged as a continuous signal.

步骤211，若是，确定所述最大能量信号为啸叫信号。Step 211, if yes, determine that the maximum energy signal is a howling signal.

本发明实施例中，当最大能量信号为持续信号时，则可以判断出该信号为啸叫信号。In the embodiment of the present invention, when the maximum energy signal is a continuous signal, it can be determined that the signal is a howling signal.

步骤212，使用自适应陷波滤波器处理所述第三声音信号，去除所述啸叫信号的单频信号，得到所述第四声音信号。Step 212, process the third sound signal with an adaptive notch filter, remove the single-frequency signal of the howling signal, and obtain the fourth sound signal.

本发明实施例中，自适应陷波滤波器是根据滤波器的输出量来控制滤波器的某个或某些参数，从而达到自动地滤除某些频率分量。其中，自适应陷波滤波器以某种意义上的最优化方式消除包含在基本信号中的未知干扰。基本信号用作自适应滤波器的期望响应，参考信号用作滤波器的输入。参考信号来自定位的某一传感器或一组传感器，并以承载新息的信号是微弱的或基本不可预测的方式，供给基本信号上。In the embodiment of the present invention, the adaptive notch filter controls one or some parameters of the filter according to the output of the filter, so as to automatically filter out certain frequency components. Among them, the adaptive notch filter eliminates the unknown interference contained in the basic signal in an optimal manner in a certain sense. The base signal is used as the desired response of the adaptive filter and the reference signal is used as the input to the filter. The reference signal comes from a sensor or set of sensors positioned and fed to the base signal in such a way that the innovation-bearing signal is weak or largely unpredictable.

具体地，先设置自适应陷波滤波器的初始频率，把啸叫信号的单频信号消掉，然后根据陷波滤波器的计算结果，计算更精确的频率，并更新陷波滤波器的频率和进行清除操作。Specifically, first set the initial frequency of the adaptive notch filter to eliminate the single-frequency signal of the howling signal, and then calculate a more accurate frequency according to the calculation result of the notch filter, and update the frequency of the notch filter and perform a cleanup operation.

步骤213，放大处理所述第四声音信号。Step 213, amplifying and processing the fourth sound signal.

步骤213与本发明声音信号的处理方法的第一实施例的相应步骤相同，此处不再赘述。Step 213 is the same as the corresponding step in the first embodiment of the method for processing the sound signal of the present invention, and will not be repeated here.

本发明实施例提供的声音信号的处理方法，通过获取移动终端的麦克风采集的第一声音信号；计算得到所述第一声音信号的短时过零率，所述短时过零率包括5ms-15ms内的声音的过零率；去除所述第一声音信号的低频部分和高频部分，得到第二声音信号；对所述第二声音信号进行快速傅里叶变换，得到所述第二声音信号的频谱数据；从所述频谱数据中获取所述能量数据，所述能量数据包括所述第二声音信号在低频、中频、高频的能量峰值信号；分析所述短时过零率和所述能量数据，得到所述第二声音信号的频率数据和振幅数据；根据所述频率数据和所述振幅数据，判断所述第二声音信号中是否存在所述非语音部分；若是，静音处理所述非语音部分，得到所述第三声音信号；根据所述能量数据，获取所述第三声音信号的低频、中频、高频的最大能量信号；判断所述最大能量信号是否为持续信号；若是，确定所述最大能量信号为啸叫信号；使用自适应陷波滤波器处理所述第三声音信号，消除所述啸叫信号的单频信号，得到所述第四声音信号；放大处理所述第四声音信号。由此，实现了更好地清除非语音并处理人声中的啸叫信号，提升了用户体验。The sound signal processing method provided by the embodiment of the present invention obtains the first sound signal collected by the microphone of the mobile terminal; calculates the short-term zero-crossing rate of the first sound signal, and the short-term zero-crossing rate includes 5 ms- The zero-crossing rate of the sound within 15ms; removing the low-frequency part and high-frequency part of the first sound signal to obtain a second sound signal; performing fast Fourier transform on the second sound signal to obtain the second sound The spectrum data of the signal; the energy data is obtained from the spectrum data, the energy data includes the energy peak signals of the second sound signal at low frequency, medium frequency, and high frequency; the short-term zero-crossing rate and the According to the energy data, frequency data and amplitude data of the second sound signal are obtained; according to the frequency data and the amplitude data, it is judged whether there is the non-speech part in the second sound signal; if so, the mute processing According to the non-speech part, the third sound signal is obtained; according to the energy data, the maximum energy signal of the low frequency, intermediate frequency and high frequency of the third sound signal is obtained; whether the maximum energy signal is a continuous signal is judged; if , determine that the maximum energy signal is a howling signal; use an adaptive notch filter to process the third sound signal, eliminate the single-frequency signal of the howling signal, and obtain the fourth sound signal; amplify and process the Fourth sound signal. As a result, it is possible to better remove non-speech signals and process howling signals in human voices, thereby improving user experience.

上文对本发明移动终端的显示方法的实施例作了详细介绍。下面将相应于上述方法的装置(即移动终端)作进一步阐述。其中，移动终端可以是手机、平板电脑、MP3或MP4等。The embodiments of the display method for the mobile terminal of the present invention are described in detail above. The device (ie the mobile terminal) corresponding to the above method will be further elaborated below. Wherein, the mobile terminal may be a mobile phone, a tablet computer, MP3 or MP4, and the like.

第三实施例third embodiment

如图3所示，为本发明移动终端的第一实施例的结构框图。该移动终端300能实现本发明的声音信号的处理方法的第一实施例的各步骤，其中，移动终端300包括第一获取模块301、计算模块302、第一滤波模块303、第二获取模块304、静音模块305、第三获取模块306、第二滤波模块307和放大模块308。As shown in FIG. 3 , it is a structural block diagram of the first embodiment of the mobile terminal of the present invention. The mobile terminal 300 can implement the steps of the first embodiment of the sound signal processing method of the present invention, wherein the mobile terminal 300 includes a first acquisition module 301, a calculation module 302, a first filter module 303, and a second acquisition module 304 , a mute module 305 , a third acquisition module 306 , a second filter module 307 and an amplification module 308 .

第一获取模块301，与计算模块302相连接，用于获取移动终端的麦克风采集的第一声音信号。The first obtaining module 301 is connected with the computing module 302 and is used for obtaining the first sound signal collected by the microphone of the mobile terminal.

本发明实施例中，通过移动终端的麦克风采集第一声音信号，该第一声音信号包括了麦克风实时采集的所有声音，如人声、音响设备播放的音乐声等。第一获取模块301获取该第一声音信号后以便于进一步处理。In the embodiment of the present invention, the first sound signal is collected by the microphone of the mobile terminal, and the first sound signal includes all sounds collected by the microphone in real time, such as human voice, music played by audio equipment, and the like. The first acquiring module 301 acquires the first sound signal for further processing.

计算模块302，与第一滤波模块303相连接，用于计算得到所述第一声音信号的短时过零率。The calculating module 302 is connected with the first filtering module 303, and is used for calculating the short-term zero-crossing rate of the first sound signal.

本发明实施例中，计算模块302计算该第一声音信号的短时过零率，已知地，过零率(zero-crossing rate,ZCR)是指一个信号的符号变化的比率，例如信号从正数变成负数或反向。这个特征在语音对比、语音识别和音乐信息检索(music information retrieval)领域得到广泛使用，是对敲击声音的进行分类的主要特征。In the embodiment of the present invention, the calculation module 302 calculates the short-term zero-crossing rate of the first sound signal. Knownly, the zero-crossing rate (zero-crossing rate, ZCR) refers to the ratio of the sign change of a signal, for example, the signal changes from Positive numbers become negative or reversed. This feature is widely used in the fields of speech comparison, speech recognition and music information retrieval, and is the main feature for classifying percussion sounds.

第一滤波模块303，与第二获取模块304相连接，用于去除所述第一声音信号的低频部分和高频部分，得到第二声音信号。The first filtering module 303 is connected with the second acquiring module 304, and is used for removing the low-frequency part and the high-frequency part of the first sound signal to obtain the second sound signal.

本发明实施例中，实际的人声频率一般处于80Hz-1.1KHz,频率过低或过高的都属于非语音信号，因此需要初步将之去除。如，第一滤波模块303可以通过带通滤波器去除第一声音信号的低频部分和高频部分。In the embodiment of the present invention, the actual human voice frequency is generally in the range of 80 Hz-1.1 KHz, and the frequencies that are too low or too high are all non-speech signals, so they need to be preliminarily removed. For example, the first filtering module 303 may remove the low-frequency part and the high-frequency part of the first sound signal through a band-pass filter.

第二获取模块304，与静音模块305相连接，用于获取所述第二声音信号的能量数据。The second acquisition module 304 is connected to the mute module 305 and configured to acquire energy data of the second sound signal.

静音模块305，与第三获取模块306相连接，用于根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号。The mute module 305 is connected with the third acquisition module 306, and is configured to remove the non-speech part of the second sound signal according to the short-term zero-crossing rate and the energy data, so as to obtain a third sound signal.

第三获取模块306，与第二滤波模块307相连接，用于获取所述第三声音信号的啸叫信号。The third acquiring module 306 is connected to the second filtering module 307, and is configured to acquire the howling signal of the third sound signal.

本发明实施例中，由于第三声音信号的人声同时包括了用户实时发出的声音和外放音响设备播放的人声，当这两部分的人声产生自激，就会产生啸叫信号，这些啸叫信号在声音播放时会产生啸叫声，因此需要进一步处理这些啸叫信号。In the embodiment of the present invention, since the human voice of the third sound signal includes both the real-time voice of the user and the human voice played by the external audio equipment, when the two parts of the human voice are self-excited, a howling signal will be generated. These howling signals will produce howling sounds when the sound is played, so these howling signals need to be further processed.

第二滤波模块307，与放大模块308相连接，用于在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号。The second filter module 307 is connected with the amplification module 308, and is used for removing the single-frequency signal of the howling signal from the third sound signal to obtain a fourth sound signal.

放大模块308，用于放大处理所述第四声音信号。The amplifying module 308 is configured to amplify and process the fourth sound signal.

本发明实施例提供的移动终端，通过获取移动终端的麦克风采集的第一声音信号；计算得到所述第一声音信号的短时过零率，所述短时过零率包括5ms-15ms内的声音的过零率；去除所述第一声音信号的低频部分和高频部分，得到第二声音信号；获取所述第二声音信号的能量数据；根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号；获取所述第三声音信号的啸叫信号；在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号；放大处理所述第四声音信号，实现了结合人声识别，在抑制或消除啸叫声的同时保证人声的播放质量，获得更好的用户体验。The mobile terminal provided by the embodiment of the present invention obtains the first sound signal collected by the microphone of the mobile terminal; calculates the short-time zero-crossing rate of the first sound signal, and the short-time zero-crossing rate includes 5ms-15ms The zero-crossing rate of the sound; removing the low-frequency part and the high-frequency part of the first sound signal to obtain the second sound signal; obtaining the energy data of the second sound signal; according to the short-term zero-crossing rate and the energy Data, removing the non-speech part of the second sound signal to obtain a third sound signal; obtaining the howling signal of the third sound signal; removing the single frequency of the howling signal from the third sound signal signal to obtain a fourth sound signal; the fourth sound signal is amplified and processed to achieve a combination of human voice recognition, while suppressing or eliminating howling sound while ensuring the playback quality of human voice, and obtaining a better user experience.

第四实施例Fourth embodiment

如图4所示，为本发明移动终端的第二实施例的结构框图。该移动终端400能实现本发明的声音信号的处理方法的第二实施例的各步骤，其中，移动终端400包括第一获取模块401、计算模块402、第一滤波模块403、第二获取模块404、静音模块405、第三获取模块406、第二滤波模块407和放大模块408。As shown in FIG. 4 , it is a structural block diagram of the second embodiment of the mobile terminal of the present invention. The mobile terminal 400 can implement the steps of the second embodiment of the sound signal processing method of the present invention, wherein the mobile terminal 400 includes a first acquisition module 401, a calculation module 402, a first filter module 403, and a second acquisition module 404 , a mute module 405 , a third acquisition module 406 , a second filter module 407 and an amplification module 408 .

第一获取模块401，与计算模块402相连接，用于获取移动终端的麦克风采集的第一声音信号。The first acquisition module 401 is connected with the calculation module 402 and configured to acquire the first sound signal collected by the microphone of the mobile terminal.

计算模块402，与第一滤波模块403相连接，用于计算得到所述第一声音信号的短时过零率。The calculating module 402 is connected with the first filtering module 403 and is used for calculating the short-term zero-crossing rate of the first sound signal.

第一滤波模块403，与第二获取模块404相连接，用于去除所述第一声音信号的低频部分和高频部分，得到第二声音信号。The first filtering module 403 is connected with the second acquiring module 404, and is used for removing the low-frequency part and the high-frequency part of the first sound signal to obtain the second sound signal.

第一获取模块401、计算模块402和第一滤波模块403与本发明移动终端的第一实施例的相应模块相同，此处不再赘述。The first obtaining module 401, the calculating module 402 and the first filtering module 403 are the same as the corresponding modules in the first embodiment of the mobile terminal of the present invention, and will not be repeated here.

第二获取模块404，与静音模块405相连接，用于获取所述第二声音信号的能量数据。The second acquiring module 404 is connected with the mute module 405 and configured to acquire energy data of the second sound signal.

其中，所述第二获取模块404包括：Wherein, the second acquisition module 404 includes:

傅里叶变换单元4041，与第二获取单元4042相连接，用于对所述第二声音信号进行快速傅里叶变换，得到所述第二声音信号的频谱数据。The Fourier transform unit 4041 is connected to the second acquisition unit 4042, and is configured to perform fast Fourier transform on the second sound signal to obtain spectrum data of the second sound signal.

本发明实施例中，由于傅里叶变换是把各种形式的信号用正弦信号表示，因此傅里叶变换后可以得到第二声音信号的频谱数据。而快速傅里叶变换(fast Fouriertransform),即利用计算机计算离散傅里叶变换(DFT)的高效、快速计算方法的统称，简称FFT，因此使用快速傅里叶变换可以快速地得到第二声音信号的频谱数据。In the embodiment of the present invention, since the Fourier transform expresses various forms of signals with sinusoidal signals, the spectral data of the second sound signal can be obtained after the Fourier transform. Fast Fourier transform (fast Fouriertransform), that is, the use of computers to calculate discrete Fourier transform (DFT) efficient and fast calculation methods collectively, referred to as FFT, so the use of fast Fourier transform can quickly obtain the second sound signal spectrum data.

第二获取单元4042，用于从所述频谱数据中获取所述能量数据，所述能量数据包括所述第二声音信号在低频、中频、高频的能量峰值信号。The second acquiring unit 4042 is configured to acquire the energy data from the frequency spectrum data, where the energy data includes energy peak signals of the second sound signal at low frequency, medium frequency, and high frequency.

静音模块405，与第三获取模块406相连接，用于根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号。The mute module 405 is connected with the third acquisition module 406, and is configured to remove the non-speech part of the second sound signal according to the short-term zero-crossing rate and the energy data, so as to obtain a third sound signal.

其中，所述静音模块405包括：Wherein, the mute module 405 includes:

分析单元4051，与第一判断单元4052相连接，用于分析所述短时过零率和所述能量数据，得到所述第二声音信号的频率数据和振幅数据。The analyzing unit 4051 is connected with the first judging unit 4052, and is configured to analyze the short-term zero-crossing rate and the energy data, and obtain frequency data and amplitude data of the second sound signal.

第一判断单元4052，与静音单元4053相连接，用于根据所述频率数据和所述振幅数据，判断所述第二声音信号中是否存在所述非语音部分。The first judging unit 4052 is connected to the mute unit 4053, and is configured to judge whether the non-speech part exists in the second sound signal according to the frequency data and the amplitude data.

静音单元4053，用于静音处理所述非语音部分，得到所述第三声音信号。The mute unit 4053 is configured to mute the non-speech part to obtain the third sound signal.

第三获取模块406，与第二滤波模块407相连接，用于获取所述第三声音信号的啸叫信号。The third acquiring module 406 is connected with the second filtering module 407, and is configured to acquire the howling signal of the third sound signal.

其中，所述第三获取模块406包括：Wherein, the third acquisition module 406 includes:

第一获取单元4061，与第二判断单元4062相连接，用于根据所述能量数据，获取所述第三声音信号的低频、中频、高频的最大能量信号。The first acquiring unit 4061 is connected with the second judging unit 4062, and is configured to acquire the maximum energy signals of the low frequency, medium frequency and high frequency of the third sound signal according to the energy data.

第二判断单元4062，与确定单元4063相连接，用于判断所述最大能量信号是否为持续信号。The second judging unit 4062 is connected to the determining unit 4063, and is used for judging whether the maximum energy signal is a continuous signal.

确定单元4063，用于确定所述最大能量信号为啸叫信号。A determining unit 4063, configured to determine that the maximum energy signal is a howling signal.

第二滤波模块407，与放大模块408相连接，用于在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号。The second filter module 407 is connected with the amplification module 408, and is used for removing the single-frequency signal of the howling signal from the third sound signal to obtain a fourth sound signal.

具体地，所述第二滤波模块407包括：Specifically, the second filtering module 407 includes:

滤波单元4071，用于使用自适应陷波滤波器处理所述第三声音信号，去除所述啸叫信号的单频信号，得到所述第四声音信号。The filtering unit 4071 is configured to use an adaptive notch filter to process the third sound signal, remove the single-frequency signal of the howling signal, and obtain the fourth sound signal.

放大模块408，用于放大处理所述第四声音信号。The amplification module 408 is configured to amplify and process the fourth sound signal.

放大模块408与本发明移动终端的第一实施例的相应模块相同，此处不再赘述。The amplifying module 408 is the same as the corresponding module in the first embodiment of the mobile terminal of the present invention, and will not be repeated here.

本发明实施例提供的移动终端，通过获取移动终端的麦克风采集的第一声音信号；计算得到所述第一声音信号的短时过零率，所述短时过零率包括5ms-15ms内的声音的过零率；去除所述第一声音信号的低频部分和高频部分，得到第二声音信号；对所述第二声音信号进行快速傅里叶变换，得到所述第二声音信号的频谱数据；从所述频谱数据中获取所述能量数据，所述能量数据包括所述第二声音信号在低频、中频、高频的能量峰值信号；分析所述短时过零率和所述能量数据，得到所述第二声音信号的频率数据和振幅数据；根据所述频率数据和所述振幅数据，判断所述第二声音信号中是否存在所述非语音部分；若是，静音处理所述非语音部分，得到所述第三声音信号；根据所述能量数据，获取所述第三声音信号的低频、中频、高频的最大能量信号；判断所述最大能量信号是否为持续信号；若是，确定所述最大能量信号为啸叫信号；使用自适应陷波滤波器处理所述第三声音信号，消除所述单频信号，得到所述第四声音信号；放大处理所述第四声音信号。由此，实现了更好地将非语音清除并处理人声中的啸叫信号，提升了用户体验。The mobile terminal provided by the embodiment of the present invention obtains the first sound signal collected by the microphone of the mobile terminal; calculates the short-time zero-crossing rate of the first sound signal, and the short-time zero-crossing rate includes 5ms-15ms The zero-crossing rate of the sound; removing the low-frequency part and the high-frequency part of the first sound signal to obtain a second sound signal; performing fast Fourier transform on the second sound signal to obtain the spectrum of the second sound signal data; obtain the energy data from the spectrum data, the energy data includes the energy peak signals of the second sound signal at low frequency, intermediate frequency, and high frequency; analyze the short-term zero-crossing rate and the energy data , to obtain frequency data and amplitude data of the second sound signal; according to the frequency data and the amplitude data, judge whether there is the non-speech part in the second sound signal; if so, silently process the non-speech part, obtain the third sound signal; according to the energy data, obtain the maximum energy signal of the low frequency, intermediate frequency and high frequency of the third sound signal; judge whether the maximum energy signal is a continuous signal; if so, determine the The maximum energy signal is a howling signal; the third sound signal is processed by using an adaptive notch filter to eliminate the single-frequency signal to obtain the fourth sound signal; and the fourth sound signal is amplified and processed. As a result, it is possible to better remove non-speech and process howling signals in human voices, thereby improving user experience.

第五实施例fifth embodiment

图5是本发明移动终端的第三实施例的结构框图。图5所示的移动终端800包括：至少一个处理器801、存储器802、至少一个网络接口804、用户接口803和其他组件806，其他组件806包括眼球追踪传感器和前置摄像头。移动终端800中的各个组件通过总线系统805耦合在一起。可理解，总线系统805用于实现这些组件之间的连接通信。总线系统805除包括数据总线之外，还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见，在图5中将各种总线都标为总线系统805。Fig. 5 is a structural block diagram of the third embodiment of the mobile terminal of the present invention. The mobile terminal 800 shown in FIG. 5 includes: at least one processor 801 , memory 802 , at least one network interface 804 , user interface 803 and other components 806 , and the other components 806 include an eye tracking sensor and a front camera. Various components in the mobile terminal 800 are coupled together through a bus system 805 . It can be understood that the bus system 805 is used to realize connection and communication between these components. In addition to the data bus, the bus system 805 also includes a power bus, a control bus and a status signal bus. However, the various buses are labeled as bus system 805 in FIG. 5 for clarity of illustration.

其中，用户接口803可以包括显示器、键盘或者点击设备(例如，鼠标，轨迹球(trackball)、触感板或者触摸屏等。Wherein, the user interface 803 may include a display, a keyboard, or a pointing device (for example, a mouse, a trackball (trackball), a touch panel, or a touch screen, and the like.

可以理解，本发明实施例中的存储器802可以是易失性存储器或非易失性存储器，或可包括易失性和非易失性存储器两者。其中，非易失性存储器可以是只读存储器(Read-Only Memory，ROM)、可编程只读存储器(Programmable ROM，PROM)、可擦除可编程只读存储器(Erasable PROM，EPROM)、电可擦除可编程只读存储器(Electrically EPROM，EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory，RAM)，其用作外部高速缓存。通过示例性但不是限制性说明，许多形式的RAM可用，例如静态随机存取存储器(Static RAM，SRAM)、动态随机存取存储器(Dynamic RAM，DRAM)、同步动态随机存取存储器(Synchronous DRAM，SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data RateSDRAM，DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM，ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM，SLDRAM)和直接内存总线随机存取存储器(DirectRambus RAM，DRRAM)。本发明实施例描述的系统和方法的存储器802旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the memory 802 in the embodiment of the present invention may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories. Wherein, the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash. The volatile memory can be Random Access Memory (RAM), which acts as an external cache. By way of illustration and not limitation, many forms of RAM are available such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data RateSDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (Synchlink DRAM, SLDRAM) and Direct memory bus random access memory (DirectRambus RAM, DRRAM). The memory 802 of the systems and methods described in embodiments of the present invention is intended to include, but is not limited to, these and any other suitable types of memory.

在一些实施方式中，存储器802存储了如下的元素，可执行模块或者数据结构，或者他们的子集，或者他们的扩展集：操作系统8021和应用程序8022。In some implementations, the memory 802 stores the following elements, executable modules or data structures, or their subsets, or their extended sets: an operating system 8021 and an application program 8022 .

其中，操作系统8021，包含各种系统程序，例如框架层、核心库层、驱动层等，用于实现各种基础业务以及处理基于硬件的任务。应用程序8022，包含各种应用程序，例如媒体播放器(Media Player)、浏览器(Browser)等，用于实现各种应用业务。实现本发明实施例方法的程序可以包含在应用程序8022中。Among them, the operating system 8021 includes various system programs, such as framework layer, core library layer, driver layer, etc., for realizing various basic services and processing hardware-based tasks. The application program 8022 includes various application programs, such as a media player (Media Player), a browser (Browser), etc., and is used to implement various application services. The program for realizing the method of the embodiment of the present invention may be included in the application program 8022 .

在本发明实施例中，通过调用存储器802存储的程序或指令，具体的，可以是应用程序8022中存储的程序或指令，处理器801用于获取移动终端的麦克风采集的第一声音信号；计算得到所述第一声音信号的短时过零率，所述短时过零率包括5ms-15ms内的声音的过零率；去除所述第一声音信号的低频部分和高频部分，得到第二声音信号；获取所述第二声音信号的能量数据；根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号；获取所述第三声音信号的啸叫信号；在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号；放大处理所述第四声音信号。In the embodiment of the present invention, by calling the program or instruction stored in the memory 802, specifically, the program or instruction stored in the application program 8022, the processor 801 is used to obtain the first sound signal collected by the microphone of the mobile terminal; Obtain the short-term zero-crossing rate of the first sound signal, the short-time zero-crossing rate includes the zero-crossing rate of the sound within 5ms-15ms; remove the low-frequency part and high-frequency part of the first sound signal to obtain the second Two sound signals; acquiring energy data of the second sound signal; removing the non-speech part of the second sound signal according to the short-term zero-crossing rate and the energy data to obtain a third sound signal; obtaining the A howling signal of the third sound signal; removing the single-frequency signal of the howling signal from the third sound signal to obtain a fourth sound signal; amplifying and processing the fourth sound signal.

上述本发明实施例揭示的方法可以应用于处理器801中，或者由处理器801实现。处理器801可能是一种集成电路芯片，具有信号的处理能力。在实现过程中，上述方法的各步骤可以通过处理器801中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器801可以是通用处理器、数字信号处理器(Digital Signal Processor，DSP)、专用集成电路(Application Specific Integrated Circuit，ASIC)、现成可编程门阵列(FieldProgrammable Gate Array，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成，或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器，闪存、只读存储器，可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器802，处理器801读取存储器802中的信息，结合其硬件完成上述方法的步骤。The methods disclosed in the foregoing embodiments of the present invention may be applied to the processor 801 or implemented by the processor 801 . The processor 801 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 801 or instructions in the form of software. The above-mentioned processor 801 may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA) or other available Program logic devices, discrete gate or transistor logic devices, discrete hardware components. Various methods, steps and logic block diagrams disclosed in the embodiments of the present invention may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the methods disclosed in the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 802, and the processor 801 reads the information in the memory 802, and completes the steps of the above method in combination with its hardware.

可以理解的是，本发明实施例描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现，处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits，ASIC)、数字信号处理器(Digital SignalProcessing，DSP)、数字信号处理设备(DSP Device，DSPD)、可编程逻辑设备(ProgrammableLogic Device，PLD)、现场可编程门阵列(Field-Programmable Gate Array，FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。It can be understood that the embodiments described in the embodiments of the present invention may be implemented by hardware, software, firmware, middleware, microcode or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processor (Digital Signal Processing, DSP), digital signal processing device (DSP Device, DSPD), programmable logic Device (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), general-purpose processor, controller, microcontroller, microprocessor, other electronic units for performing the functions described in this application or a combination thereof.

对于软件实现，可通过执行本发明实施例所述功能的模块(例如过程、函数等)来实现本发明实施例所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。For software implementation, the techniques described in the embodiments of the present invention may be implemented through modules (such as procedures, functions, etc.) that execute the functions described in the embodiments of the present invention. Software codes can be stored in memory and executed by a processor. Memory can be implemented within the processor or external to the processor.

可选地，处理器801还用于：分析所述短时过零率和所述能量数据，得到所述第二声音信号的频率数据和振幅数据；根据所述频率数据和所述振幅数据，判断所述第二声音信号中是否存在所述非语音部分；若是，静音处理所述非语音部分，得到所述第三声音信号。Optionally, the processor 801 is further configured to: analyze the short-term zero-crossing rate and the energy data to obtain frequency data and amplitude data of the second sound signal; according to the frequency data and the amplitude data, Judging whether the non-speech part exists in the second sound signal; if yes, silently processing the non-speech part to obtain the third sound signal.

可选地，处理器801还用于：根据所述能量数据，获取所述第三声音信号的低频、中频、高频的最大能量信号；判断所述最大能量信号是否为持续信号；若是，确定所述最大能量信号为啸叫信号。Optionally, the processor 801 is further configured to: acquire, according to the energy data, the maximum energy signals of the low frequency, medium frequency, and high frequency of the third sound signal; determine whether the maximum energy signal is a continuous signal; if so, determine The maximum energy signal is a howling signal.

可选地，处理器801还用于：使用自适应陷波滤波器处理所述第三声音信号，去除所述啸叫信号的单频信号，得到所述第四声音信号。Optionally, the processor 801 is further configured to: use an adaptive notch filter to process the third sound signal, remove a single-frequency signal of the howling signal, and obtain the fourth sound signal.

可选地，处理器801还用于：对所述第二声音信号进行快速傅里叶变换，得到所述第二声音信号的频谱数据；从所述频谱数据中获取所述能量数据，所述能量数据包括所述第二声音信号在低频、中频、高频的能量峰值信号。Optionally, the processor 801 is further configured to: perform a fast Fourier transform on the second sound signal to obtain spectral data of the second sound signal; obtain the energy data from the spectral data, the The energy data includes energy peak signals of the second sound signal at low frequency, medium frequency, and high frequency.

移动终端800能够实现前述实施例中移动终端实现的各个过程，为避免重复，这里不再赘述。The mobile terminal 800 can implement various processes implemented by the mobile terminal in the foregoing embodiments, and details are not repeated here to avoid repetition.

本发明实施例提供的移动终端800，通过获取移动终端的麦克风采集的第一声音信号；计算得到所述第一声音信号的短时过零率，所述短时过零率包括5ms-15ms内的声音的过零率；去除所述第一声音信号的低频部分和高频部分，得到第二声音信号；获取所述第二声音信号的能量数据；根据所述短时过零率和所述能量数据，去除所述第二声音信号的非语音部分，得到第三声音信号；获取所述第三声音信号的啸叫信号；在所述第三声音信号中，去除所述啸叫信号的单频信号，得到第四声音信号；放大处理所述第四声音信号，实现了结合人声识别，在抑制或消除啸叫声的同时保证人声的播放质量，获得更好的用户体验。The mobile terminal 800 provided in the embodiment of the present invention obtains the first sound signal collected by the microphone of the mobile terminal; calculates the short-term zero-crossing rate of the first sound signal, and the short-time zero-crossing rate includes 5ms-15ms The zero-crossing rate of the sound; remove the low-frequency part and high-frequency part of the first sound signal to obtain the second sound signal; obtain the energy data of the second sound signal; according to the short-term zero-crossing rate and the Energy data, removing the non-speech part of the second sound signal to obtain a third sound signal; obtaining the howling signal of the third sound signal; removing a single part of the howling signal from the third sound signal frequency signal to obtain a fourth sound signal; the fourth sound signal is amplified and processed to achieve a combination of human voice recognition, while suppressing or eliminating howling sound while ensuring the playback quality of human voice, and obtaining a better user experience.

本领域普通技术人员可以意识到，结合本发明实施例中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed in the embodiments of the present invention can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present invention.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统、装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

在本申请所提供的实施例中，应该理解到，所揭露的装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage medium includes: various media capable of storing program codes such as U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应以权利要求的保护范围为准。The above is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Anyone skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present invention. Should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims

1. a kind of processing method of acoustical signal, it is characterised in that include：

Obtain the first acoustical signal of the mike collection of mobile terminal；

The short-time zero-crossing rate of first acoustical signal is calculated, the short-time zero-crossing rate includes the sound in 5ms-15ms Zero-crossing rate；

The low frequency part and HFS of first acoustical signal are removed, second sound signal is obtained；

Obtain the energy datum of the second sound signal；

The non-speech portion of the second sound signal according to the short-time zero-crossing rate and the energy datum, is removed, the is obtained Three acoustical signals；

Obtain the signal of uttering long and high-pitched sounds of the 3rd acoustical signal；

In the 3rd acoustical signal, the simple signal of signal of uttering long and high-pitched sounds described in removal obtains falling tone message number；

Falling tone message number described in processing and amplifying.

2. method according to claim 1, it is characterised in that described according to the zero-crossing rate and the energy datum, goes Except the non-speech portion of the second sound signal, the step of obtain three acoustical signals, including：

The short-time zero-crossing rate and the energy datum are analyzed, the frequency data and amplitude number of the second sound signal are obtained According to；

According to the frequency data and the amplitude data, in judging the second sound signal, whether there is the non-voice portion Point；

If so, non-speech portion described in silence processing, obtains the 3rd acoustical signal.

3. method according to claim 1, it is characterised in that the signal of uttering long and high-pitched sounds of the acquisition the 3rd acoustical signal Step, including：

According to the energy datum, the low frequency of acquisition the 3rd acoustical signal, intermediate frequency, the ceiling capacity signal of high frequency；

Judge whether the ceiling capacity signal is persistent signal；

If so, the ceiling capacity signal is determined for signal of uttering long and high-pitched sounds.

4. method according to claim 1, it is characterised in that described in the 3rd acoustical signal, removes the howl The simple signal of signal is, the step of obtain falling tone message, including：

Process the 3rd acoustical signal using adaptive notch filter, remove described in utter long and high-pitched sounds the simple signal of signal, obtain The falling tone message number.

5. method according to claim 1, it is characterised in that the energy datum of the acquisition second sound signal Step, including：

Fast Fourier transform is carried out to the second sound signal, the frequency spectrum data of the second sound signal is obtained；

Obtain the energy datum from the frequency spectrum data, the energy datum include the second sound signal low frequency, The energy peak signal of intermediate frequency, high frequency.

6. a kind of mobile terminal, it is characterised in that include：

First acquisition module, the first acoustical signal that the mike for obtaining mobile terminal is gathered；

Computing module, for being calculated the short-time zero-crossing rate of first acoustical signal, the short-time zero-crossing rate includes 5ms- The zero-crossing rate of the sound in 15ms；

First filtration module, for removing the low frequency part and HFS of first acoustical signal, obtains rising tone message Number；

Second acquisition module, for obtaining the energy datum of the second sound signal；

Quiet module, for according to the short-time zero-crossing rate and the energy datum, removing the non-language of the second sound signal Line point, obtains the 3rd acoustical signal；

3rd acquisition module, for obtaining the signal of uttering long and high-pitched sounds of the 3rd acoustical signal；

Second filtration module, for the simple signal of signal of in the 3rd acoustical signal, uttering long and high-pitched sounds described in removal, obtains the 4th Acoustical signal；

Amplification module, for falling tone message number described in processing and amplifying.

7. mobile terminal according to claim 6, it is characterised in that the quiet module includes：

Analytic unit, for analyzing the short-time zero-crossing rate and the energy datum, obtains the frequency of the second sound signal Data and amplitude data；

First judging unit, for according to the frequency data and the amplitude data, in judging the second sound signal being It is no to there is the non-speech portion；

Quiet unit, for non-speech portion described in silence processing, obtains the 3rd acoustical signal.

8. mobile terminal according to claim 6, it is characterised in that the 3rd acquisition module includes：

First acquisition unit, for according to the energy datum, obtaining the low frequency of the 3rd acoustical signal, intermediate frequency, high frequency Ceiling capacity signal；

Second judging unit, for judging whether the ceiling capacity signal is persistent signal；

Determining unit, for determining the ceiling capacity signal for signal of uttering long and high-pitched sounds.

9. mobile terminal according to claim 6, it is characterised in that second filtration module includes：

Filter unit, for processing the 3rd acoustical signal using adaptive notch filter, remove described in utter long and high-pitched sounds signal Simple signal, obtains the falling tone message number.

10. mobile terminal according to claim 6, it is characterised in that second acquisition module includes：

Fourier transform unit, for carrying out fast Fourier transform to the second sound signal, obtains the second sound The frequency spectrum data of signal；

Second acquisition unit, for the energy datum is obtained from the frequency spectrum data, the energy datum includes described Two acoustical signals low frequency, intermediate frequency, high frequency energy peak signal.