CN104867498A

CN104867498A - Mobile communication terminal and voice enhancement method and module thereof

Info

Publication number: CN104867498A
Application number: CN201510164111.0A
Authority: CN
Inventors: 王丹; 舒畅; 张国新; 王雪祥
Original assignee: Shenzhen Micro & Nano Integrated Circuit And System Application Institute
Current assignee: Shenzhen Micro & Nano Integrated Circuit And System Application Institute
Priority date: 2014-12-26
Filing date: 2015-04-09
Publication date: 2015-08-26

Abstract

The invention provides a voice enhancement method comprising the following steps that S101, each frame of input voice is transformed to a frequency domain and divided into multiple sub-bands; S103, signal-to-noise ratio of each sub-band is estimated; S105, voice and noise detection is performed; S107, and gain of the sub-bands is modified according to the size of the signal-to-noise ratio of each sub-band. The invention also provides a voice enhancement module and a mobile communication terminal. The voice enhancement method and module have significant noise suppression effect, and noise can be eliminated so that computation amount of the whole system is greatly reduced. A better voice communication effect can be acquired by utilizing the voice enhancement module and the mobile communication terminal so that environment applicability is relatively high.

Description

A kind of communication terminal and sound enhancement method thereof and module

Technical field

The present invention relates to the communication technology, particularly relate to a kind of communication terminal and sound enhancement method thereof and module.

Background technology

In speech communication, neighbourhood noise (as air-conditioning, fan, computing machine, noisy environment etc.) very easily produces interference to the voice of speaker, thus voice quality is declined, and affects the performance of whole communication system.For solving this problem, usually adopt speech enhan-cement (or being called squelch) technology.

Since the latter stage seventies, studying various speech enhancement technique both at home and abroad always, propose spectrum subtraction method, Wiener Filtering, kalman filter method etc., and be applied in actual communication system, these technology can improve the voice quality in communication preferably.

As shown in Figure 1, for the process flow diagram of noise estimation method followed the tracks of based on minimum value adopted in prior art, the method first carries out filtering with an optimal smoothing filtering to the power spectrum of noisy speech, obtain the guestimate of a noise, then the minimum value in the certain hour window in rough noise is found out, finally some drift correction are carried out to this minimum value, namely obtain the variance of the noise that will estimate.

But there is following shortcoming in this method: one, system operations amount are large; Two, require ground unrest held stationary, signal to noise ratio (S/N ratio) is higher, is difficult to match with actual conditions.

Summary of the invention

Based on this, be necessary the problems referred to above existed for prior art, provide a kind of communication terminal and sound enhancement method thereof and module, to solve prior art Problems existing.

A kind of sound enhancement method, it comprises the steps: S101, every frame is inputted phonetic modification to frequency domain, and is divided into multiple subband; S103, estimate the signal to noise ratio (S/N ratio) of each described subband; S105, carry out the detection of voice and noise; S 107, size according to the signal to noise ratio (S/N ratio) of each described subband, the gain of amendment subband.

In the present invention one better embodiment, in step S101, by every frame input phonetic modification to frequency domain, and be divided into 16 subbands.

In the present invention one better embodiment, in step S105, utilize voice to estimate detection that mechanism carries out voice and noise.

In the present invention one better embodiment, adopt update_flag as the mark of speech detection, advanced row speech enhan-cement, then enters speech detection, then according to the result of speech detection and the feature of voice signal, carries out auto level control.

The present invention provides a kind of speech enhan-cement module in addition, it comprises the voice-input unit, Audio Processing Unit and the voice-output unit that connect successively, voice input described Audio Processing Unit from described voice-input unit, every frame is inputted phonetic modification to frequency domain by described Audio Processing Unit, and be divided into multiple subband, then the signal to noise ratio (S/N ratio) of each described subband is estimated, carry out the detection of voice and noise again, finally according to the size of the signal to noise ratio (S/N ratio) of each described subband, the gain of amendment subband, and export via described voice-output unit.

In the present invention one better embodiment, every frame is inputted phonetic modification to frequency domain by described Audio Processing Unit, and is divided into 16 subbands.

In the present invention one better embodiment, utilize voice the to estimate detection that mechanism carries out voice and noise of described Audio Processing Unit.

The present invention also provides a kind of communication terminal, and it comprises above-mentioned speech enhan-cement module.

Compared to prior art, sound enhancement method provided by the invention and module tool have the following advantages: one, noise suppression effect are remarkable, can stress release treatment; Two, in low signal-to-noise ratio situation, system performance declines less; Three, not only can suppress common noise, also can suppress narrow band noise, unexpected very noisy, and have the ability suppressing nonstationary noise; Four, voice enhancement algorithm, VAD, ALC three organically combine together, greatly reduce the calculated amount of whole system.Utilize the communication terminal of described speech enhan-cement module can obtain preferably speech communication effect, the applicability of environment is higher.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the noise estimation method based on minimum value tracking that prior art adopts;

Fig. 2 is the process flow diagram of sound enhancement method provided by the invention;

Fig. 3 is the workflow diagram of sound enhancement method described in Fig. 2;

Fig. 4 is the schematic diagram of speech enhan-cement module provided by the invention.

Embodiment

For the ease of understanding the present invention, below with reference to relevant drawings, the present invention is described more fully.Better embodiment of the present invention is given in accompanying drawing.But the present invention can realize in many different forms, is not limited to embodiment described herein.On the contrary, provide the object of these embodiments be make to disclosure of the present invention understand more thorough comprehensively.

It should be noted that, when element is called as " being fixed on " another element, directly can there is element placed in the middle in it on another element or also.When an element is considered to " connection " another element, it can be directly connected to another element or may there is centering elements simultaneously.Term as used herein " vertical ", " level ", " left side ", " right side " and similar statement just for illustrative purposes, do not represent it is unique embodiment.

Unless otherwise defined, all technology used herein and scientific terminology are identical with belonging to the implication that those skilled in the art of the present invention understand usually.The object of term used in the description of the invention herein just in order to describe concrete embodiment, is not intended to be restriction the present invention.Term as used herein " and/or " comprise arbitrary and all combinations of one or more relevant Listed Items.

Refer to Fig. 2, the invention provides a kind of sound enhancement method, it comprises the steps: S101, every frame is inputted phonetic modification to frequency domain, and is divided into multiple subband; S103, estimate the signal to noise ratio (S/N ratio) of each described subband; S105, carry out the detection of voice and noise; S107, size according to the signal to noise ratio (S/N ratio) of each described subband, the gain of amendment subband.

In the present embodiment, by every frame input phonetic modification to frequency domain, and be divided into 16 subbands; Then the signal to noise ratio (S/N ratio) of each subband is estimated; The detection that (Voice Metric) mechanism carries out voice and noise estimated in recycling voice, realizes the accurate estimation of ground unrest; Finally according to the size of the signal to noise ratio (S/N ratio) of each described subband, the gain of amendment subband, thus realize the object of squelch.

Further, refer to Fig. 3, the present invention, by computing in subband after Fourier transform, then obtains time domain output signal through inverse Fourier transform.Particularly, for making full use of the resource of voice enhancement algorithm, the present invention adopts the update_flag in voice enhancement algorithm as the mark of VAD (Voice Activity Detection, voice activity detection, also known as speech terminals detection or speech endpoint detection).Advanced row speech enhan-cement, then carries out VAD, thus makes VAD better effects if herein.Again according to the result of VAD and the feature of voice signal, effective auto level control (Automatic Level Control, ALC) can be carried out.

The ultimate principle of auto level control is: current energy and long-term average energy are compared, thus determines it is strengthen or decay voice, upgrades according to current energy to long-term average energy simultaneously.

Described sound enhancement method adopts the voice enhancement algorithm of the improvement based on spectrum subtraction and subband combine with technique, by speech detection method and automatic level control method based on the parameter all in voice enhancement algorithm, thus make speech enhan-cement, VAD, ALC three organically combines together, greatly reduce the calculated amount of whole system, not only can suppress common noise, also can suppress narrow band noise, unexpected very noisy, and have the ability suppressing nonstationary noise.

Refer to Fig. 4, the present invention provides a kind of speech enhan-cement module 100 in addition, it comprises the voice-input unit 10 connected successively, Audio Processing Unit 20 and voice-output unit 30, voice input described Audio Processing Unit 20 from described voice-input unit 10, every frame is inputted phonetic modification to frequency domain by described Audio Processing Unit 20, and be divided into multiple subband, then the signal to noise ratio (S/N ratio) of each described subband is estimated, carry out the detection of voice and noise again, finally according to the size of the signal to noise ratio (S/N ratio) of each described subband, the gain of amendment subband, and export via described voice-output unit 30.

In the present embodiment, every frame is inputted phonetic modification to frequency domain by described Audio Processing Unit 20, and is divided into 16 subbands; Then the signal to noise ratio (S/N ratio) of each subband is estimated; The detection that mechanism carries out voice and noise estimated in recycling voice, realizes the accurate estimation of ground unrest; Finally according to the size of the signal to noise ratio (S/N ratio) of each described subband, the gain of amendment subband, thus realize the object of squelch.

Further, see also Fig. 3, the present invention, by computing in subband after Fourier transform, then obtains time domain output signal through inverse Fourier transform.Particularly, for making full use of the resource of voice enhancement algorithm, described Audio Processing Unit 20 adopts the update_flag in voice enhancement algorithm as the mark of VAD.Advanced row speech enhan-cement, then carries out VAD, thus makes VAD better effects if herein.Again according to the result of VAD and the feature of voice signal, effective auto level control can be carried out.

Described speech enhan-cement module 100 adopts the voice enhancement algorithm of the improvement based on spectrum subtraction and subband combine with technique, by speech detection method and automatic level control method based on the parameter all in voice enhancement algorithm, thus make speech enhan-cement, VAD, ALC three organically combines together, greatly reduce the calculated amount of whole system, not only can suppress common noise, also can suppress narrow band noise, unexpected very noisy, and have the ability suppressing nonstationary noise.

The present invention also provides a kind of communication terminal, and it comprises above-mentioned speech enhan-cement module 100.Utilize the communication terminal of described speech enhan-cement module 100 can obtain preferably speech communication effect.

Compared to prior art, sound enhancement method provided by the invention and module tool have the following advantages: one, noise suppression effect are remarkable, can stress release treatment; Two, in low signal-to-noise ratio situation, system performance declines less; Three, not only can suppress common noise, also can suppress narrow band noise, unexpected very noisy, and have the ability suppressing nonstationary noise; Four, voice enhancement algorithm, VAD, ALC three organically combine together, greatly reduce the calculated amount of whole system, and the applicability of environment is higher.

The above embodiment only have expressed several embodiment of the present invention, and it describes comparatively concrete and detailed, but therefore can not be interpreted as the restriction to the scope of the claims of the present invention.It should be pointed out that for the person of ordinary skill of the art, without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims

1. a sound enhancement method, is characterized in that, comprises the steps:

S101, by every frame input phonetic modification to frequency domain, and be divided into multiple subband;

S103, estimate the signal to noise ratio (S/N ratio) of each described subband;

S105, carry out the detection of voice and noise;

S107, size according to the signal to noise ratio (S/N ratio) of each described subband, the gain of amendment subband.

2. sound enhancement method as claimed in claim 1, is characterized in that, in step S101, by every frame input phonetic modification to frequency domain, and is divided into 16 subbands.

3. sound enhancement method as claimed in claim 1, is characterized in that, in step S105, and utilize voice to estimate detection that mechanism carries out voice and noise.

4. sound enhancement method as claimed in claim 1, is characterized in that, adopts update_flag as the mark of speech detection, advanced row speech enhan-cement, then enter speech detection, then according to the result of speech detection and the feature of voice signal, carry out auto level control.

5. a speech enhan-cement module, it is characterized in that, comprise the voice-input unit, Audio Processing Unit and the voice-output unit that connect successively, voice input described Audio Processing Unit from described voice-input unit, every frame is inputted phonetic modification to frequency domain by described Audio Processing Unit, and be divided into multiple subband, then the signal to noise ratio (S/N ratio) of each described subband is estimated, carry out the detection of voice and noise again, finally according to the size of the signal to noise ratio (S/N ratio) of each described subband, the gain of amendment subband, and export via described voice-output unit.

6. speech enhan-cement module as claimed in claim 5, is characterized in that, every frame is inputted phonetic modification to frequency domain by described Audio Processing Unit, and is divided into 16 subbands.

7. speech enhan-cement module as claimed in claim 5, is characterized in that, utilize voice the to estimate detection that mechanism carries out voice and noise of described Audio Processing Unit.

8. a communication terminal, is characterized in that, described communication terminal comprises the speech enhan-cement module described in any one of claim 5 ~ 7.