CN104575515A

CN104575515A - Method and device for improving voice quality

Info

Publication number: CN104575515A
Application number: CN201310503510.6A
Authority: CN
Inventors: 孙焘; 梁超
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2013-10-23
Filing date: 2013-10-23
Publication date: 2015-04-29
Also published as: WO2014161388A1

Abstract

The invention discloses a method and a device for improving voice quality. The method comprises the following steps: extracting a characteristic voice signal from voice signals to be processed (such as voice signals input into a loudspeaker); adjusting the amplitude of the extracted characteristic voice signal according to a preset rule to enable the amplitude to be within a preset amplitude range to ensure relatively good voice quality; reconstructing the adjusted characteristic voice signal and other voice signals in the voice signals to obtain a processed voice signal which has relatively good voice quality after being processed. By adopting the method and the device for improving the voice quality, the quality of the voice signals can be improved without increasing the size of the loudspeaker or increasing the electric power input into the loudspeaker, various problems caused by the size increase of the loudspeaker and input electric power increase can be avoided, and relatively good experience can be brought to users.

Description

A kind of method and device improving voice quality

Technical field

The present invention relates to field of voice signal, be specifically related to a kind of method and the device that improve voice quality.

Background technology

At present due to the size of mobile terminals such as mobile phone and Power Limitation, its loud speaker adopted is little, and sound cavity volume needed for reserving to loud speaker is also very little; In addition, the current mobile phone communication overwhelming majority is all based upon CS territory, be limited to the bandwidth of the switched circuit of core net, speech coding algorithm often only adopts the voice of 300Hz-3400Hz, even if wideband speech signal also expands to about 6000Hz at most, and according to the loud speaker of prior art manufacture and the sound chamber design that matches, in order to improve distorsion when volume often makes voice signal be delivered to loud speaker sonification system, and definition is inadequate.The mode receiving voice quality and the longest employing of loudness at present in order to improve terminal speaker promotes tonequality and loudness from hardware.This way needs the volume increasing loud speaker on the one hand, the feature of loud speaker is the increase along with volume, effective acoustic radiation power also can increase, thus it is smaller to make up small size loud speaker acoustic radiation power, the problem that decay in transmission way is large, ensure more voice signal and enter people's ear, thus definition when improving call and differentiability.Improve the electrical power that circuit is input to loud speaker on the other hand, loud speaker can be made like this with higher power work, thus make up the decay of sound in transmission way, ensure more voice signal and enter people's ear, definition when so also can improve call and differentiability.But all there is very large defect in these modes, the increase of loud speaker volume not only self can have larger requirement to the space in terminal, and corresponding sound chamber also will increase, otherwise tonequality and volume still can be influenced, and for the existing mobile terminals such as the mobile phone of ultrathin development that trend towards, so increasing volume requirement cannot meet; Therefore when loud speaker finite volume, volume and tonequality can only be promoted by improving the electrical power of input loudspeaker, but be easy to the situation that occurs power of loudspeaker overload, distorsion even damages loud speaker like this.

Summary of the invention

The main technical problem to be solved in the present invention is, provides a kind of method and the device that improve voice quality, solves the problem that existing raising voice quality need increase the electrical power of loud speaker volume and raising input loudspeaker.

For solving the problems of the technologies described above, the invention provides a kind of method improving voice quality, comprising:

Characteristic voice signal is extracted from pending voice signal;

The amplitude of the characteristic voice signal extracted is adjusted according to preset rules;

Other voice signals that characteristic voice signal after adjustment and described pending voice signal comprise are carried out rebuilding and obtains the voice signal after processing.

In an embodiment of the present invention, described characteristic voice signal comprises voice fundamental signal and/or voice Unvoiced signal.

In an embodiment of the present invention, when described characteristic voice signal comprises voice fundamental signal, described preset rules comprises:

When the amplitude of voice fundamental signal is less than minimum pitch signal amplitude thresholds, is adjusted to and is equal to or greater than described minimum pitch signal amplitude thresholds; When the amplitude of voice fundamental signal is greater than the highest pitch signal amplitude thresholds, be adjusted to the highest pitch signal amplitude thresholds described in being less than or equal to;

When described characteristic voice signal comprises voice Unvoiced signal, described preset rules comprises:

When the amplitude of voice Unvoiced signal is less than minimum Unvoiced signal amplitude thresholds, is adjusted to and is equal to or greater than described minimum Unvoiced signal amplitude thresholds; When the amplitude of voice Unvoiced signal is greater than the highest Unvoiced signal amplitude thresholds, be adjusted to the highest Unvoiced signal amplitude thresholds described in being less than or equal to.

In an embodiment of the present invention, after the amplitude of described characteristic voice signal is adjusted, before rebuilding based on the characteristic voice signal after adjustment, also comprise: judge whether the characteristic voice signal after adjusting meets preset requirement with the consistency of the primitive character voice signal extracted before; As no, then the amplitude of described characteristic voice signal is readjusted.

In an embodiment of the present invention, after other voice signals that the characteristic voice signal after adjustment and described pending voice signal comprise being carried out rebuilding the voice signal after obtaining process, also comprise:

Amplitude according to the described characteristic voice signal after adjustment carries out extension process to the described voice signal after process.

In order to solve the problem, present invention also offers a kind of device improving voice quality, comprising:

Voice extraction module, for extracting characteristic voice signal from pending voice signal;

Speech processing module, for adjusting according to preset rules the amplitude of the characteristic voice signal extracted;

Speech reconstructing module, other voice signals for the characteristic voice signal after adjustment and described pending voice signal being comprised carry out rebuilding the voice signal after obtaining process.

In an embodiment of the present invention, described device also comprises judge module, described judge module is used for after the amplitude of described speech processing module to described characteristic voice signal adjusts, before described speech reconstructing module is rebuild based on the characteristic voice signal after adjustment, whether the consistency of the primitive character voice signal extracted before judging the characteristic voice signal after adjusting and described voice extraction module meets preset requirement; As no, notify that the amplitude of described speech processing module to described characteristic voice signal is readjusted.

In an embodiment of the present invention, described device also comprises voice extension module, for after other voice signals that the characteristic voice signal after adjustment and described pending voice signal comprise are carried out rebuilding the voice signal after obtaining process by described speech reconstructing module, the amplitude according to the described characteristic voice signal after described speech processing module adjustment carries out extension process to the described voice signal after process.

The invention has the beneficial effects as follows:

The method of raising voice quality provided by the invention and device, extract characteristic voice signal to from pending voice signal (voice signal of such as input loudspeaker); Then make it in the amplitude range preset, to ensure better voice quality to the amplitude of the characteristic voice signal extracted according to the rule adjustment of presetting; Then other voice signals in the characteristic voice signal after adjustment and pending voice signal are carried out obtaining the better voice signal of voice quality after reconstruction obtains process; Therefore the method for this raising voice quality and device can neither need to increase loud speaker volume, the quality improving voice signal when the electrical power improving loud speaker input is not needed yet, the various problems that increase loud speaker volume and raising input electric power can be avoided to cause, can bring user and better experience.

Accompanying drawing explanation

Fig. 1 is the schematic flow sheet of the method improving voice quality in the embodiment of the present invention one;

Fig. 2 is the structural representation one of the device improving voice quality in the embodiment of the present invention two;

Fig. 3 is the structural representation two of the device improving voice quality in the embodiment of the present invention two;

Fig. 4 is the structural representation three of the device improving voice quality in the embodiment of the present invention two;

Fig. 5 is the schematic flow sheet of the method improving voice quality in the embodiment of the present invention three.

Embodiment

By reference to the accompanying drawings the present invention is described in further detail below by embodiment.

The present invention extracts characteristic voice signal to from pending voice signal; Then make it to the amplitude of the characteristic voice signal extracted in the amplitude range preset according to the rule adjustment of presetting; Then other voice signals in the characteristic voice signal after adjustment and pending voice signal are carried out obtaining the better voice signal of voice quality after reconstruction obtains process.For a better understanding of the present invention, below in conjunction with accompanying drawing and each embodiment, the present invention is described further.

Embodiment one:

Please refer to Fig. 1, the method for the raising voice quality that the present embodiment provides comprises:

Step 101: extract characteristic voice signal from pending voice signal;

In this step, concrete which kind of characteristic voice signal of extraction can be selected to arrange according to the voice signal situation of concrete input and concrete application scenarios, as long as this characteristic voice signal has certain representativeness and can meet the requirement of subsequent voice reconstruction;

Step 102: the amplitude of the characteristic voice signal extracted is adjusted according to preset rules;

This step mainly adjusts according to certain preset rules the amplitude of the characteristic voice signal extracted, and makes it in the amplitude range of the best; The amplitude range of this best specifically need according to concrete application scenarios and the selected setting of amplitude distribution situation when the voice signal of pre-treatment;

Step 103: other voice signals that the characteristic voice signal after adjustment and pending voice signal comprise are carried out rebuilding the voice signal after obtaining process.

Voice signal after this step process is compared with untreated voice signal, and at least one in the characteristic voice signal that it comprises is through range-adjusting, and the quality of the mass ratio voice signal before treatment of the voice signal after therefore rebuilding is good; And this processing mode does not need the volume increasing loud speaker, do not need the input electric power increasing loud speaker yet, the situation that power of loudspeaker transships, distorsion even damages loud speaker therefore also can not be caused to occur.

In the present embodiment, the characteristic voice signal extracted can be voice fundamental signal, also can be voice Unvoiced signal, or voice fundamental signal and voice Unvoiced signal; Which characteristic voice of concrete extraction can be selected to arrange according to specific circumstances; Such as, in pending voice signal, the voice Unvoiced signal that it comprises is less, or the amplitude of the voice Unvoiced signal that it comprises is all very low, the voice fundamental signal that it comprises is then many, now then can only extract voice fundamental signal and carry out above-mentioned process, this also can improve voice quality to a certain extent; On the contrary, when voice fundamental signal proportion is less, and voice Unvoiced signal proportion more than described voice fundamental signal a lot of time, then can only extract a voice Unvoiced signal and carry out above-mentioned process, also can improve voice quality to a certain extent; When voice fundamental signal and voice Unvoiced signal proportion similar time, then can extract voice fundamental signal and voice Unvoiced signal carries out above-mentioned process.Certainly, the concrete foundation extracting characteristic signal is not limited in above-mentioned situation, herein just as an indicative explanation.

In the present embodiment, when the characteristic voice signal extracted comprises voice fundamental signal, the preset rules of employing comprises:

When the amplitude of voice fundamental signal is less than minimum pitch signal amplitude thresholds, is adjusted to and is equal to or greater than minimum pitch signal amplitude thresholds; When the amplitude of voice fundamental signal is greater than the highest pitch signal amplitude thresholds, be adjusted to the highest pitch signal amplitude thresholds described in being less than or equal to.

In the present embodiment, when the characteristic voice signal extracted comprises voice Unvoiced signal, the preset rules of employing comprises:

When the amplitude of voice Unvoiced signal is less than minimum Unvoiced signal amplitude thresholds, is adjusted to and is equal to or greater than minimum Unvoiced signal amplitude thresholds; When the amplitude of voice Unvoiced signal is greater than the highest Unvoiced signal amplitude thresholds, is adjusted to and is less than or equal to the highest Unvoiced signal amplitude thresholds.

In the present embodiment, in order to ensure voice quality further, its distortion is caused after preventing from adjusting the amplitude of characteristic voice signal, after the amplitude of characteristic voice signal is adjusted, before carrying out the reconstruction of voice signal based on the characteristic voice signal after adjustment, also comprise: judge whether the characteristic voice signal after adjusting meets preset requirement with the consistency of the primitive character voice signal extracted before; As no, then the amplitude of pending characteristic voice signal is readjusted.

In the present embodiment, after above-mentioned steps 103, after other voice signals that characteristic voice signal after adjustment and pending voice signal comprise being carried out rebuilding the voice signal after obtaining process, in order to ensure and improve the saturation of its voice signal further, also can comprise the following steps:

Amplitude according to the characteristic voice signal after adjustment carries out extension process to the voice signal after process; Such as, the frequency distribution scope of primitive tone signal is 200Hz-3400Hz; The frequency distribution scope obtained after carrying out extension process according to the amplitude of the characteristic voice signal after whole to the voice signal after process may be 50Hz-5000Hz; To improve the saturation of this voice signal.

Embodiment two:

Please refer to Fig. 2, the device of the raising voice quality that the present embodiment provides comprises:

Voice extraction module, for extracting characteristic voice signal from pending voice signal; Its concrete which kind of characteristic voice signal of extraction can select setting according to the voice signal situation of concrete input and concrete application scenarios, as long as this characteristic voice signal has certain representativeness and can meet the requirement of subsequent voice reconstruction;

Speech processing module, for adjusting according to preset rules the amplitude of the characteristic voice signal extracted; The object of carrying out adjusting makes the amplitude of characteristic voice signal in the amplitude range of the best; The amplitude range of this best specifically need according to concrete application scenarios and the selected setting of amplitude distribution situation when the voice signal of pre-treatment;

Speech reconstructing module, other voice signals for the characteristic voice signal after adjustment and pending voice signal being comprised carry out rebuilding the voice signal after obtaining process.

In the present embodiment, in order to ensure voice quality further, its distortion is caused after preventing from adjusting the amplitude of characteristic voice signal, shown in Figure 3, device in the present embodiment also can comprise judge module, it is for after the amplitude of speech processing module to characteristic voice signal adjusts, before speech reconstructing module is rebuild based on the characteristic voice signal after adjustment, whether the consistency of the primitive character voice signal extracted before judging the characteristic voice signal after adjusting and voice extraction module meets preset requirement; As no, the amplitude of notice speech processing module to pending characteristic voice signal is readjusted.

In the present embodiment, in order to ensure and improve the saturation of its voice signal further, shown in Figure 4, this device also can comprise voice extension module: for after other voice signals that the characteristic voice signal after adjustment and pending voice signal comprise are carried out rebuilding the voice signal after obtaining process by speech reconstructing module, the amplitude according to the characteristic voice signal after speech processing module adjustment carries out extension process to the amplitude of the language characteristic voice signal after process to the voice signal after process; Such as, the frequency distribution scope of primitive tone signal is 200Hz-3400Hz; The frequency distribution scope obtained after carrying out extension process according to the amplitude of the characteristic voice signal after whole to the voice signal after process may be 50Hz-5000Hz; To improve the saturation of this voice signal.

Embodiment three:

For a better understanding of the present invention, be that example is described below in conjunction with a concrete application scenarios.

The present embodiment is for mobile phone, and the voice signal that the PCM data module of mobile phone obtains the descending PCM data format of mobile phone from the standard pcm interface of mobile phone is that example is described as pending voice signal.At the present embodiment, the characteristic voice signal extracted is voice Unvoiced signal and voice fundamental signal.It should be noted that, when extract characteristic voice signal be voice Unvoiced signal and voice fundamental signal time, can carry out the adjustment process of voice Unvoiced signal and voice fundamental signal amplitude simultaneously, also can first adjust voice Unvoiced signal amplitude again after the adjustment of voice fundamental signal amplitude, or after the range-adjusting first to voice Unvoiced signal, then voice fundamental signal amplitude is adjusted.During to speech reconstructing, also first to after the voice fundamental signal after adjustment and the synthesis of voice Unvoiced signal, can rebuild in conjunction with other voice signals in primitive tone signal.

Shown in Figure 5, this processing procedure comprises:

Step 501: obtain the voice signal of PCM data format as pending voice signal;

Step 502: the spectrum signature obtaining this pending voice signal;

Step 503: extract voice fundamental signal and voice Unvoiced signal in the speech signal spec-trum from step 502;

Step 504: carry out adjustment to the amplitude of the voice fundamental signal extracted according to the rule of setting and control, concrete adjusted value can specifically be determined based on experience value;

Step 505: judge whether the consistency of the voice fundamental signal that the voice fundamental signal after adjusting and original extraction go out meets the demands, and as met, goes to step 508, otherwise, go to step 504;

Step 506: carry out adjustment to the amplitude of the voice Unvoiced signal extracted according to the rule of setting and control, concrete adjusted value also can specifically be determined based on experience value;

Step 507: judge whether the consistency of the voice Unvoiced signal that the voice Unvoiced signal after adjusting and original extraction go out meets the demands, as met 508, otherwise, go to step 506;

Step 508: when the consistency of the clear pitch signal of language after the voice fundamental signal after adjusting and adjustment all meets the demands, by clear for the language after the voice fundamental signal after adjustment and adjustment pitch signal synthesis;

Step 509: carry out rebuilding based on other voice signals in the voice fundamental signal after synthesis and the clear pitch signal of language and primitive tone signal except voice fundamental signal and the clear pitch signal of language, extension process;

Step 510: the voice signal of the PCM data format finally obtained is exported.

Visible, the present invention passes through to extract characteristic voice signal from pending voice signal, after making it to its amplitude in the amplitude range preset according to the rule adjustment of presetting; Again other voice signals in itself and former pending voice signal carried out rebuilding, even further expand and can obtain the better voice signal of voice quality.

Above content is in conjunction with concrete execution mode further description made for the present invention, can not assert that specific embodiment of the invention is confined to these explanations.For general technical staff of the technical field of the invention, without departing from the inventive concept of the premise, some simple deduction or replace can also be made, all should be considered as belonging to protection scope of the present invention.

Claims

1. improve a method for voice quality, it is characterized in that comprising:

Characteristic voice signal is extracted from pending voice signal;

2. the method improving voice quality as claimed in claim 1, it is characterized in that, described characteristic voice signal comprises voice fundamental signal and/or voice Unvoiced signal.

3. the method improving voice quality as claimed in claim 2, it is characterized in that, when described characteristic voice signal comprises voice fundamental signal, described preset rules comprises:

4. the method for the raising voice quality as described in any one of claim 1-3, it is characterized in that, after the amplitude of described characteristic voice signal is adjusted, before rebuilding based on the characteristic voice signal after adjustment, also comprise: judge whether the characteristic voice signal after adjusting meets preset requirement with the consistency of the primitive character voice signal extracted before; As no, then the amplitude of described characteristic voice signal is readjusted.

5. the method for the raising voice quality as described in any one of claim 1-3, is characterized in that, after other voice signals that the characteristic voice signal after adjustment and described pending voice signal comprise being carried out rebuilding the voice signal after obtaining process, also comprises:

6. improve a device for voice quality, it is characterized in that comprising:

7. the device improving voice quality as claimed in claim 6, it is characterized in that, described characteristic voice signal comprises voice fundamental signal and/or voice Unvoiced signal.

8. the device improving voice quality as claimed in claim 7, it is characterized in that, when described characteristic voice signal comprises voice fundamental signal, described preset rules comprises:

9. the device of the raising voice quality as described in any one of claim 6-8, it is characterized in that, described device also comprises judge module, described judge module is used for after the amplitude of described speech processing module to described characteristic voice signal adjusts, before described speech reconstructing module is rebuild based on the characteristic voice signal after adjustment, whether the consistency of the primitive character voice signal extracted before judging the characteristic voice signal after adjusting and described voice extraction module meets preset requirement; As no, notify that the amplitude of described speech processing module to described characteristic voice signal is readjusted.

10. the device of the raising voice quality as described in any one of claim 6-8, it is characterized in that, described device also comprises voice extension module, for after other voice signals that the characteristic voice signal after adjustment and described pending voice signal comprise are carried out rebuilding the voice signal after obtaining process by described speech reconstructing module, the amplitude according to the described characteristic voice signal after described speech processing module adjustment carries out extension process to the described voice signal after process.