CN103812462A - Loudness control method and device - Google Patents

Loudness control method and device Download PDF

Info

Publication number
CN103812462A
CN103812462A CN201210460201.0A CN201210460201A CN103812462A CN 103812462 A CN103812462 A CN 103812462A CN 201210460201 A CN201210460201 A CN 201210460201A CN 103812462 A CN103812462 A CN 103812462A
Authority
CN
China
Prior art keywords
gain
voice signal
loudness
adjustment
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210460201.0A
Other languages
Chinese (zh)
Other versions
CN103812462B (en
Inventor
王田
吴文海
张德军
王凤玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201210460201.0A priority Critical patent/CN103812462B/en
Publication of CN103812462A publication Critical patent/CN103812462A/en
Application granted granted Critical
Publication of CN103812462B publication Critical patent/CN103812462B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

An embodiment of the invention provides a loudness control method and device. The loudness control method comprises performing voice detection on at least a circuit of voice signals; confirming voice loudness gain of the voice signal according to detected voice signals and confirming maximum tolerable gain of mute signals according to detected mute signals; generating into adjustment gain according to the voice loudness gain and the maximum tolerable gain; performing gain adjusting on a sound signal according to the adjusting gain. The loudness control method and device achieves control of audio loudness under an application scene with much noise or during the real-time communication process and improves the loudness control effect.

Description

Loudness control method and device
Technical field
The embodiment of the present invention relates to audio signal processing technique, relates in particular to a kind of loudness control method and device.
Background technology
Loudness is an index of signal of telecommunication voice metric energy while being converted to acoustic vibration, is the subjective feeling of human auditory system to sound intensity.Loudness control (Loudness Control) is main different to different frequency range perception with people's ear according to the loudness of signal, increases or deamplification, makes acoustic playback remain in an identical perception level or identical loudness.
The loudness control of prior art is mainly used in audio player, and the voice quality of the audio frequency of audio player plays is conventionally all relatively good.But, under the more application scenarios of noise or in real time communication process, by the loudness control of the prior art adjustment that gains, noise can be adjusted to poor effect according to the adjustment gain of voice.
Summary of the invention
The embodiment of the present invention provides a kind of loudness control method and device, to realize the loudness control to audio frequency under the more application scenarios of noise or in real time communication process, improves the effect of loudness control.
First aspect, the embodiment of the present invention provides a kind of loudness control method, comprising:
At least one road voice signal is carried out respectively to speech detection;
For each road voice signal, according to the voice signal detecting, determine the speech loudness gain of described voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of described mute signal;
According to the Gain generating adjustment gain of described speech loudness gain and described largest tolerable;
According to described adjustment gain to the adjustment that gains of described voice signal.
In the possible implementation of the first, described at least one road voice signal is carried out respectively to speech detection, comprising:
For described each road voice signal, calculate the root mean square of described voice signal;
Generate respectively signal envelope and noise envelope according to described root mean square;
Calculate the ratio of described signal envelope and described noise envelope, if described ratio is greater than the first predetermined threshold value, described voice signal detected, otherwise, described mute signal detected.
In conjunction with the possible implementation of the first of first aspect, in the possible implementation of the second, the mute signal that described basis detects, determines the gain of the largest tolerable of described mute signal, is specially:
According to the level of mute signal described in described root mean square calculation, determine the gain of described largest tolerable according to the level of described mute signal.
In conjunction with the possible implementation of the second of first aspect, in the third possible implementation, described according to the level of mute signal described in described root mean square calculation, be specially:
Apply the level Noise_Level that following formula calculates described mute signal:
Noise_Level=0.99×Noise_Level+0.01×Ecur;
Wherein, Ecur is described root mean square.
In the 4th kind of possible implementation, the voice signal that described basis detects, determines the speech loudness gain of described voice signal, is specially:
Described voice signal is carried out to loudness filtering processing, loudness filtering voice signal after treatment is carried out to signal level statistics, determine the level of described voice signal according to statistics, determine described speech loudness gain according to the level of described voice signal.
In the 5th kind of possible implementation, described according to the Gain generating adjustment gain of described speech loudness gain and described largest tolerable, be specially:
If the absolute value of described speech loudness gain is greater than the second predetermined threshold value, the following formula of application generates described adjustment gain G ain:
Gain=LGain×(1.0-(LGain+NGain)/(LGain×2));
Wherein, LGain is described speech loudness gain, the gain that NGain is described largest tolerable.
In the 6th kind of possible implementation, described according to described adjustment gain to the adjustment that gains of described voice signal, be specially:
Determine and adjust duration according to described voice signal, determine and adjust step-length according to described adjustment gain and described adjustment duration, according to described adjustment gain and described adjustment step-length to the adjustment that gains of described voice signal.
In the 7th kind of possible implementation, if described voice signal is two-way at least, after the described Gain generating adjustment according to described speech loudness gain and described largest tolerable increases, describedly according to described adjustment gain, described voice signal gain before adjustment, described method also comprises:
Calculate the difference of described speech loudness gain and the described adjustment gain on each road, determine difference maximum in each road, apply following formula the described adjustment gain G ain on each road is adjusted:
Gain=2×Gain–LDiffMax–LGain;
Wherein, the difference that LDiffMax is described maximum, LGain is described speech loudness gain.
Second aspect, the embodiment of the present invention provides a kind of loudness control device, comprising:
Detecting unit, for carrying out respectively speech detection at least one road voice signal;
The first processing unit, is connected with described detecting unit, for for each road voice signal, according to the voice signal detecting, determine the speech loudness gain of described voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of described mute signal;
The second processing unit, is connected with described the first processing unit, for gaining according to the Gain generating adjustment of described speech loudness gain and described largest tolerable;
The first adjustment unit, is connected with described the second processing unit, for according to described adjustment gain to the adjustment that gains of described voice signal.
In the possible implementation of the first, described detecting unit comprises:
First processes subelement, for for described each road voice signal, calculates the root mean square of described voice signal;
Second processes subelement, is connected, for generate respectively signal envelope and noise envelope according to described root mean square with described the first processing subelement;
Judgment sub-unit, is connected with described the second processing subelement, for calculating the ratio of described signal envelope and described noise envelope, if described ratio is greater than the first predetermined threshold value, described voice signal detected, otherwise, described mute signal detected.
In conjunction with the possible implementation of the first of second aspect, in the possible implementation of the second, described the first processing unit, specifically for according to the level of mute signal described in described root mean square calculation, is determined the gain of described largest tolerable according to the level of described mute signal.
In conjunction with the possible implementation of the second of second aspect, in the third possible implementation, described the first processing unit calculates the level Noise_Level of described mute signal specifically for applying following formula:
Noise_Level=0.99×Noise_Level+0.01×Ecur;
Wherein, Ecur is described root mean square.
In the 4th kind of possible implementation, described the first processing unit is specifically for carrying out loudness filtering processing to described voice signal, loudness filtering voice signal after treatment is carried out to signal level statistics, determine the level of described voice signal according to statistics, determine described speech loudness gain according to the level of described voice signal.
In the 5th kind of possible implementation, if described the second processing unit is greater than the second predetermined threshold value specifically for the absolute value of described speech loudness gain, the following formula of application generates described adjustment gain G ain:
Gain=LGain×(1.0-(LGain+NGain)/(LGain×2));
Wherein, LGain is described speech loudness gain, the gain that NGain is described largest tolerable.
In the 6th kind of possible implementation, described the first adjustment unit is specifically for determining and adjust duration according to described voice signal, determine and adjust step-length according to described adjustment gain and described adjustment duration, according to described adjustment gain and described adjustment step-length to the adjustment that gains of described voice signal.
In the 7th kind of possible implementation, if described voice signal is two-way at least, described loudness control device also comprises:
The second adjustment unit, is connected with described the first adjustment unit, for calculating the described speech loudness gain on each road and the difference that described adjustment gains, determines difference maximum in each road, applies following formula the described adjustment gain G ain on each road is adjusted:
Gain=2×Gain-LDiffMax-LGain;
Wherein, the difference that LDiffMax is described maximum, LGain is described speech loudness gain.
As shown from the above technical solution, loudness control method and device that the embodiment of the present invention provides, loudness control device carries out respectively speech detection at least one road voice signal, for each road voice signal, according to the voice signal detecting, determine the speech loudness gain of voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of mute signal, according to the Gain generating adjustment gain of speech loudness gain and largest tolerable, gain to the voice signal adjustment that gains according to adjusting.By the detection of mute signal being determined to the gain of largest tolerable, and simultaneously according to the Gain generating adjustment gain of speech loudness gain and largest tolerable, by this adjustment gain, the gain of voice signal is adjusted, make the voice signal after adjusting more meet the perception level of people's ear, improved widely the effect of loudness control.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
The first loudness control method flow chart that Fig. 1 provides for the embodiment of the present invention;
The second loudness control method flow chart that Fig. 2 provides for the embodiment of the present invention;
The first loudness control device structural representation that Fig. 3 provides for the embodiment of the present invention;
The second loudness control device structural representation that Fig. 4 provides for the embodiment of the present invention;
The third loudness control device structural representation that Fig. 5 provides for the embodiment of the present invention.
Embodiment
For making object, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
The first loudness control method flow chart that Fig. 1 provides for the embodiment of the present invention.As shown in Figure 1, the loudness control method that the present embodiment provides specifically can be applied to but be not limited to the real-time control procedure of the loudness to voice signal under virtual meeting scene, the loudness control method that the present embodiment provides can be carried out by loudness control device, this loudness control device can be integrated in audio processing equipment, also can arrange separately.This audio processing equipment specifically can be applied in conference system.
The loudness control method that the present embodiment provides specifically comprises:
Step 10, at least one road voice signal is carried out respectively to speech detection;
Step 20, for each road voice signal, according to the voice signal detecting, determine the speech loudness gain of described voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of described mute signal;
Step 30, according to the Gain generating adjustment gain of the gain of described speech loudness and described largest tolerable;
Step 40, according to described adjustment gain to the adjustment that gains of described voice signal.
Particularly, at least one road voice signal is the signal that need to carry out loudness control, and when under virtual meeting application scenarios, voice signal is generally multichannel.Audio processing equipment receives the monaural code stream of each terminal, and a monaural code stream is decoded after processing and obtained a road voice signal, more respectively each road voice signal is carried out to loudness control.
The voice signal receiving may be that voice signal may be also mute signal, and in the time that user speaks, this voice signal is voice signal, and in the time that user does not speak, this voice signal is mute signal.Voice signal is carried out to speech detection, voice signal or mute signal when judging this voice signal.This speech detection process specifically can adopt Voice activity detector (Voice Active Detection is called for short VAD) method to realize,
In the time voice signal being detected, determine the speech loudness gain of voice signal, in the time mute signal being detected, information entrained in mute signal is generally noise, determine the gain of the largest tolerable of mute signal, the calculating of the gain of speech loudness gain and largest tolerable specifically can realize according to playback gain (Replay Gain) standard.Gain according to the Gain generating adjustment of speech loudness gain and largest tolerable, adjust by the gain gain of Dui Gai road voice signal of adjustment, to realize loudness control.
The loudness control method that the present embodiment provides, loudness control device carries out respectively speech detection at least one road voice signal, for each road voice signal, according to the voice signal detecting, determine the speech loudness gain of voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of mute signal, according to the Gain generating adjustment gain of speech loudness gain and largest tolerable, gain to the voice signal adjustment that gains according to adjusting.By the detection of mute signal being determined to the gain of largest tolerable, and simultaneously according to the Gain generating adjustment gain of speech loudness gain and largest tolerable, by this adjustment gain, the gain of voice signal is adjusted, make the voice signal after adjusting more meet the perception level of people's ear, improved widely the effect of loudness control.
The second loudness control method flow chart that Fig. 2 provides for the embodiment of the present invention.As shown in Figure 2, in the present embodiment, step 10, describedly carries out respectively speech detection at least one road voice signal, specifically can comprise the steps:
Step 101, for described each road voice signal, calculate the root mean square of described voice signal;
Step 102, generate respectively signal envelope and noise envelope according to described root mean square;
Step 103, calculate the ratio of described signal envelope and described noise envelope, if described ratio is greater than the first predetermined threshold value, described voice signal detected, otherwise, described mute signal detected.
Particularly, voice signal is digital signal, can determine the energy of voice signal by calculating the root mean square of voice signal.Voice signal comprises multiple audio frame sequence, and each audio frame comprises multiple sampled points, and the testing process of voice signal is to the testing process to audio frame.For example, audio frame is s (n), n=0, and 1 ..., N-1, the number that N is sampled point, in the time that the frame length of s (n) is 20ms, can calculate by following formula the root mean square Ecur of s (n).
Ecur = 20 × log 10 Σ n = 0 N - 1 s ( n ) × s ( n ) / N ;
The process that generates signal envelope Senv according to root mean square Ecur is specifically as follows:
If Ecur is greater than thread1, Senv=0.9 × Senv+0.1 × Ecur;
If Ecur is not more than thread1, Senv=0.998 × Senv+0.002 × Ecur;
The process that generates noise envelope Sno according to root mean square Ecur is specifically as follows:
If Ecur is greater than thread2, Sno=0.998 × Sno+0.002 × Ecur;
If Ecur is not more than thread2, Sno=0.9 × Sno+0.1 × Ecur;
The initial value of thread1 and thread2 can arrange based on experience value, and thread1>=thread2, in the process of processing, then according to Senv and Sno, thread1 and thread2 is adjusted.As, thread2=(Senv+Sno) × 0.5, thread1 can be set to equate with thread2, or be slightly larger than thread2.
The ratio Senv/Sno that calculates signal envelope and noise envelope, judges whether Senv/Sno is greater than thread3, and if so, this voice signal is voice signal, otherwise this voice signal is noise signal.
It should be noted that the parameter in above-mentioned formula all can arrange and adjust according to actual treatment situation.
In the present embodiment, in step 20, the mute signal that described basis detects, determines the gain of the largest tolerable of described mute signal, is specifically as follows:
According to the level of mute signal described in described root mean square calculation, determine the gain of described largest tolerable according to the level of described mute signal.
Particularly, can first determine the maximum noise level that people's ear can be tolerated, be for example 30dB, determine the gain of this largest tolerable according to the difference of the level of this maximum noise level and mute signal, in the scope that can tolerate at people's ear, mute signal is adjusted.
In the present embodiment, described according to the level of mute signal described in described root mean square calculation, be specifically as follows:
Apply the level Noise_Level that following formula calculates described mute signal:
Noise_Level=0.99×Noise_Level+0.01×Ecur;
Wherein, Ecur is described root mean square.
Suppose that the maximum noise level that people's ear can be tolerated is NoiseThread, the gain NGain=NoiseThread-Noise_Level of largest tolerable.
In the present embodiment, in step 20, the voice signal that described basis detects, determines the speech loudness gain of described voice signal, is specifically as follows:
Described voice signal is carried out to loudness filtering processing, loudness filtering voice signal after treatment is carried out to signal level statistics, determine the level of described voice signal according to statistics, determine described speech loudness gain according to the level of described voice signal.
Particularly, can adopt loudness filter to carry out loudness filtering processing to voice signal, this loudness filter can the high pass IIR yulewalk filter on 10 rank and the butterworth high pass filter cascade of 2 order frequency 150HZ approach this loudness filter.The parameter of high pass IIR yulewalk filter and butterworth high pass filter can need to arrange according to actual processing, is not limited with the present embodiment.
Loudness filtering voice signal after treatment is carried out to signal level statistics to be specifically as follows: the root mean square Level of each audio frame in computing voice signal, audio frame is s (n), n=0,1 ..., N-1, N is the number of sampled point, when the frame length of s (n) is 20ms, when sample rate is 16Hz, root mean square Level specifically can realize by following formula:
Level = 20 × log 10 Σ n = 0 N - 1 ( s ( n ) × s ( n ) / N / 2 15 ) ;
The root mean square of multiple audio frames is carried out to level statistics, if the level distribution of most of audio frame around certain level, can be determined according to this level the level Level1 of voice signal.Can be by following formula computing voice loudness gain LGain:
LGain=Lref – Level1; Wherein, Lref is specifically as follows an empirical value, for example, be-14dB.
In the present embodiment, step 30, described according to the Gain generating adjustment gain of described speech loudness gain and described largest tolerable, be specifically as follows:
If the absolute value of described speech loudness gain is greater than the second predetermined threshold value, the following formula of application generates described adjustment gain G ain:
Gain=LGain×(1.0-(LGain+NGain)/(LGain×2));
Wherein, LGain is described speech loudness gain, the gain that NGain is described largest tolerable.
Particularly, first absolute value and second predetermined threshold value of speech loudness gain are compared, this second predetermined threshold value is specifically as follows an empirical value, for example, be 3dB.。In the time that the absolute value of speech loudness gain is greater than the second predetermined threshold value, generate and adjust gain by above-mentioned formula, parameter in above-mentioned formula also can arrange according to actual needs, and the present embodiment has only been to provide one preferred embodiment, and the present invention is not as limit.
In the present embodiment, step 40, described according to described adjustment gain to the adjustment that gains of described voice signal, be specially:
Determine and adjust duration according to described voice signal, determine and adjust step-length according to described adjustment gain and described adjustment duration, according to described adjustment gain and described adjustment step-length to the adjustment that gains of described voice signal.
Voice signal is gained to adjust can adopt automatic gain control (Automatic Gain Control is called for short AGC) method, and detailed process is:
Calculate and adjust step-length according to the signal characteristic of voice signal, first can calculate and adjust duration decay according to the signal type of voice signal, as: decay=Ratio × MaxFrameNum × FrameLen+FrameLen, wherein, Ratio is that voice signal is the similarity while speaking state, MaxFrameNum is largest frames long number, the length that FrameLen is every frame.
Calculate adjustment step-length delt:delt=(the curGain-m_oldGain)/decay of each sampled point, the gain that curGain is current sampling point, the gain that m_oldGain is last sampled point, can be set to 1 for the gain of first sampled point.
To the adjustment that gains of each audio frame of voice signal, S ' (n)=S (n) × (m_oldGain+delt) upgrade oldGain.This adjustment process is specifically as follows:
The gain of last sampled point is m_oldGain, and curGain is the gain of the current sampling point after some frames, the duration that decay is some frames.
The first step, application of formula delt=(curGain-m_oldGain)/decay, calculates the delt that a pointwise is upgraded.
Second step, upgrades each sampled point:
m_curGain=m_oldGain+delt;
S’(n)=S(n)×m_curGain;
m_oldGain=m_curGain;
Here increased the gain of a current sampling point, above iteration until this frame signal handle.After lower frame signal comes, can be according to the above processing procedure of feature circulation of new signal.
In the present embodiment, if described voice signal is two-way at least, step 30, after the described Gain generating adjustment according to described speech loudness gain and described largest tolerable increases, step 40, describedly according to described adjustment gain, described voice signal gain before adjustment, described method can also comprise:
Step 50, calculate the difference of the gain of described speech loudness and the described adjustment gain on each road, determine difference maximum in each road, apply following formula the described adjustment gain G ain on each road is adjusted:
Gain=2×Gain–LDiffMax–LGain;
Wherein, the difference that LDiffMax is described maximum, LGain is described speech loudness gain.
Particularly, under virtual meeting application scenarios, conventionally having multi-channel sound signal, for example, is M road, and the adjustment gain of m road voice signal is Gain (m), [m=1..M].In the process of speech loudness gain of calculating each road, identical with reference to Lref, if namely each Lu Douyong LGain (m) adjusts, energy after adjusting is so Lref, when the noise level on Dang Mei road is different, finally calculate Gain (m), the energy of each road voice signal after Gain (m) adjusts is different, automatically the adjustment energy of adjusting in the following manner the every road of alignment voice signal, makes the level of output identical.
First the voice signal that calculates every road is poor to Lref's: LDiff (m)=LGain (m)-Gain (m), therefrom choose maximum LDiffMax, be LDiffMax=Max (LDiff (m)), then by formula Gain=2 × Gain – LDiffMax – LGain, the described adjustment gain G ain on each road adjusted.
In actual application, adjusting Hou Ge road voice signal through gain can also add azimuth information according to predefined scene Gei Mei road voice signal, synthetic 3d audio frequency.
The first loudness control device structural representation that Fig. 3 provides for the embodiment of the present invention.As shown in Figure 3, the loudness control device that the present embodiment provides specifically can be realized each step of the loudness control method that any embodiment of the present invention provides, and specific implementation process does not repeat them here.The loudness control device that the present embodiment provides can be integrated in audio processing equipment, also can arrange separately.This audio processing equipment specifically can be applied in conference system.
The loudness control device that the present embodiment provides specifically comprises detecting unit 11, the first processing unit 12, the second processing unit 13 and the first adjustment unit 14.Described detecting unit 11 is for carrying out respectively speech detection at least one road voice signal.Described the first processing unit 12 is connected with described detecting unit 11, for for each road voice signal, according to the voice signal detecting, determine the speech loudness gain of described voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of described mute signal.Described the second processing unit 13 is connected with described the first processing unit 12, for gaining according to the Gain generating adjustment of described speech loudness gain and described largest tolerable.Described the first adjustment unit 14 is connected with described the second processing unit 13, for according to described adjustment gain to the adjustment that gains of described voice signal.
The loudness control device that the present embodiment provides, detecting unit 11 carries out respectively speech detection at least one road voice signal, the first processing unit 12 is for each road voice signal, according to the voice signal detecting, determine the speech loudness gain of voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of mute signal, the second processing unit 13 is according to the Gain generating adjustment gain of speech loudness gain and largest tolerable, and the first adjustment unit 14 gains to the voice signal adjustment that gains according to adjusting.By the detection of mute signal being determined to the gain of largest tolerable, and simultaneously according to the Gain generating adjustment gain of speech loudness gain and largest tolerable, by this adjustment gain, the gain of voice signal is adjusted, make the voice signal after adjusting more meet the perception level of people's ear, improved widely the effect of loudness control.
The second loudness control device structural representation that Fig. 4 provides for the embodiment of the present invention.As shown in Figure 4, in the present embodiment, described detecting unit 11 specifically can comprise the first processing subelement 21, the second processing subelement 22 and judgment sub-unit 23.Described first processes subelement 21 for for described each road voice signal, calculates the root mean square of described voice signal.Described second processes subelement 22 processes subelement 21 with described first, for generate respectively signal envelope and noise envelope according to described root mean square.Described judgment sub-unit 23 and described second is processed subelement 22 and is connected, and for calculating the ratio of described signal envelope and described noise envelope, if described ratio is greater than the first predetermined threshold value, described voice signal detected, otherwise, described mute signal detected.
Particularly, voice signal is digital signal, can determine the energy of voice signal by calculating the root mean square of voice signal.Voice signal comprises multiple audio frame sequence, and each audio frame comprises multiple sampled points, and the testing process of voice signal is to the testing process to audio frame.For example, audio frame is s (n), n=0, and 1 ..., N-1, the number that N is sampled point, in the time that the frame length of s (n) is 20ms, can calculate by following formula the root mean square Ecur of s (n).
Ecur = 20 × log 10 Σ n = 0 N - 1 s ( n ) × s ( n ) / N ;
The process that generates signal envelope Senv according to root mean square Ecur is specifically as follows:
If Ecur is greater than thread1, Senv=0.9 × Senv+0.1 × Ecur;
If Ecur is not more than thread1, Senv=0.998 × Senv+0.002 × Ecur;
The process that generates noise envelope Sno according to root mean square Ecur is specifically as follows:
If Ecur is greater than thread2, Sno=0.998 × Sno+0.002 × Ecur;
If Ecur is not more than thread2, Sno=0.9 × Sno+0.1 × Ecur;
The initial value of thread1 and thread2 can arrange based on experience value, and thread1>=thread2, in the process of processing, then according to Senv and Sno, thread1 and thread2 is adjusted.As, thread2=(Senv+Sno) × 0.5, thread1 can be set to equate with thread2, or be slightly larger than thread2.
The ratio Senv/Sno that calculates signal envelope and noise envelope, judges whether Senv/Sno is greater than thread3, and if so, this voice signal is voice signal, otherwise this voice signal is noise signal.
It should be noted that the parameter in above-mentioned formula all can arrange and adjust according to actual treatment situation.
In the present embodiment, described the first processing unit 12 specifically can, for according to the level of mute signal described in described root mean square calculation, be determined the gain of described largest tolerable according to the level of described mute signal.
In the present embodiment, described the first processing unit 12 specifically can calculate specifically for applying following formula the level Noise_Level of described mute signal:
Noise_Level=0.99×Noise_Level+0.01×Ecur;
Wherein, Ecur is described root mean square.
In the present embodiment, described the first processing unit 12 specifically can be for carrying out loudness filtering processing to described voice signal, loudness filtering voice signal after treatment is carried out to signal level statistics, determine the level of described voice signal according to statistics, determine described speech loudness gain according to the level of described voice signal.
Particularly, in the first processing unit 12, loudness filter can be set, by loudness filter, voice signal is carried out to loudness filtering processing, this loudness filter can the high pass IIR yulewalk filter on 10 rank and the butterworth high pass filter cascade of 2 order frequency 150HZ approach this loudness filter.The parameter of high pass IIR yulewalk filter and butterworth high pass filter can need to arrange according to actual processing, is not limited with the present embodiment.
In the present embodiment, if described the second processing unit 13 is greater than the second predetermined threshold value specifically for the absolute value of described speech loudness gain, the following formula of application generates described adjustment gain G ain:
Gain=LGain×(1.0-(LGain+NGain)/(LGain×2));
Wherein, LGain is described speech loudness gain, the gain that NGain is described largest tolerable.
In the present embodiment, described the first adjustment unit 14 is specifically for determining and adjust duration according to described voice signal, determine and adjust step-length according to described adjustment gain and described adjustment duration, according to described adjustment gain and described adjustment step-length to the adjustment that gains of described voice signal.
In the present embodiment, further, if described voice signal is two-way at least, described loudness control device can also comprise the second adjustment unit 15, described the second adjustment unit 15 is connected with described the first adjustment unit 14, for calculating the described speech loudness gain on each road and the difference that described adjustment gains, determine difference maximum in each road, apply following formula the described adjustment gain G ain on each road is adjusted:
Gain=2×Gain-LDiffMax-LGain;
Wherein, the difference that LDiffMax is described maximum, LGain is described speech loudness gain.
Under virtual meeting application scenarios, conventionally there is multi-channel sound signal, be for example M road, the adjustment gain of m road voice signal is Gain (m), [m=1..M].In the process of speech loudness gain of calculating each road, identical with reference to Lref, if namely each Lu Douyong LGain (m) adjusts, energy after adjusting is so Lref, when the noise level on Dang Mei road is different, finally calculate Gain (m), the energy of each road voice signal after Gain (m) adjusts is different, automatically the adjustment energy of adjusting in the following manner the every road of alignment voice signal, makes the level of output identical.
First the voice signal that calculates every road is poor to Lref's: LDiff (m)=LGain (m)-Gain (m), therefrom choose maximum LDiffMax, be LDiffMax=Max (LDiff (m)), then by formula Gain=2 × Gain – LDiffMax – LGain, the described adjustment gain G ain on each road adjusted.
In actual application, adjusting Hou Ge road voice signal through gain can also add azimuth information according to predefined scene Gei Mei road voice signal, synthetic 3d audio frequency.
The third loudness control device structural representation that Fig. 5 provides for the embodiment of the present invention.As shown in Figure 5, the loudness control device that the present embodiment provides specifically can be realized each step of the loudness control method that any embodiment of the present invention provides, and specific implementation process does not repeat them here.The loudness control device that the present embodiment provides specifically comprises processor 31 and memory 32, and described memory 32 is for storing instruction.Described processor 31 is coupled with described memory 32, described processor 31 is configured to carry out the instruction being stored in described memory 32, wherein, described processor 31 is configured to at least one road voice signal is carried out respectively to speech detection, for each road voice signal, according to the voice signal detecting, determine the speech loudness gain of described voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of described mute signal, according to the Gain generating adjustment gain of described speech loudness gain and described largest tolerable, according to described adjustment gain to the adjustment that gains of described voice signal.
One of ordinary skill in the art will appreciate that: all or part of step that realizes said method embodiment can complete by the relevant hardware of program command, aforesaid program can be stored in a computer read/write memory medium, this program, in the time carrying out, is carried out the step that comprises said method embodiment; And aforesaid storage medium comprises: various media that can be program code stored such as ROM, RAM, magnetic disc or CDs.
Finally it should be noted that: above embodiment only, in order to technical scheme of the present invention to be described, is not intended to limit; Although the present invention is had been described in detail with reference to previous embodiment, those of ordinary skill in the art is to be understood that: its technical scheme that still can record aforementioned each embodiment is modified, or part technical characterictic is wherein equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution depart from the scope of various embodiments of the present invention technical scheme.

Claims (16)

1. a loudness control method, is characterized in that, comprising:
At least one road voice signal is carried out respectively to speech detection;
For each road voice signal, according to the voice signal detecting, determine the speech loudness gain of described voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of described mute signal;
According to the Gain generating adjustment gain of described speech loudness gain and described largest tolerable;
According to described adjustment gain to the adjustment that gains of described voice signal.
2. loudness control method according to claim 1, is characterized in that, described at least one road voice signal is carried out respectively to speech detection, comprising:
For described each road voice signal, calculate the root mean square of described voice signal;
Generate respectively signal envelope and noise envelope according to described root mean square;
Calculate the ratio of described signal envelope and described noise envelope, if described ratio is greater than the first predetermined threshold value, described voice signal detected, otherwise, described mute signal detected.
3. loudness control method according to claim 2, is characterized in that, the mute signal that described basis detects is determined the gain of the largest tolerable of described mute signal, is specially:
According to the level of mute signal described in described root mean square calculation, determine the gain of described largest tolerable according to the level of described mute signal.
4. loudness control method according to claim 3, is characterized in that, described according to the level of mute signal described in described root mean square calculation, is specially:
Apply the level Noise_Level that following formula calculates described mute signal:
Noise_Level=0.99×Noise_Level+0.01×Ecur;
Wherein, Ecur is described root mean square.
5. loudness control method according to claim 1, is characterized in that, the voice signal that described basis detects is determined the speech loudness gain of described voice signal, is specially:
Described voice signal is carried out to loudness filtering processing, loudness filtering voice signal after treatment is carried out to signal level statistics, determine the level of described voice signal according to statistics, determine described speech loudness gain according to the level of described voice signal.
6. loudness control method according to claim 1, is characterized in that, described according to the Gain generating adjustment gain of described speech loudness gain and described largest tolerable, is specially:
If the absolute value of described speech loudness gain is greater than the second predetermined threshold value, the following formula of application generates described adjustment gain G ain:
Gain=LGain×(1.0-(LGain+NGain)/(LGain×2));
Wherein, LGain is described speech loudness gain, the gain that NGain is described largest tolerable.
7. loudness control method according to claim 1, is characterized in that, described according to described adjustment gain to the adjustment that gains of described voice signal, be specially:
Determine and adjust duration according to described voice signal, determine and adjust step-length according to described adjustment gain and described adjustment duration, according to described adjustment gain and described adjustment step-length to the adjustment that gains of described voice signal.
8. loudness control method according to claim 1, it is characterized in that, if described voice signal is two-way at least, after the described Gain generating adjustment according to described speech loudness gain and described largest tolerable increases, describedly according to described adjustment gain, described voice signal gain before adjustment, described method also comprises:
Calculate the difference of described speech loudness gain and the described adjustment gain on each road, determine difference maximum in each road, apply following formula the described adjustment gain G ain on each road is adjusted:
Gain=2×Gain–LDiffMax–LGain;
Wherein, the difference that LDiffMax is described maximum, LGain is described speech loudness gain.
9. a loudness control device, is characterized in that, comprising:
Detecting unit, for carrying out respectively speech detection at least one road voice signal;
The first processing unit, is connected with described detecting unit, for for each road voice signal, according to the voice signal detecting, determine the speech loudness gain of described voice signal, according to the mute signal detecting, determine the gain of the largest tolerable of described mute signal;
The second processing unit, is connected with described the first processing unit, for gaining according to the Gain generating adjustment of described speech loudness gain and described largest tolerable;
The first adjustment unit, is connected with described the second processing unit, for according to described adjustment gain to the adjustment that gains of described voice signal.
10. loudness control device according to claim 9, is characterized in that, described detecting unit comprises:
First processes subelement, for for described each road voice signal, calculates the root mean square of described voice signal;
Second processes subelement, is connected, for generate respectively signal envelope and noise envelope according to described root mean square with described the first processing subelement;
Judgment sub-unit, is connected with described the second processing subelement, for calculating the ratio of described signal envelope and described noise envelope, if described ratio is greater than the first predetermined threshold value, described voice signal detected, otherwise, described mute signal detected.
11. loudness control device according to claim 10, is characterized in that: described the first processing unit, specifically for according to the level of mute signal described in described root mean square calculation, is determined the gain of described largest tolerable according to the level of described mute signal.
12. loudness control device according to claim 11, is characterized in that: described the first processing unit calculates the level Noise_Level of described mute signal specifically for applying following formula:
Noise_Level=0.99×Noise_Level+0.01×Ecur;
Wherein, Ecur is described root mean square.
13. loudness control device according to claim 9, it is characterized in that: described the first processing unit is specifically for carrying out loudness filtering processing to described voice signal, loudness filtering voice signal after treatment is carried out to signal level statistics, determine the level of described voice signal according to statistics, determine described speech loudness gain according to the level of described voice signal.
14. loudness control device according to claim 9, is characterized in that: if described the second processing unit is greater than the second predetermined threshold value specifically for the absolute value of described speech loudness gain, the following formula of application generates described adjustment gain G ain:
Gain=LGain×(1.0-(LGain+NGain)/(LGain×2));
Wherein, LGain is described speech loudness gain, the gain that NGain is described largest tolerable.
15. loudness control device according to claim 9, it is characterized in that: described the first adjustment unit is specifically for determining and adjust duration according to described voice signal, determine and adjust step-length according to described adjustment gain and described adjustment duration, according to described adjustment gain and described adjustment step-length to the adjustment that gains of described voice signal.
16. loudness control device according to claim 9, is characterized in that, if described voice signal is two-way at least, described loudness control device also comprises:
The second adjustment unit, is connected with described the first adjustment unit, for calculating the described speech loudness gain on each road and the difference that described adjustment gains, determines difference maximum in each road, applies following formula the described adjustment gain G ain on each road is adjusted:
Gain=2×Gain-LDiffMax-LGain;
Wherein, the difference that LDiffMax is described maximum, LGain is described speech loudness gain.
CN201210460201.0A 2012-11-15 2012-11-15 Volume control method and device Active CN103812462B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210460201.0A CN103812462B (en) 2012-11-15 2012-11-15 Volume control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210460201.0A CN103812462B (en) 2012-11-15 2012-11-15 Volume control method and device

Publications (2)

Publication Number Publication Date
CN103812462A true CN103812462A (en) 2014-05-21
CN103812462B CN103812462B (en) 2016-12-07

Family

ID=50708755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210460201.0A Active CN103812462B (en) 2012-11-15 2012-11-15 Volume control method and device

Country Status (1)

Country Link
CN (1) CN103812462B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105450193A (en) * 2014-08-28 2016-03-30 深圳Tcl新技术有限公司 Volume adjusting method and volume adjusting device
CN106992003A (en) * 2017-03-24 2017-07-28 深圳北斗卫星信息科技有限公司 Voice signal auto gain control method
CN107994879A (en) * 2017-12-04 2018-05-04 北京小米移动软件有限公司 Volume control method and device
CN108806710A (en) * 2018-06-15 2018-11-13 会听声学科技(北京)有限公司 A kind of speech enhancement gain method of adjustment, system and earphone
CN108882115A (en) * 2017-05-12 2018-11-23 华为技术有限公司 loudness adjusting method, device and terminal
CN116168719A (en) * 2022-12-26 2023-05-26 杭州爱听科技有限公司 Sound gain adjusting method and system based on context analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060126856A1 (en) * 2004-12-10 2006-06-15 Quanta Computer Inc. Volume control method and audio device
US20090103751A1 (en) * 2007-10-22 2009-04-23 Stephen Gordon Lenk Sound volume leveler for speed sensitive volume
CN101783656A (en) * 2010-03-17 2010-07-21 北京爱德发科技有限公司 Loudness control method, module and device of stereo system
CN102436821A (en) * 2011-12-02 2012-05-02 海能达通信股份有限公司 Method for adaptively adjusting sound effect and equipment thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060126856A1 (en) * 2004-12-10 2006-06-15 Quanta Computer Inc. Volume control method and audio device
US20090103751A1 (en) * 2007-10-22 2009-04-23 Stephen Gordon Lenk Sound volume leveler for speed sensitive volume
CN101783656A (en) * 2010-03-17 2010-07-21 北京爱德发科技有限公司 Loudness control method, module and device of stereo system
CN102436821A (en) * 2011-12-02 2012-05-02 海能达通信股份有限公司 Method for adaptively adjusting sound effect and equipment thereof

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105450193A (en) * 2014-08-28 2016-03-30 深圳Tcl新技术有限公司 Volume adjusting method and volume adjusting device
CN106992003A (en) * 2017-03-24 2017-07-28 深圳北斗卫星信息科技有限公司 Voice signal auto gain control method
CN108882115A (en) * 2017-05-12 2018-11-23 华为技术有限公司 loudness adjusting method, device and terminal
CN108882115B (en) * 2017-05-12 2020-08-25 华为技术有限公司 Loudness adjustment method and device and terminal
CN107994879A (en) * 2017-12-04 2018-05-04 北京小米移动软件有限公司 Volume control method and device
CN107994879B (en) * 2017-12-04 2022-07-08 北京小米移动软件有限公司 Loudness control method and device
CN108806710A (en) * 2018-06-15 2018-11-13 会听声学科技(北京)有限公司 A kind of speech enhancement gain method of adjustment, system and earphone
CN116168719A (en) * 2022-12-26 2023-05-26 杭州爱听科技有限公司 Sound gain adjusting method and system based on context analysis

Also Published As

Publication number Publication date
CN103812462B (en) 2016-12-07

Similar Documents

Publication Publication Date Title
CN102017402B (en) System for adjusting perceived loudness of audio signals
CN106878866B (en) Audio signal processing method and device and terminal
US9219973B2 (en) Method and system for scaling ducking of speech-relevant channels in multi-channel audio
US9431982B1 (en) Loudness learning and balancing system
EP2592546B1 (en) Automatic Gain Control in a multi-talker audio system
CN103812462A (en) Loudness control method and device
KR20140116152A (en) Bass enhancement system
US20180190310A1 (en) De-reverberation control method and apparatus for device equipped with microphone
CN108573709B (en) Automatic gain control method and device
CN103473005A (en) Method and device for performing sound effect control on played audio frequency
CN110956976B (en) Echo cancellation method, device and equipment and readable storage medium
US9391575B1 (en) Adaptive loudness control
CN114650494A (en) DSP sound system and automatic acoustic testing method thereof
KR20230017719A (en) Adaptive equalization method and system for acoustic system
CN106448690A (en) Automatic gain control method and apparatus of audio signals
US10389323B2 (en) Context-aware loudness control
CN110390954B (en) Method and device for evaluating quality of voice product
US20120033835A1 (en) System and method for modifying an audio signal
CN109889170B (en) Audio signal control method and device
CN115396781A (en) Method for tuning and adapting sound field uniformity based on audio processor
CN108932953B (en) Audio equalization function determination method, audio equalization method and equipment
KR20220071954A (en) Method for performing normalization of audio signal and apparatus therefor
JP2005184154A (en) Unit and method for automatic gain control
JP7028613B2 (en) Audio processor and audio player
CN116634221A (en) Multi-channel audio source automatic mixing method, system, device and medium based on Android system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant