CN111243631A - Automatic gain control method and electronic equipment - Google Patents

Automatic gain control method and electronic equipment Download PDF

Info

Publication number
CN111243631A
CN111243631A CN202010037394.3A CN202010037394A CN111243631A CN 111243631 A CN111243631 A CN 111243631A CN 202010037394 A CN202010037394 A CN 202010037394A CN 111243631 A CN111243631 A CN 111243631A
Authority
CN
China
Prior art keywords
value
target frame
gain value
voice
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010037394.3A
Other languages
Chinese (zh)
Other versions
CN111243631B (en
Inventor
郝斌
冯大航
常乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SoundAI Technology Co Ltd
Original Assignee
Beijing SoundAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SoundAI Technology Co Ltd filed Critical Beijing SoundAI Technology Co Ltd
Priority to CN202010037394.3A priority Critical patent/CN111243631B/en
Publication of CN111243631A publication Critical patent/CN111243631A/en
Application granted granted Critical
Publication of CN111243631B publication Critical patent/CN111243631B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10546Audio or video recording specifically adapted for audio data
    • G11B2020/10555Audio or video recording specifically adapted for audio data wherein the frequency, the amplitude, or other characteristics of the audio signal is taken into account
    • G11B2020/10574Audio or video recording specifically adapted for audio data wherein the frequency, the amplitude, or other characteristics of the audio signal is taken into account volume or amplitude

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)
  • Control Of Amplification And Gain Control (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention provides an automatic gain control method and electronic equipment, and relates to the technical field of voice, wherein the method comprises the following steps: acquiring an envelope value of a target frame voice signal in a received voice signal; acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal; and adjusting the gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame voice signal before the target frame voice signal. The embodiment of the invention can improve the recording effect.

Description

Automatic gain control method and electronic equipment
Technical Field
The present invention relates to the field of speech technologies, and in particular, to an automatic gain control method and an electronic device.
Background
Along with the popularization of electronic equipment, the functions of the electronic equipment are more and more perfect, and the electronic equipment almost becomes an indispensable communication tool in daily life and work of people. In daily life and work, people often need to use the recording function of electronic equipment.
Electronic equipment is carrying out the in-process of recording, and when the sound source was far away apart from electronic equipment, the sound source signal that electronic equipment's microphone was gathered was more weak, and when the sound source was nearer apart from electronic equipment, the sound source signal that electronic equipment's microphone was gathered was stronger, is carrying out processing such as coding and reinforcing to the sound source signal after, can make the speech amplitude change great, leads to the recording effect relatively poor.
Disclosure of Invention
The embodiment of the invention provides an automatic gain control method and electronic equipment, and aims to solve the problem that in the prior art, the recording effect is poor due to large voice amplitude change.
In order to solve the technical problem, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides an automatic gain control method applied to an electronic device, where the method includes:
acquiring an envelope value of a target frame voice signal in a received voice signal;
acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal;
and adjusting the gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame voice signal before the target frame voice signal.
In a second aspect, an embodiment of the present invention provides an electronic device, where the electronic device includes:
the first acquisition module is used for acquiring an envelope value of a target frame voice signal in the received voice signal;
the second acquisition module is used for acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal;
and the adjusting module is used for adjusting the gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame of voice signal before the target frame voice signal.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor and a program stored on the memory and executable on the processor, which program, when executed by the processor, performs the steps in the automatic gain control method according to the first aspect.
In a fourth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in the automatic gain control method according to the first aspect.
In the embodiment of the invention, the envelope value of a target frame voice signal in a received voice signal is obtained; acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal; and adjusting the gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame voice signal before the target frame voice signal. Therefore, the gain value can be adjusted in real time based on the received voice signal, and the recording effect is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a flowchart of an automatic gain control method according to an embodiment of the present invention;
FIG. 2 is a gain comparison diagram provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating comparison of speech amplitudes according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of another electronic device provided in an embodiment of the invention;
FIG. 6 is a schematic structural diagram of another electronic device provided in an embodiment of the invention;
fig. 7 is a schematic structural diagram of another electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the embodiment of the present invention, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted mobile terminal, a wearable device, a pedometer, and the like.
Referring to fig. 1, fig. 1 is a flowchart of an automatic gain control method according to an embodiment of the present invention, where the method is applied to an electronic device, and as shown in fig. 1, the method includes the following steps:
step 101, obtaining an envelope value of a target frame voice signal in the received voice signal.
Wherein the received voice signal may be divided into a plurality of frames of voice signals. The target frame speech signal may be any one of the multi-frame speech signals. The target frame speech signal may include a plurality of speech samples, the envelope value of the target frame speech signal may be an average value of energy envelope values of the plurality of speech samples, and the energy envelope value of each of the plurality of speech samples may be obtained by calculating the energy value of the speech sample. For example, the energy value of the nth speech sample may be:
ef(n)=α1*ef(n-1)+(1-α1)*|x(n)|
wherein α 1 is a coefficient, n is the nth speech sample, ef (n-1) is the energy value of the nth-1 speech sample, and x (n) is the amplitude value of the nth speech sample.
The energy envelope value of the nth speech sample may be:
ep(n)=α2*ep(n-1)+(1-α2)*ef(n)
α 2 is a coefficient, ep (n-1) is an energy envelope value of the (n-1) th speech sample, and ef (n) is an energy value of the nth speech sample.
And 102, acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal.
Wherein the envelope value of at least one frame of speech signal preceding the target frame of speech signal may include the envelope value of one or more frames of speech signal preceding the target frame of speech signal.
Step 103, adjusting the gain value of the target frame speech signal based on the envelope value of the target frame speech signal and the envelope value of at least one frame speech signal before the target frame speech signal.
Adjusting a gain value of the target frame speech signal based on the envelope value of the target frame speech signal and the envelope value of at least one frame of speech signal before the target frame speech signal, wherein the average value of the envelope value of the target frame speech signal and the envelope value of at least one frame of speech signal before the target frame speech signal is calculated to obtain an average value corresponding to the target frame speech signal; acquiring a first gain value of the target frame voice signal; and under the condition that the average value corresponding to the target frame voice signal is greater than a first preset value and/or the envelope value change rate is greater than a second preset value, adjusting the gain value of the target frame voice signal to be a target gain value, wherein the target gain value is smaller than the first gain value, and the envelope value change rate is obtained based on the average value corresponding to the target frame voice signal and the average value corresponding to at least one frame of voice signal before the target frame voice signal.
Or adjusting a gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame of voice signal before the target frame voice signal, or calculating a difference value between the envelope value of the target frame voice signal and the envelope value of one frame of voice signal before the target frame voice signal to obtain a difference value corresponding to the target frame voice signal; acquiring a first gain value of the target frame voice signal; and under the condition that the difference value corresponding to the target frame voice signal is greater than a preset difference value, adjusting the gain value of the target frame voice signal to be a target gain value, wherein the target gain value is smaller than the first gain value.
In addition, a DCR (Dynamic range compression) algorithm may be used to smooth the gain value of the target frame speech signal. After the smoothing treatment, the gain value can be wholly raised slowly and lowered quickly, the jitter is small, and the recording effect is good.
In the embodiment of the invention, the envelope value of a target frame voice signal in a received voice signal is obtained; acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal; and adjusting the gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame voice signal before the target frame voice signal. Therefore, the gain value can be adjusted in real time based on the received voice signal, and the recording effect is improved.
Optionally, the adjusting the gain value of the target frame speech signal based on the envelope value of the target frame speech signal and the envelope value of at least one frame speech signal before the target frame speech signal includes:
calculating the average value of the envelope value of the target frame voice signal and the envelope value of at least one frame of voice signal before the target frame voice signal to obtain the average value corresponding to the target frame voice signal;
acquiring a first gain value of the target frame voice signal;
and under the condition that the average value corresponding to the target frame voice signal is greater than a first preset value and/or the envelope value change rate is greater than a second preset value, adjusting the gain value of the target frame voice signal to be a target gain value, wherein the target gain value is smaller than the first gain value, and the envelope value change rate is obtained based on the average value corresponding to the target frame voice signal and the average value corresponding to at least one frame of voice signal before the target frame voice signal.
The target frame speech signal may include a plurality of speech samples, the first gain value of the target frame speech signal may include gain values of the speech samples, and the gain value of the target frame speech signal may be adjusted to a target gain value, where the gain values of the speech samples may be all adjusted to the target gain value. The target gain value may be a preset fixed value, for example, 3; or, the target gain value may also be a value obtained by calculating according to the maximum value and the minimum value of the gain value of the target frame speech signal; alternatively, the target gain value may also be obtained empirically.
In addition, the first preset value may be half of an average value of energy values of all the current voice samples from the beginning of recording; or, the first preset value may be obtained through experiments, multiple values may be selected as the first preset value for gain control, and a value with a better gain control effect may be used as the first preset value; alternatively, the first preset value may also be obtained empirically. The second preset value may be 0.2, or 0.3, etc.; or, the second preset value may be obtained through experiments, multiple values may be selected as the second preset value for gain control, and a value with a better gain control effect may be used as the second preset value; alternatively, the second preset value may also be obtained empirically.
Further, the corresponding average value of the target frame speech signal may be an average value of the envelope value of the target frame speech signal and the envelope value of the 50 frames or 75 frames or 100 frames of speech signals before the target frame speech signal. Taking the mean value corresponding to the target frame speech signal as ep _ frame (r) as an example, where r is a target frame, the envelope value variation rate may be obtained from the mean value corresponding to the target frame speech signal and the mean value corresponding to a frame speech signal before the target frame speech signal, for example, the envelope value variation rate may be: (ep _ frame (r) -ep _ frame (r-1))/ep _ frame (r-1), or (ep _ frame (r) -ep _ frame (r-1))/ep _ frame (r), wherein ep _ frame (r-1) is a corresponding mean value of a frame of speech signal preceding the target frame of speech signal.
In practical applications, for example, the target frame speech signal is a 150 th frame speech signal, a frame speech signal before the target frame speech signal is a 149 th frame speech signal, and an average value corresponding to the 149 th frame speech signal is an average value of an envelope value of the 149 th frame speech signal and an envelope value of at least one frame speech signal before the 149 th frame speech signal. The envelope value change rate may also be obtained from a mean value corresponding to the target frame speech signal and a mean value corresponding to a plurality of frames of speech signals before the target frame speech signal, for example, the envelope value change rate may be: (2ep _ frame (r) -ep _ frame (r-1) -ep _ frame (r-2))/ep _ frame (r), wherein ep _ frame (r-2) is the corresponding mean value of the speech signals of two frames before the target frame speech signal.
As shown in fig. 2, fig. 2 is a schematic diagram illustrating gain comparison provided by an embodiment of the present invention, where a curve in fig. 2 is an amplitude curve of an original audio, b curve in fig. 2 is a gain curve without performing target processing, and c curve in fig. 3 is a gain curve with performing target processing, where the target processing is to adjust a gain value of a target frame speech signal to a target gain value when a mean value corresponding to the target frame speech signal is greater than a first preset value and/or an envelope value change rate is greater than a second preset value. As shown in fig. 3, fig. 3 is a schematic diagram illustrating comparison of speech amplitude values according to an embodiment of the present invention, where a curve d in fig. 3 is a speech amplitude value curve obtained by performing automatic gain control using a gain value of a curve b, and a curve e in fig. 3 is a speech amplitude value curve obtained by performing automatic gain control using a gain value of a curve c. As can be seen from fig. 3, the automatic gain control according to this embodiment can stabilize the speech amplitude.
In this embodiment, an average value of the envelope value of the target frame speech signal and an envelope value of at least one frame of speech signal before the target frame speech signal is calculated to obtain an average value corresponding to the target frame speech signal; acquiring a first gain value of the target frame voice signal; and under the condition that the corresponding mean value of the target frame voice signal is greater than a first preset value and/or the change rate of the envelope value is greater than a second preset value, adjusting the gain value of the target frame voice signal to be a target gain value, wherein the target gain value is smaller than the first gain value. Therefore, when the target frame voice signal is probably the initial frame voice signal entering the voice from silence, the gain value of the target frame voice signal is reduced, the phenomenon that the gain value of the target frame voice signal cannot be attenuated in time to cause overlarge volume is avoided, and the recording effect can be further improved.
Optionally, the adjusting the gain value of the target frame speech signal to a target gain value when the mean value corresponding to the target frame speech signal is greater than a first preset value and/or the envelope value variation rate is greater than a second preset value includes:
and under the condition that the corresponding mean value of the target frame voice signal is greater than a first preset value and/or the change rate of the envelope value is greater than a second preset value, if the energy value of the target frame voice signal is greater than a third preset value, adjusting the gain value of the target frame voice signal to be a target gain value.
The energy value of the target frame speech signal may be an average value of energy values of a plurality of speech samples of the target frame speech signal. The third preset value can be half of the average value of the energy values of all the current voice sampling points from the beginning of recording; or, the third preset value may be obtained through experiments, multiple values may be selected as the third preset value for gain control, and a value with a better gain control effect may be used as the third preset value; alternatively, the third preset value may also be obtained empirically.
In this embodiment, when the mean value corresponding to the target frame speech signal is greater than a first preset value and/or the envelope value variation rate is greater than a second preset value, if the energy value of the target frame speech signal is greater than a third preset value, the gain value of the target frame speech signal is adjusted to be a target gain value. Therefore, the target frame voice signal can be further accurately judged to be the initial frame voice signal of the voice from silence, and the recording effect can be further improved.
Optionally, the target gain value is g, where g ═ gmin+k*(gmax-gmin)*g1/gmax,gminIs the minimum value of the gain value of the target frame speech signal, gmaxIs the maximum value of the gain value of the target frame speech signal, g1K is a fourth preset value for the first gain value.
The fourth preset value may be 0.1, 0.4, or 0.6, and may be preset, and preferably, may be set to 0.5. The maximum value and the minimum value of the gain value of the target frame voice signal can be preset, and when the gain value of the target frame voice signal is detected to be larger than the maximum value of the gain value of the target frame voice signal, the maximum value of the gain value of the target frame voice signal can be used as the gain value of the target frame voice signal; when it is detected that the gain value of the target frame speech signal is smaller than the minimum value of the gain values of the target frame speech signal, the minimum value of the gain values of the target frame speech signal may be used as the gain value of the target frame speech signal. The maximum value of the gain value of the target frame speech signal may be set to 6, or may be 8, or may be 10, or the like. The minimum value of the gain value of the target frame speech signal may be set to the reciprocal of the maximum value of the gain value of the target frame speech signal, for example, the maximum value of the gain value of the target frame speech signal may be 8, and the minimum value of the gain value of the target frame speech signal may be set to 1/8.
In this embodiment, the target gain value may be obtained by calculation according to the minimum value and the minimum value of the gain value of the target frame speech signal, and may be obtained without a large number of experiments, which may improve the efficiency of automatic gain control.
Optionally, the obtaining the first gain value of the target frame speech signal includes:
acquiring a gain value of the first voice sampling point;
acquiring a gain value of the second voice sample point based on the gain value of the first voice sample point, wherein the first voice sample point is a previous voice sample point of the second voice sample point;
the adjusting the gain value of the target frame speech signal to a target gain value comprises:
and adjusting the gain value of the first voice sampling point and the gain value of the second voice sampling point to be the target gain value.
The first speech sample may be any one speech sample in the target frame speech signal. When the target frame speech signal is an initial frame and the first speech sample is a first speech sample of the target frame speech signal, the gain value of the first speech sample may be a predetermined value, for example, 1.0001. When the target frame speech signal is a non-start frame and the first speech sample is a first speech sample of the target frame speech signal, the gain value of the first speech sample may be obtained by calculating a gain value of a last speech sample of a previous frame speech signal of the target frame speech signal. The gain value of each speech sample can be obtained from the gain value of the previous speech sample.
In addition, the obtaining the gain value of the second voice sample based on the gain value of the first voice sample may include: under the condition that the energy value of the second voice sampling point is smaller than a fifth preset value, the gain value of the second voice sampling point is the gain value of the first voice sampling point; and under the condition that the energy value of the second voice sample point is greater than or equal to the fifth preset value, the gain value of the second voice sample point is the product of the gain value of the first voice sample point and a preset coefficient. Or, the obtaining the gain value of the second voice sample based on the gain value of the first voice sample may include: when the envelope value of the second voice sample is greater than or equal to a seventh preset value, the gain value of the first voice sample may be decreased by an eighth preset value to be used as the gain value of the second voice sample; when the envelope value of the second voice sample is smaller than the seventh preset value, the gain value of the first voice sample may be increased by the eighth preset value to be used as the gain value of the second voice sample.
In this embodiment, the gain value of the first voice sample is obtained, the gain value of the second voice sample is obtained based on the gain value of the first voice sample, and both the gain value of the first voice sample and the gain value of the second voice sample are adjusted to the target gain value. Therefore, the gain value of the voice sampling point of the target frame voice signal is integrally adjusted, the situation that the gain value of the target frame voice signal cannot be attenuated in time is avoided, and the recording effect can be further improved.
Optionally, the obtaining the gain value of the second voice sample point based on the gain value of the first voice sample point includes:
under the condition that the energy value of the second voice sampling point is smaller than a fifth preset value, the gain value of the second voice sampling point is the gain value of the first voice sampling point;
and under the condition that the energy value of the second voice sample point is greater than or equal to the fifth preset value, the gain value of the second voice sample point is the product of the gain value of the first voice sample point and a preset coefficient.
The fifth preset value may be half of an average of energy values of all the current voice samples from the beginning of recording; or, the fifth preset value may be obtained through experiments, multiple values may be selected as the fifth preset value for gain control, and a value with a better gain control effect may be used as the fifth preset value; alternatively, the fifth preset value may also be obtained empirically.
In addition, when the product of the envelope value of the second speech sample and the gain value of the first speech sample is greater than or equal to a sixth preset value, the preset coefficient may be less than 1; when the product of the envelope value of the second speech sample and the gain value of the first speech sample is smaller than the sixth preset value, the preset coefficient may be greater than 1. Or, when the envelope value of the second speech sample point is greater than or equal to a seventh preset value, the preset coefficient may be less than 1; when the envelope value of the second speech sample is smaller than the seventh preset value, the preset coefficient may be greater than 1.
Further, when the gain value of the target frame speech signal is adjusted to the target gain value, the gain value of the last speech sample point of the target frame speech signal is also adjusted to the target gain value, and the gain value of the first speech sample point of the next frame speech signal of the target frame speech signal may be a product of the gain value of the last speech sample point of the target frame speech signal before adjustment and a preset coefficient, or may be a product of the gain value of the last speech sample point of the target frame speech signal after adjustment and a preset coefficient.
In this embodiment, when the energy value of the second voice sample is smaller than a fifth preset value, the gain value of the second voice sample is the gain value of the first voice sample; and under the condition that the energy value of the second voice sample point is greater than or equal to the fifth preset value, the gain value of the second voice sample point is the product of the gain value of the first voice sample point and a preset coefficient. In this way, it is determined whether the gain value needs to be updated based on the energy value of the second speech sample, avoiding updating the gain value during silent periods.
Optionally, when a product of the envelope value of the second speech sample and the gain value of the first speech sample is greater than or equal to a sixth preset value, the preset coefficient is less than 1;
and when the product of the envelope value of the second voice sample and the gain value of the first voice sample is smaller than the sixth preset value, the preset coefficient is larger than 1.
The sixth preset value may be a product of an envelope value of the target frame speech signal and a target gain value; or, the sixth preset value may be obtained through experiments, multiple values may be selected as the sixth preset value for gain control, and a value with a better gain control effect may be used as the sixth preset value; alternatively, the sixth preset value may also be obtained empirically. When the product of the envelope value of the second voice sample and the gain value of the first voice sample is greater than or equal to a sixth preset value, the preset coefficient may be a; when the product of the envelope value of the second speech sample and the gain value of the first speech sample is smaller than the sixth preset value, the preset coefficient may be 1/a, where a is smaller than 1. For example, a may be 0.999784.
In this embodiment, when the product of the envelope value of the second speech sample and the gain value of the first speech sample is greater than or equal to a sixth preset value, the preset coefficient is less than 1; and when the product of the envelope value of the second voice sample and the gain value of the first voice sample is smaller than the sixth preset value, the preset coefficient is larger than 1. In this way, the gain value can be slowly increased or decreased based on the product of the envelope value of the second voice sample and the gain value of the first voice sample, thereby realizing automatic control of the gain value.
Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device 200 includes:
a first obtaining module 201, configured to obtain an envelope value of a target frame speech signal in a received speech signal;
a second obtaining module 202, configured to obtain an envelope value of at least one frame of speech signal before the target frame of speech signal;
an adjusting module 203, configured to adjust a gain value of the target frame speech signal based on the envelope value of the target frame speech signal and an envelope value of at least one frame speech signal preceding the target frame speech signal.
Optionally, as shown in fig. 5, the adjusting module 203 includes:
a calculating unit 2031, configured to calculate an average value between the envelope value of the target frame speech signal and an envelope value of at least one frame of speech signal before the target frame speech signal, to obtain an average value corresponding to the target frame speech signal;
an obtaining unit 2032, configured to obtain a first gain value of the target frame speech signal;
an adjusting unit 2033, configured to adjust a gain value of the target frame speech signal to a target gain value when the mean value corresponding to the target frame speech signal is greater than a first preset value and/or an envelope value change rate is greater than a second preset value, where the target gain value is smaller than the first gain value, and the envelope value change rate is obtained based on the mean value corresponding to the target frame speech signal and the mean value corresponding to at least one frame of speech signal before the target frame speech signal.
Optionally, the adjusting unit 2033 is specifically configured to:
and under the condition that the corresponding mean value of the target frame voice signal is greater than a first preset value and/or the change rate of the envelope value is greater than a second preset value, if the energy value of the target frame voice signal is greater than a third preset value, adjusting the gain value of the target frame voice signal to be a target gain value.
Optionally, the target gain value is g, where g ═ gmin+k*(gmax-gmin)*g1/gmax,gminIs the minimum value of the gain value of the target frame speech signal, gmaxIs the maximum value of the gain value of the target frame speech signal, g1K is a fourth preset value for the first gain value.
Optionally, the target frame speech signal includes a plurality of speech samples, the plurality of speech samples include a first speech sample and a second speech sample, and the first gain value of the target frame speech signal includes a gain value of the first speech sample and a gain value of the second speech sample, as shown in fig. 6, the obtaining unit 2032 includes:
a first obtaining subunit 20321, configured to obtain a gain value of the first voice sample;
a second obtaining subunit 20322, configured to obtain a gain value of the second voice sample based on a gain value of the first voice sample, where the first voice sample is a previous voice sample of the second voice sample;
the adjusting unit 2033 is specifically configured to:
and under the condition that the average value corresponding to the target frame voice signal is greater than a first preset value and/or the envelope value change rate is greater than a second preset value, adjusting the gain value of the first voice sample point and the gain value of the second voice sample point to be the target gain value.
Optionally, the second obtaining subunit 20322 is specifically configured to:
under the condition that the energy value of the second voice sampling point is smaller than a fifth preset value, the gain value of the second voice sampling point is the gain value of the first voice sampling point;
and under the condition that the energy value of the second voice sample point is greater than or equal to the fifth preset value, the gain value of the second voice sample point is the product of the gain value of the first voice sample point and a preset coefficient.
Optionally, when a product of the envelope value of the second speech sample and the gain value of the first speech sample is greater than or equal to a sixth preset value, the preset coefficient is less than 1;
and when the product of the envelope value of the second voice sample and the gain value of the first voice sample is smaller than the sixth preset value, the preset coefficient is larger than 1.
The electronic device can implement each process implemented in the method embodiment of fig. 1, and is not described here again to avoid repetition.
Referring to fig. 7, fig. 7 is a schematic structural diagram of another electronic device according to an embodiment of the present invention, and as shown in fig. 7, the electronic device 300 includes: a memory 302, a processor 301, and a program stored on the memory 302 and executable on the processor 301, wherein:
the processor 301 reads the program in the memory 302 for executing:
acquiring an envelope value of a target frame voice signal in a received voice signal;
acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal;
and adjusting the gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame voice signal before the target frame voice signal.
Optionally, the adjusting, performed by the processor 301, a gain value of the target frame speech signal based on the envelope value of the target frame speech signal and an envelope value of at least one frame speech signal before the target frame speech signal includes:
calculating the average value of the envelope value of the target frame voice signal and the envelope value of at least one frame of voice signal before the target frame voice signal to obtain the average value corresponding to the target frame voice signal;
acquiring a first gain value of the target frame voice signal;
and under the condition that the average value corresponding to the target frame voice signal is greater than a first preset value and/or the envelope value change rate is greater than a second preset value, adjusting the gain value of the target frame voice signal to be a target gain value, wherein the target gain value is smaller than the first gain value, and the envelope value change rate is obtained based on the average value corresponding to the target frame voice signal and the average value corresponding to at least one frame of voice signal before the target frame voice signal.
Optionally, the adjusting, by the processor 301, the adjusting the gain value of the target frame speech signal to a target gain value when the mean value corresponding to the target frame speech signal is greater than a first preset value and/or the envelope value change rate is greater than a second preset value includes:
and under the condition that the corresponding mean value of the target frame voice signal is greater than a first preset value and/or the change rate of the envelope value is greater than a second preset value, if the energy value of the target frame voice signal is greater than a third preset value, adjusting the gain value of the target frame voice signal to be a target gain value.
Optionally, the target gain value is g, where g ═ gmin+k*(gmax-gmin)*g1/gmax,gminIs the minimum value of the gain value of the target frame speech signal, gmaxIs the maximum value of the gain value of the target frame speech signal, g1K is a fourth preset value for the first gain value.
Optionally, the obtaining the first gain value of the target frame speech signal by the processor 301 includes:
acquiring a gain value of the first voice sampling point;
acquiring a gain value of the second voice sample point based on the gain value of the first voice sample point, wherein the first voice sample point is a previous voice sample point of the second voice sample point;
the adjusting the gain value of the target frame speech signal to a target gain value comprises:
and adjusting the gain value of the first voice sampling point and the gain value of the second voice sampling point to be the target gain value.
Optionally, the obtaining, by the processor 301, a gain value of the second voice sample based on the gain value of the first voice sample includes:
under the condition that the energy value of the second voice sampling point is smaller than a fifth preset value, the gain value of the second voice sampling point is the gain value of the first voice sampling point;
and under the condition that the energy value of the second voice sample point is greater than or equal to the fifth preset value, the gain value of the second voice sample point is the product of the gain value of the first voice sample point and a preset coefficient.
Optionally, when a product of the envelope value of the second speech sample and the gain value of the first speech sample is greater than or equal to a sixth preset value, the preset coefficient is less than 1;
and when the product of the envelope value of the second voice sample and the gain value of the first voice sample is smaller than the sixth preset value, the preset coefficient is larger than 1.
In fig. 7, the bus architecture may include any number of interconnected buses and bridges, with one or more processors represented by processor 301 and various circuits of memory represented by memory 302 being linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface.
The processor 301 is responsible for managing the bus architecture and general processing, and the memory 302 may store data used by the processor 301 in performing operations.
It should be noted that any implementation manner in the method embodiment of the present invention may be implemented by the electronic device in this embodiment, and achieve the same beneficial effects, and details are not described here.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned embodiment of the automatic gain control method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. An automatic gain control method applied to an electronic device, the method comprising:
acquiring an envelope value of a target frame voice signal in a received voice signal;
acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal;
and adjusting the gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame voice signal before the target frame voice signal.
2. The method according to claim 1, wherein said adjusting the gain value of the target frame speech signal based on the envelope value of the target frame speech signal and the envelope value of at least one frame speech signal preceding the target frame speech signal comprises:
calculating the average value of the envelope value of the target frame voice signal and the envelope value of at least one frame of voice signal before the target frame voice signal to obtain the average value corresponding to the target frame voice signal;
acquiring a first gain value of the target frame voice signal;
and under the condition that the average value corresponding to the target frame voice signal is greater than a first preset value and/or the envelope value change rate is greater than a second preset value, adjusting the gain value of the target frame voice signal to be a target gain value, wherein the target gain value is smaller than the first gain value, and the envelope value change rate is obtained based on the average value corresponding to the target frame voice signal and the average value corresponding to at least one frame of voice signal before the target frame voice signal.
3. The method according to claim 2, wherein the adjusting the gain value of the target frame speech signal to a target gain value when the corresponding mean value of the target frame speech signal is greater than a first preset value and/or the change rate of the envelope value is greater than a second preset value comprises:
and under the condition that the corresponding mean value of the target frame voice signal is greater than a first preset value and/or the change rate of the envelope value is greater than a second preset value, if the energy value of the target frame voice signal is greater than a third preset value, adjusting the gain value of the target frame voice signal to be a target gain value.
4. A method according to claim 2 or 3, wherein the target gain value is g, where g-gmin+k*(gmax-gmin)*g1/gmax,gminIs the minimum value of the gain value of the target frame speech signal, gmaxIs the maximum value of the gain value of the target frame speech signal, g1K is a fourth preset value for the first gain value.
5. The method of claim 2, wherein the target frame speech signal comprises a plurality of speech samples, the plurality of speech samples comprises a first speech sample and a second speech sample, the first gain value of the target frame speech signal comprises a gain value of the first speech sample and a gain value of the second speech sample, and the obtaining the first gain value of the target frame speech signal comprises:
acquiring a gain value of the first voice sampling point;
acquiring a gain value of the second voice sample point based on the gain value of the first voice sample point, wherein the first voice sample point is a previous voice sample point of the second voice sample point;
the adjusting the gain value of the target frame speech signal to a target gain value comprises:
and adjusting the gain value of the first voice sampling point and the gain value of the second voice sampling point to be the target gain value.
6. The method of claim 5, wherein obtaining the gain value for the second speech sample based on the gain value for the first speech sample comprises:
under the condition that the energy value of the second voice sampling point is smaller than a fifth preset value, the gain value of the second voice sampling point is the gain value of the first voice sampling point;
and under the condition that the energy value of the second voice sample point is greater than or equal to the fifth preset value, the gain value of the second voice sample point is the product of the gain value of the first voice sample point and a preset coefficient.
7. The method according to claim 6, wherein the predetermined coefficient is smaller than 1 when the product of the envelope value of the second speech sample and the gain value of the first speech sample is greater than or equal to a sixth predetermined value;
and when the product of the envelope value of the second voice sample and the gain value of the first voice sample is smaller than the sixth preset value, the preset coefficient is larger than 1.
8. An electronic device, characterized in that the electronic device comprises:
the first acquisition module is used for acquiring an envelope value of a target frame voice signal in the received voice signal;
the second acquisition module is used for acquiring an envelope value of at least one frame of voice signal before the target frame of voice signal;
and the adjusting module is used for adjusting the gain value of the target frame voice signal based on the envelope value of the target frame voice signal and the envelope value of at least one frame of voice signal before the target frame voice signal.
9. An electronic device, comprising: memory, processor and program stored on the memory and executable on the processor, which when executed by the processor implements the steps in the automatic gain control method according to any of claims 1 to 7.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the automatic gain control method according to any one of claims 1 to 7.
CN202010037394.3A 2020-01-14 2020-01-14 Automatic gain control method and electronic equipment Active CN111243631B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010037394.3A CN111243631B (en) 2020-01-14 2020-01-14 Automatic gain control method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010037394.3A CN111243631B (en) 2020-01-14 2020-01-14 Automatic gain control method and electronic equipment

Publications (2)

Publication Number Publication Date
CN111243631A true CN111243631A (en) 2020-06-05
CN111243631B CN111243631B (en) 2021-12-14

Family

ID=70876474

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010037394.3A Active CN111243631B (en) 2020-01-14 2020-01-14 Automatic gain control method and electronic equipment

Country Status (1)

Country Link
CN (1) CN111243631B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112669878A (en) * 2020-12-23 2021-04-16 北京声智科技有限公司 Method and device for calculating sound gain value and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101009099A (en) * 2007-01-26 2007-08-01 北京中星微电子有限公司 Digital auto gain control method and device
CN101567190A (en) * 2009-05-21 2009-10-28 深圳市科莱特斯科技有限公司 Speech gain control method and device
CN106448712A (en) * 2016-10-20 2017-02-22 广州视源电子科技股份有限公司 Automatic gain control method and device for audio signals
CN108573709A (en) * 2017-03-09 2018-09-25 中移(杭州)信息技术有限公司 A kind of auto gain control method and device
CN109716432A (en) * 2018-11-30 2019-05-03 深圳市汇顶科技股份有限公司 Gain process method and device thereof, electronic equipment, signal acquisition method and its system
CN110111805A (en) * 2019-04-29 2019-08-09 北京声智科技有限公司 Auto gain control method, device and readable storage medium storing program for executing in the interactive voice of far field
CN110349595A (en) * 2019-07-22 2019-10-18 浙江大华技术股份有限公司 A kind of audio signal auto gain control method, control equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101009099A (en) * 2007-01-26 2007-08-01 北京中星微电子有限公司 Digital auto gain control method and device
CN100589183C (en) * 2007-01-26 2010-02-10 北京中星微电子有限公司 Digital auto gain control method and device
CN101567190A (en) * 2009-05-21 2009-10-28 深圳市科莱特斯科技有限公司 Speech gain control method and device
CN106448712A (en) * 2016-10-20 2017-02-22 广州视源电子科技股份有限公司 Automatic gain control method and device for audio signals
CN108573709A (en) * 2017-03-09 2018-09-25 中移(杭州)信息技术有限公司 A kind of auto gain control method and device
CN109716432A (en) * 2018-11-30 2019-05-03 深圳市汇顶科技股份有限公司 Gain process method and device thereof, electronic equipment, signal acquisition method and its system
CN110111805A (en) * 2019-04-29 2019-08-09 北京声智科技有限公司 Auto gain control method, device and readable storage medium storing program for executing in the interactive voice of far field
CN110349595A (en) * 2019-07-22 2019-10-18 浙江大华技术股份有限公司 A kind of audio signal auto gain control method, control equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112669878A (en) * 2020-12-23 2021-04-16 北京声智科技有限公司 Method and device for calculating sound gain value and electronic equipment
CN112669878B (en) * 2020-12-23 2024-04-19 北京声智科技有限公司 Sound gain value calculation method and device and electronic equipment

Also Published As

Publication number Publication date
CN111243631B (en) 2021-12-14

Similar Documents

Publication Publication Date Title
EP2680548A1 (en) Method and apparatus for reducing noise in voices in mobile terminals
US10141008B1 (en) Real-time voice masking in a computer network
JP6587742B2 (en) Sound mixing processing method and apparatus, apparatus, and storage medium
JP6764923B2 (en) Speech processing methods, devices, devices and storage media
CN110650410B (en) Microphone automatic gain control method, device and storage medium
JP6073456B2 (en) Speech enhancement device
CN113539285A (en) Audio signal noise reduction method, electronic device, and storage medium
CN111243631B (en) Automatic gain control method and electronic equipment
CN112602150A (en) Noise estimation method, noise estimation device, voice processing chip and electronic equipment
CN110097892B (en) Voice frequency signal processing method and device
CN112309418B (en) Method and device for inhibiting wind noise
CN110491366B (en) Audio smoothing method and device, computer equipment and storage medium
CN111370016B (en) Echo cancellation method and electronic equipment
CN114466285B (en) Method, device, equipment and storage medium for adjusting loudness of audio signal
WO2006055354A2 (en) Adaptive time-based noise suppression
CN113470691A (en) Automatic gain control method of voice signal and related device thereof
CN111883150A (en) Loudness equalization method, device, storage medium and equipment
CN112397079A (en) Filter, adaptive filtering method thereof and computer readable storage medium
CN114449413B (en) Method, device, equipment and storage medium for controlling loudness of audio signal
CN117079657B (en) Pressure limit processing method and device, electronic equipment and readable storage medium
CN112151047B (en) Real-time automatic gain control method applied to voice digital signal
CN112669872B (en) Audio data gain method and device
CN112908350B (en) Audio processing method, communication device, chip and module equipment thereof
JPH113094A (en) Noise eliminating device
CN111161750B (en) Voice processing method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant