CN110349595B

CN110349595B - Audio signal automatic gain control method, control equipment and storage medium

Info

Publication number: CN110349595B
Application number: CN201910663246.XA
Authority: CN
Inventors: 陈烈; 黄景标
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2019-07-22
Filing date: 2019-07-22
Publication date: 2021-08-31
Anticipated expiration: 2039-07-22
Also published as: CN110349595A

Abstract

The application discloses an audio signal automatic gain control method, control equipment and a storage medium, wherein the audio signal automatic gain control method comprises the steps of counting the energy of an input audio signal, and correcting an initial gain coefficient corresponding to the input audio signal for one time according to the energy of the input audio signal to obtain a middle gain coefficient of the input audio signal; carrying out secondary correction on the intermediate gain coefficient to obtain a final gain coefficient; and obtaining an output signal according to the input audio signal and the final gain coefficient. By means of the method, the gain coefficient can be dynamically adjusted, the fluctuation of the audio signal can be quickly tracked, and the quality of the output signal is improved.

Description

Audio signal automatic gain control method, control equipment and storage medium

Technical Field

The present application relates to the field of signal processing technologies, and in particular, to an audio signal automatic gain control method, control device, and storage medium.

Background

In network voice chat, due to the influence of factors such as low precision of equipment for sampling voice signals, poor microphone quality, and the fact that the distance between a calling party and a microphone is suddenly and suddenly changed when the calling party and the microphone are in a call, the situation that voice is suddenly changed when the voice signals are transmitted by using a network may occur, and the communication between the calling party and the called party may be influenced when the situation is serious.

In practical applications, under what conditions the volume is to be increased or decreased, it is common practice to employ silence detection; once silence or noise is detected, no processing is performed, and otherwise, certain strategies are used for processing. The audio signal is processed by using an Automatic Gain Control (AGC) algorithm, so that the level of an output end or a receiving end can be kept within a certain range, the unbalanced problem of voice signal transmission on the network is solved, and the voice with higher quality is obtained; however, in order to ensure the reliability of the statistics and avoid the improper interference of the abnormal burst signal to the statistics, a certain sample needs to be accumulated for statistics, and usually, several tens of milliseconds or even more than one hundred milliseconds are used as a period, so that the statistics time is long and the response speed is slow.

Disclosure of Invention

The present application mainly solves the problem of providing an audio signal automatic gain control method, control device, and storage medium, which can dynamically adjust a gain coefficient, quickly track the fluctuation of an audio signal, and improve the quality of an output signal.

In order to solve the technical problem, the technical scheme adopted by the application is as follows: there is provided an audio signal automatic gain control method, the method comprising: counting the energy of the input audio signal, and correcting the initial gain coefficient corresponding to the input audio signal for one time according to the energy of the input audio signal to obtain a middle gain coefficient of the input audio signal; carrying out secondary correction on the intermediate gain coefficient to obtain a final gain coefficient; and obtaining an output signal according to the input audio signal and the final gain coefficient.

The method comprises the following steps of counting the energy of an input audio signal, correcting an initial gain coefficient corresponding to the input audio signal according to the energy of the input audio signal, and obtaining a middle gain coefficient of the input audio signal, wherein the method comprises the following steps: obtaining a loudness gain coefficient table according to a preset loudness range and an equal loudness curve, and storing the loudness gain coefficient table, wherein the loudness gain coefficient table comprises a mapping relation of loudness level, frequency and gain coefficient; selecting a loudness level from a preset loudness range as an initial loudness factor; and matching the initial loudness factor with a loudness gain coefficient table to obtain an initial gain coefficient.

Wherein, the step of counting the energy of the input audio signal comprises: performing framing processing on an input audio signal to obtain a plurality of first audio frames; detecting each first audio frame by using a voice activity detection method so as to judge whether each first audio frame is an effective frame; if the first audio frame is an effective frame, acquiring the energy root mean square of each effective frame, and averaging the energy root mean square of all effective frames in the input audio signal to obtain an average energy root mean square.

The method for obtaining the intermediate gain coefficient of the input audio signal comprises the following steps of performing primary correction on an initial gain coefficient corresponding to the input audio signal according to the energy of the input audio signal to obtain the intermediate gain coefficient of the input audio signal: judging whether the mean energy root mean square is larger than a first preset energy threshold value or not; if the mean energy root mean square is greater than a first preset energy threshold, reducing the initial loudness factor to a first intermediate loudness factor; if the mean energy root mean square is smaller than a first preset energy threshold, increasing the initial loudness factor to a second intermediate loudness factor; and matching the first intermediate loudness factor/the second intermediate loudness factor with a loudness gain coefficient table to obtain an intermediate gain coefficient.

The method for obtaining the intermediate gain coefficient of the input audio signal comprises the following steps of performing primary correction on an initial gain coefficient corresponding to the input audio signal according to the energy of the input audio signal to obtain the intermediate gain coefficient of the input audio signal: judging whether the mean energy root mean square of the input audio signal corresponding to the previous time period of the current time period is greater than a second preset energy threshold value or not; if the mean energy root mean square of the input audio signal corresponding to the previous time period is larger than a second preset energy threshold, reducing the initial loudness factor to a third intermediate loudness factor; if the mean energy root mean square of the input audio signal corresponding to the previous time period is smaller than a second preset energy threshold, increasing the initial loudness factor to a fourth intermediate loudness factor; and matching the third intermediate loudness factor/the fourth intermediate loudness factor with a loudness gain coefficient table to obtain an intermediate gain coefficient.

Wherein, the step of carrying out secondary correction on the intermediate gain coefficient to obtain the final gain coefficient comprises the following steps: performing framing processing on the input audio signal to obtain a plurality of second audio frames; analyzing the second audio frames by using a voice endpoint detection method to obtain an audio probability value corresponding to each second audio frame, wherein the audio probability value is used for indicating the probability of the second audio frames having audio; performing secondary correction on the middle gain coefficient corresponding to each second audio frame according to the audio probability value to obtain a gain coefficient adjustment table; the gain coefficient adjustment table includes a plurality of second audio frames and final gain coefficients corresponding to the plurality of second audio frames.

The step of performing secondary correction on the intermediate gain coefficient corresponding to each second audio frame according to the audio probability value to obtain a gain coefficient adjustment table includes: judging whether the audio probability value is greater than a preset probability threshold value or not; if the audio probability value is larger than the preset probability threshold value, reducing the intermediate gain coefficient corresponding to the second audio frame to a first final gain coefficient; and if the audio probability value is smaller than the preset probability threshold value, adding the intermediate gain coefficient corresponding to the second audio frame to a second final gain coefficient.

Wherein the step of obtaining an output signal based on the input audio signal and the final gain factor comprises: and multiplying the second audio frame by the final gain coefficient in the gain coefficient adjustment table to obtain an output signal.

In order to solve the above technical problem, another technical solution adopted by the present application is: there is provided an automatic gain control apparatus including: a memory and a processor connected to each other, wherein the memory is adapted to store a computer program which, when executed by the processor, is adapted to carry out the above-mentioned method of automatic gain control of an audio signal.

In order to solve the above technical problem, another technical solution adopted by the present application is: there is provided a computer storage medium for storing a computer program for implementing the above-mentioned audio signal automatic gain control method when executed by a processor.

Through the scheme, the beneficial effects of the application are that: the initial gain coefficient is corrected twice, the initial gain coefficient is adjusted according to the energy of the input audio signal in the first correction to obtain an intermediate gain coefficient, and the final gain coefficient is obtained by using the intermediate gain coefficient and the input audio signal in the second correction; then generating an output signal by using the input audio signal and the final gain coefficient; the gain coefficient can be dynamically adjusted, the fluctuation of the input audio signal can be quickly tracked, and the gain coefficient can be adaptively adjusted according to the fluctuation of the input audio signal, so that the amplitude of the output signal is kept within a certain range, the quality of the output signal is improved, and the stable audio can be heard by human ears.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:

fig. 1 is a schematic flowchart illustrating an embodiment of an automatic gain control method for audio signals according to the present application;

FIG. 2 is a schematic flowchart illustrating an audio signal AGC method according to another embodiment of the present application;

FIG. 3 is a schematic diagram of an equal loudness curve in another embodiment of an automatic gain control method for audio signals according to the present application;

fig. 4 is a flowchart illustrating step 208 in another embodiment of the method for automatic gain control of audio signals provided in the present application;

fig. 5 is another flowchart illustrating step 208 in another embodiment of the method for automatic gain control of audio signals provided in the present application;

fig. 6 is a schematic structural diagram of an embodiment of an automatic gain control apparatus provided in the present application;

FIG. 7 is a schematic structural diagram of an embodiment of a computer storage medium provided in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Currently, there are a plurality of methods for adjusting the gain of a speech signal, for example, a first method: setting high and low thresholds of voice short-time energy and high and low thresholds of zero crossing rate, calculating the short-time energy and zero crossing rate of input signals, and comparing the short-time energy and zero crossing rate with preset thresholds respectively, thereby selecting to increase gain, decrease gain or keep current gain value. The method takes the zero crossing rate, the short-term energy and other voice characteristic parameters as the standard of endpoint detection and judgment, and is easy to misjudge under the environment of low signal-to-noise ratio and certain specific noise, so that the adjusted sound is suddenly changed.

The second method comprises the following steps: preprocessing a Voice to be detected, then performing Voice Activity Detection (VAD) on the Voice signals, comparing the Voice signal parameters of a current operation window frame and a previous operation window frame when the Voice signals to be detected are detected, and outputting corresponding gain adjustment data to an operational amplifier according to a comparison result; when the signal is started or judged not to be a voice signal, the gain of the operational amplifier is adjusted to an initial value. The method relies entirely on voice activity detection, and if voice activity detection fails, the output of the entire system becomes abnormal.

The third method comprises the following steps: acquiring an input voice signal, calculating to obtain an accumulated statistical maximum value, calculating a Pulse Code Modulation (PCM) expected adjustment factor, calculating a rapid AGC gain, performing AGC calculation on the input signal, and finally outputting the processed voice signal. The method does not detect voice activity, only calculates the accumulated statistical maximum value, compares the energy of the input signal with the statistical maximum value to adjust the gain, and can generate error adjustment under the condition of certain continuous large noise.

In the method, voice activity detection is not carried out or only end point detection is carried out aiming at a certain single characteristic during automatic gain adjustment, and misadjustment is easy to occur under low signal-to-noise ratio; the algorithm has poor robustness, and the condition of sudden and sudden sound is easy to occur; the algorithm calculation needs long-time statistics, the length of a processing frame cannot be too short, and the processing of a linear filter needs convergence time, so that the effective response speed of the traditional method is low.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating an embodiment of an audio signal automatic gain control method provided in the present application, the method including:

step 11: and counting the energy of the input audio signal, and correcting the initial gain coefficient corresponding to the input audio signal for one time according to the energy of the input audio signal to obtain a middle gain coefficient of the input audio signal.

The input audio signal may be preprocessed, for example, the input audio signal may be windowed, so as to smooth the input audio signal and reduce the frequency spectrum leakage; or Fast Fourier Transform (FFT) is performed on the input audio signal, the input audio signal is transformed from the time domain to the frequency domain, and processing is performed in the frequency domain.

The input audio signal corresponds to an initial gain factor, which may be system default or user set. In order to process the input audio signal and keep the amplitude of the output signal generated after the processing within a certain range, the initial gain coefficient is corrected twice, the first correction is based on the energy of the input audio signal, the initial gain coefficient is corrected according to the energy of the input audio signal, and the initial gain coefficient is corrected into the middle gain coefficient.

Step 12: and carrying out secondary correction on the intermediate gain coefficient to obtain a final gain coefficient.

And after the intermediate gain coefficient is obtained, correcting based on the intermediate gain coefficient and the input audio signal to obtain a final gain coefficient.

Step 13: and obtaining an output signal according to the input audio signal and the final gain coefficient.

After the final gain coefficient is acquired, the input audio signal is multiplied by the final gain coefficient, thereby obtaining an output signal.

Different from the prior art, the application provides an audio signal automatic gain control method, which is characterized in that the initial gain coefficient is adjusted according to the energy of an input audio signal by counting the energy of the input audio signal to obtain an intermediate gain coefficient; then, obtaining a final gain coefficient by using the intermediate gain coefficient and the input audio signal to generate an output signal; the gain coefficient can be dynamically adjusted, the fluctuation of the input audio signal can be quickly tracked, and the gain coefficient can be adaptively adjusted according to the fluctuation of the input audio signal, so that the amplitude of the output signal is kept within a certain range, the quality of the output signal is improved, and the stable audio can be heard by human ears.

Referring to fig. 2, fig. 2 is a schematic flowchart illustrating another embodiment of an audio signal automatic gain control method provided in the present application, the method including:

step 201: and obtaining a loudness gain coefficient table according to a preset loudness range and an equal loudness curve, and storing the loudness gain coefficient table.

Human hearing does not perceive all frequencies linearly, but follows an equal loudness curve, and in order to make frequencies in the whole frequency range sound the same loudness, the gain needs to be adjusted in a loudness scale instead of in a frequency domain, and frequencies of an audio signal are weighted according to the equal loudness curve, so that a fixed loudness level cannot be adopted for weighting. A loudness level may be determined, and then the loudness level is mapped onto an equal loudness curve, and gain coefficients corresponding to frequencies are determined, each loudness level corresponding to a set of gain coefficients that includes a frequency and its corresponding gain coefficient.

For example, as shown in fig. 3, the equal loudness curve shows the sound pressure level required for sounds of different frequencies to cause equal loudness, for example, for a 110-square loudness level, if a sound with the same loudness as a frequency of 200Hz and a sound pressure level of 110dB is to be obtained, a sound pressure level of 100dB is required at a frequency of 5000Hz to make people hear the same volume.

The preset loudness range can be [0, Lmax ], Lmax is a maximum adjustable loudness threshold, which is the maximum loudness level acceptable by human ears; obtaining the frequency and sound pressure level corresponding to each loudness level from the equal loudness curve; in order to reduce the processing time required by each gain calculation and improve the effective response speed, the mapping relation of loudness level, frequency and gain coefficient can be established according to the preset loudness range and the equal loudness curve, so as to obtain a loudness gain coefficient table, namely the loudness gain coefficient table comprises the mapping relation of loudness level, frequency and gain coefficient, and the loudness gain coefficient table can be stored in a memory.

Step 202: a loudness level is selected from a preset loudness range as the initial loudness factor.

A loudness level between 0 and a maximum adjustable loudness threshold Lmax may be taken as the initial loudness factor.

Step 203: and matching the initial loudness factor with a loudness gain coefficient table to obtain an initial gain coefficient.

The loudness gain coefficient table stores the corresponding relation between the loudness level and the gain coefficient, and the initial loudness factor is used as an index and is searched in the loudness gain coefficient table, so that the initial gain coefficient corresponding to the initial loudness factor is obtained.

Step 204: the method comprises the steps of performing framing processing on an input audio signal to obtain a plurality of first audio frames.

In order to count the corresponding energy of the input audio signal, the input audio signal is divided into smaller short segments for processing, and each short segment is a first audio frame.

Step 205: and detecting each first audio frame by using a voice activity detection method so as to judge whether each first audio frame is a valid frame.

Step 206: if the first audio frame is an effective frame, acquiring the energy root mean square of each effective frame, and averaging the energy root mean square of all effective frames in the input audio signal to obtain an average energy root mean square.

Detecting frames with active voice by using a voice activity detection method, and counting the energy Root Mean Square (RMS) of effective frames; and if the first audio frame is judged not to be the valid frame, no processing is carried out.

Further, all the first audio frames are detected by a voice activity detection method, effective audio frames are detected from the first audio frames and recorded as effective frames; then, the energy root mean square of each effective frame is counted, and the energy root mean square of all effective frames in the input audio signal is summed and averaged, so that the average energy root mean square corresponding to the input audio signal is obtained.

Step 207: and correcting the initial gain coefficient corresponding to the input audio signal for one time according to the energy of the input audio signal to obtain a middle gain coefficient of the input audio signal.

In a specific embodiment, the average energy root mean square may be compared with a first preset energy threshold, and the initial gain coefficient may be modified according to the comparison result, which includes the following steps:

step 2071: and judging whether the mean energy root mean square is larger than a first preset energy threshold value or not.

For the adjustment of the gain factor, the average of the root mean square of the energy of a plurality of active frames in the input audio signal is compared with a first preset energy threshold.

Step 2072: if the mean energy root mean square is greater than the first preset energy threshold, the initial loudness factor is reduced to a first intermediate loudness factor.

Step 2073: if the mean energy root mean square is less than the first preset energy threshold, the initial loudness factor is increased to a second intermediate loudness factor.

The first intermediate loudness factor is less than the second intermediate loudness factor; if the mean energy root mean square is equal to the first preset energy threshold, the initial loudness factor is kept unchanged; adjusting the initial loudness factor according to the size relation between the mean energy root-mean-square and a first preset energy threshold; if the mean energy root mean square is larger, adjusting the initial loudness factor to be smaller; if the mean energy root mean square is small, the initial loudness factor is adjusted large.

Step 2074: and matching the first intermediate loudness factor/the second intermediate loudness factor with a loudness gain coefficient table to obtain an intermediate gain coefficient.

And taking the first intermediate loudness factor or the second intermediate loudness factor as an index, and searching in a loudness gain coefficient table to obtain an intermediate gain coefficient corresponding to the first intermediate loudness factor/the second intermediate loudness factor.

In another specific embodiment, in order to correct the initial gain coefficient, the correction may further be performed by using an average root mean square of energy corresponding to a previous time period of the current time period, which specifically includes the following steps:

step 2075: and judging whether the mean energy root mean square of the input audio signal corresponding to the previous time period of the current time period is greater than a second preset energy threshold value.

Each input audio signal corresponds to a time segment, and the input audio signal corresponding to the last time segment of the input audio signal currently being processed can be used as a basis for adjusting the gain factor.

Step 2076: and if the mean energy root mean square of the input audio signal corresponding to the previous time period is greater than a second preset energy threshold, reducing the initial loudness factor to a third intermediate loudness factor.

If the mean energy root mean square of the input audio signal corresponding to the previous time period is larger than the second preset energy threshold, it is indicated that the mean energy root mean square of the input audio signal of the previous time period is larger, the initial loudness factor can be reduced to a third intermediate loudness factor according to the continuity of the audio signals, and the third intermediate loudness factor is used as the loudness factor corresponding to the currently processed input audio signal.

Step 2077: and if the mean energy root mean square of the input audio signal corresponding to the previous time period is smaller than a second preset energy threshold, increasing the initial loudness factor to a fourth intermediate loudness factor.

If the mean square root of the average energy of the input audio signals corresponding to the previous time period is smaller than the second preset energy threshold, it is indicated that the mean square root of the average energy of the input audio signals of the previous time period is smaller, the initial loudness factor can be increased to a fourth intermediate loudness factor, and the fourth intermediate loudness factor is used as the loudness factor corresponding to the currently processed input audio signals.

Step 2078: and matching the third intermediate loudness factor/the fourth intermediate loudness factor with a loudness gain coefficient table to obtain an intermediate gain coefficient.

And taking the third intermediate loudness factor or the fourth intermediate loudness factor as an index, and searching in a loudness gain coefficient table to obtain an intermediate gain coefficient corresponding to the third intermediate loudness factor/the fourth intermediate loudness factor.

The method does not adjust the gain coefficient for the first audio frame with inactive voice and does coarse adjustment of the gain coefficient for the first audio frame with active voice; the loudness factor is no longer fixed but dynamically adjustable, and can quickly adapt to fluctuations in the audio and avoid amplifying noise as much as possible.

Step 208: and performing framing processing on the input audio signal to obtain a plurality of second audio frames.

After the intermediate gain coefficients are acquired, the input audio signal is divided into a plurality of second audio frames.

Step 209: and analyzing the second audio frames by using a voice endpoint detection method to obtain the audio probability value corresponding to each second audio frame.

The audio probability value is used for indicating the probability of the second audio frames having audio, and a voice endpoint Detection (EPD) method is used to determine the probability of each second audio frame having voice, so as to modify the intermediate gain coefficient according to the audio probability value.

Step 210: and performing secondary correction on the middle gain coefficient corresponding to each second audio frame according to the audio probability value to obtain a gain coefficient adjustment table.

The gain coefficient adjustment table includes a plurality of second audio frames and final gain coefficients corresponding to the plurality of second audio frames, respectively.

Step 211: and judging whether the audio probability value is greater than a preset probability threshold value.

Step 212: and if the audio probability value is greater than the preset probability threshold value, reducing the intermediate gain coefficient corresponding to the second audio frame to the first final gain coefficient.

Step 213: and if the audio probability value is smaller than the preset probability threshold value, adding the intermediate gain coefficient corresponding to the second audio frame to a second final gain coefficient.

The first final gain coefficient and the second final gain coefficient are final gain coefficients in the gain coefficient adjustment table; fine-tuning the gain coefficient, and if the audio probability value is smaller than a preset probability threshold value, reducing the gain coefficient, namely reducing the intermediate gain coefficient to a first final gain coefficient; if the audio probability value is larger than the preset probability threshold value, the intermediate gain coefficient is increased to a second final gain coefficient, and the first final gain coefficient is smaller than the second final gain coefficient; if the audio probability value is equal to the preset probability threshold value, no adjustment is carried out; the adjustment step size of the gain factor may be defined by the user or default by the system. In addition, for amplitude limiting protection, the maximum gain coefficient needs to be smaller than the gain coefficient extremum set by the system, so as to avoid excessively amplifying the input audio signal.

Step 214: and multiplying the second audio frame by the final gain coefficient in the gain coefficient adjustment table to obtain an output signal.

For a plurality of second audio frames in the input audio signal, multiplying them by the final gain coefficients in the gain coefficient adjustment table, respectively, to obtain output signals.

The method is characterized in that two prior detections of energy root mean square statistics and voice activity detection of the audio signal are combined, an effective frame is obtained through voice activity detection, energy root mean square statistics is carried out to obtain an intermediate loudness factor, a loudness gain factor table is inquired to obtain an intermediate gain factor, first correction of the gain factor is realized, and the loudness factor is dynamically adjustable, so that voice fluctuation can be quickly tracked; then, voice endpoint detection is used for fine adjustment of the intermediate gain coefficient to obtain a corrected gain coefficient, so that the reliability of gain coefficient statistics is guaranteed, and the adjustment effect is improved; because the loudness gain coefficient table is obtained in advance and stored in the memory as the lookup table, the processing time and the whole calculation amount required by calculating the gain coefficient can be reduced, and the effective response speed is improved.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an embodiment of an automatic gain control apparatus 60 provided in the present application, where the automatic gain control apparatus includes: a memory 61 and a processor 62 connected to each other, wherein the memory 61 is used for storing a computer program, and the computer program, when executed by the processor 62, is used for implementing the audio signal automatic gain control method in the above-described embodiment.

The processor 62 can dynamically adjust the gain factor according to the energy of the input audio signal, and can adaptively adjust the gain for the fluctuation of the audio signal, thereby ensuring that the amplitude of the output signal is kept within a certain range and improving the quality of the output signal.

Referring to fig. 7, fig. 7 is a schematic structural diagram of an embodiment of a computer storage medium 70 provided in the present application, where the computer storage medium 70 is used to store a computer program 71, and the computer program 71 is used to implement the automatic gain control method of an audio signal in the foregoing embodiment when being executed by a processor.

The storage medium 70 may be various media capable of storing program codes, such as a server, a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of modules or units is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The above embodiments are merely examples, and not intended to limit the scope of the present application, and all modifications, equivalents, and flow charts using the contents of the specification and drawings of the present application, or those directly or indirectly applied to other related arts, are included in the scope of the present application.

Claims

1. A method for automatic gain control of an audio signal, comprising:

counting the energy of an input audio signal, and correcting an initial gain coefficient corresponding to the input audio signal for one time according to the energy of the input audio signal to obtain a middle gain coefficient of the input audio signal;

performing framing processing on the input audio signal to obtain a plurality of second audio frames;

analyzing the second audio frames by using a voice endpoint detection method to obtain an audio probability value corresponding to each second audio frame, wherein the audio probability value is used for indicating the probability that the second audio frames have audio;

performing secondary correction on the intermediate gain coefficient corresponding to each second audio frame according to the audio probability value to obtain a gain coefficient adjustment table; wherein the gain coefficient adjustment table includes a plurality of the second audio frames and final gain coefficients corresponding to the plurality of the second audio frames, respectively;

and obtaining an output signal according to the input audio signal and the final gain coefficient.

2. The method as claimed in claim 1, wherein the step of performing statistics on the energy of the input audio signal and performing a correction on the initial gain coefficient corresponding to the input audio signal according to the energy of the input audio signal to obtain the intermediate gain coefficient of the input audio signal comprises:

obtaining a loudness gain coefficient table according to a preset loudness range and an equal loudness curve, and storing the loudness gain coefficient table, wherein the loudness gain coefficient table comprises a mapping relation of loudness level, frequency and gain coefficient;

selecting a loudness level from the preset loudness range as an initial loudness factor;

and matching the initial loudness factor with the loudness gain coefficient table to obtain the initial gain coefficient.

3. The method of claim 2, wherein the step of performing statistics on the energy of the input audio signal comprises:

performing framing processing on the input audio signal to obtain a plurality of first audio frames;

detecting each first audio frame by using a voice activity detection method so as to judge whether each first audio frame is an effective frame;

if so, acquiring the energy root mean square of each effective frame, and averaging the energy root mean square of all the effective frames in the input audio signal to obtain an average energy root mean square.

4. The method as claimed in claim 3, wherein the step of performing a first modification on the initial gain coefficient corresponding to the input audio signal according to the energy of the input audio signal to obtain the intermediate gain coefficient of the input audio signal comprises:

judging whether the mean energy root mean square is larger than a first preset energy threshold value or not;

if so, reducing the initial loudness factor to a first intermediate loudness factor; if not, increasing the initial loudness factor to a second intermediate loudness factor;

and matching the first intermediate loudness factor/the second intermediate loudness factor with the loudness gain coefficient table to obtain the intermediate gain coefficient.

5. The method as claimed in claim 3, wherein the step of performing a first modification on the initial gain coefficient corresponding to the input audio signal according to the energy of the input audio signal to obtain the intermediate gain coefficient of the input audio signal comprises:

judging whether the mean energy root mean square of the input audio signal corresponding to the previous time period of the current time period is greater than a second preset energy threshold value or not;

if so, reducing the initial loudness factor to a third intermediate loudness factor; if not, increasing the initial loudness factor to a fourth intermediate loudness factor;

and matching the third intermediate loudness factor/the fourth intermediate loudness factor with the loudness gain coefficient table to obtain the intermediate gain coefficient.

6. The method according to claim 1, wherein the step of performing secondary correction on the intermediate gain coefficient corresponding to each of the second audio frames according to the audio probability value to obtain a gain coefficient adjustment table comprises:

judging whether the audio probability value is larger than a preset probability threshold value or not;

if so, reducing the intermediate gain coefficient corresponding to the second audio frame to a first final gain coefficient; if not, increasing the intermediate gain coefficient corresponding to the second audio frame to a second final gain coefficient.

7. The method of claim 1, wherein the step of deriving an output signal from the input audio signal and the final gain factor comprises:

and multiplying the second audio frame by the final gain coefficient in the gain coefficient adjustment table to obtain the output signal.

8. An automatic gain control device, comprising a memory and a processor connected to each other, wherein the memory is configured to store a computer program, which when executed by the processor is configured to implement the audio signal automatic gain control method of any one of claims 1-7.

9. A computer storage medium storing a computer program, characterized in that the computer program, when being executed by a processor, is adapted to carry out the audio signal automatic gain control method of any one of claims 1-7.