CN113113042A

CN113113042A - Audio signal processing method, device, equipment and storage medium

Info

Publication number: CN113113042A
Application number: CN202110383925.9A
Authority: CN
Inventors: 陈敬进
Original assignee: Guangzhou Huiruisitong Technology Co Ltd
Current assignee: Guangzhou Huiruisitong Technology Co Ltd
Priority date: 2021-04-09
Filing date: 2021-04-09
Publication date: 2021-07-13

Abstract

The present disclosure relates to the field of signal processing, and in particular, to an audio signal processing method, apparatus, device, and storage medium. The method comprises the following steps: acquiring an audio signal to be processed; segmenting the audio signal to obtain at least two-stage audio signals; the audio signal of each stage is processed as follows: when the maximum amplitude value of the stage audio signal is smaller than a first preset amplitude threshold value, amplifying the amplitude of the stage audio signal so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; and when the maximum amplitude value of the stage audio signal is greater than the second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal. The audio signal amplitude adjustment method and device are used for solving the problem that in the prior art, the audio signal amplitude adjustment cannot be carried out while timeliness is guaranteed, and meanwhile audio signal distortion is avoided.

Description

Audio signal processing method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of signal processing, and in particular, to an audio signal processing method, apparatus, device, and storage medium.

Background

In the field of intercom and other uses, when transmitting audio signals, amplitude adjustment needs to be performed on the audio signals, so that the audio signals reach an ideal state, and subsequent processing or use of the audio signals is facilitated, for example, the adjusted audio signals are more beneficial to modulation of the audio signals.

In the prior art, two methods are generally used for adjusting the amplitude of an audio signal, one is to adjust the signal through a negative feedback loop, and the method can stably adjust the amplitude of the audio signal. Another is by comparison of signal energy. The energy of the audio signal represents the whole amplitude of the audio signal, after the energy of the audio signal is compared with a preset energy threshold, a gain multiple is obtained according to a comparison result, and then the amplitude of the audio signal is adjusted through the gain multiple.

Disclosure of Invention

At present, the mode of adjusting signals through negative feedback is stable, but the time consumption of the processing process of signal adjustment is long, and the method is particularly not suitable for environments with high timeliness requirements such as an interphone and the like, and the audio signal amplitude adjustment effect is poor. Although the timeliness of the adjustment process is guaranteed through the signal energy comparison adjustment mode, the gain multiple of the audio signal directly acts on the overall amplitude of the audio signal, and when the amplitude fluctuation of the same section of audio signal is large, the maximum amplitude of the signal exceeds the preset signal amplitude threshold value after the signal component with the large amplitude is multiplied by the gain multiple. The part exceeding the signal amplitude threshold value can be subjected to peak clipping processing, so that the distortion of the audio signal is caused, the audio effect presented by the audio signal is reduced, and the effect of adjusting the amplitude of the audio signal is also poor.

The present disclosure provides an audio signal processing method, apparatus, device, and storage medium, which are used to solve the problem that in the prior art, the audio signal cannot be adjusted in amplitude while timeliness is guaranteed, and audio signal distortion is avoided.

In a first aspect, an embodiment of the present disclosure provides an audio signal processing method, including: acquiring an audio signal to be processed; segmenting the audio signal to obtain at least two-stage audio signals; the audio signal of each stage is processed as follows: amplifying the amplitude of the staged audio signal when the maximum amplitude value of the staged audio signal is smaller than a first preset amplitude threshold value, so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

Optionally, when the maximum amplitude value of the staged audio signal is smaller than a first preset amplitude threshold, amplifying the amplitude of the staged audio signal, including: when the maximum amplitude value of the phase audio signal is smaller than the first preset amplitude threshold value, taking the ratio of the first preset amplitude threshold value to the maximum amplitude value as a first amplitude gain value; amplifying the amplitude of the stage audio signal by the first amplitude gain value.

Optionally, when the maximum amplitude value of the staged audio signal is greater than a second preset amplitude threshold, adjusting the amplitude of the staged audio signal includes: when the maximum amplitude value of the stage audio signal is larger than a second preset amplitude threshold value, taking the ratio of the transmission amplitude upper limit value to the maximum amplitude value as a second amplitude gain value; and adjusting the amplitude of the audio signal of the stage through the second amplitude gain value.

Optionally, after adjusting the amplitude of the staged audio signal when the maximum amplitude value of the staged audio signal is greater than a second preset amplitude threshold value, so that the adjusted maximum amplitude value is less than or equal to a transmission amplitude upper limit value of a transmission channel of the audio signal, the method further includes: acquiring boundary sampling points of two adjacent stage audio signals; acquiring N continuous sampling points including the boundary sampling points from the two adjacent stage audio information, wherein N is an integer greater than 1, and the N continuous sampling points include at least two sampling points belonging to different stage audio information; and carrying out interpolation on the audio signals corresponding to the N continuous sampling points.

Optionally, the acquiring the audio signal to be processed includes: receiving a signal to be analyzed; acquiring a first signal energy value of the signal to be analyzed in a preset audio frequency band, and acquiring a second signal energy value of the signal to be analyzed in a preset noise frequency band; and when the ratio of the first signal energy value to the second signal energy value is greater than a preset energy threshold value, taking the signal of the signal to be analyzed in the preset audio signal frequency band as the audio signal to be processed.

Optionally, the segmenting the audio signal to obtain at least two-stage audio signals includes: grouping continuous sampling points contained in the audio signal according to the number of preset sampling points to obtain at least two grouping results, wherein one grouping result comprises at least two continuous sampling points; and respectively using the audio signal corresponding to each grouping result as the stage audio signal, wherein the number of sampling points included in the stage audio signal is equal to or less than the number of preset sampling points.

Optionally, the method further comprises: when the maximum amplitude value is larger than the first preset amplitude threshold value and the maximum amplitude value is smaller than a third preset amplitude threshold value, amplifying the amplitude of the staged audio signal according to the ratio of the amplitude value of each sampling point in the staged audio signal to the third preset amplitude threshold value, wherein the third preset amplitude threshold value is larger than the first preset amplitude threshold value and smaller than the second preset amplitude threshold value; and when the maximum amplitude value is smaller than a second preset amplitude threshold value and the maximum amplitude value is larger than a fourth preset amplitude threshold value, reducing the amplitude of the stage audio signal according to the ratio of the amplitude value of each sampling point in the stage audio signal to the fourth preset amplitude threshold value, wherein the fourth preset amplitude threshold value is smaller than the second preset amplitude threshold value and larger than the third preset amplitude threshold value.

In a second aspect, an embodiment of the present disclosure provides an audio signal processing apparatus, including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an audio signal to be processed; the segmenting unit is used for segmenting the audio signal to obtain at least two stage audio signals; the processing unit is used for respectively carrying out the following processing on the audio signal of each stage: amplifying the amplitude of the staged audio signal when the maximum amplitude value of the staged audio signal is smaller than a first preset amplitude threshold value, so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: the system comprises a processor, a memory and a communication bus, wherein the processor and the memory are communicated with each other through the communication bus; the memory for storing a computer program; the processor is configured to execute the program stored in the memory, and implement the audio signal processing method according to the first aspect.

In a fourth aspect, the present disclosure provides a computer-readable storage medium storing a computer program, which when executed by a processor implements the audio signal processing method according to the first aspect.

Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages: the method provided by the embodiment of the disclosure segments the audio signal to be processed to obtain at least two stage audio signals, and processes each stage audio signal respectively. When the maximum amplitude value of the stage audio signal is smaller than a first preset amplitude threshold value, amplifying the amplitude of the stage audio signal so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal; wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

Compared with the prior art, the method adjusts the overall amplitude of the audio signal in each stage through the relationship between the maximum amplitude value of the audio signal in each stage and the first preset amplitude threshold value or the second preset amplitude threshold value. When the maximum amplitude value of the stage audio signal is smaller, the amplitude of the stage audio signal is amplified, the maximum amplitude value of the stage audio signal is larger, the amplitude of the stage audio signal is reduced, and meanwhile, the maximum amplitude value of each stage audio signal is guaranteed not to exceed the upper limit value of the transmission amplitude of the transmission channel.

The amplitude of the audio signal at the stage is directly adjusted, and the timeliness of the audio signal processing process can be ensured without adopting a negative feedback loop mode. Meanwhile, the amplitude of the audio signal is respectively adjusted in a segmented mode instead of being amplified or reduced on the basis of the same gain factor, so that the situation that when a small-amplitude signal is amplified, a large-amplitude signal is amplified and then subjected to peak clipping to cause signal distortion is avoided. The maximum amplitude value of the staged audio signal is compared with the first preset amplitude threshold value or the second preset amplitude threshold value, the amplitude of the signal is adjusted according to the comparison result of the amplitude value and the amplitude threshold value instead of adjusting the amplitude through energy comparison, the effect of adjusting the amplitude of the audio signal is better, the audio effect presented by the audio signal is improved, and the experience of a user on the audio is further improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a schematic diagram illustrating steps of an implementation process of an audio signal processing method provided in an embodiment of the present disclosure;

fig. 2 is a schematic diagram of the steps of a process for acquiring an audio signal to be processed according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of the flow steps for segmenting an audio signal provided in an embodiment of the present disclosure;

FIG. 4 is a schematic flow chart illustrating a smoothing process provided in an embodiment of the present disclosure;

fig. 5 is a first schematic structural connection diagram of an audio signal processing apparatus according to an embodiment of the disclosure;

fig. 6 is a schematic structural connection diagram of an audio signal processing apparatus according to an embodiment of the present disclosure;

fig. 7 is a schematic diagram of a structural connection of an electronic device provided in an embodiment of the present disclosure;

fig. 8 is a first schematic connection diagram of an intercom provided in the embodiment of the present disclosure;

fig. 9 is a second schematic structural connection diagram of an intercom provided in the embodiment of the present disclosure;

fig. 10 is a third schematic connection diagram of an intercom provided in the embodiment of the present disclosure;

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some embodiments of the present disclosure, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

The audio processing method provided in the embodiment of the present disclosure is implemented in a device for processing an audio signal, where the device may be a dedicated device for performing audio processing, such as an intercom, or may also be an intelligent device integrating an audio processing function, such as a smart phone, and the scope of protection of the present disclosure is not limited by the specific type of the device implementing the method.

In one embodiment, as shown in fig. 1, the implementation flow of the audio signal processing method mainly includes the following steps:

step 101, an audio signal to be processed is acquired.

In this embodiment, the audio signal to be processed is a signal capable of actually presenting an audio effect, and is not a signal that does not make sense for presenting an audio effect, such as a noise signal. For example, when using an intercom, the voice signal that needs to be transmitted by the intercom is the audio signal to be processed, and so on.

In one embodiment, the device receives the signal to be analyzed, together with the audio signal to be processed and the noise signal generated by the environment. The noise signal generated by the environment exists at any time, and the audio signal to be processed is generated only when needed, so that whether the received signal to be analyzed contains the audio signal to be processed or not needs to be judged. The audio signal to be processed is located in a specific frequency band, while the noise signal exists in the full frequency band. Whether the received signal to be analyzed contains the required audio signal or not can be determined according to different frequency bands, and then the audio signal to be processed is obtained.

As shown in fig. 2, the process of acquiring the audio signal to be processed includes the following steps:

step 201, a signal to be analyzed is received.

The signal to be analyzed is a signal received by the device, and the signal to be analyzed may only contain a noise signal, and may simultaneously contain an audio signal and a noise signal to be processed.

Step 202, a first signal energy value of the signal to be analyzed in a preset audio frequency band is obtained, and a second signal energy value of the signal to be analyzed in a preset noise frequency band is obtained.

According to the frequency band of the audio signal to be processed, the sampling rate corresponding to the audio signal and the like, a preset audio frequency band and a preset noise frequency band can be determined. Calculating signal energy in a preset audio frequency band to obtain a first signal energy value; and calculating the signal energy in the preset noise frequency band to obtain a second signal energy value.

For example, the frequency band where the voice signal can be perceived by human is 300Hz to 3000Hz, and when the sampling rate of the voice signal is 8kHz, the preset audio frequency band is set to 300Hz to 3000Hz, and the preset noise frequency band is set to 3000Hz to 4000 Hz. The separation of the same signal in different frequency bands is realized through the first filter and the second filter, when the signal to be analyzed passes through the two filters in parallel, the signal positioned between 300Hz and 3000Hz in the signal to be analyzed is separated out through the first filter, and the signal positioned between 3000Hz and 4000Hz in the signal to be analyzed is separated out through the second filter. And calculating the energy of the frequency range of 300Hz to 3000Hz as a first signal energy value, and calculating the energy of the frequency range of 3000Hz to 4000Hz as a second signal energy value.

Step 203, when the ratio of the first signal energy value to the second signal energy value is greater than the preset energy threshold, taking the signal of the signal to be analyzed in the preset audio signal frequency band as the audio signal to be processed.

The preset energy threshold may be a value set according to actual conditions and needs, or a value obtained through experiments or empirical calculation, and the protection range of the present disclosure is not limited by the setting manner of the preset energy threshold.

After the ratio of the first signal energy value to the second signal energy value is obtained, the ratio can be directly compared with a preset energy threshold value, or the ratio can be compared with the preset energy threshold value after the ratio is processed according to a preset rule. For example, the ratio of the first signal energy value and the second signal energy value is processed by the following preset formula:

the processing result is 10 × log10 (first signal energy value/second signal energy value);

comparing the processing result obtained after the first signal energy value and the second signal energy value are processed according to the formula with a preset energy threshold value, and if the processing result is not greater than the preset energy threshold value, no needed audio signal exists in the signal to be analyzed, and subsequent signal processing is not performed any more; and if the processing result is greater than the preset energy threshold value, the signal to be analyzed contains a required audio signal, and the signal of the signal to be analyzed in the preset audio signal frequency band is used as the audio signal to be processed.

In one embodiment, when the audio signal to be processed is generated, a sub-audio signal is added to the audio signal, where the sub-audio signal refers to a signal with a frequency less than 300Hz and is a sinusoidal signal, and the specific frequency is set by the upper control layer. And after receiving the audio signal, separating the sub-audio signal, comparing the sub-audio signal with the preset component corresponding to the pre-stored sub-audio signal, and if the comparison result is consistent, determining that the signal is the audio signal to be processed.

In one embodiment, the audio signal to be processed needs to be transmitted from the transmitting-end interphone to the receiving-end interphone. At this time, the audio signal to be processed needs to be pre-adjusted according to the characteristics and the transmission requirements of the intercom transmission channel. The pre-adjustment process comprises pre-emphasis, interpolation up-sampling and anti-image frequency filtering.

Pre-emphasis refers to amplifying the high frequency components of the signal to compensate for the high frequency attenuation due to the channel. The pre-emphasis is implemented using a filter having a transfer function of:

y(n)＝x(n)-0.92x(n-1)，n＝0,1,2,…,N-1；

where y (N) represents the filter output, x (N) represents the filter input, N is the frame length, and N represents each frame.

Interpolation upsampling refers to performing interpolation after upsampling a signal, and is used for increasing the sampling rate of the signal. Specifically, between two sampling points, another (up-sampling multiple-1) sampling points are inserted.

The anti-image frequency filtering refers to filtering image frequency interference, and the image frequency interference is harmonic interference dominant frequency. By using a low-pass filter as an anti-image frequency filter, the image frequency interference is filtered.

In this embodiment, through the pre-adjustment process, the audio signal to be processed is subjected to the processes of pre-emphasis, interpolation up-sampling, anti-image-frequency filtering, and the like, so that the signal characteristics of the audio signal are more obvious, the subsequent continuous processing of the signal is facilitated, and the final audio presentation effect of the audio signal is improved.

Step 102, segmenting the audio signal to obtain at least two stage audio signals.

The audio signal is segmented, that is, the longer audio signal is divided into more than two shorter stages for processing. Amplitude fluctuation of the audio signal in the stage is smaller than amplitude fluctuation of the whole audio signal, and amplitude adjustment is facilitated.

In one embodiment, as shown in fig. 3, the audio signal is segmented to obtain at least two stages of audio signals, and the specific process includes the following steps:

step 301, grouping continuous sampling points included in an audio signal according to the number of preset sampling points to obtain at least two grouping results, wherein one grouping result comprises at least two continuous sampling points;

step 302, the audio signal corresponding to each grouping result is respectively used as a phase audio signal, wherein the number of sampling points included in the phase audio signal is equal to or less than the number of preset sampling points.

Specifically, for example, the number of the preset sampling points is set to 160, if the audio signal includes 1910 sampling points, the 1 st sampling point to the 160 th sampling point are used as a first group, the 161 st sampling point to the 320 th sampling point are used as a second group, and so on, the audio signal is divided into 12 groups. Wherein group 12 contains samples 1761 through 1910, and a total of 150 samples.

In this embodiment, the number of the preset sampling points may be set according to actual conditions and needs, and the number of the preset sampling points may be a value obtained according to an experiment or a value set according to experience. The protection scope of the present disclosure is not limited to the specific number of the preset sampling points.

Step 103, respectively processing the audio signal of each stage as follows:

when the maximum amplitude value of the stage audio signal is smaller than a first preset amplitude threshold value, amplifying the amplitude of the stage audio signal so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal;

when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal; wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

In this embodiment, the second preset amplitude threshold is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal.

Comparing the maximum amplitude value of the phase audio signal with a preset first preset amplitude threshold value and a preset second preset amplitude threshold value respectively to judge whether the amplitude of the audio signal needs to be adjusted or not and how to adjust.

When the maximum amplitude value of the staged audio signal is smaller than the first preset amplitude threshold value, that is, the maximum amplitude value of the staged audio signal is smaller, which means that the overall amplitude of the staged audio signal is smaller, the amplitude of the staged audio signal is amplified. When the maximum amplitude value of the staged audio signal is greater than the second preset amplitude threshold value, that is, the maximum amplitude value of the staged audio signal is greater, which means that the overall amplitude of the staged audio signal is greater and may exceed the upper limit value of the transmission amplitude of the transmission channel of the audio signal, the staged audio signal is adjusted to limit the amplitude within the upper limit value of the transmission amplitude, so as to avoid the occurrence of peak clipping. When the maximum amplitude value of the stage audio signal is greater than the first preset amplitude threshold value and less than the second preset amplitude threshold value, the overall amplitude of the stage audio signal is more appropriate, and amplitude adjustment is not performed any more.

In this embodiment, when the maximum amplitude value of the staged audio signal is smaller than the first preset amplitude threshold, the maximum amplitude value of the staged audio signal may be adjusted to the first preset amplitude threshold, or the maximum amplitude value of the staged audio signal may be adjusted to any amplitude value from the first preset amplitude threshold to the second preset amplitude threshold. Similarly, when the maximum amplitude value of the staged audio signal is greater than the second preset amplitude threshold, the maximum amplitude value of the staged audio signal may be adjusted to the second preset amplitude threshold, the maximum amplitude value of the staged audio signal may also be adjusted to the transmission amplitude upper limit, and the maximum amplitude value of the staged audio signal may also be adjusted to any one of the amplitude values from the first preset amplitude threshold to the second preset amplitude threshold.

In this embodiment, the first preset amplitude threshold and the second preset amplitude threshold may be set according to actual conditions and needs, or may be calculated and obtained according to preset rules. For example, the first preset amplitude threshold may be obtained by the following formula:

the first preset amplitude threshold value is 0.1 multiplied by the transmission amplitude upper limit value;

the second preset amplitude threshold may be obtained by the following equation:

the second preset amplitude threshold is 0.9 × the transmission amplitude upper limit value.

In one embodiment, when the maximum amplitude value of the staged audio signal is smaller than the first preset amplitude threshold, the amplitude of the staged audio signal is amplified, which is implemented as follows: when the maximum amplitude value of the phase audio signal is smaller than a first preset amplitude threshold value, taking the ratio of the first preset amplitude threshold value to the maximum amplitude value as a first amplitude gain value; the amplitude of the audio signal is amplified in the stage by a first amplitude gain value.

In this embodiment, when the maximum amplitude value of the staged audio signal is smaller than the first preset amplitude threshold, the maximum amplitude value of the staged audio signal may be amplified to the first preset amplitude threshold. And the adjustment of the audio signal in the stage with smaller amplitude is realized.

In one embodiment, when the maximum amplitude value of the staged audio signal is greater than the second preset amplitude threshold, the amplitude of the staged audio signal is adjusted, which is implemented as follows: when the maximum amplitude value of the phase audio signal is larger than a second preset amplitude threshold value, taking the ratio of the transmission amplitude upper limit value to the maximum amplitude value as a second amplitude gain value; and adjusting the amplitude of the audio signal in the stage through the second amplitude gain value.

In this embodiment, when the maximum amplitude value of the staged audio signal is greater than the second preset amplitude threshold, the maximum amplitude value of the staged audio signal may be adjusted to the transmission amplitude upper limit value. And the adjustment of the audio signal in the stage with larger amplitude is realized.

In an embodiment, when the maximum amplitude value of the staged audio signal is greater than the second preset amplitude threshold, the amplitude of the staged audio signal is adjusted, so that after the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal, and the adjustment modes of the two adjacent staged audio signals are different, the amplitude difference at the boundary of the two staged audio signals may be large, which may cause discontinuity of the audio signal and further cause sound discontinuity, and therefore, the boundary needs to be smoothed. As shown in fig. 4, the specific processing procedure is as follows:

step 401, acquiring boundary sampling points of two adjacent stages of audio signals;

step 402, acquiring N continuous sampling points including boundary sampling points from two adjacent stages of audio information, wherein N is an integer greater than 1, and the N continuous sampling points include at least two sampling points belonging to different stages of audio information;

step 403, performing interpolation on the audio signals corresponding to the N consecutive sampling points.

In this embodiment, the value of N may be set according to actual conditions and needs, and may be set as a fixed value; the method can also be used for setting according to the amplitude gain values of the audio signals in two adjacent stages according to a preset rule, the larger the difference between the two amplitude gain values is, the larger the value of N is, and when the difference between the two amplitude gain values is smaller than a preset difference threshold value, the smoothing process can be omitted.

In a specific embodiment, the two stage audio signals each include 160 sampling points, the value of N is a fixed value 40, and the amplitude gain values of the two stage audio signals are G1 and G2, respectively, so that interpolation is performed on 20 sampling points including the boundary sampling points in the first stage audio signal and 20 sampling points including the boundary sampling points in the second stage audio signal. The calculation formula of the interpolation unit is as follows:

the interpolation unit is (G1-G2)/40.

In one embodiment, in the above-mentioned embodiment, the amplitude gain values for adjusting the amplitude of the phase audio signal are all fixed values, that is, linear adjustment is implemented. In addition, more amplitude thresholds can be set, and nonlinear adjustment can be performed under different amplitude thresholds. The specific adjustment process is as follows:

when the maximum amplitude value is larger than a first preset amplitude threshold value and is smaller than a third preset amplitude threshold value, amplifying the amplitude of the stage audio signal according to the ratio of the amplitude value of each sampling point in the stage audio signal to the third preset amplitude threshold value, wherein the third preset amplitude threshold value is larger than the first preset amplitude threshold value and smaller than the second preset amplitude threshold value;

and when the maximum amplitude value is smaller than a second preset amplitude threshold value and the maximum amplitude value is larger than a fourth preset amplitude threshold value, reducing the amplitude of the stage audio signal according to the ratio of the amplitude value of each sampling point in the stage audio signal to the fourth preset amplitude threshold value, wherein the fourth preset amplitude threshold value is smaller than the second preset amplitude threshold value and larger than the third preset amplitude threshold value.

In this embodiment, the third preset amplitude threshold and the fourth preset amplitude threshold may be set according to actual conditions and needs, or may be calculated and obtained according to preset rules. For example, the third preset amplitude threshold may be obtained by the following formula:

the third preset amplitude threshold is 0.15 multiplied by the transmission amplitude upper limit value;

the fourth preset amplitude threshold may be obtained by the following equation:

the fourth preset amplitude threshold is 0.85 × the transmission amplitude upper limit value.

In a specific embodiment, the amplitude of the audio signal in the amplification stage is amplified according to a ratio of the amplitude value of each sampling point in the audio signal in the amplification stage to a third preset amplitude threshold, and specifically, when the audio signal in the amplification stage is adjusted, the amplitude gain value obtained by the following formula is adjusted:

the amplitude gain value is log2(x +1), where x represents the ratio of the third preset amplitude threshold value to the amplitude value of the xth sample point.

Similarly, the amplitude of the audio signal in the amplification stage is amplified according to the ratio of the amplitude value of each sampling point in the audio signal in the stage to the fourth preset amplitude threshold, and specifically, when the audio signal in the stage is adjusted, the amplitude gain value obtained by the following formula can be adjusted:

the amplitude gain value is log2(y +1), where y represents the ratio of the fourth preset amplitude threshold value to the amplitude value at the y-th sample point.

In this embodiment, the nonlinear gain can be understood as the effect of the deceleration strip, that is, the gain is made to be flat, but the signal still has the effect of the gain. The signal may be amplified faster by a linear gain when it is farther from the set amplitude and may be amplified slowly by a non-linear gain when it is closer to the set amplitude. The process can improve the processing speed, the nonlinear gain is amplified through the gain function, compared with a negative feedback loop adjustment mode, the process still has the advantage of processing speed, meanwhile, the gain difference of each section is reduced, the number of interpolation values needing to be smooth is reduced, and the distortion of the audio signal is further prevented.

The audio signal processing method provided by the present disclosure segments an audio signal to be processed to obtain at least two stage audio signals, and processes each stage audio signal respectively. When the maximum amplitude value of the stage audio signal is smaller than a first preset amplitude threshold value, amplifying the amplitude of the stage audio signal so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal; wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

Through the pre-adjustment process, the audio signal to be processed is subjected to the processes of pre-emphasis, interpolation up-sampling, anti-image frequency filtering and the like, so that the signal characteristics of the audio signal are more obvious, the subsequent continuous processing of the signal is facilitated, and the final audio presentation effect of the audio signal is improved.

The audio signals in two adjacent stages are subjected to smoothing processing, the problem that the audio signals are discontinuous due to large amplitude difference at the boundary of the audio signals in the two stages, and then the sound is discontinuous is solved, and the audio presenting effect of the audio signals is improved.

Compared with a negative feedback loop adjustment mode, the nonlinear gain is amplified through a gain function, the method still has the advantage of processing speed, simultaneously reduces the gain difference of each section, reduces the number of interpolation values needing to be smooth, and further prevents the distortion of the audio signal.

Based on the same concept, the embodiment of the present disclosure provides an audio signal processing apparatus, and the specific implementation of the apparatus may refer to the description of the method embodiment, and repeated details are not repeated, as shown in fig. 5, the apparatus mainly includes:

an obtaining unit 501, configured to obtain an audio signal to be processed;

a segmenting unit 502, configured to segment the audio signal to obtain at least two stage audio signals;

a processing unit 503, configured to perform the following processing on each stage audio signal: when the maximum amplitude value of the stage audio signal is smaller than a first preset amplitude threshold value, amplifying the amplitude of the stage audio signal so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal; wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

In an embodiment, the processing unit 503 is specifically configured to, when the maximum amplitude value of the staged audio signal is smaller than a first preset amplitude threshold, use a ratio of the first preset amplitude threshold to the maximum amplitude value as a first amplitude gain value; the amplitude of the audio signal is amplified in the stage by a first amplitude gain value.

In an embodiment, the processing unit 503 is specifically configured to, when the maximum amplitude value of the staged audio signal is greater than a second preset amplitude threshold, use a ratio of the transmission amplitude upper limit value to the maximum amplitude value as a second amplitude gain value; and adjusting the amplitude of the audio signal in the stage through the second amplitude gain value.

In one embodiment, as shown in fig. 6, the audio signal processing apparatus further includes an interpolation unit 504.

The interpolation unit 504 is configured to, when the maximum amplitude value of the staged audio signal is greater than a second preset amplitude threshold, adjust the amplitude of the staged audio signal so that the adjusted maximum amplitude value is less than or equal to the upper limit value of the transmission amplitude of the transmission channel of the audio signal, and then acquire boundary sampling points of two adjacent staged audio signals; acquiring N continuous sampling points including boundary sampling points from the audio information of two adjacent stages, wherein N is an integer greater than 1, and the N continuous sampling points include at least two sampling points belonging to the audio information of different stages; and carrying out interpolation on the audio signals corresponding to the N continuous sampling points.

In one embodiment, the obtaining unit 501 is specifically configured to receive a signal to be analyzed; acquiring a first signal energy value of a signal to be analyzed in a preset audio frequency band and acquiring a second signal energy value of the signal to be analyzed in a preset noise frequency band; and when the ratio of the first signal energy value to the second signal energy value is greater than a preset energy threshold value, taking the signal of the signal to be analyzed in a preset audio signal frequency band as the audio signal to be processed.

In an embodiment, the segmenting unit 502 is specifically configured to group consecutive sampling points included in the audio signal according to a preset number of the sampling points to obtain at least two grouping results, where one grouping result includes at least two consecutive sampling points; and respectively taking the audio signal corresponding to each grouping result as a stage audio signal, wherein the number of sampling points included in the stage audio signal is equal to or less than the number of preset sampling points.

In one embodiment, the processing unit 503 is further configured to amplify the amplitude of the staged audio signal according to a ratio of the amplitude value of each sampling point in the staged audio signal to a third preset amplitude threshold value when the maximum amplitude value is greater than the first preset amplitude threshold value and the maximum amplitude value is less than the third preset amplitude threshold value, where the third preset amplitude threshold value is greater than the first preset amplitude threshold value and less than the second preset amplitude threshold value; and when the maximum amplitude value is smaller than a second preset amplitude threshold value and the maximum amplitude value is larger than a fourth preset amplitude threshold value, reducing the amplitude of the stage audio signal according to the ratio of the amplitude value of each sampling point in the stage audio signal to the fourth preset amplitude threshold value, wherein the fourth preset amplitude threshold value is smaller than the second preset amplitude threshold value and larger than the third preset amplitude threshold value.

Based on the same concept, an embodiment of the present disclosure further provides an electronic device, as shown in fig. 7, the electronic device mainly includes: a processor 701, a memory 702, and a communication bus 703, wherein the processor 701 and the memory 702 communicate with each other via the communication bus 703. The memory 702 stores a program executable by the processor 701, and the processor 701 executes the program stored in the memory 702 to implement the following steps: acquiring an audio signal to be processed; segmenting the audio signal to obtain at least two-stage audio signals; the audio signal of each stage is processed as follows: when the maximum amplitude value of the stage audio signal is smaller than a first preset amplitude threshold value, amplifying the amplitude of the stage audio signal so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal; wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

The communication bus 703 mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus 703 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.

The Memory 702 may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor 701.

The Processor 701 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like, or may be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic devices, discrete gates or transistor logic devices, and discrete hardware components.

The electronic device may be an intercom as mentioned in the present disclosure, and may also be other devices capable of implementing the audio signal processing method, for example, a smart phone, a tablet computer, and the like. The scope of protection of the present application is not limited to a particular type of electronic device.

Based on the same concept, the embodiment of the present disclosure provides an intercom, and specific implementation of the intercom may refer to the description of the method embodiment section, and repeated details are not described again. As shown in fig. 8, the intercom includes a signal processing module 801 and a signal transmission module 802;

a signal processing module 801, configured to obtain an audio signal to be processed; segmenting the audio signal to obtain at least two-stage audio signals; the following processing is performed for each stage audio signal: when the maximum amplitude value of the stage audio signal is smaller than a first preset amplitude threshold value, amplifying the amplitude of the stage audio signal so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of the transmission channel of the audio signal; wherein the first preset amplitude threshold value is smaller than the second preset amplitude threshold value;

the signal processing module 801 is further configured to transmit the audio signal after the audio signal at each stage is processed to the signal transmission module 802;

the signal transmission module 802 is configured to transmit the audio signal after the audio signal at each stage is processed to other devices through a transmission channel.

In one embodiment, the signal processing module 801 is specifically configured to, when the maximum amplitude value of the staged audio signal is smaller than a first preset amplitude threshold, use a ratio of the first preset amplitude threshold to the maximum amplitude value as a first amplitude gain value; the amplitude of the audio signal is amplified in the stage by a first amplitude gain value.

In one embodiment, the signal processing module 801 is specifically configured to, when the maximum amplitude value of the phase audio signal is greater than a second preset amplitude threshold, use a ratio of the transmission amplitude upper limit value to the maximum amplitude value as a second amplitude gain value; and adjusting the amplitude of the audio signal in the stage through the second amplitude gain value.

In one embodiment, as shown in fig. 9, the intercom further includes a signal interpolation module 803;

the signal interpolation module 803 is configured to obtain a boundary sampling point of two adjacent stages of audio signals after the audio signal transmitted by the signal processing module 801 is processed; acquiring N continuous sampling points including boundary sampling points from the audio information of two adjacent stages, wherein N is an integer greater than 1, and the N continuous sampling points include at least two sampling points belonging to the audio information of different stages; and carrying out interpolation on the audio signals corresponding to the N continuous sampling points.

In one embodiment, as shown in fig. 10, the intercom further includes a signal receiving module 804 and a signal detecting module 805;

a signal receiving module 804, configured to receive a signal to be analyzed, and transmit the signal to a signal detecting module 805;

the signal detection module 805 is configured to obtain a first signal energy value of a signal to be analyzed in a preset audio frequency band, and obtain a second signal energy value of the signal to be analyzed in a preset noise frequency band; when the ratio of the first signal energy value to the second signal energy value is greater than the preset energy threshold, the information of the signal to be analyzed in the preset audio signal frequency band is used as the audio signal to be processed, and the audio signal to be processed is transmitted to the signal processing module 801.

In one embodiment, the signal processing module 801 is specifically configured to group consecutive sampling points included in an audio signal according to a preset number of sampling points, and obtain at least two grouping results, where one grouping result includes at least two consecutive sampling points; and respectively taking the audio signal corresponding to each grouping result as a stage audio signal, wherein the number of sampling points included in the stage audio signal is equal to or less than the number of preset sampling points.

In one embodiment, the signal processing module 801 is further configured to, when the maximum amplitude value is greater than the first preset amplitude threshold and the maximum amplitude value is less than a third preset amplitude threshold, amplify the amplitude of the staged audio signal according to a ratio of the amplitude value of each sampling point in the staged audio signal to the third preset amplitude threshold, where the third preset amplitude threshold is greater than the first preset amplitude threshold and less than the second preset amplitude threshold; and when the maximum amplitude value is smaller than a second preset amplitude threshold value and the maximum amplitude value is larger than a fourth preset amplitude threshold value, reducing the amplitude of the stage audio signal according to the ratio of the amplitude value of each sampling point in the stage audio signal to the fourth preset amplitude threshold value, wherein the fourth preset amplitude threshold value is smaller than the second preset amplitude threshold value and larger than the third preset amplitude threshold value.

In still another embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored therein a computer program which, when run on a computer, causes the computer to execute the audio signal processing method described in the above-described embodiment.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The processes or functions according to the embodiments of the present disclosure are produced in whole or in part when the computer instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The available media may be magnetic media (e.g., floppy disks, hard disks, tapes, etc.), optical media (e.g., DVDs), or semiconductor media (e.g., solid state drives), among others.

It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An audio signal processing method, comprising:

acquiring an audio signal to be processed;

segmenting the audio signal to obtain at least two-stage audio signals;

the audio signal of each stage is processed as follows:

amplifying the amplitude of the staged audio signal when the maximum amplitude value of the staged audio signal is smaller than a first preset amplitude threshold value, so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal;

when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal;

wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

2. The audio signal processing method of claim 1, wherein amplifying the amplitude of the staged audio signal when the maximum amplitude value of the staged audio signal is less than a first preset amplitude threshold comprises:

when the maximum amplitude value of the phase audio signal is smaller than the first preset amplitude threshold value, taking the ratio of the first preset amplitude threshold value to the maximum amplitude value as a first amplitude gain value;

amplifying the amplitude of the stage audio signal by the first amplitude gain value.

3. The audio signal processing method of claim 1, wherein adjusting the amplitude of the staged audio signal when the maximum amplitude value of the staged audio signal is greater than a second preset amplitude threshold comprises:

when the maximum amplitude value of the stage audio signal is larger than a second preset amplitude threshold value, taking the ratio of the transmission amplitude upper limit value to the maximum amplitude value as a second amplitude gain value;

and adjusting the amplitude of the audio signal of the stage through the second amplitude gain value.

4. The audio signal processing method according to claim 1, wherein, when the maximum amplitude value of the staged audio signal is greater than a second preset amplitude threshold, the method further comprises, after adjusting the amplitude of the staged audio signal so that the adjusted maximum amplitude value is less than or equal to a transmission amplitude upper limit value of a transmission channel of the audio signal:

acquiring boundary sampling points of two adjacent stage audio signals;

acquiring N continuous sampling points including the boundary sampling points from the two adjacent stage audio information, wherein N is an integer greater than 1, and the N continuous sampling points include at least two sampling points belonging to different stage audio information;

and carrying out interpolation on the audio signals corresponding to the N continuous sampling points.

5. The audio signal processing method according to claim 1, wherein the obtaining the audio signal to be processed comprises:

receiving a signal to be analyzed;

acquiring a first signal energy value of the signal to be analyzed in a preset audio frequency band, and acquiring a second signal energy value of the signal to be analyzed in a preset noise frequency band;

and when the ratio of the first signal energy value to the second signal energy value is greater than a preset energy threshold value, taking the signal of the signal to be analyzed in the preset audio signal frequency band as the audio signal to be processed.

6. The audio signal processing method of claim 1, wherein the segmenting the audio signal to obtain at least two stage audio signals comprises:

grouping continuous sampling points contained in the audio signal according to the number of preset sampling points to obtain at least two grouping results, wherein one grouping result comprises at least two continuous sampling points;

and respectively using the audio signal corresponding to each grouping result as the stage audio signal, wherein the number of sampling points included in the stage audio signal is equal to or less than the number of preset sampling points.

7. The audio signal processing method of claim 1, further comprising:

when the maximum amplitude value is larger than the first preset amplitude threshold value and the maximum amplitude value is smaller than a third preset amplitude threshold value, amplifying the amplitude of the staged audio signal according to the ratio of the amplitude value of each sampling point in the staged audio signal to the third preset amplitude threshold value, wherein the third preset amplitude threshold value is larger than the first preset amplitude threshold value and smaller than the second preset amplitude threshold value;

8. An audio signal processing apparatus, comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an audio signal to be processed;

the segmenting unit is used for segmenting the audio signal to obtain at least two stage audio signals;

the processing unit is used for respectively carrying out the following processing on the audio signal of each stage: amplifying the amplitude of the staged audio signal when the maximum amplitude value of the staged audio signal is smaller than a first preset amplitude threshold value, so that the amplified maximum amplitude value is larger than or equal to the first preset amplitude threshold value and smaller than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; when the maximum amplitude value of the stage audio signal is greater than a second preset amplitude threshold value, adjusting the amplitude of the stage audio signal so that the adjusted maximum amplitude value is less than or equal to the transmission amplitude upper limit value of a transmission channel of the audio signal; wherein the first preset amplitude threshold is smaller than the second preset amplitude threshold.

9. An electronic device, comprising: the system comprises a processor, a memory and a communication bus, wherein the processor and the memory are communicated with each other through the communication bus;

the memory for storing a computer program;

the processor, which executes the program stored in the memory, implements the audio signal processing method of any one of claims 1 to 7.

10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the audio signal processing method of any one of claims 1 to 7.