WO2022199288A1 - Audio signal processing method and apparatus, terminal, and storage medium - Google Patents

Audio signal processing method and apparatus, terminal, and storage medium Download PDF

Info

Publication number
WO2022199288A1
WO2022199288A1 PCT/CN2022/076756 CN2022076756W WO2022199288A1 WO 2022199288 A1 WO2022199288 A1 WO 2022199288A1 CN 2022076756 W CN2022076756 W CN 2022076756W WO 2022199288 A1 WO2022199288 A1 WO 2022199288A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio signal
signal
fusion
proportion
dynamic
Prior art date
Application number
PCT/CN2022/076756
Other languages
French (fr)
Chinese (zh)
Inventor
许逸君
郭华
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2022199288A1 publication Critical patent/WO2022199288A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/16Storage of analogue signals in digital stores using an arrangement comprising analogue/digital [A/D] converters, digital memories and digital/analogue [D/A] converters 

Definitions

  • the embodiments of the present application relate to the field of audio technology, and in particular, to a method, device, terminal, and storage medium for processing audio signals.
  • Recording is the process of converting an analog audio signal captured by a microphone into a digital audio signal and storing it.
  • analog to digital converter Analog to Digital Converter
  • DSP Digital to Digital Converter
  • Embodiments of the present application provide an audio signal processing method, device, terminal, and storage medium.
  • the technical solution is as follows:
  • an embodiment of the present application provides a method for processing an audio signal, the method comprising:
  • the first audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the first analog gain
  • the The second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with a second analog gain
  • the first analog gain is greater than the second analog gain
  • an embodiment of the present application provides an apparatus for processing an audio signal, and the apparatus includes:
  • a signal acquisition module configured to acquire the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is obtained by performing analog-to-digital conversion on the original audio signal that has undergone the first analog gain the obtained signal, the second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with a second analog gain, and the first analog gain is greater than the second analog gain;
  • a signal fusion module configured to perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the first audio signal and the second audio signal. The superposition of the dynamic range of the two audio signals.
  • an embodiment of the present application provides a computer-readable storage medium, where at least one piece of program code is stored in the computer-readable storage medium, and the program code is loaded and executed by a processor to implement the above aspects The processing method of the audio signal.
  • an embodiment of the present application provides a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the audio signal processing methods provided in various optional implementations of the above aspects.
  • FIG. 1 is a schematic diagram of an audio recording process in the related art
  • FIG. 2 is a schematic diagram of an audio recording process in an embodiment of the present application.
  • FIG. 3 is a flowchart of an audio signal processing method provided by an exemplary embodiment of the present application.
  • FIG. 4 is a flowchart of an audio signal processing method provided by another exemplary embodiment of the present application.
  • FIG. 5 is a schematic diagram of the principle of an audio recording process shown in an exemplary embodiment of the present application.
  • FIG. 6 is a schematic diagram of a signal compensation process shown in an exemplary embodiment of the present application.
  • FIG. 7 is a flowchart of an audio signal processing method provided by another exemplary embodiment of the present application.
  • FIG. 8 is a schematic diagram of the implementation of an audio signal processing process shown in an exemplary embodiment of the present application.
  • FIG. 9 shows a structural block diagram of an apparatus for processing an audio signal provided by an embodiment of the present application.
  • FIG. 10 shows a structural block diagram of a computer device provided by an exemplary embodiment of the present application.
  • the audio recording process is shown in FIG. 1 .
  • the original audio signal (analog signal) output by the microphone 101 is first input to the programmable gain amplifier 102 (Programmable Gain Amplifier, PGA), and the PGA 102 performs analog gain on the original audio signal, thereby reducing the equivalent input noise of the ADC 103 ( Because the ADC itself has quantization noise and electrical noise, it will increase the signal noise floor).
  • the analog-gained original audio signal is input to the ADC 103, the ADC 103 converts the original audio signal from an analog audio signal to a digital audio signal, and the digital audio signal is input to the DSP 104.
  • the DSP 104 further processes the digital audio signal, and finally outputs the processed digital audio signal for generating an audio file.
  • the analog gain method can reduce the equivalent input noise of the ADC, since the analog gain will increase the amplitude of the original audio signal, if the amplitude of the original audio signal after the analog gain is too large (especially when picking up high sound pressure level signals) ), it will cause signal clipping at the ADC, resulting in audio distortion. It can be seen that it is impossible to simultaneously reduce the equivalent input noise of the ADC and pick up the audio signal (especially the high sound pressure level signal) without distortion in the above-mentioned method for audio recording.
  • the original audio signal output by the microphone 201 is respectively input to the first channel and the second channel.
  • the first ADC 203 performs analog-to-digital conversion on the gained original audio signal to obtain the first audio signal
  • the second ADC 205 performs analog-to-digital conversion on the gained original audio signal to obtain a second audio signal.
  • the DSP 206 fuses the first audio signal and the second audio signal through an algorithm, and finally outputs a high dynamic range target audio signal.
  • the audio signal processing method provided in the embodiment of the present application can be applied to a computer device with audio signal processing capability, and the computer device may be a smart phone, a tablet computer, a wearable device, a personal computer, a vehicle-mounted terminal, etc.
  • the computer device may be a smart phone, a tablet computer, a wearable device, a personal computer, a vehicle-mounted terminal, etc.
  • the example does not limit this.
  • the computer device may collect signals through a built-in microphone, or may collect signals through an external microphone, which is not limited in this embodiment of the present application.
  • the solutions provided by the embodiments of the present application may be executed by a processor in the computer device, and the processor may be a DSP or an application processor (Application Processor, AP), which is not limited in the embodiments of the present application.
  • the processor may be a DSP or an application processor (Application Processor, AP), which is not limited in the embodiments of the present application.
  • Application Processor Application Processor
  • the following embodiments are described by using a method for processing an audio signal to be performed by a computer device.
  • FIG. 3 shows a flowchart of an audio signal processing method provided by an exemplary embodiment of the present application, and the method includes:
  • Step 302 Obtain the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the first analog gain,
  • the second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the second analog gain, and the first analog gain is greater than the second analog gain.
  • the original audio signal passes through two channels of analog gains of different degrees. Wherein, under the first channel, the original audio signal passes through the first analog gain, and under the second channel, the original audio signal passes through the second analog gain.
  • the first channel Since the first analog gain is greater than the second analog gain, the first channel has lower equivalent input noise, which is suitable for picking up non-high sound pressure level signals (non-high sound pressure level signals after high analog gain, due to excessive amplitude The probability of clipping is lower), while the second channel is suitable for picking up high SPL signals (high SPL signals have a low probability of clipping due to excessive amplitude after low analog gain).
  • the computer device performs analog gain on the original audio signal through the PGA set on each channel, and the first analog gain and the second analog gain are fixed gains, or can be adjusted as required.
  • the first analog gain is 30db
  • the second analog gain is 10db.
  • the ADC in the first channel performs analog-to-digital conversion on the original audio signal (analog signal) that has undergone the first analog gain to obtain the first audio signal (digital signal); the ADC in the second channel is subjected to the second analog gain.
  • the original audio signal is converted from analog to digital to obtain the second audio signal.
  • the probability of signal clipping in both channels is small, and the equivalent input noise floor in the first channel is smaller than that in the second channel; when the original audio signal is high
  • the probability of signal clipping in the second channel is lower than that in the first channel.
  • the pickup quality of non-high sound pressure level signals and high sound pressure level signals is considered.
  • Step 304 Perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the superposition of the dynamic ranges of the first audio signal and the second audio signal.
  • the computer equipment needs to further fuse the two audio signals to obtain the target audio signal.
  • the computer device performs signal fusion on the first audio signal and the second audio signal based on the respective signal pickup advantages of the first channel and the second channel.
  • the first channel has a signal pickup advantage in non-high sound pressure level signal pickup
  • the second channel has a signal pickup advantage in high sound pressure level signal pickup. Therefore, the first audio signal has an impact on the non-high sound pressure level in the target audio signal The influence of the signal is larger, and the influence of the second audio signal on the high sound pressure level signal in the target audio signal is larger.
  • the computer device performs sampling point level fusion or sampling frame level fusion on the two audio signals, which is not limited in this embodiment.
  • the computer equipment fuses the sampling point signals in the first audio signal and the second audio signal at the same sampling time every 1/48000 second;
  • the sampling point signals (including 480 sampling point signals) in the first audio signal and the second audio signal under the sampling frame are fused.
  • the computer device performs signal fusion by executing a fusion algorithm through a DSP or an AP, and the specific process of the signal fusion will be described in detail in the following embodiments.
  • the dynamic range of the target audio signal (that is, with better signal pickup quality within the dynamic range) is combined with the respective dynamic ranges of the first audio signal and the second audio signal, that is, The target audio signal has a larger dynamic range than the single-channel output audio signal.
  • the dynamic range of the first audio signal is 30db to 90db
  • the dynamic range of the second audio signal is 60db to 120db
  • the dynamic range of the target audio signal is 30db to 120db.
  • the computer device may further perform noise reduction processing, dynamic range processing, amplitude limiting processing, and spectral equalization processing on the target audio signal, and finally save the processed audio signal as a recording file, which will not be repeated in this embodiment.
  • using the solutions provided by the embodiments of the present application can realize recording with a larger dynamic range. For example, when recording drum performances or concerts (high sound pressure level scenes), it can reduce the problem of broken sound during recording; when recording in a quiet environment (non-high sound pressure level scenes), it can retain more sound detail.
  • the first audio signal and the second audio signal are obtained by respectively inputting the same original audio signal into the first channel and the second channel with different analog gains, while taking into account the non-high sound pressure level signal. and signal pickup of high sound pressure level signals; further, signal fusion is performed based on the first audio signal and the second audio signal, and when outputting a target audio signal, because the dynamic range of the target audio signal is the first audio signal and the second audio signal.
  • the superposition of the dynamic range of the signal thus expanding the dynamic range of the audio signal, helps to improve the recording quality of non-high sound pressure level signals and high sound pressure level signals, and realizes high dynamic range recording.
  • signal fusion is performed based on the first audio signal and the second audio signal to obtain a target audio signal, including:
  • signal fusion is performed on the first audio signal and the third audio signal to obtain the target audio signal, including:
  • Signal fusion is performed on the first audio signal and the third audio signal based on the fusion proportion to obtain a target audio signal.
  • determining the respective corresponding fusion proportions of the first audio signal and the third audio signal including:
  • the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the first clipping flag is set, it is determined that the fusion ratio of the first audio signal is 1, the fusion ratio of the third audio signal is 0, and the first clipping ratio is determined to be 0.
  • the wave mark is used to indicate that the first channel is not clipped;
  • the signal amplitude of the third audio signal is greater than the second amplitude threshold, it is determined that the fusion proportion of the first audio signal is 0, the fusion proportion of the third audio signal is 1, and the second amplitude threshold is greater than the first amplitude threshold;
  • the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the second clipping flag is set, or the signal amplitude of the third audio signal is larger than the first amplitude threshold and smaller than the second amplitude threshold, determine the first amplitude threshold.
  • the fusion proportion of the audio signal is the first dynamic proportion, and it is determined that the fusion proportion of the second audio signal is the second dynamic proportion, and the second clipping flag is used to characterize that the first channel is clipped. and is 1.
  • the method further includes:
  • the first dynamic proportion and the second dynamic proportion are updated based on the first update step size, wherein the updated first dynamic proportion and the second dynamic proportion are updated.
  • the dynamic proportion is greater than the first dynamic proportion before the update, and the second dynamic proportion after the update is smaller than the second dynamic proportion before the update;
  • the first dynamic proportion and the second dynamic proportion are updated based on the second update step, wherein the updated first dynamic proportion is less than The first dynamic proportion before the update and the second dynamic proportion after the update are greater than the second dynamic proportion before the update.
  • the method further includes:
  • the second clipping identifier is replaced with the first clipping identifier.
  • the method further includes:
  • the first clipping flag is replaced with the second clipping flag.
  • performing signal compensation on the second audio signal to obtain a third audio signal including:
  • Gain compensation is performed on the second audio signal based on the analog gain difference to obtain a third audio signal.
  • performing signal compensation on the second audio signal to obtain a third audio signal including:
  • Amplitude compensation is performed on the fourth audio signal based on the signal amplitude ratio to obtain a third audio signal.
  • the computer device first needs to perform signal compensation on the second audio signal, so as to reduce the signal between the first audio signal and the second audio signal difference. Exemplary embodiments are used for description below.
  • FIG. 4 shows a flowchart of an audio signal processing method provided by another exemplary embodiment of the present application, the method includes:
  • Step 401 Obtain the first audio signal output by the first channel and the second audio signal output by the second channel.
  • the first audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal that has undergone the first analog gain.
  • the second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the second analog gain, and the first analog gain is greater than the second analog gain.
  • the original audio signal is outputted after the first analog gain and analog-to-digital conversion, and the original audio signal is outputted after the second analog gain and analog-to-digital conversion.
  • Step 402 Perform signal compensation on the second audio signal to obtain a third audio signal, where the signal compensation is used to compensate for the signal difference caused by the first analog gain and the second analog gain.
  • the computer equipment After the same original audio signal has undergone different degrees of analog gain, its signal amplitude is different, so the computer equipment first needs to compensate the second audio signal based on the first audio signal, so that the second audio signal after signal compensation is as close as possible to the first audio signal. audio signal, so as to subsequently perform signal fusion on two audio signals with similar amplitudes.
  • the computer device Since the difference between the first audio signal and the second audio signal is mainly caused by the analog gain, the computer device performs gain compensation on the second audio signal (different from the analog gain of the PGA, here is the digital gain). Moreover, since the first analog gain is higher than the second analog gain, it is necessary to increase the gain of the second audio signal when performing the gain compensation, so as to obtain the third audio signal.
  • the computer device determines the analog gain difference between the first analog gain and the second analog gain, thereby Gain compensation is performed on the second audio signal based on the analog gain difference to obtain a third audio signal.
  • the computer device determines the gain multiplier based on the analog gain difference between the first analog gain and the second analog gain, so as to perform gain compensation on the second audio signal based on the gain multiplier to obtain the third audio signal.
  • the computer device determines the gain multiple to be Correspondingly, the third audio signal
  • the second audio signal is first compensated by a fixed gain to obtain the fourth audio signal, and then the third audio signal is obtained by adaptive amplitude compensation (fine adjustment).
  • the computer device calculates the respective signal amplitudes of the first audio signal and the fourth audio signal, so as to determine the signal amplitude difference between the two, and then based on the signal amplitude difference Amplitude compensation is performed on the fourth audio signal to obtain a third audio signal.
  • this step may include the following steps:
  • Step 402A Determine the analog gain difference between the first analog gain and the second analog gain.
  • Step 402B Perform gain compensation on the second audio signal based on the analog gain difference to obtain a fourth audio signal.
  • steps 402A and 402B For the implementation of steps 402A and 402B, reference may be made to the foregoing embodiments, and details are not described herein again in this embodiment.
  • Step 402C determining the signal amplitude ratio of the first audio signal and the fourth audio signal.
  • the computer device calculates the signal amplitude of the first audio signal and the signal amplitude of the fourth audio signal, and determines the signal amplitude difference by obtaining the signal amplitude ratio between the two.
  • the signal amplitude of the audio signal is represented by an energy envelope
  • the signal amplitude of the audio signal may also be represented by other parameters than the energy envelope, which is not limited in this embodiment.
  • Step 402D performing amplitude compensation on the fourth audio signal based on the signal amplitude ratio to obtain a third audio signal.
  • the computer device performs amplitude compensation on the fourth audio signal based on the determined signal amplitude ratio to obtain the third audio signal.
  • Step 403 Perform signal fusion on the first audio signal and the third audio signal to obtain a target audio signal.
  • the computer equipment needs to be based on the real-time sound pressure level of the signal. Perform dynamic fusion.
  • the computer equipment when performing signal fusion, increase the proportion of the first audio signal, reduce the proportion of the third audio signal, and improve the recording effect of non-high sound pressure level signals; when the signal pickup quality of the second channel is higher than that of the first channel. (high sound pressure level signal), the computer equipment increases the proportion of the third audio signal when performing signal fusion, reduces the proportion of the first audio signal, and improves the recording quality of the high sound pressure level signal.
  • this step may include the following steps.
  • the sum of the corresponding fusion proportions of the first audio signal and the third audio signal is 1.
  • the fusion proportion of the first audio signal is a
  • the fusion proportion of the third audio signal is 1-a, and 0 ⁇ a ⁇ 1.
  • the first analog gain at the first channel is relatively large, clipping may occur during the analog-to-digital conversion of the original audio signal that has passed through the first analog gain in the first channel, while the second analog gain output from the second channel is relatively high. Small, so less prone to clipping. If the first audio signal is directly analyzed to determine whether clipping occurs in the first channel (for example, it is determined that clipping occurs in the first audio signal at the highest amplitude or higher than a preset amplitude), the probability of misjudgment is high.
  • the computer device determines the clipping situation of the first channel based on the third audio signal, and further, the computer device determines the clipping situation of the first channel based on the clipping situation of the first channel. , and the real-time signal amplitude of the third audio signal, dynamically determine the respective fusion proportions of the first audio signal and the third audio signal.
  • the respective fusion ratios corresponding to the first audio signal and the third audio signal include the following situations.
  • two levels of amplitude thresholds are set in the computer device, which are a first amplitude threshold and a second amplitude threshold, wherein the second amplitude threshold is greater than the first amplitude threshold.
  • the computer device determines that no clipping occurs in the first channel; when the signal amplitude of the third audio signal is greater than or equal to the first amplitude threshold but less than the second amplitude threshold.
  • the computer device determines that clipping may occur in the first channel; when the signal amplitude of the third audio signal is greater than or equal to the second amplitude threshold, the computer device determines that clipping occurs in the first channel.
  • the computer device sets the first amplitude threshold and the second amplitude threshold based on the analog gain conditions of the two channels, which is not limited in this embodiment.
  • the initial clipping flag of the clipping flag bit is the first clipping flag
  • the computer equipment It is determined that clipping may occur in the first channel, and the first clipping mark is adjusted to the second clipping mark; and in order to improve the smoothness of the target audio signal after fusion, when the second clipping mark is currently provided, when the first clipping mark is When the signal amplitude of the three audio signals is less than the first amplitude threshold, the computer device does not directly adjust the second clipping flag to the first clipping flag, but when the signal amplitude of the third audio signal is less than the first amplitude threshold and reaches a certain duration When , the second clipping flag is adjusted to the first clipping flag.
  • the signal amplitude of the third audio signal is less than the first amplitude threshold, and the first clipping flag is set, it indicates that the first channel has not clipped in a recent period of time, and because no clipping has occurred in the In this case, the signal pickup quality of the first channel is better than that of the second channel (because the equivalent input noise of the first channel is smaller), so the computer device determines that the fusion ratio of the first audio signal is 1, and the third audio The fusion ratio of the signal is 0, that is, the third audio signal is not integrated into the signal fusion.
  • the computer equipment determines that the fusion ratio of the first audio signal is 0, and the third audio signal is 0.
  • the fusion ratio of the audio signal is 1, that is, the first audio signal is not integrated into the signal fusion.
  • the signal amplitude mag abs(x2_c) of the third audio signal x2_c, when mag ⁇ thrd2, the computer device determines that the fusion weight of the first audio signal x1 is 0, and the fusion weight of the third audio signal x2_c is 1.
  • Case 3 When the signal amplitude of the third audio signal is less than the first amplitude threshold and a second clipping flag is set, or the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold, It is determined that the fusion proportion of the first audio signal is the first dynamic proportion, and the fusion proportion of the second audio signal is determined to be the second dynamic proportion, and the second clipping mark is used to characterize that the first channel is clipped, and the first dynamic proportion and the second The sum of the dynamic proportions is 1.
  • the computer equipment determines the fusion weight of the first audio signal and the second audio signal as the dynamic weight. , that is, during the change of the signal amplitude, the fusion proportion of the first audio signal and the second audio signal also changes accordingly.
  • the computer device is provided with a first dynamic specific gravity and a second dynamic specific gravity, wherein the first dynamic specific gravity is g (g is initially 1), and the second dynamic specific gravity is 1-g.
  • the computer device performs signal fusion based on the first dynamic proportion and the second dynamic proportion.
  • the computer device will update the first dynamic proportion and the second dynamic proportion based on the magnitude relationship between the signal amplitude of the third audio signal and the first amplitude threshold and the second amplitude threshold.
  • the computer device increases the first dynamic proportion and reduces the second Dynamic weight; when it is determined that the clipping probability of the first channel is increased, the computer device reduces the first dynamic weight and increases the second dynamic weight.
  • the computer device updates the first dynamic proportion and the first dynamic proportion based on the first update step size. Two dynamic proportions, wherein the first dynamic proportion after the update is greater than the first dynamic proportion before the update, and the second dynamic proportion after the update is smaller than the second dynamic proportion before the update.
  • the first update step size may be determined based on the sampling rate of the audio signal.
  • the first update step size is negatively correlated with the sampling rate, that is, the higher the sampling rate, the smaller the first update step size, and the lower the sampling rate, the larger the first update step size.
  • the first dynamic weight is g
  • the second dynamic weight is 1-g
  • the first update step is delta_rel (0 ⁇ delta_rel ⁇ 1)
  • the signal amplitude of the third audio signal is is less than the first amplitude threshold, and is provided with a second clipping mark
  • the g is updated to g+delta_rel, thereby improving the fusion proportion of the first audio signal during subsequent signal fusion, Reduce the fusion weight of the third audio signal.
  • the computer device detects whether the first dynamic proportion is greater than or equal to 1. If the first dynamic proportion is greater than or equal to 1, it indicates that the signal amplitude of the third audio signal within a period of time is lower than the first amplitude threshold. , and the first channel does not clip for a period of time, so the second clipping flag is replaced with the first clipping flag.
  • the computer device sets the first dynamic specific gravity to 1.
  • the computer device updates the first dynamic proportion and the second dynamic proportion based on the second update step size weight, wherein the updated first dynamic weight is smaller than the pre-update first dynamic weight, and the updated second dynamic weight is greater than the before-update second dynamic weight.
  • the computer device when the signal amplitude of the third audio signal is greater than the first amplitude threshold and the first clipping flag is set, the computer device replaces the first clipping flag with the second clipping flag, indicating that the first channel may be clipped.
  • the second update step size may be determined based on the sampling rate of the audio signal.
  • the second update step size is negatively correlated with the sampling rate, that is, the higher the sampling rate is, the smaller the second update step size is, and the lower the sampling rate is, the larger the second update step size is.
  • the first dynamic weight is g
  • the second dynamic weight is 1-g
  • the first update step is delta_att (0 ⁇ delta_att ⁇ 1)
  • the computer device sets the first dynamic proportion to 0.
  • the target audio signal y g*x1+(1-g)*x2_c obtained by computer equipment fusion, and Update g to g+delta_rel.
  • the computer device performs fixed gain compensation and adaptive amplitude compensation on the second audio signal, so that the third audio signal obtained after compensation is as close to the first audio signal as possible, which helps to improve the quality of subsequent signal fusion.
  • the computer device determines the clipping situation at the first channel based on the signal amplitude of the third audio signal, and further dynamically determines that the signal fusion is the first audio frequency based on the clipping situation and the signal amplitude of the first channel.
  • the fusion ratio of the signal and the third audio signal avoids the occurrence of jumps in the target audio signal, and improves the fusion quality and smoothness of the target audio signal after fusion.
  • the computer equipment when the original audio signal is gradually increased from a non-high sound pressure level to a high sound pressure level, and then gradually decreased to a non-high sound pressure level, the computer equipment performs the audio signal processing process as follows: shown in Figure 8.
  • Step 801 Obtain the first audio signal output by the first channel and the second audio signal output by the second channel.
  • Step 802 performing fixed gain compensation and adaptive amplitude compensation on the second audio signal to obtain a third audio signal.
  • steps 801 to 802 are signal compensation processes.
  • Step 803 the signal amplitude of the third audio signal is smaller than the first amplitude threshold (initially the first clipping flag), and the first audio signal is output.
  • the audio signal of the first channel is directly output.
  • Step 804 the signal amplitude of the third audio signal is greater than or equal to the first amplitude threshold and less than the second amplitude threshold, and based on the first dynamic proportion and the second dynamic proportion, the first audio signal and the third audio signal are fused and output. .
  • Step 805 based on the second update step size, reduce the first dynamic proportion (increase the second dynamic proportion), and set a second clipping flag.
  • steps 804 to 805 when the non-high sound pressure level signal gradually becomes a high sound pressure level signal, the fusion proportion of the second channel is dynamically increased, the fusion proportion of the first channel is decreased, and signal fusion is performed.
  • Step 806 the signal amplitude of the third audio signal is greater than or equal to the second amplitude threshold, and the third audio signal is output.
  • the audio signal of the second channel is directly output.
  • Step 807 the signal amplitude of the third audio signal is less than the first amplitude threshold (this time is the second clipping mark), and based on the first dynamic proportion and the second dynamic proportion, the first audio signal and the third audio signal are subjected to signal fusion. and output.
  • Step 808 based on the first update step size, increase the first dynamic proportion (decrease the second dynamic proportion).
  • Step 809 when the first dynamic proportion is greater than or equal to 1, set the first clipping flag.
  • steps 807 to 809 in the process of gradually changing the high sound pressure level signal into a non-high sound pressure level signal, dynamically increase the fusion proportion of the first channel, decrease the fusion proportion of the second channel, and perform signal fusion.
  • FIG. 9 shows a structural block diagram of an audio signal processing apparatus provided by an embodiment of the present application.
  • the apparatus has the function of realizing the function executed by the computer device in the above method embodiment, and the function may be realized by hardware, or may be realized by hardware executing corresponding software.
  • the apparatus may include:
  • a signal acquisition module 901 configured to acquire the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is an analog-to-digital conversion of the original audio signal that has undergone the first analog gain
  • the second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with a second analog gain, and the first analog gain is greater than the second analog gain;
  • a signal fusion module 902 configured to perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the first audio signal and the Superposition of the dynamic range of the second audio signal.
  • the signal fusion module 902 includes:
  • a compensation unit configured to perform signal compensation on the second audio signal to obtain a third audio signal, wherein the signal compensation is used to compensate the signal difference caused by the first analog gain and the second analog gain;
  • a fusion unit configured to perform signal fusion on the first audio signal and the third audio signal to obtain the target audio signal.
  • fusion unit specifically for:
  • the fusion unit when determining the respective fusion proportions of the first audio signal and the third audio signal, is specifically used for:
  • the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the first clipping flag is set, it is determined that the fusion proportion of the first audio signal is 1, and the fusion proportion of the third audio signal is 1 is 0, and the first clipping flag is used to represent that the first channel is not clipped;
  • the signal amplitude of the third audio signal is greater than the second amplitude threshold, it is determined that the fusion proportion of the first audio signal is 0, the fusion proportion of the third audio signal is 1, and the second amplitude threshold greater than the first amplitude threshold;
  • the fusion proportion of the first audio signal is determined to be the first dynamic proportion
  • the fusion proportion of the second audio signal is determined to be the second dynamic proportion
  • the second clipping flag is used to represent the The first channel is clipped, and the sum of the first dynamic specific gravity and the second dynamic specific gravity is 1.
  • the device further includes:
  • the first specific gravity update module is used to update the third audio signal based on the first update step when the signal amplitude of the third audio signal is less than the first amplitude threshold and the second clipping flag is set.
  • a dynamic proportion and the second dynamic proportion wherein the first dynamic proportion after the update is greater than the first dynamic proportion before the update, and the second dynamic proportion after the update is smaller than the second dynamic proportion before the update dynamic proportion;
  • a second proportion updating module configured to update the first dynamic proportion based on a second update step size when the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold and the second dynamic proportion, wherein the first dynamic proportion after the update is smaller than the first dynamic proportion before the update, and the second dynamic proportion after the update is greater than the second dynamic proportion before the update.
  • the device further includes:
  • a first identifier replacement module configured to replace the second clipping identifier with the first clipping identifier when the updated first dynamic proportion is greater than or equal to 1.
  • the device further includes:
  • a second identifier replacement module configured to replace the first clipping identifier with the first clipping identifier when the signal amplitude of the third audio signal is greater than the first amplitude threshold the second clipping flag.
  • compensation unit specifically for:
  • Gain compensation is performed on the second audio signal based on the analog gain difference to obtain the third audio signal.
  • compensation unit specifically for:
  • the first audio signal and the second audio signal are obtained by respectively inputting the same original audio signal into the first channel and the second channel with different analog gains, while taking into account the non-high sound pressure level signal. and signal pickup of high sound pressure level signals; further, signal fusion is performed based on the first audio signal and the second audio signal, and when outputting a target audio signal, because the dynamic range of the target audio signal is the first audio signal and the second audio signal.
  • the superposition of the dynamic range of the signal thus expanding the dynamic range of the audio signal, helps to improve the recording quality of non-high sound pressure level signals and high sound pressure level signals, and realizes high dynamic range recording.
  • the computer device performs fixed gain compensation and adaptive amplitude compensation on the second audio signal, so that the third audio signal obtained after compensation is as close to the first audio signal as possible, which helps to improve the quality of subsequent signal fusion.
  • the computer device determines the clipping situation of the first channel based on the signal amplitude of the third audio signal, and further dynamically determines that the signal fusion is the first audio frequency based on the clipping situation and the signal amplitude of the first channel.
  • the fusion ratio of the signal and the third audio signal avoids the occurrence of jumps in the target audio signal, and improves the fusion quality and smoothness of the target audio signal after fusion.
  • FIG. 10 shows a structural block diagram of a computer device provided by an exemplary embodiment of the present application.
  • the computer device 1000 may be a smartphone, a tablet computer, a wearable device, or the like.
  • the computer device 1000 in this application may include one or more of the following components: a processor 1010 and a memory 1020 .
  • Processor 1010 may include one or more processing cores.
  • the processor 1010 uses various interfaces and lines to connect various parts of the entire computer device 1000, and executes by running or executing the instructions, programs, code sets or instruction sets stored in the memory 1020, and calling the data stored in the memory 1020.
  • the processor 1010 may adopt at least one of digital signal processing (Digital Signal Processing, DSP), field-programmable gate array (Field-Programmable Gate Array, FPGA), and programmable logic array (Programmable Logic Array, PLA).
  • DSP Digital Signal Processing
  • FPGA Field-Programmable Gate Array
  • PLA programmable logic array
  • the processor 1010 may integrate one or more of a central processing unit (Central Processing Unit, CPU), a graphics processor (Graphics Processing Unit, GPU), a neural network processor (Neural-network Processing Unit, NPU), and a modem, etc.
  • a central processing unit Central Processing Unit, CPU
  • a graphics processor Graphics Processing Unit, GPU
  • a neural network processor Neural-network Processing Unit, NPU
  • a modem etc.
  • the CPU mainly processes the operating system, user interface and application programs, etc.
  • the GPU is used for rendering and drawing the content that needs to be displayed on the touch display screen 1030
  • the NPU is used for implementing artificial intelligence (Artificial Intelligence, AI) functions
  • the modem is used for Handle wireless communications. It can be understood that, the above-mentioned modem may not be integrated into the processor 1010, but is implemented by a single chip.
  • the memory 1020 may include random access memory (Random Access Memory, RAM), or may include read-only memory (Read-Only Memory, ROM).
  • the memory 1020 includes a non-transitory computer-readable storage medium.
  • Memory 1020 may be used to store instructions, programs, codes, sets of codes, or sets of instructions.
  • the memory 1020 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playback function, an image playback function, etc.), Instructions and the like for implementing the various method embodiments described below; the storage data area may store data (such as audio data, phone book) and the like created according to the use of the computer device 1000 .
  • the computer device 1000 is provided with a microphone 1030, and the microphone 1030 may be a built-in microphone of the computer device 1000 or an external microphone connected to the computer device 1000 through a microphone interface.
  • the computer device 1000 is further provided with a first channel and a second channel (audio circuit), and the microphone 1030 is respectively connected with the first channel and the second channel, wherein the first channel is provided with a high-gain PGA and The first ADC and the second channel are provided with a low-gain PGA and a second ADC.
  • the first channel and the second channel respectively perform analog gain and analog-to-digital conversion on the original audio signal output by the microphone, and convert the converted two channels of audio signals.
  • Input to the processor 1010, and the processor 1010 performs audio signal processing.
  • the structure of the computer device 1000 shown in the above drawings does not constitute a limitation on the computer device, and the computer device may include more or less components than those shown in the drawings, or combinations thereof. certain components, or different component arrangements.
  • the computer device 1000 also includes components such as a display screen, a sensor, a speaker, and a power supply, which will not be repeated here.
  • Embodiments of the present application further provide a computer-readable storage medium, where at least one piece of program code is stored in the computer-readable storage medium, and the program code is loaded and executed by a processor to implement the audio signal processing according to the above embodiments.
  • Approach
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the audio signal processing methods provided in various optional implementations of the above aspects.
  • references herein to "a plurality” means two or more.
  • "And/or" which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, A and B exist at the same time, and B exists alone.
  • the character "/" generally indicates that the associated objects are an "or” relationship.
  • the numbering of the steps described in this document only exemplarily shows a possible execution sequence between the steps. In some other embodiments, the above steps may also be executed in different order, such as two different numbers. The steps are performed at the same time, or two steps with different numbers are performed in a reverse order to that shown in the figure, which is not limited in this embodiment of the present application.

Abstract

An audio signal processing method and apparatus, a terminal, and a storage medium, relating to the technical field of audios. The method comprises: obtaining a first audio signal outputted by a first channel and a second audio signal outputted by a second channel, the first audio signal being a signal obtained by performing analog-to-digital conversion on an original audio signal subjected to a first analog gain, the second audio signal being a signal obtained by performing analog-to-digital conversion on the original audio signal subjected to a second analog gain, and the first analog gain being greater than the second analog gain (302); and performing signal fusion on the basis of the first audio signal and the second audio signal to obtain a target audio signal, a dynamic range of the target audio signal being a superposition of dynamic ranges of the first audio signal and the second audio signal (304). By adoption of the solution provided by embodiments of the present application, signal pickup of both a non-high sound pressure level signal and a high sound pressure level signal is considered, the dynamic range of the audio signal is expanded, and high dynamic range recording is realized.

Description

音频信号的处理方法、装置、终端及存储介质Audio signal processing method, device, terminal and storage medium
本申请要求于2021年03月22日提交的申请号为202110301715.0、发明名称为“音频信号的处理方法、装置、终端及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese Patent Application No. 202110301715.0 filed on March 22, 2021, with the invention titled "Audio Signal Processing Method, Device, Terminal and Storage Medium", the entire contents of which are incorporated herein by reference Applying.
技术领域technical field
本申请实施例涉及音频技术领域,特别涉及一种音频信号的处理方法、装置、终端及存储介质。The embodiments of the present application relate to the field of audio technology, and in particular, to a method, device, terminal, and storage medium for processing audio signals.
背景技术Background technique
录音是一种将麦克风采集的模拟音频信号转化为数字音频信号并进行存储的过程。Recording is the process of converting an analog audio signal captured by a microphone into a digital audio signal and storing it.
相关技术中,麦克风采集到的模拟音频信号经过增益放大后,输入模拟数字转化器(Analog to Digital Converter,ADC),由ADC将模拟音频信号转化为数字音频信号,进而由数字信号处理器(Digital Signal Processor,DSP)对数字音频信号处理并输出保存。In the related art, after the analog audio signal collected by the microphone is amplified by gain, it is input to an analog to digital converter (Analog to Digital Converter, ADC), and the analog audio signal is converted into a digital audio signal by the ADC. Signal Processor, DSP) for digital audio signal processing and output save.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了一种音频信号的处理方法、装置、终端及存储介质。所述技术方案如下:Embodiments of the present application provide an audio signal processing method, device, terminal, and storage medium. The technical solution is as follows:
一方面,本申请实施例提供了一种音频信号的处理方法,所述方法包括:On the one hand, an embodiment of the present application provides a method for processing an audio signal, the method comprising:
获取第一通道输出的第一音频信号,以及第二通道输出的第二音频信号,所述第一音频信号是对经过第一模拟增益的原始音频信号进行模数转换后得到的信号,所述第二音频信号是对经过第二模拟增益的原始音频信号进行模数转换后得到的信号,所述第一模拟增益大于所述第二模拟增益;Obtain the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the first analog gain, the The second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with a second analog gain, and the first analog gain is greater than the second analog gain;
基于所述第一音频信号和所述第二音频信号进行信号融合,得到目标音频信号,其中,所述目标音频信号的动态范围是所述第一音频信号和所述第二音频信号的动态范围的叠加。Perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the dynamic range of the first audio signal and the second audio signal superposition.
另一方面,本申请实施例提供了一种音频信号的处理装置,所述装置包括:On the other hand, an embodiment of the present application provides an apparatus for processing an audio signal, and the apparatus includes:
信号获取模块,用于获取第一通道输出的第一音频信号,以及第二通道输出的第二音频信号,所述第一音频信号是对经过第一模拟增益的原始音频信号进行模数转换后得到的信号,所述第二音频信号是对经过第二模拟增益的原始音频信号进行模数转换后得到的信号,所述第一模拟增益大于所述第二模拟增益;A signal acquisition module, configured to acquire the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is obtained by performing analog-to-digital conversion on the original audio signal that has undergone the first analog gain the obtained signal, the second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with a second analog gain, and the first analog gain is greater than the second analog gain;
信号融合模块,用于基于所述第一音频信号和所述第二音频信号进行信号融合,得到目标音频信号,其中,所述目标音频信号的动态范围是所述第一音频信号和所述第二音频信号的动态范围的叠加。A signal fusion module, configured to perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the first audio signal and the second audio signal. The superposition of the dynamic range of the two audio signals.
另一方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条程序代码,所述程序代码由处理器加载并执行以实现如上述方面所述的音频信号的处理方法。On the other hand, an embodiment of the present application provides a computer-readable storage medium, where at least one piece of program code is stored in the computer-readable storage medium, and the program code is loaded and executed by a processor to implement the above aspects The processing method of the audio signal.
另一方面,本申请实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述方面的各种可选实现方式中提供的音频信号的处理方法。On the other hand, an embodiment of the present application provides a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the audio signal processing methods provided in various optional implementations of the above aspects.
附图说明Description of drawings
图1是相关技术中音频录制过程的示意图;1 is a schematic diagram of an audio recording process in the related art;
图2是本申请实施例中音频录制过程的示意图;2 is a schematic diagram of an audio recording process in an embodiment of the present application;
图3是本申请一个示例性实施例提供的音频信号的处理方法的流程图;3 is a flowchart of an audio signal processing method provided by an exemplary embodiment of the present application;
图4是本申请另一个示例性实施例提供的音频信号的处理方法的流程图;FIG. 4 is a flowchart of an audio signal processing method provided by another exemplary embodiment of the present application;
图5是本申请一个示例性实施例示出的音频录制过程的原理示意图;FIG. 5 is a schematic diagram of the principle of an audio recording process shown in an exemplary embodiment of the present application;
图6是本申请一个示例性实施例示出的信号补偿过程的示意图;FIG. 6 is a schematic diagram of a signal compensation process shown in an exemplary embodiment of the present application;
图7是本申请另一个示例性实施例提供的音频信号的处理方法的流程图;FIG. 7 is a flowchart of an audio signal processing method provided by another exemplary embodiment of the present application;
图8是本申请一个示例性实施例示出的音频信号处理过程的实施示意图;FIG. 8 is a schematic diagram of the implementation of an audio signal processing process shown in an exemplary embodiment of the present application;
图9示出了本申请一个实施例提供的音频信号的处理装置的结构框图;9 shows a structural block diagram of an apparatus for processing an audio signal provided by an embodiment of the present application;
图10示出了本申请一个示例性实施例提供的计算机设备的结构方框图。FIG. 10 shows a structural block diagram of a computer device provided by an exemplary embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the objectives, technical solutions and advantages of the present application clearer, the embodiments of the present application will be further described in detail below with reference to the accompanying drawings.
相关技术中,音频录制过程如图1所示。其中,麦克风101输出的原始音频信号(模拟信号)首先输入可编程增益放大器102(Programmable Gain Amplifier,PGA),由PGA 102对原始音频信号进行模拟增益,以此降低ADC 103的等效输入噪声(因为ADC本身存在量化噪声和电噪声,会增加信号底噪)。经过模拟增益的原始音频信号输入ADC 103,由ADC 103将原始音频信号由模拟音频信号转换为数字音频信号,并将数字音频信号输入DSP 104。DSP104对数字音频信号进行进一步处理,最终输出处理后的数字音频信号用于生成音频文件。In the related art, the audio recording process is shown in FIG. 1 . The original audio signal (analog signal) output by the microphone 101 is first input to the programmable gain amplifier 102 (Programmable Gain Amplifier, PGA), and the PGA 102 performs analog gain on the original audio signal, thereby reducing the equivalent input noise of the ADC 103 ( Because the ADC itself has quantization noise and electrical noise, it will increase the signal noise floor). The analog-gained original audio signal is input to the ADC 103, the ADC 103 converts the original audio signal from an analog audio signal to a digital audio signal, and the digital audio signal is input to the DSP 104. The DSP 104 further processes the digital audio signal, and finally outputs the processed digital audio signal for generating an audio file.
采用模拟增益的方式虽然可以降低ADC的等效输入噪声,但是由于模拟增益会提高原始音频信号的幅度,因此若模拟增益后原始音频信号的幅度过大(尤其是在拾取高声压级信号时),则会在ADC处造成信号削波,导致音频失真。可见,采用上述方式进行音频录制无法同时兼顾降低ADC的等效输入噪声和无失真拾取音频信号(尤其是高声压级信号)。Although the analog gain method can reduce the equivalent input noise of the ADC, since the analog gain will increase the amplitude of the original audio signal, if the amplitude of the original audio signal after the analog gain is too large (especially when picking up high sound pressure level signals) ), it will cause signal clipping at the ADC, resulting in audio distortion. It can be seen that it is impossible to simultaneously reduce the equivalent input noise of the ADC and pick up the audio signal (especially the high sound pressure level signal) without distortion in the above-mentioned method for audio recording.
而本申请实施例提供的方案中,通过设置两条通路,分别对原始音频信号进行不同程度的模拟增益和模数转换,兼顾非高声压级以及高声压级信号的信号拾取;进一步的,通过对两条通路输出的音频信号进行融合,扩大最终输出的音频信号的动态范围(两路音频信号的动态范围的叠加),实现高动态范围录音。However, in the solution provided by the embodiment of the present application, by setting two paths, different degrees of analog gain and analog-to-digital conversion are respectively performed on the original audio signal, taking into account the signal pickup of non-high sound pressure level and high sound pressure level signals; further , by fusing the audio signals output by the two channels, the dynamic range of the final output audio signal is expanded (the superposition of the dynamic ranges of the two audio signals), and the high dynamic range recording is realized.
如图2所示,麦克风201输出的原始音频信号分别输入第一通道和第二通道。其中,第一通道下,高增益PGA 202对原始音频信号进行模拟增益后,由第一ADC 203对增益后的原始音频信号进行模数转换,得到第一音频信号;第二通道下,低增益PGA 204对原始音频信号进行模拟增益后,由第二ADC 205对增益后的原始音频信号进行模数转换,得到第二音频信号。进一步的,DSP 206通过算法对第一音频信号和第二音频信号进行融合,最终输出高动态范围的目标音频信号。As shown in FIG. 2 , the original audio signal output by the microphone 201 is respectively input to the first channel and the second channel. Among them, in the first channel, after the high-gain PGA 202 performs analog gain on the original audio signal, the first ADC 203 performs analog-to-digital conversion on the gained original audio signal to obtain the first audio signal; in the second channel, the low gain After the PGA 204 performs analog gain on the original audio signal, the second ADC 205 performs analog-to-digital conversion on the gained original audio signal to obtain a second audio signal. Further, the DSP 206 fuses the first audio signal and the second audio signal through an algorithm, and finally outputs a high dynamic range target audio signal.
本申请实施例提供的音频信号的处理方法,可以应用于具有音频信号处理能力的计算机设备,该计算机设备可以是智能手机、平板电脑、可穿戴式设备、个人计算机、车载终端等等,本实施例对此不作限定。The audio signal processing method provided in the embodiment of the present application can be applied to a computer device with audio signal processing capability, and the computer device may be a smart phone, a tablet computer, a wearable device, a personal computer, a vehicle-mounted terminal, etc. The example does not limit this.
并且,该计算机设备可以通过内置麦克风进行信号采集,也可以通过外接麦克风进行信号采集,本申请实施例对此不作限定。In addition, the computer device may collect signals through a built-in microphone, or may collect signals through an external microphone, which is not limited in this embodiment of the present application.
此外,本申请实施例提供的方案,可以由计算机设备内的处理器执行,该处理器可以是DSP,也可以是应用处理器(Application Processor,AP),本申请实施例对此不作限定。为了方便表述,下述各个实施例以音频信号的处理方法由计算机设备执行进行说明。In addition, the solutions provided by the embodiments of the present application may be executed by a processor in the computer device, and the processor may be a DSP or an application processor (Application Processor, AP), which is not limited in the embodiments of the present application. For the convenience of description, the following embodiments are described by using a method for processing an audio signal to be performed by a computer device.
请参考图3,其示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图,该方法包括:Please refer to FIG. 3, which shows a flowchart of an audio signal processing method provided by an exemplary embodiment of the present application, and the method includes:
步骤302,获取第一通道输出的第一音频信号,以及第二通道输出的第二音频信号,第 一音频信号是对经过第一模拟增益的原始音频信号进行模数转换后得到的信号,第二音频信号是对经过第二模拟增益的原始音频信号进行模数转换后得到的信号,第一模拟增益大于第二模拟增益。Step 302: Obtain the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the first analog gain, The second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the second analog gain, and the first analog gain is greater than the second analog gain.
为了在降低等效输入噪声的前提下,降低高声压级信号的削波概率,本申请实施例中,原始音频信号经过两路不同程度的模拟增益。其中,第一通道下,原始音频信号经过第一模拟增益,第二通道下,原始音频信号经过第二模拟增益。In order to reduce the clipping probability of the high sound pressure level signal on the premise of reducing the equivalent input noise, in the embodiment of the present application, the original audio signal passes through two channels of analog gains of different degrees. Wherein, under the first channel, the original audio signal passes through the first analog gain, and under the second channel, the original audio signal passes through the second analog gain.
由于第一模拟增益大于第二模拟增益,因此第一通道具有更低的等效输入噪声,适合拾取非高声压级信号(非高声压级信号经过高模拟增益后,因幅度过高导致削波的概率较低),而第二通道则适合拾取高声压级信号(高声压级信号经过低模拟增益后,因幅度过高导致削波的概率同样较低)。Since the first analog gain is greater than the second analog gain, the first channel has lower equivalent input noise, which is suitable for picking up non-high sound pressure level signals (non-high sound pressure level signals after high analog gain, due to excessive amplitude The probability of clipping is lower), while the second channel is suitable for picking up high SPL signals (high SPL signals have a low probability of clipping due to excessive amplitude after low analog gain).
在一种可能的实施方式中,计算机设备通过各通道上设置的PGA对原始音频信号进行模拟增益,且第一模拟增益和第二模拟增益为固设增益,或者可以根据需求进行调整。示意性的,第一模拟增益为30db,第二模拟增益为10db。In a possible implementation manner, the computer device performs analog gain on the original audio signal through the PGA set on each channel, and the first analog gain and the second analog gain are fixed gains, or can be adjusted as required. Illustratively, the first analog gain is 30db, and the second analog gain is 10db.
进一步的,第一通道中的ADC对经过第一模拟增益的原始音频信号(模拟信号)进行模数转换,得到第一音频信号(数字信号);第二通道中的ADC对经过第二模拟增益的原始音频信号进行模数转换,得到第二音频信号。其中,当原始音频信号为非高声压级信号时,两条通道发生信号削波的概率均较小,且第一通道中的等效输入底噪小于第二通道;当原始音频信号为高声压级信号时,第二通道发生信号削波的概率低于第一通道,在大动态范围信号采集场景下,兼顾非高声压级信号和高声压级信号的拾取质量。Further, the ADC in the first channel performs analog-to-digital conversion on the original audio signal (analog signal) that has undergone the first analog gain to obtain the first audio signal (digital signal); the ADC in the second channel is subjected to the second analog gain. The original audio signal is converted from analog to digital to obtain the second audio signal. Among them, when the original audio signal is a non-high sound pressure level signal, the probability of signal clipping in both channels is small, and the equivalent input noise floor in the first channel is smaller than that in the second channel; when the original audio signal is high In the case of sound pressure level signals, the probability of signal clipping in the second channel is lower than that in the first channel. In the large dynamic range signal acquisition scenario, the pickup quality of non-high sound pressure level signals and high sound pressure level signals is considered.
步骤304,基于第一音频信号和第二音频信号进行信号融合,得到目标音频信号,其中,目标音频信号的动态范围是第一音频信号和第二音频信号的动态范围的叠加。Step 304: Perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the superposition of the dynamic ranges of the first audio signal and the second audio signal.
由于存在两路音频信号,而最终需要输出单路音频信号,因此计算机设备需要进一步对两路音频信号进行融合,得到目标音频信号。在进行信号融合的过程中,计算机设备基于第一通道和第二通道各自对应的信号拾取优势,对第一音频信号和第二音频信号进行信号融合。其中,第一通道在非高声压级信号拾取上具有信号拾取优势,第二通道在高声压级信号拾取上具有信号拾取优势,因此第一音频信号对目标音频信号中非高声压级信号的影响较大,而第二音频信号对目标音频信号中高声压级信号的影响较大。Since there are two audio signals, and ultimately a single audio signal needs to be output, the computer equipment needs to further fuse the two audio signals to obtain the target audio signal. In the process of signal fusion, the computer device performs signal fusion on the first audio signal and the second audio signal based on the respective signal pickup advantages of the first channel and the second channel. Among them, the first channel has a signal pickup advantage in non-high sound pressure level signal pickup, and the second channel has a signal pickup advantage in high sound pressure level signal pickup. Therefore, the first audio signal has an impact on the non-high sound pressure level in the target audio signal The influence of the signal is larger, and the influence of the second audio signal on the high sound pressure level signal in the target audio signal is larger.
在一种可能的实施方式中,信号融合过程中,计算机设备对两路音频信号进行采样点级别融合,或者采样帧级别融合,本实施例对此不作限定。In a possible implementation manner, during the signal fusion process, the computer device performs sampling point level fusion or sampling frame level fusion on the two audio signals, which is not limited in this embodiment.
比如,当采样率为48kHz时,计算机设备每隔1/48000秒,对同一采样时刻下第一音频信号和第二音频信号中的采样点信号进行融合;或者,计算机设备每隔10ms,对同一采样帧下第一音频信号和第二音频信号中的采样点信号(包含480个采样点信号)进行融合。For example, when the sampling rate is 48kHz, the computer equipment fuses the sampling point signals in the first audio signal and the second audio signal at the same sampling time every 1/48000 second; The sampling point signals (including 480 sampling point signals) in the first audio signal and the second audio signal under the sampling frame are fused.
可选的,计算机设备通过DSP或AP执行融合算法进行信号融合,信号融合的具体过程下述实施例将进行详述。Optionally, the computer device performs signal fusion by executing a fusion algorithm through a DSP or an AP, and the specific process of the signal fusion will be described in detail in the following embodiments.
由于融合了两个通道各自的信号拾取优势,因此目标音频信号的动态范围(即在该动态范围内具有较好信号拾取质量)融合了第一音频信号以及第二音频信号各自的动态范围,即相较于单通道输出的音频信号,目标音频信号具有更大的动态范围。示意性的,第一音频信号的动态范围为30db至90db,第二音频信号的动态范围为60db至120db,目标音频信号的动态范围为30db至120db。Since the respective signal pickup advantages of the two channels are merged, the dynamic range of the target audio signal (that is, with better signal pickup quality within the dynamic range) is combined with the respective dynamic ranges of the first audio signal and the second audio signal, that is, The target audio signal has a larger dynamic range than the single-channel output audio signal. Illustratively, the dynamic range of the first audio signal is 30db to 90db, the dynamic range of the second audio signal is 60db to 120db, and the dynamic range of the target audio signal is 30db to 120db.
可选的,计算机设备可以进一步对目标音频信号进行降噪处理、动态范围处理、限幅处理、频谱均衡处理,最终将处理后的音频信号保存为录音文件,本实施例在此不再赘述。Optionally, the computer device may further perform noise reduction processing, dynamic range processing, amplitude limiting processing, and spectral equalization processing on the target audio signal, and finally save the processed audio signal as a recording file, which will not be repeated in this embodiment.
相较于相关技术中的方案,采用本申请实施例提供的方案能够实现更大动态范围的录音。比如,在录制架子鼓演奏或者音乐会(高声压级场景)时,能够降低录制时的破音问题;在安静环境下(非高声压级场景)进行录制时,能够保留更多的声音细节。Compared with the solutions in the related art, using the solutions provided by the embodiments of the present application can realize recording with a larger dynamic range. For example, when recording drum performances or concerts (high sound pressure level scenes), it can reduce the problem of broken sound during recording; when recording in a quiet environment (non-high sound pressure level scenes), it can retain more sound detail.
综上所述,本申请实施例中,通过将同一原始音频信号分别输入不同模拟增益的第一通 道和第二通道,得到第一音频信号和第二音频信号,同时兼顾非高声压级信号以及高声压级信号的信号拾取;进一步的,基于第一音频信号和第二音频信号进行信号融合,输出一路目标音频信号时,由于目标音频信号的动态范围是第一音频信号和第二音频信号的动态范围的叠加,因此扩大了音频信号的动态范围,有助于提高非高声压级信号以及高声压级信号的录制质量,实现高动态范围录音。To sum up, in the embodiment of the present application, the first audio signal and the second audio signal are obtained by respectively inputting the same original audio signal into the first channel and the second channel with different analog gains, while taking into account the non-high sound pressure level signal. and signal pickup of high sound pressure level signals; further, signal fusion is performed based on the first audio signal and the second audio signal, and when outputting a target audio signal, because the dynamic range of the target audio signal is the first audio signal and the second audio signal The superposition of the dynamic range of the signal, thus expanding the dynamic range of the audio signal, helps to improve the recording quality of non-high sound pressure level signals and high sound pressure level signals, and realizes high dynamic range recording.
在一种可能的实施方式中,基于第一音频信号和第二音频信号进行信号融合,得到目标音频信号,包括:In a possible implementation, signal fusion is performed based on the first audio signal and the second audio signal to obtain a target audio signal, including:
对第二音频信号进行信号补偿,得到第三音频信号,其中,信号补偿用于补偿第一模拟增益和第二模拟增益造成的信号差异;performing signal compensation on the second audio signal to obtain a third audio signal, wherein the signal compensation is used to compensate the signal difference caused by the first analog gain and the second analog gain;
对第一音频信号和第三音频信号进行信号融合,得到目标音频信号。Perform signal fusion on the first audio signal and the third audio signal to obtain a target audio signal.
在一种可能的实施方式中,对第一音频信号和第三音频信号进行信号融合,得到目标音频信号,包括:In a possible implementation manner, signal fusion is performed on the first audio signal and the third audio signal to obtain the target audio signal, including:
确定第一音频信号和第三音频信号各自对应的融合比重;Determine the respective fusion proportions corresponding to the first audio signal and the third audio signal;
基于融合比重对第一音频信号和第三音频信号进行信号融合,得到目标音频信号。Signal fusion is performed on the first audio signal and the third audio signal based on the fusion proportion to obtain a target audio signal.
在一种可能的实施方式中,确定第一音频信号和第三音频信号各自对应的融合比重,包括:In a possible implementation manner, determining the respective corresponding fusion proportions of the first audio signal and the third audio signal, including:
在第三音频信号的信号幅度小于第一幅度阈值,且设置有第一削波标识的情况下,确定第一音频信号的融合比重为1,第三音频信号的融合比重为0,第一削波标识用于表征第一通道未发生削波;When the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the first clipping flag is set, it is determined that the fusion ratio of the first audio signal is 1, the fusion ratio of the third audio signal is 0, and the first clipping ratio is determined to be 0. The wave mark is used to indicate that the first channel is not clipped;
在第三音频信号的信号幅度大于第二幅度阈值的情况下,确定第一音频信号的融合比重为0,第三音频信号的融合比重为1,第二幅度阈值大于第一幅度阈值;When the signal amplitude of the third audio signal is greater than the second amplitude threshold, it is determined that the fusion proportion of the first audio signal is 0, the fusion proportion of the third audio signal is 1, and the second amplitude threshold is greater than the first amplitude threshold;
在第三音频信号的信号幅度小于第一幅度阈值,且设置有第二削波标识,或者,第三音频信号的信号幅度大于第一幅度阈值且小于第二幅度阈值的情况下,确定第一音频信号的融合比重为第一动态比重,确定第二音频信号的融合比重为第二动态比重,第二削波标识用于表征第一通道发生削波,第一动态比重和第二动态比重之和为1。When the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the second clipping flag is set, or the signal amplitude of the third audio signal is larger than the first amplitude threshold and smaller than the second amplitude threshold, determine the first amplitude threshold. The fusion proportion of the audio signal is the first dynamic proportion, and it is determined that the fusion proportion of the second audio signal is the second dynamic proportion, and the second clipping flag is used to characterize that the first channel is clipped. and is 1.
在一种可能的实施方式中,确定第一音频信号的融合比重为第一动态比重,确定第二音频信号的融合比重为第二动态比重之后,该方法还包括:In a possible implementation manner, after determining that the fusion weight of the first audio signal is the first dynamic weight, and determining that the fusion weight of the second audio signal is the second dynamic weight, the method further includes:
在第三音频信号的信号幅度小于第一幅度阈值,且设置有第二削波标识的情况下,基于第一更新步长更新第一动态比重和第二动态比重,其中,更新后的第一动态比重大于更新前的第一动态比重,更新后的第二动态比重小于更新前的第二动态比重;In the case where the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the second clipping flag is set, the first dynamic proportion and the second dynamic proportion are updated based on the first update step size, wherein the updated first dynamic proportion and the second dynamic proportion are updated. The dynamic proportion is greater than the first dynamic proportion before the update, and the second dynamic proportion after the update is smaller than the second dynamic proportion before the update;
在第三音频信号的信号幅度大于第一幅度阈值且小于第二幅度阈值的情况下,基于第二更新步长更新第一动态比重和第二动态比重,其中,更新后的第一动态比重小于更新前的第一动态比重,更新后的第二动态比重大于更新前的第二动态比重。When the signal amplitude of the third audio signal is greater than the first amplitude threshold and smaller than the second amplitude threshold, the first dynamic proportion and the second dynamic proportion are updated based on the second update step, wherein the updated first dynamic proportion is less than The first dynamic proportion before the update and the second dynamic proportion after the update are greater than the second dynamic proportion before the update.
在一种可能的实施方式中,基于第一更新步长更新第一动态比重和第二动态比重之后,该方法还包括:In a possible implementation manner, after updating the first dynamic proportion and the second dynamic proportion based on the first update step size, the method further includes:
在更新后的第一动态比重大于等于1的情况下,将第二削波标识替换为第一削波标识。In the case that the updated first dynamic proportion is greater than or equal to 1, the second clipping identifier is replaced with the first clipping identifier.
可选的,该方法还包括:Optionally, the method further includes:
在第三音频信号的信号幅度大于第一幅度阈值,且设置有第一削波标识的情况下,将第一削波标识替换为第二削波标识。In the case that the signal amplitude of the third audio signal is greater than the first amplitude threshold and the first clipping flag is set, the first clipping flag is replaced with the second clipping flag.
在一种可能的实施方式中,对第二音频信号进行信号补偿,得到第三音频信号,包括:In a possible implementation manner, performing signal compensation on the second audio signal to obtain a third audio signal, including:
确定第一模拟增益与第二模拟增益的模拟增益差值;determining the analog gain difference between the first analog gain and the second analog gain;
基于模拟增益差值对第二音频信号进行增益补偿,得到第三音频信号。Gain compensation is performed on the second audio signal based on the analog gain difference to obtain a third audio signal.
在一种可能的实施方式中,对第二音频信号进行信号补偿,得到第三音频信号,包括:In a possible implementation manner, performing signal compensation on the second audio signal to obtain a third audio signal, including:
确定第一模拟增益与第二模拟增益的模拟增益差值;determining the analog gain difference between the first analog gain and the second analog gain;
基于模拟增益差值对第二音频信号进行增益补偿,得到第四音频信号;Perform gain compensation on the second audio signal based on the analog gain difference to obtain a fourth audio signal;
确定第一音频信号与第四音频信号的信号幅度比值;determining the signal amplitude ratio of the first audio signal and the fourth audio signal;
基于信号幅度比值对第四音频信号进行幅度补偿,得到第三音频信号。Amplitude compensation is performed on the fourth audio signal based on the signal amplitude ratio to obtain a third audio signal.
由于同一原始音频信号经过了不同程度的模拟增益,因此第一音频信号和第二音频信号之间存在较大差异,若直接对第一音频信号和第二音频信号进行融合,会出现融合后目标音频信号出现信号突变的问题。为了提高信号融合质量,在一种可能的实施方式中,在进行信号融合前,计算机设备首先需要对第二音频信号进行信号补偿,以此降低第一音频信号和第二音频信号之间的信号差异。下面采用示例性的实施例进行说明。Because the same original audio signal has undergone different degrees of analog gain, there is a big difference between the first audio signal and the second audio signal. If the first audio signal and the second audio signal are directly fused, the fusion target will appear. The audio signal has the problem of signal mutation. In order to improve the quality of signal fusion, in a possible implementation manner, before performing signal fusion, the computer device first needs to perform signal compensation on the second audio signal, so as to reduce the signal between the first audio signal and the second audio signal difference. Exemplary embodiments are used for description below.
请参考图4,其示出了本申请另一个示例性实施例提供的音频信号的处理方法的流程图,该方法包括:Please refer to FIG. 4, which shows a flowchart of an audio signal processing method provided by another exemplary embodiment of the present application, the method includes:
步骤401,获取第一通道输出的第一音频信号,以及第二通道输出的第二音频信号,第一音频信号是对经过第一模拟增益的原始音频信号进行模数转换后得到的信号,第二音频信号是对经过第二模拟增益的原始音频信号进行模数转换后得到的信号,第一模拟增益大于第二模拟增益。Step 401: Obtain the first audio signal output by the first channel and the second audio signal output by the second channel. The first audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal that has undergone the first analog gain. The second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the second analog gain, and the first analog gain is greater than the second analog gain.
本步骤的实施方式可以参考上述步骤302,本实施例在此不再赘述。For the implementation of this step, reference may be made to the foregoing step 302, and details are not described herein again in this embodiment.
如图5所示,原始音频信号经过第一模拟增益和模数转换后输出第一音频信号,原始音频信号经过第二模拟增益和模数转换后输出第二音频信号。As shown in FIG. 5 , the original audio signal is outputted after the first analog gain and analog-to-digital conversion, and the original audio signal is outputted after the second analog gain and analog-to-digital conversion.
步骤402,对第二音频信号进行信号补偿,得到第三音频信号,其中,信号补偿用于补偿第一模拟增益和第二模拟增益造成的信号差异。Step 402: Perform signal compensation on the second audio signal to obtain a third audio signal, where the signal compensation is used to compensate for the signal difference caused by the first analog gain and the second analog gain.
同一原始音频信号经过不同程度的模拟增益后,其信号幅度存在差异,因此计算机设备首先需要基于第一音频信号对第二音频信号进行补偿,使经过信号补偿的第二音频信号尽可能接近第一音频信号,以便后续对两路幅度相近的音频信号进行信号融合。After the same original audio signal has undergone different degrees of analog gain, its signal amplitude is different, so the computer equipment first needs to compensate the second audio signal based on the first audio signal, so that the second audio signal after signal compensation is as close as possible to the first audio signal. audio signal, so as to subsequently perform signal fusion on two audio signals with similar amplitudes.
由于第一音频信号和第二音频信号的差异主要是由模拟增益造成,因此计算机设备对第二音频信号进行增益补偿(不同于PGA的模拟增益,此处为数字增益)。并且,由于第一模拟增益高于第二模拟增益,因此进行增益补偿时需要提高第二音频信号的增益,从而得到第三音频信号。Since the difference between the first audio signal and the second audio signal is mainly caused by the analog gain, the computer device performs gain compensation on the second audio signal (different from the analog gain of the PGA, here is the digital gain). Moreover, since the first analog gain is higher than the second analog gain, it is necessary to increase the gain of the second audio signal when performing the gain compensation, so as to obtain the third audio signal.
由于第一通道和第二通道中第一模拟增益和第二模拟增益已知,因此在一种可能的实施方式中,计算机设备确定第一模拟增益与第二模拟增益的模拟增益差值,从而基于模拟增益差值对第二音频信号进行增益补偿,得到第三音频信号。Since the first analog gain and the second analog gain in the first channel and the second channel are known, in a possible implementation, the computer device determines the analog gain difference between the first analog gain and the second analog gain, thereby Gain compensation is performed on the second audio signal based on the analog gain difference to obtain a third audio signal.
可选的,计算机设备基于第一模拟增益与第二模拟增益之间的模拟增益差值,确定增益倍数,从而基于增益倍数对第二音频信号进行增益补偿,得到第三音频信号。Optionally, the computer device determines the gain multiplier based on the analog gain difference between the first analog gain and the second analog gain, so as to perform gain compensation on the second audio signal based on the gain multiplier to obtain the third audio signal.
在一个示意性的例子中,当第一音频信号为x1,第二音频信号为x2,且第一模拟增益为g1dB,且第二模拟增益为g2dB时,计算机设备确定增益倍数为
Figure PCTCN2022076756-appb-000001
相应的,第三音频信号
Figure PCTCN2022076756-appb-000002
In an illustrative example, when the first audio signal is x1, the second audio signal is x2, the first analog gain is g1dB, and the second analog gain is g2dB, the computer device determines the gain multiple to be
Figure PCTCN2022076756-appb-000001
Correspondingly, the third audio signal
Figure PCTCN2022076756-appb-000002
然而,在实际应用中发现,由于实际模拟增益与预先设置的模拟增益之间可能存在差异(可能因PGA原因造成),相应的,经过固定增益补偿后的第二音频信号与第一音频信号之间仍旧存在一定的幅度差异。因此为了进一步缩小音频信号之间的差异,如图5所示,第二音频信号首先经过固定增益补偿得到第四音频信号,然后经过自适应幅度补偿(微调)得到第三音频信号。However, in practical applications, it is found that due to the possible difference between the actual analog gain and the preset analog gain (which may be caused by PGA), correspondingly, the difference between the second audio signal after fixed gain compensation and the first audio signal is There are still some differences in magnitude. Therefore, in order to further reduce the difference between the audio signals, as shown in FIG. 5 , the second audio signal is first compensated by a fixed gain to obtain the fourth audio signal, and then the third audio signal is obtained by adaptive amplitude compensation (fine adjustment).
具体的,自适应幅度补偿时,如图6所示,计算机设备分别计算第一音频信号和第四音频信号各自的信号幅度,从而确定两者之间的信号幅度差异,进而基于该信号幅度差异对第四音频信号进行幅度补偿,得到第三音频信号。Specifically, during the adaptive amplitude compensation, as shown in FIG. 6 , the computer device calculates the respective signal amplitudes of the first audio signal and the fourth audio signal, so as to determine the signal amplitude difference between the two, and then based on the signal amplitude difference Amplitude compensation is performed on the fourth audio signal to obtain a third audio signal.
在一种可能的实施方式中,如图7所示,本步骤可以包括如下步骤:In a possible implementation, as shown in FIG. 7 , this step may include the following steps:
步骤402A,确定第一模拟增益与第二模拟增益的模拟增益差值。 Step 402A: Determine the analog gain difference between the first analog gain and the second analog gain.
步骤402B,基于模拟增益差值对第二音频信号进行增益补偿,得到第四音频信号。 Step 402B: Perform gain compensation on the second audio signal based on the analog gain difference to obtain a fourth audio signal.
步骤402A和402B的实施方式可以参考上述实施例,本实施例在此不再赘述。For the implementation of steps 402A and 402B, reference may be made to the foregoing embodiments, and details are not described herein again in this embodiment.
步骤402C,确定第一音频信号与第四音频信号的信号幅度比值。 Step 402C, determining the signal amplitude ratio of the first audio signal and the fourth audio signal.
可选的,计算机设备计算第一音频信号的信号幅度以及第四音频信号的信号幅度,并通过计算得到两者之间的信号幅度比值确定信号幅度差异。Optionally, the computer device calculates the signal amplitude of the first audio signal and the signal amplitude of the fourth audio signal, and determines the signal amplitude difference by obtaining the signal amplitude ratio between the two.
在一种可能的实施方式中,音频信号的信号幅度采用能量包络进行表示,相应的,第一音频信号与第四音频信号的信号幅度比值可以采用能量包络比值进行表示,即信号幅度比值ratio=第一音频信号能量包络/第四音频信号能量包络。In a possible implementation manner, the signal amplitude of the audio signal is represented by an energy envelope, and correspondingly, the signal amplitude ratio of the first audio signal and the fourth audio signal can be represented by the energy envelope ratio, that is, the signal amplitude ratio ratio=first audio signal energy envelope/fourth audio signal energy envelope.
当然,音频信号的信号幅度还可以采用能量包络以外的其他参数进行表示,本实施例对此不作限定。Certainly, the signal amplitude of the audio signal may also be represented by other parameters than the energy envelope, which is not limited in this embodiment.
步骤402D,基于信号幅度比值对第四音频信号进行幅度补偿,得到第三音频信号。 Step 402D, performing amplitude compensation on the fourth audio signal based on the signal amplitude ratio to obtain a third audio signal.
进一步的,计算机设备基于确定出的信号幅度比值,对第四音频信号进行幅度补偿,得到第三音频信号。Further, the computer device performs amplitude compensation on the fourth audio signal based on the determined signal amplitude ratio to obtain the third audio signal.
结合上述步骤中的示例,经过幅度补偿得到的第三音频信号可以表示为x2_c=x2_g×ratio。With reference to the examples in the above steps, the third audio signal obtained through amplitude compensation can be expressed as x2_c=x2_g×ratio.
步骤403,对第一音频信号和第三音频信号进行信号融合,得到目标音频信号。Step 403: Perform signal fusion on the first audio signal and the third audio signal to obtain a target audio signal.
由于麦克风采集到的信号可能时刻发生改变,即高声压级信号和非高声压级信号可能交替出现,因此计算机设备需要基于信号实时的声压级,对第一音频信号和第三音频信号进行动态融合。Since the signal collected by the microphone may change at any time, that is, the high sound pressure level signal and the non-high sound pressure level signal may appear alternately, so the computer equipment needs to be based on the real-time sound pressure level of the signal. Perform dynamic fusion.
为了提高在高动态范围内的整体录音质量,在一种可能的实施方式中,当第一通道的信号拾取质量高于第二通道的信号拾取质量时(非高声压级信号),计算机设备在进行信号融合时提高第一音频信号的比重,降低第三音频信号的比重,提高非高声压级信号的录制效果;当第二通道的信号拾取质量高于第一通道的信号拾取质量时(高声压级信号),计算机设备在进行信号融合时提高第三音频信号的比重,降低第一音频信号的比重,提高高声压级信号的录制质量。可选的,本步骤可以包括如下步骤。In order to improve the overall recording quality in the high dynamic range, in a possible implementation, when the signal pickup quality of the first channel is higher than the signal pickup quality of the second channel (non-high sound pressure level signal), the computer equipment When performing signal fusion, increase the proportion of the first audio signal, reduce the proportion of the third audio signal, and improve the recording effect of non-high sound pressure level signals; when the signal pickup quality of the second channel is higher than that of the first channel. (high sound pressure level signal), the computer equipment increases the proportion of the third audio signal when performing signal fusion, reduces the proportion of the first audio signal, and improves the recording quality of the high sound pressure level signal. Optionally, this step may include the following steps.
一、确定第一音频信号和第三音频信号各自对应的融合比重。1. Determine the respective fusion proportions corresponding to the first audio signal and the third audio signal.
其中,第一音频信号和第三音频信号各自对应融合比重之和为1,比如,若第一音频信号的融合比重为a,第三音频信号的融合比重则为1-a,且0≤a≤1。The sum of the corresponding fusion proportions of the first audio signal and the third audio signal is 1. For example, if the fusion proportion of the first audio signal is a, the fusion proportion of the third audio signal is 1-a, and 0≤a ≤1.
由于第一通道处的第一模拟增益较大,因此第一通道中经过第一模拟增益的原始音频信号进行模数转化时可能会出现削波情况,而第二通道出的第二模拟增益较小,因此不易出现削波情况。若直接对第一音频信号进行分析,确定第一通道中是否发生削波(比如确定最高幅度或高于预设幅度处的第一音频信号发生削波),误判概率较高。为了提高对第一通道中削波情况的识别准确性,本申请实施例中,计算机设备基于第三音频信号确定第一通道的削波情况,进一步的,计算机设备基于第一通道的削波情况,以及第三音频信号的实时信号幅度,动态确定第一音频信号和第三音频信号各自的融合比重。Since the first analog gain at the first channel is relatively large, clipping may occur during the analog-to-digital conversion of the original audio signal that has passed through the first analog gain in the first channel, while the second analog gain output from the second channel is relatively high. Small, so less prone to clipping. If the first audio signal is directly analyzed to determine whether clipping occurs in the first channel (for example, it is determined that clipping occurs in the first audio signal at the highest amplitude or higher than a preset amplitude), the probability of misjudgment is high. In order to improve the recognition accuracy of the clipping situation in the first channel, in this embodiment of the present application, the computer device determines the clipping situation of the first channel based on the third audio signal, and further, the computer device determines the clipping situation of the first channel based on the clipping situation of the first channel. , and the real-time signal amplitude of the third audio signal, dynamically determine the respective fusion proportions of the first audio signal and the third audio signal.
在一种可能的实施方式中,第一音频信号和第三音频信号各自对应的融合比值包括如下几种情况。In a possible implementation manner, the respective fusion ratios corresponding to the first audio signal and the third audio signal include the following situations.
情况一、在第三音频信号的信号幅度小于第一幅度阈值,且设置有第一削波标识的情况下,确定第一音频信号的融合比重为1,第三音频信号的融合比重为0,第一削波标识用于表征第一通道未发生削波。Situation 1, when the signal amplitude of the third audio signal is less than the first amplitude threshold and is provided with the first clipping mark, it is determined that the fusion proportion of the first audio signal is 1, and the fusion proportion of the third audio signal is 0, The first clipping flag is used to indicate that the first channel is not clipped.
本实施例中,计算机设备中设置有两级幅度阈值,分别为第一幅度阈值和第二幅度阈值,其中,第二幅度阈值大于第一幅度阈值。可选的,当第三音频信号的信号幅度小于第一幅度阈值时,计算机设备确定第一通道未发生削波;当第三音频信号的信号幅度大于等于第一幅度阈值但小于第二幅度阈值时,计算机设备确定第一通道可能发生削波;当第三音频信号的 信号幅度大于等于第二幅度阈值时,计算机设备确定第一通道发生削波。In this embodiment, two levels of amplitude thresholds are set in the computer device, which are a first amplitude threshold and a second amplitude threshold, wherein the second amplitude threshold is greater than the first amplitude threshold. Optionally, when the signal amplitude of the third audio signal is less than the first amplitude threshold, the computer device determines that no clipping occurs in the first channel; when the signal amplitude of the third audio signal is greater than or equal to the first amplitude threshold but less than the second amplitude threshold When , the computer device determines that clipping may occur in the first channel; when the signal amplitude of the third audio signal is greater than or equal to the second amplitude threshold, the computer device determines that clipping occurs in the first channel.
在一些实施例中,计算机设备基于两条通道的模拟增益情况设置第一幅度阈值和第二幅度阈值,本实施例对此不作限定。In some embodiments, the computer device sets the first amplitude threshold and the second amplitude threshold based on the analog gain conditions of the two channels, which is not limited in this embodiment.
可选的,计算机设备设置有用于表征第一通道削波情况的削波标识位(flag),其中,当削波标识位设置第一削波标识时(flag=0),表征第一通道未发生削波,当削波标识位设置第二削波标识时(flag=1),表征第一通道发生削波。Optionally, the computer equipment is provided with a clipping flag bit (flag) for characterizing the clipping situation of the first channel, wherein, when the clipping flag bit sets the first clipping flag (flag=0), it indicates that the first channel is not Clipping occurs. When the clipping flag bit is set to the second clipping flag (flag=1), it indicates that the first channel is clipped.
关于削波标识的设置方式,在一种可能的实施方式中,削波标识位的初始削波标识为第一削波标识,当第三音频信号的信号幅度大于第一幅度阈值时,计算机设备确定第一通道可能发生削波,将第一削波标识调整为第二削波标识;且为了提高融合后目标音频信号的平滑度,在当前设置有第二削波标识的情况下,当第三音频信号的信号幅度小于第一幅度阈值时,计算机设备并非直接将第二削波标识调整为第一削波标识,而是在第三音频信号的信号幅度小于第一幅度阈值且达到一定时长时,将第二削波标识调整为第一削波标识。Regarding the setting mode of the clipping flag, in a possible implementation manner, the initial clipping flag of the clipping flag bit is the first clipping flag, and when the signal amplitude of the third audio signal is greater than the first amplitude threshold, the computer equipment It is determined that clipping may occur in the first channel, and the first clipping mark is adjusted to the second clipping mark; and in order to improve the smoothness of the target audio signal after fusion, when the second clipping mark is currently provided, when the first clipping mark is When the signal amplitude of the three audio signals is less than the first amplitude threshold, the computer device does not directly adjust the second clipping flag to the first clipping flag, but when the signal amplitude of the third audio signal is less than the first amplitude threshold and reaches a certain duration When , the second clipping flag is adjusted to the first clipping flag.
可选的,当第三音频信号的信号幅度小于第一幅度阈值,且设置有第一削波标识时,表明第一通道在最近一段时间内未发生削波,且由于在未发生削波的情况下,第一通道的信号拾取质量优于第二通道的信号拾取质量(因为第一通道的等效输入噪声更小),因此计算机设备确定第一音频信号的融合比重为1,第三音频信号的融合比重为0,即在信号融合时不融入第三音频信号。Optionally, when the signal amplitude of the third audio signal is less than the first amplitude threshold, and the first clipping flag is set, it indicates that the first channel has not clipped in a recent period of time, and because no clipping has occurred in the In this case, the signal pickup quality of the first channel is better than that of the second channel (because the equivalent input noise of the first channel is smaller), so the computer device determines that the fusion ratio of the first audio signal is 1, and the third audio The fusion ratio of the signal is 0, that is, the third audio signal is not integrated into the signal fusion.
示意性的,第三音频信号x2_c的信号幅度mag=abs(x2_c),当mag<thrd1,且flag=0时,计算机设备确定第一音频信号x1的融合比重为1,第三音频信号x2_c的融合比重为0。Illustratively, the signal amplitude mag=abs(x2_c) of the third audio signal x2_c, when mag<thrd1 and flag=0, the computer device determines that the fusion ratio of the first audio signal x1 is 1, and the third audio signal x2_c has a The fusion proportion is 0.
情况二、在第三音频信号的信号幅度大于第二幅度阈值的情况下,确定第一音频信号的融合比重为0,第三音频信号的融合比重为1,第二幅度阈值大于第一幅度阈值。Situation 2: When the signal amplitude of the third audio signal is greater than the second amplitude threshold, it is determined that the fusion proportion of the first audio signal is 0, the fusion proportion of the third audio signal is 1, and the second amplitude threshold is greater than the first amplitude threshold. .
由于在第三音频信号的信号幅度大于第二幅度阈值的情况下(因为大于第二幅度阈值前先达到第一幅度阈值,因此当削波标识位设置有第二削波标识),第一通道必然发生削波,而在发生削波的情况下,第一音频信号将会出现失真的问题,因此为了避免最终输出的音频信号失真,计算机设备确定第一音频信号的融合比重为0,第三音频信号的融合比重为1,即在信号融合时不融入第一音频信号。Since the signal amplitude of the third audio signal is greater than the second amplitude threshold (because the first amplitude threshold is reached before being greater than the second amplitude threshold, when the clipping flag is set with the second clipping flag), the first channel Clipping will inevitably occur, and in the case of clipping, the first audio signal will be distorted. Therefore, in order to avoid distortion of the final output audio signal, the computer equipment determines that the fusion ratio of the first audio signal is 0, and the third audio signal is 0. The fusion ratio of the audio signal is 1, that is, the first audio signal is not integrated into the signal fusion.
示意性的,第三音频信号x2_c的信号幅度mag=abs(x2_c),当mag≥thrd2时,计算机设备确定第一音频信号x1的融合比重为0,第三音频信号x2_c的融合比重为1。Illustratively, the signal amplitude mag=abs(x2_c) of the third audio signal x2_c, when mag≥thrd2, the computer device determines that the fusion weight of the first audio signal x1 is 0, and the fusion weight of the third audio signal x2_c is 1.
情况三、在第三音频信号的信号幅度小于第一幅度阈值,且设置有第二削波标识,或者,第三音频信号的信号幅度大于第一幅度阈值且小于第二幅度阈值的情况下,确定第一音频信号的融合比重为第一动态比重,确定第二音频信号的融合比重为第二动态比重,第二削波标识用于表征第一通道发生削波,第一动态比重和第二动态比重之和为1。Case 3: When the signal amplitude of the third audio signal is less than the first amplitude threshold and a second clipping flag is set, or the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold, It is determined that the fusion proportion of the first audio signal is the first dynamic proportion, and the fusion proportion of the second audio signal is determined to be the second dynamic proportion, and the second clipping mark is used to characterize that the first channel is clipped, and the first dynamic proportion and the second The sum of the dynamic proportions is 1.
当第三音频信号的信号幅度小于第一幅度阈值且设置有第二削波标识时,或者,当第三音频信号的信号幅度大于第一幅度阈值且小于第二幅度阈值时,表明第一通道在最近一段时间内可能发生削波,由于信号幅度的变化过程并非突变过程,因此为了提高融合后目标音频信号的平滑度,计算机设备确定第一音频信号和第二音频信号的融合比重为动态比重,即在信号幅度变化过程中,第一音频信号和第二音频信号的融合比重也相应发生变化。When the signal amplitude of the third audio signal is less than the first amplitude threshold and the second clipping flag is set, or when the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold, it indicates that the first channel Clipping may occur in a recent period of time. Since the change process of the signal amplitude is not a mutation process, in order to improve the smoothness of the target audio signal after fusion, the computer equipment determines the fusion weight of the first audio signal and the second audio signal as the dynamic weight. , that is, during the change of the signal amplitude, the fusion proportion of the first audio signal and the second audio signal also changes accordingly.
在一种可能的实施方式中,计算机设备设置有第一动态比重和第二动态比重,其中,第一动态比重为g(g初始为1),第二动态比重为1-g。在上述情况三中,计算机设备即基于第一动态比重和第二动态比重进行信号融合。In a possible implementation, the computer device is provided with a first dynamic specific gravity and a second dynamic specific gravity, wherein the first dynamic specific gravity is g (g is initially 1), and the second dynamic specific gravity is 1-g. In the above-mentioned situation three, the computer device performs signal fusion based on the first dynamic proportion and the second dynamic proportion.
并且,在上述情况下完成信号融合后,计算机设备都会基于第三音频信号的信号幅度与第一幅度阈值以及第二幅度阈值之间的大小关系,更新第一动态比重和第二动态比重。其中,当基于第三音频信号的信号幅度与第一幅度阈值以及第二幅度阈值之间的大小关系,确定第 一通道的削波概率降低时,计算机设备提高第一动态比重,并降低第二动态比重;当确定第一通道削波概率提高时,计算机设备降低第一动态比重,并提高第二动态比重。Moreover, after the signal fusion is completed in the above situation, the computer device will update the first dynamic proportion and the second dynamic proportion based on the magnitude relationship between the signal amplitude of the third audio signal and the first amplitude threshold and the second amplitude threshold. Wherein, when it is determined that the clipping probability of the first channel is reduced based on the magnitude relationship between the signal amplitude of the third audio signal and the first amplitude threshold and the second amplitude threshold, the computer device increases the first dynamic proportion and reduces the second Dynamic weight; when it is determined that the clipping probability of the first channel is increased, the computer device reduces the first dynamic weight and increases the second dynamic weight.
在一种可能的实施方式中,在第三音频信号的信号幅度小于第一幅度阈值,且设置有第二削波标识的情况下,计算机设备基于第一更新步长更新第一动态比重和第二动态比重,其中,更新后的第一动态比重大于更新前的第一动态比重,更新后的第二动态比重小于更新前的第二动态比重。In a possible implementation manner, when the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the second clipping flag is set, the computer device updates the first dynamic proportion and the first dynamic proportion based on the first update step size. Two dynamic proportions, wherein the first dynamic proportion after the update is greater than the first dynamic proportion before the update, and the second dynamic proportion after the update is smaller than the second dynamic proportion before the update.
其中,第一更新步长可以基于音频信号的采样率确定得到。可选的,第一更新步长与采样率呈负相关关系,即采样率越高,第一更新步长越小,采样率越低,第一更新步长越大。The first update step size may be determined based on the sampling rate of the audio signal. Optionally, the first update step size is negatively correlated with the sampling rate, that is, the higher the sampling rate, the smaller the first update step size, and the lower the sampling rate, the larger the first update step size.
在一个示意性的例子中,当第一动态比重为g,第二动态比重为1-g,且第一更新步长为delta_rel(0<delta_rel<1)时,若第三音频信号的信号幅度小于第一幅度阈值,且设置有第二削波标识,计算机设备基于g和1-g完成信号融合后,将g更新为g+delta_rel,从而提高后续信号融合时第一音频信号的融合比重,降低第三音频信号的融合比重。In an illustrative example, when the first dynamic weight is g, the second dynamic weight is 1-g, and the first update step is delta_rel (0<delta_rel<1), if the signal amplitude of the third audio signal is is less than the first amplitude threshold, and is provided with a second clipping mark, after the computer equipment completes the signal fusion based on g and 1-g, the g is updated to g+delta_rel, thereby improving the fusion proportion of the first audio signal during subsequent signal fusion, Reduce the fusion weight of the third audio signal.
可选的,每次动态比重后,计算机设备检测第一动态比重是否大于等于1,若第一动态比重大于等于1,表明第三音频信号在一段时间内的信号幅度均低于第一幅度阈值,且第一通道在一段时间内未发生削波,从而将第二削波标识替换为第一削波标识。Optionally, after each dynamic proportion, the computer device detects whether the first dynamic proportion is greater than or equal to 1. If the first dynamic proportion is greater than or equal to 1, it indicates that the signal amplitude of the third audio signal within a period of time is lower than the first amplitude threshold. , and the first channel does not clip for a period of time, so the second clipping flag is replaced with the first clipping flag.
此外,由于第一动态比重的最大值为1,因此,当更新后的第一动态比重大于等于1时,计算机设备将第一动态比重设置为1。In addition, since the maximum value of the first dynamic specific gravity is 1, when the updated first dynamic specific gravity is greater than or equal to 1, the computer device sets the first dynamic specific gravity to 1.
在另一种可能的实施方式中,在第三音频信号的信号幅度大于第一幅度阈值且小于第二幅度阈值的情况下,计算机设备基于第二更新步长更新第一动态比重和第二动态比重,其中,更新后的第一动态比重小于更新前的第一动态比重,更新后的第二动态比重大于更新前的第二动态比重。In another possible implementation, when the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold, the computer device updates the first dynamic proportion and the second dynamic proportion based on the second update step size weight, wherein the updated first dynamic weight is smaller than the pre-update first dynamic weight, and the updated second dynamic weight is greater than the before-update second dynamic weight.
并且,当第三音频信号的信号幅度大于第一幅度阈值,且设置有第一削波标识,计算机设备将第一削波标识替换为第二削波标识,指示第一通道可能发生削波。Moreover, when the signal amplitude of the third audio signal is greater than the first amplitude threshold and the first clipping flag is set, the computer device replaces the first clipping flag with the second clipping flag, indicating that the first channel may be clipped.
其中,第二更新步长可以基于音频信号的采样率确定得到。可选的,第二更新步长与采样率呈负相关关系,即采样率越高,第二更新步长越小,采样率越低,第二更新步长越大。The second update step size may be determined based on the sampling rate of the audio signal. Optionally, the second update step size is negatively correlated with the sampling rate, that is, the higher the sampling rate is, the smaller the second update step size is, and the lower the sampling rate is, the larger the second update step size is.
在一个示意性的例子中,当第一动态比重为g,第二动态比重为1-g,且第一更新步长为delta_att(0<delta_att<1)时,若第三音频信号的信号幅度大于等于第一幅度阈值且小于第二幅度阈值,计算机设备基于g和1-g完成信号融合后,将g更新为g-delta_att,从而降低后续信号融合时第一音频信号的融合比重,提高第三音频信号的融合比重。In an illustrative example, when the first dynamic weight is g, the second dynamic weight is 1-g, and the first update step is delta_att (0<delta_att<1), if the signal amplitude of the third audio signal is Greater than or equal to the first amplitude threshold and less than the second amplitude threshold, after the computer equipment completes the signal fusion based on g and 1-g, it updates g to g-delta_att, thereby reducing the fusion proportion of the first audio signal during subsequent signal fusion, and improving the first audio signal. The fusion proportion of the three audio signals.
此外,由于第一动态比重的最小值为0,因此,当更新后的第一动态比重小于等于0时,计算机设备将第一动态比重设置为0。In addition, since the minimum value of the first dynamic proportion is 0, when the updated first dynamic proportion is less than or equal to 0, the computer device sets the first dynamic proportion to 0.
二、基于融合比重对第一音频信号和第三音频信号进行信号融合,得到目标音频信号。2. Perform signal fusion on the first audio signal and the third audio signal based on the fusion proportion to obtain a target audio signal.
结合上述步骤中的三种情况,当第一音频信号为x1,第三音频信号为x2_c,且第一动态比重为g,第二动态比重为1-g时,情况一下,由于第一音频信号的融合比重为1,第三音频信号的融合比重为0,因此目标音频信号即为第一音频信号,即y=x1。Combining the three cases in the above steps, when the first audio signal is x1, the third audio signal is x2_c, and the first dynamic proportion is g, and the second dynamic proportion is 1-g, in the case, because the first audio signal The fusion weight of 1 is 1, and the fusion weight of the third audio signal is 0, so the target audio signal is the first audio signal, that is, y=x1.
情况二下,由于第一音频信号的融合比重为0,第三音频信号的融合比重为1,因此目标音频信号即为第二音频信号,即y=x2_c。In the second case, since the fusion weight of the first audio signal is 0 and the fusion weight of the third audio signal is 1, the target audio signal is the second audio signal, that is, y=x2_c.
情况三下,当第三音频信号的信号幅度小于第一幅度阈值,且设置有第二削波标识是,计算机设备融合得到的目标音频信号y=g*x1+(1-g)*x2_c,并将g更新为g+delta_rel。In the third case, when the signal amplitude of the third audio signal is less than the first amplitude threshold, and the second clipping flag is set, the target audio signal y=g*x1+(1-g)*x2_c obtained by computer equipment fusion, and Update g to g+delta_rel.
当第三音频信号的信号幅度大于第一幅度阈值且小于第二幅度阈值时,计算机设备融合得到的目标音频信号y=g*x1+(1-g)*x2_c,并将g更新为g-delta_att,并设置flag=1。When the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold, the target audio signal y=g*x1+(1-g)*x2_c obtained by the computer device fusion, and update g to g-delta_att , and set flag=1.
本实施例中,计算机设备通过对第二音频信号进行固定增益补偿和自适应幅度补偿,使 补偿后得到的第三音频信号尽可能接近第一音频信号,有助于提高后续的信号融合质量。In this embodiment, the computer device performs fixed gain compensation and adaptive amplitude compensation on the second audio signal, so that the third audio signal obtained after compensation is as close to the first audio signal as possible, which helps to improve the quality of subsequent signal fusion.
此外,本实施例中,计算机设备基于第三音频信号的信号幅度,确定第一通道处的削波情况,并进一步基于第一通道的削波情况以及信号幅度,动态确定信号融合是第一音频信号以及第三音频信号的融合比重,避免出现目标音频信号出现跳变的情况,提高了融合后目标音频信号的融合质量和平滑度。In addition, in this embodiment, the computer device determines the clipping situation at the first channel based on the signal amplitude of the third audio signal, and further dynamically determines that the signal fusion is the first audio frequency based on the clipping situation and the signal amplitude of the first channel. The fusion ratio of the signal and the third audio signal avoids the occurrence of jumps in the target audio signal, and improves the fusion quality and smoothness of the target audio signal after fusion.
结合上述实施例,在一个示意性的例子中,当原始音频信号从非高声压级逐渐提高至高声压级,然后逐渐降低至非高声压级时,计算机设备进行音频信号处理的过程如图8所示。In conjunction with the above embodiments, in an illustrative example, when the original audio signal is gradually increased from a non-high sound pressure level to a high sound pressure level, and then gradually decreased to a non-high sound pressure level, the computer equipment performs the audio signal processing process as follows: shown in Figure 8.
步骤801,获取第一通道输出的第一音频信号,以及第二通道输出的第二音频信号。Step 801: Obtain the first audio signal output by the first channel and the second audio signal output by the second channel.
步骤802,对第二音频信号进行固定增益补偿和自适应幅度补偿,得到第三音频信号。Step 802, performing fixed gain compensation and adaptive amplitude compensation on the second audio signal to obtain a third audio signal.
其中,步骤801至802为信号补偿过程。Among them, steps 801 to 802 are signal compensation processes.
步骤803,第三音频信号的信号幅度小于第一幅度阈值(初始为第一削波标识),并输出第一音频信号。Step 803, the signal amplitude of the third audio signal is smaller than the first amplitude threshold (initially the first clipping flag), and the first audio signal is output.
当采集到非高声压级信号时,直接输出第一通道的音频信号。When a non-high sound pressure level signal is collected, the audio signal of the first channel is directly output.
步骤804,第三音频信号的信号幅度大于等于第一幅度阈值,且小于第二幅度阈值,基于第一动态比重和第二动态比重,对第一音频信号和第三音频信号进行信号融合并输出。Step 804, the signal amplitude of the third audio signal is greater than or equal to the first amplitude threshold and less than the second amplitude threshold, and based on the first dynamic proportion and the second dynamic proportion, the first audio signal and the third audio signal are fused and output. .
步骤805,基于第二更新步长,降低第一动态比重(提高第二动态比重),并设置第二削波标识。Step 805 , based on the second update step size, reduce the first dynamic proportion (increase the second dynamic proportion), and set a second clipping flag.
步骤804至805中,当非高声压级信号逐渐变为高声压级信号的过程中,动态提高第二通道的融合比重,降低第一通道的融合比重,并进行信号融合。In steps 804 to 805, when the non-high sound pressure level signal gradually becomes a high sound pressure level signal, the fusion proportion of the second channel is dynamically increased, the fusion proportion of the first channel is decreased, and signal fusion is performed.
步骤806,第三音频信号的信号幅度大于等于第二幅度阈值,并输出第三音频信号。Step 806, the signal amplitude of the third audio signal is greater than or equal to the second amplitude threshold, and the third audio signal is output.
当采集到高声压级信号是,直接输出第二通道的音频信号。When a high sound pressure level signal is collected, the audio signal of the second channel is directly output.
步骤807,第三音频信号的信号幅度小于第一幅度阈值(此时为第二削波标识),基于第一动态比重和第二动态比重,对第一音频信号和第三音频信号进行信号融合并输出。Step 807, the signal amplitude of the third audio signal is less than the first amplitude threshold (this time is the second clipping mark), and based on the first dynamic proportion and the second dynamic proportion, the first audio signal and the third audio signal are subjected to signal fusion. and output.
步骤808,基于第一更新步长,提高第一动态比重(降低第二动态比重)。Step 808, based on the first update step size, increase the first dynamic proportion (decrease the second dynamic proportion).
步骤809,第一动态比重大于等于1时,设置第一削波标识。Step 809, when the first dynamic proportion is greater than or equal to 1, set the first clipping flag.
步骤807至809中,高声压级信号逐渐变为非高声压级信号的过程中,动态提高第一通道的融合比重,降低第二通道的融合比重,并进行信号融合。In steps 807 to 809, in the process of gradually changing the high sound pressure level signal into a non-high sound pressure level signal, dynamically increase the fusion proportion of the first channel, decrease the fusion proportion of the second channel, and perform signal fusion.
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。The following are apparatus embodiments of the present application, which can be used to execute the method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.
请参考图9,其示出了本申请一个实施例提供的音频信号的处理装置的结构框图。该装置具有实现上述方法实施例中由计算机设备执行的功能,该功能可以由硬件实现,也可以由硬件执行相应的软件实现。如图9所示,该装置可以包括:Please refer to FIG. 9 , which shows a structural block diagram of an audio signal processing apparatus provided by an embodiment of the present application. The apparatus has the function of realizing the function executed by the computer device in the above method embodiment, and the function may be realized by hardware, or may be realized by hardware executing corresponding software. As shown in Figure 9, the apparatus may include:
信号获取模块901,用于获取第一通道输出的第一音频信号,以及第二通道输出的第二音频信号,所述第一音频信号是对经过第一模拟增益的原始音频信号进行模数转换后得到的信号,所述第二音频信号是对经过第二模拟增益的原始音频信号进行模数转换后得到的信号,所述第一模拟增益大于所述第二模拟增益;A signal acquisition module 901, configured to acquire the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is an analog-to-digital conversion of the original audio signal that has undergone the first analog gain The second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with a second analog gain, and the first analog gain is greater than the second analog gain;
信号融合模块902,用于基于所述第一音频信号和所述第二音频信号进行信号融合,得到目标音频信号,其中,所述目标音频信号的动态范围是所述第一音频信号和所述第二音频信号的动态范围的叠加。A signal fusion module 902, configured to perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the first audio signal and the Superposition of the dynamic range of the second audio signal.
可选的,信号融合模块902,包括:Optionally, the signal fusion module 902 includes:
补偿单元,用于对所述第二音频信号进行信号补偿,得到第三音频信号,其中,信号补偿用于补偿所述第一模拟增益和所述第二模拟增益造成的信号差异;a compensation unit, configured to perform signal compensation on the second audio signal to obtain a third audio signal, wherein the signal compensation is used to compensate the signal difference caused by the first analog gain and the second analog gain;
融合单元,用于对所述第一音频信号和所述第三音频信号进行信号融合,得到所述目标音频信号。A fusion unit, configured to perform signal fusion on the first audio signal and the third audio signal to obtain the target audio signal.
可选的,融合单元,具体用于:Optional, fusion unit, specifically for:
确定所述第一音频信号和所述第三音频信号各自对应的融合比重;Determine the respective fusion proportions corresponding to the first audio signal and the third audio signal;
基于所述融合比重对所述第一音频信号和所述第三音频信号进行信号融合,得到所述目标音频信号。Perform signal fusion on the first audio signal and the third audio signal based on the fusion ratio to obtain the target audio signal.
可选的,确定所述第一音频信号和所述第三音频信号各自对应的融合比重时,融合单元,具体用于:Optionally, when determining the respective fusion proportions of the first audio signal and the third audio signal, the fusion unit is specifically used for:
在所述第三音频信号的信号幅度小于第一幅度阈值,且设置有第一削波标识的情况下,确定所述第一音频信号的融合比重为1,所述第三音频信号的融合比重为0,所述第一削波标识用于表征所述第一通道未发生削波;In the case that the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the first clipping flag is set, it is determined that the fusion proportion of the first audio signal is 1, and the fusion proportion of the third audio signal is 1 is 0, and the first clipping flag is used to represent that the first channel is not clipped;
在所述第三音频信号的信号幅度大于第二幅度阈值的情况下,确定所述第一音频信号的融合比重为0,所述第三音频信号的融合比重为1,所述第二幅度阈值大于所述第一幅度阈值;In the case that the signal amplitude of the third audio signal is greater than the second amplitude threshold, it is determined that the fusion proportion of the first audio signal is 0, the fusion proportion of the third audio signal is 1, and the second amplitude threshold greater than the first amplitude threshold;
在所述第三音频信号的信号幅度小于所述第一幅度阈值,且设置有第二削波标识,或者,所述第三音频信号的信号幅度大于所述第一幅度阈值且小于所述第二幅度阈值的情况下,确定所述第一音频信号的融合比重为第一动态比重,确定所述第二音频信号的融合比重为第二动态比重,所述第二削波标识用于表征所述第一通道发生削波,所述第一动态比重和所述第二动态比重之和为1。When the signal amplitude of the third audio signal is smaller than the first amplitude threshold, and a second clipping flag is set, or the signal amplitude of the third audio signal is greater than the first amplitude threshold and smaller than the first amplitude threshold In the case of two amplitude thresholds, the fusion proportion of the first audio signal is determined to be the first dynamic proportion, the fusion proportion of the second audio signal is determined to be the second dynamic proportion, and the second clipping flag is used to represent the The first channel is clipped, and the sum of the first dynamic specific gravity and the second dynamic specific gravity is 1.
可选的,所述装置还包括:Optionally, the device further includes:
第一比重更新模块,用于唉所述第三音频信号的信号幅度小于所述第一幅度阈值,且设置有所述第二削波标识的情况下,基于第一更新步长更新所述第一动态比重和所述第二动态比重,其中,更新后的所述第一动态比重大于更新前的所述第一动态比重,更新后的所述第二动态比重小于更新前的所述第二动态比重;The first specific gravity update module is used to update the third audio signal based on the first update step when the signal amplitude of the third audio signal is less than the first amplitude threshold and the second clipping flag is set. A dynamic proportion and the second dynamic proportion, wherein the first dynamic proportion after the update is greater than the first dynamic proportion before the update, and the second dynamic proportion after the update is smaller than the second dynamic proportion before the update dynamic proportion;
第二比重更新模块,用于在所述第三音频信号的信号幅度大于所述第一幅度阈值且小于所述第二幅度阈值的情况下,基于第二更新步长更新所述第一动态比重和所述第二动态比重,其中,更新后的所述第一动态比重小于更新前的所述第一动态比重,更新后的所述第二动态比重大于更新前的所述第二动态比重。A second proportion updating module, configured to update the first dynamic proportion based on a second update step size when the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold and the second dynamic proportion, wherein the first dynamic proportion after the update is smaller than the first dynamic proportion before the update, and the second dynamic proportion after the update is greater than the second dynamic proportion before the update.
可选的,所述装置还包括:Optionally, the device further includes:
第一标识替换模块,用于在更新后的所述第一动态比重大于等于1的情况下,将所述第二削波标识替换为所述第一削波标识。A first identifier replacement module, configured to replace the second clipping identifier with the first clipping identifier when the updated first dynamic proportion is greater than or equal to 1.
可选的,所述装置还包括:Optionally, the device further includes:
第二标识替换模块,用于在所述第三音频信号的信号幅度大于所述第一幅度阈值,且设置有所述第一削波标识的情况下,将所述第一削波标识替换为所述第二削波标识。A second identifier replacement module, configured to replace the first clipping identifier with the first clipping identifier when the signal amplitude of the third audio signal is greater than the first amplitude threshold the second clipping flag.
可选的,补偿单元,具体用于:Optional, compensation unit, specifically for:
确定所述第一模拟增益与所述第二模拟增益的模拟增益差值;determining an analog gain difference between the first analog gain and the second analog gain;
基于所述模拟增益差值对所述第二音频信号进行增益补偿,得到所述第三音频信号。Gain compensation is performed on the second audio signal based on the analog gain difference to obtain the third audio signal.
可选的,补偿单元,具体用于:Optional, compensation unit, specifically for:
确定所述第一模拟增益与所述第二模拟增益的模拟增益差值;determining an analog gain difference between the first analog gain and the second analog gain;
基于所述模拟增益差值对所述第二音频信号进行增益补偿,得到第四音频信号;Perform gain compensation on the second audio signal based on the analog gain difference to obtain a fourth audio signal;
确定所述第一音频信号与所述第四音频信号的信号幅度比值;determining a signal amplitude ratio of the first audio signal to the fourth audio signal;
基于所述信号幅度比值对所述第四音频信号进行幅度补偿,得到所述第三音频信号。Perform amplitude compensation on the fourth audio signal based on the signal amplitude ratio to obtain the third audio signal.
综上所述,本申请实施例中,通过将同一原始音频信号分别输入不同模拟增益的第一通道和第二通道,得到第一音频信号和第二音频信号,同时兼顾非高声压级信号以及高声压级信号的信号拾取;进一步的,基于第一音频信号和第二音频信号进行信号融合,输出一路目标音频信号时,由于目标音频信号的动态范围是第一音频信号和第二音频信号的动态范围的叠加,因此扩大了音频信号的动态范围,有助于提高非高声压级信号以及高声压级信号的录制质量,实现高动态范围录音。To sum up, in the embodiment of the present application, the first audio signal and the second audio signal are obtained by respectively inputting the same original audio signal into the first channel and the second channel with different analog gains, while taking into account the non-high sound pressure level signal. and signal pickup of high sound pressure level signals; further, signal fusion is performed based on the first audio signal and the second audio signal, and when outputting a target audio signal, because the dynamic range of the target audio signal is the first audio signal and the second audio signal The superposition of the dynamic range of the signal, thus expanding the dynamic range of the audio signal, helps to improve the recording quality of non-high sound pressure level signals and high sound pressure level signals, and realizes high dynamic range recording.
本实施例中,计算机设备通过对第二音频信号进行固定增益补偿和自适应幅度补偿,使补偿后得到的第三音频信号尽可能接近第一音频信号,有助于提高后续的信号融合质量。In this embodiment, the computer device performs fixed gain compensation and adaptive amplitude compensation on the second audio signal, so that the third audio signal obtained after compensation is as close to the first audio signal as possible, which helps to improve the quality of subsequent signal fusion.
此外,本实施例中,计算机设备基于第三音频信号的信号幅度,确定第一通道出的削波情况,并进一步基于第一通道的削波情况以及信号幅度,动态确定信号融合是第一音频信号以及第三音频信号的融合比重,避免出现目标音频信号出现跳变的情况,提高了融合后目标音频信号的融合质量和平滑度。In addition, in this embodiment, the computer device determines the clipping situation of the first channel based on the signal amplitude of the third audio signal, and further dynamically determines that the signal fusion is the first audio frequency based on the clipping situation and the signal amplitude of the first channel. The fusion ratio of the signal and the third audio signal avoids the occurrence of jumps in the target audio signal, and improves the fusion quality and smoothness of the target audio signal after fusion.
请参考图10,其示出了本申请一个示例性实施例提供的计算机设备的结构方框图。该计算机设备1000可以是智能手机、平板电脑、可穿戴式设备等。本申请中的计算机设备1000可以包括一个或多个如下部件:处理器1010和存储器1020。Please refer to FIG. 10 , which shows a structural block diagram of a computer device provided by an exemplary embodiment of the present application. The computer device 1000 may be a smartphone, a tablet computer, a wearable device, or the like. The computer device 1000 in this application may include one or more of the following components: a processor 1010 and a memory 1020 .
处理器1010可以包括一个或者多个处理核心。处理器1010利用各种接口和线路连接整个计算机设备1000内的各个部分,通过运行或执行存储在存储器1020内的指令、程序、代码集或指令集,以及调用存储在存储器1020内的数据,执行计算机设备1400的各种功能和处理数据。可选地,处理器1010可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable Logic Array,PLA)中的至少一种硬件形式来实现。处理器1010可集成中央处理器(Central Processing Unit,CPU)、图像处理器(Graphics Processing Unit,GPU)、神经网络处理器(Neural-network Processing Unit,NPU)和调制解调器等中的一种或几种的组合。其中,CPU主要处理操作系统、用户界面和应用程序等;GPU用于负责触摸显示屏1030所需要显示的内容的渲染和绘制;NPU用于实现人工智能(Artificial Intelligence,AI)功能;调制解调器用于处理无线通信。可以理解的是,上述调制解调器也可以不集成到处理器1010中,单独通过一块芯片进行实现。 Processor 1010 may include one or more processing cores. The processor 1010 uses various interfaces and lines to connect various parts of the entire computer device 1000, and executes by running or executing the instructions, programs, code sets or instruction sets stored in the memory 1020, and calling the data stored in the memory 1020. Various functions of the computer device 1400 and processing data. Optionally, the processor 1010 may adopt at least one of digital signal processing (Digital Signal Processing, DSP), field-programmable gate array (Field-Programmable Gate Array, FPGA), and programmable logic array (Programmable Logic Array, PLA). A hardware form is implemented. The processor 1010 may integrate one or more of a central processing unit (Central Processing Unit, CPU), a graphics processor (Graphics Processing Unit, GPU), a neural network processor (Neural-network Processing Unit, NPU), and a modem, etc. The combination. Among them, the CPU mainly processes the operating system, user interface and application programs, etc.; the GPU is used for rendering and drawing the content that needs to be displayed on the touch display screen 1030; the NPU is used for implementing artificial intelligence (Artificial Intelligence, AI) functions; the modem is used for Handle wireless communications. It can be understood that, the above-mentioned modem may not be integrated into the processor 1010, but is implemented by a single chip.
存储器1020可以包括随机存储器(Random Access Memory,RAM),也可以包括只读存储器(Read-Only Memory,ROM)。可选地,该存储器1020包括非瞬时性计算机可读介质(non-transitory computer-readable storage medium)。存储器1020可用于存储指令、程序、代码、代码集或指令集。存储器1020可包括存储程序区和存储数据区,其中,存储程序区可存储用于实现操作系统的指令、用于至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现下述各个方法实施例的指令等;存储数据区可存储根据计算机设备1000的使用所创建的数据(比如音频数据、电话本)等。The memory 1020 may include random access memory (Random Access Memory, RAM), or may include read-only memory (Read-Only Memory, ROM). Optionally, the memory 1020 includes a non-transitory computer-readable storage medium. Memory 1020 may be used to store instructions, programs, codes, sets of codes, or sets of instructions. The memory 1020 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playback function, an image playback function, etc.), Instructions and the like for implementing the various method embodiments described below; the storage data area may store data (such as audio data, phone book) and the like created according to the use of the computer device 1000 .
可选的,计算机设备1000设置有麦克风1030,该麦克风1030可以是计算机设备1000的内置麦克风,也可以是通过麦克风接口与计算机设备1000相连的外接麦克风。Optionally, the computer device 1000 is provided with a microphone 1030, and the microphone 1030 may be a built-in microphone of the computer device 1000 or an external microphone connected to the computer device 1000 through a microphone interface.
本申请实施例中,计算机设备1000中还设置有第一通路和第二通路(音频电路),麦克风1030分别与第一通路和第二通路相连,其中,第一通路上设置有高增益PGA和第一ADC,第二通路上设置有低增益PGA和第二ADC,第一通路和第二通路分别对麦克风输出的原始音频信号进行模拟增益和模数转换,并将转换得到的两路音频信号输入处理器1010,由处理器1010进行音频信号处理。In this embodiment of the present application, the computer device 1000 is further provided with a first channel and a second channel (audio circuit), and the microphone 1030 is respectively connected with the first channel and the second channel, wherein the first channel is provided with a high-gain PGA and The first ADC and the second channel are provided with a low-gain PGA and a second ADC. The first channel and the second channel respectively perform analog gain and analog-to-digital conversion on the original audio signal output by the microphone, and convert the converted two channels of audio signals. Input to the processor 1010, and the processor 1010 performs audio signal processing.
除此之外,本领域技术人员可以理解,上述附图所示出的计算机设备1000的结构并不构成对计算机设备的限定,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。比如,计算机设备1000中还包括显示屏、传感器、扬声器、电源等部件,在此不再赘述。In addition, those skilled in the art can understand that the structure of the computer device 1000 shown in the above drawings does not constitute a limitation on the computer device, and the computer device may include more or less components than those shown in the drawings, or combinations thereof. certain components, or different component arrangements. For example, the computer device 1000 also includes components such as a display screen, a sensor, a speaker, and a power supply, which will not be repeated here.
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有至少一条程序代码,所述程序代码由处理器加载并执行以实现如上各个实施例所述的音频信号的处理方法。Embodiments of the present application further provide a computer-readable storage medium, where at least one piece of program code is stored in the computer-readable storage medium, and the program code is loaded and executed by a processor to implement the audio signal processing according to the above embodiments. Approach.
根据本申请的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的 处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述方面的各种可选实现方式中提供的音频信号的处理方法。According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the audio signal processing methods provided in various optional implementations of the above aspects.
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。另外,本文中描述的步骤编号,仅示例性示出了步骤间的一种可能的执行先后顺序,在一些其它实施例中,上述步骤也可以不按照编号顺序来执行,如两个不同编号的步骤同时执行,或者两个不同编号的步骤按照与图示相反的顺序执行,本申请实施例对此不作限定。It should be understood that references herein to "a plurality" means two or more. "And/or", which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the associated objects are an "or" relationship. In addition, the numbering of the steps described in this document only exemplarily shows a possible execution sequence between the steps. In some other embodiments, the above steps may also be executed in different order, such as two different numbers. The steps are performed at the same time, or two steps with different numbers are performed in a reverse order to that shown in the figure, which is not limited in this embodiment of the present application.
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above descriptions are only optional embodiments of the present application, and are not intended to limit the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present application shall be included in the protection of the present application. within the range.

Claims (21)

  1. 一种音频信号的处理方法,所述方法包括:A method for processing an audio signal, the method comprising:
    获取第一通道输出的第一音频信号,以及第二通道输出的第二音频信号,所述第一音频信号是对经过第一模拟增益的原始音频信号进行模数转换后得到的信号,所述第二音频信号是对经过第二模拟增益的原始音频信号进行模数转换后得到的信号,所述第一模拟增益大于所述第二模拟增益;Obtain the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with the first analog gain, the The second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with a second analog gain, and the first analog gain is greater than the second analog gain;
    基于所述第一音频信号和所述第二音频信号进行信号融合,得到目标音频信号,其中,所述目标音频信号的动态范围是所述第一音频信号和所述第二音频信号的动态范围的叠加。Perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the dynamic range of the first audio signal and the second audio signal superposition.
  2. 根据权利要求1所述的方法,其中,所述基于所述第一音频信号和所述第二音频信号进行信号融合,得到目标音频信号,包括:The method according to claim 1, wherein the performing signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal comprises:
    对所述第二音频信号进行信号补偿,得到第三音频信号,其中,信号补偿用于补偿所述第一模拟增益和所述第二模拟增益造成的信号差异;performing signal compensation on the second audio signal to obtain a third audio signal, wherein the signal compensation is used to compensate the signal difference caused by the first analog gain and the second analog gain;
    对所述第一音频信号和所述第三音频信号进行信号融合,得到所述目标音频信号。Perform signal fusion on the first audio signal and the third audio signal to obtain the target audio signal.
  3. 根据权利要求2所述的方法,其中,所述对所述第一音频信号和第三音频信号进行信号融合,得到所述目标音频信号,包括:The method according to claim 2, wherein the performing signal fusion on the first audio signal and the third audio signal to obtain the target audio signal comprises:
    确定所述第一音频信号和所述第三音频信号各自对应的融合比重;Determine the respective fusion proportions corresponding to the first audio signal and the third audio signal;
    基于所述融合比重对所述第一音频信号和所述第三音频信号进行信号融合,得到所述目标音频信号。Perform signal fusion on the first audio signal and the third audio signal based on the fusion ratio to obtain the target audio signal.
  4. 根据权利要求3所述的方法,其中,所述确定所述第一音频信号和所述第三音频信号各自对应的融合比重,包括:The method according to claim 3, wherein the determining the respective fusion proportions corresponding to the first audio signal and the third audio signal comprises:
    在所述第三音频信号的信号幅度小于第一幅度阈值,且设置有第一削波标识的情况下,确定所述第一音频信号的融合比重为1,所述第三音频信号的融合比重为0,所述第一削波标识用于表征所述第一通道未发生削波;In the case that the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the first clipping flag is set, it is determined that the fusion proportion of the first audio signal is 1, and the fusion proportion of the third audio signal is 1 is 0, and the first clipping flag is used to represent that the first channel is not clipped;
    在所述第三音频信号的信号幅度大于第二幅度阈值的情况下,确定所述第一音频信号的融合比重为0,所述第三音频信号的融合比重为1,所述第二幅度阈值大于所述第一幅度阈值;In the case that the signal amplitude of the third audio signal is greater than the second amplitude threshold, it is determined that the fusion proportion of the first audio signal is 0, the fusion proportion of the third audio signal is 1, and the second amplitude threshold greater than the first amplitude threshold;
    在所述第三音频信号的信号幅度小于所述第一幅度阈值,且设置有第二削波标识,或者,所述第三音频信号的信号幅度大于所述第一幅度阈值且小于所述第二幅度阈值的情况下,确定所述第一音频信号的融合比重为第一动态比重,确定所述第二音频信号的融合比重为第二动态比重,所述第二削波标识用于表征所述第一通道发生削波,所述第一动态比重和所述第二动态比重之和为1。When the signal amplitude of the third audio signal is smaller than the first amplitude threshold, and a second clipping flag is set, or the signal amplitude of the third audio signal is greater than the first amplitude threshold and smaller than the first amplitude threshold In the case of two amplitude thresholds, the fusion proportion of the first audio signal is determined to be the first dynamic proportion, the fusion proportion of the second audio signal is determined to be the second dynamic proportion, and the second clipping flag is used to represent the The first channel is clipped, and the sum of the first dynamic specific gravity and the second dynamic specific gravity is 1.
  5. 根据权利要求4所述的方法,其中,所述确定所述第一音频信号的融合比重为第一动态比重,确定所述第二音频信号的融合比重为第二动态比重之后,所述方法还包括:The method according to claim 4, wherein after determining that the fusion weight of the first audio signal is the first dynamic weight, and determining that the fusion weight of the second audio signal is the second dynamic weight, the method further comprises: include:
    在所述第三音频信号的信号幅度小于所述第一幅度阈值,且设置有所述第二削波标识的情况下,基于第一更新步长更新所述第一动态比重和所述第二动态比重,其中,更新后的所述第一动态比重大于更新前的所述第一动态比重,更新后的所述第二动态比重小于更新前的所述第二动态比重;When the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the second clipping flag is set, the first dynamic weight and the second dynamic weight are updated based on a first update step size A dynamic proportion, wherein the updated first dynamic proportion is greater than the first dynamic proportion before the update, and the updated second dynamic proportion is smaller than the second dynamic proportion before the update;
    在所述第三音频信号的信号幅度大于所述第一幅度阈值且小于所述第二幅度阈值的情况下,基于第二更新步长更新所述第一动态比重和所述第二动态比重,其中,更新后的所述第一动态比重小于更新前的所述第一动态比重,更新后的所述第二动态比重大于更新前的所述第二动态比重。in the case where the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold, updating the first dynamic proportion and the second dynamic proportion based on a second update step size, Wherein, the updated first dynamic proportion is smaller than the first dynamic proportion before the update, and the updated second dynamic proportion is greater than the second dynamic proportion before the update.
  6. 根据权利要求5所述的方法,其中,所述基于第一更新步长更新所述第一动态比重和所述第二动态比重之后,所述方法还包括:The method according to claim 5, wherein after updating the first dynamic proportion and the second dynamic proportion based on the first update step size, the method further comprises:
    在更新后的所述第一动态比重大于等于1的情况下,将所述第二削波标识替换为所述第一削波标识。In the case that the updated first dynamic proportion is greater than or equal to 1, the second clipping identifier is replaced with the first clipping identifier.
  7. 根据权利要求4所述的方法,其中,所述方法还包括:The method of claim 4, wherein the method further comprises:
    在所述第三音频信号的信号幅度大于所述第一幅度阈值,且设置有所述第一削波标识的情况下,将所述第一削波标识替换为所述第二削波标识。When the signal amplitude of the third audio signal is greater than the first amplitude threshold and the first clipping flag is set, the first clipping flag is replaced with the second clipping flag.
  8. 根据权利要求2所述的方法,其中,所述对所述第二音频信号进行信号补偿,得到第三音频信号,包括:The method according to claim 2, wherein the performing signal compensation on the second audio signal to obtain a third audio signal comprises:
    确定所述第一模拟增益与所述第二模拟增益的模拟增益差值;determining an analog gain difference between the first analog gain and the second analog gain;
    基于所述模拟增益差值对所述第二音频信号进行增益补偿,得到所述第三音频信号。Gain compensation is performed on the second audio signal based on the analog gain difference to obtain the third audio signal.
  9. 根据权利要求2所述的方法,其中,所述对所述第二音频信号进行信号补偿,得到第三音频信号,包括:The method according to claim 2, wherein the performing signal compensation on the second audio signal to obtain a third audio signal comprises:
    确定所述第一模拟增益与所述第二模拟增益的模拟增益差值;determining an analog gain difference between the first analog gain and the second analog gain;
    基于所述模拟增益差值对所述第二音频信号进行增益补偿,得到第四音频信号;Perform gain compensation on the second audio signal based on the analog gain difference to obtain a fourth audio signal;
    确定所述第一音频信号与所述第四音频信号的信号幅度比值;determining a signal amplitude ratio of the first audio signal to the fourth audio signal;
    基于所述信号幅度比值对所述第四音频信号进行幅度补偿,得到所述第三音频信号。Perform amplitude compensation on the fourth audio signal based on the signal amplitude ratio to obtain the third audio signal.
  10. 一种音频信号的处理装置,所述装置包括:An audio signal processing device, the device comprises:
    信号获取模块,用于获取第一通道输出的第一音频信号,以及第二通道输出的第二音频信号,所述第一音频信号是对经过第一模拟增益的原始音频信号进行模数转换后得到的信号,所述第二音频信号是对经过第二模拟增益的原始音频信号进行模数转换后得到的信号,所述第一模拟增益大于所述第二模拟增益;A signal acquisition module, configured to acquire the first audio signal output by the first channel and the second audio signal output by the second channel, where the first audio signal is obtained by performing analog-to-digital conversion on the original audio signal that has undergone the first analog gain the obtained signal, the second audio signal is a signal obtained by performing analog-to-digital conversion on the original audio signal with a second analog gain, and the first analog gain is greater than the second analog gain;
    信号融合模块,用于基于所述第一音频信号和所述第二音频信号进行信号融合,得到目标音频信号,其中,所述目标音频信号的动态范围是所述第一音频信号和所述第二音频信号的动态范围的叠加。A signal fusion module, configured to perform signal fusion based on the first audio signal and the second audio signal to obtain a target audio signal, wherein the dynamic range of the target audio signal is the first audio signal and the second audio signal. The superposition of the dynamic range of the two audio signals.
  11. 根据权利要求10所述的装置,其中,所述信号融合模块,包括:The apparatus according to claim 10, wherein the signal fusion module comprises:
    补偿单元,用于对所述第二音频信号进行信号补偿,得到第三音频信号,其中,信号补偿用于补偿所述第一模拟增益和所述第二模拟增益造成的信号差异;a compensation unit, configured to perform signal compensation on the second audio signal to obtain a third audio signal, wherein the signal compensation is used to compensate the signal difference caused by the first analog gain and the second analog gain;
    融合单元,用于对所述第一音频信号和所述第三音频信号进行信号融合,得到所述目标音频信号。A fusion unit, configured to perform signal fusion on the first audio signal and the third audio signal to obtain the target audio signal.
  12. 根据权利要求11所述的装置,其中,所述融合单元,用于:The apparatus according to claim 11, wherein the fusion unit is used for:
    确定所述第一音频信号和所述第三音频信号各自对应的融合比重;Determine the respective fusion proportions corresponding to the first audio signal and the third audio signal;
    基于所述融合比重对所述第一音频信号和所述第三音频信号进行信号融合,得到所述目标音频信号。Perform signal fusion on the first audio signal and the third audio signal based on the fusion ratio to obtain the target audio signal.
  13. 根据权利要求12所述的装置,其中,确定所述第一音频信号和所述第三音频信号各自对应的融合比重的过程中,所述融合单元,用于:The device according to claim 12, wherein, in the process of determining the respective corresponding fusion proportions of the first audio signal and the third audio signal, the fusion unit is configured to:
    在所述第三音频信号的信号幅度小于第一幅度阈值,且设置有第一削波标识的情况下,确定所述第一音频信号的融合比重为1,所述第三音频信号的融合比重为0,所述第一削波标 识用于表征所述第一通道未发生削波;In the case that the signal amplitude of the third audio signal is smaller than the first amplitude threshold and the first clipping flag is set, it is determined that the fusion proportion of the first audio signal is 1, and the fusion proportion of the third audio signal is 1 is 0, and the first clipping flag is used to represent that the first channel is not clipped;
    在所述第三音频信号的信号幅度大于第二幅度阈值的情况下,确定所述第一音频信号的融合比重为0,所述第三音频信号的融合比重为1,所述第二幅度阈值大于所述第一幅度阈值;In the case that the signal amplitude of the third audio signal is greater than the second amplitude threshold, it is determined that the fusion proportion of the first audio signal is 0, the fusion proportion of the third audio signal is 1, and the second amplitude threshold greater than the first amplitude threshold;
    在所述第三音频信号的信号幅度小于所述第一幅度阈值,且设置有第二削波标识,或者,所述第三音频信号的信号幅度大于所述第一幅度阈值且小于所述第二幅度阈值的情况下,确定所述第一音频信号的融合比重为第一动态比重,确定所述第二音频信号的融合比重为第二动态比重,所述第二削波标识用于表征所述第一通道发生削波,所述第一动态比重和所述第二动态比重之和为1。When the signal amplitude of the third audio signal is smaller than the first amplitude threshold, and a second clipping flag is set, or the signal amplitude of the third audio signal is greater than the first amplitude threshold and smaller than the first amplitude threshold In the case of two amplitude thresholds, the fusion proportion of the first audio signal is determined to be the first dynamic proportion, the fusion proportion of the second audio signal is determined to be the second dynamic proportion, and the second clipping flag is used to represent the The first channel is clipped, and the sum of the first dynamic specific gravity and the second dynamic specific gravity is 1.
  14. 根据权利要求13所述的装置,其中,所述装置还包括:The apparatus of claim 13, wherein the apparatus further comprises:
    第一比重更新模块,用于在所述第三音频信号的信号幅度小于所述第一幅度阈值,且设置有所述第二削波标识的情况下,基于第一更新步长更新所述第一动态比重和所述第二动态比重,其中,更新后的所述第一动态比重大于更新前的所述第一动态比重,更新后的所述第二动态比重小于更新前的所述第二动态比重;A first specific gravity update module, configured to update the third audio signal based on a first update step when the signal amplitude of the third audio signal is less than the first amplitude threshold and the second clipping flag is set. A dynamic proportion and the second dynamic proportion, wherein the first dynamic proportion after the update is greater than the first dynamic proportion before the update, and the second dynamic proportion after the update is smaller than the second dynamic proportion before the update dynamic proportion;
    第二比重更新模块,用于在所述第三音频信号的信号幅度大于所述第一幅度阈值且小于所述第二幅度阈值的情况下,基于第二更新步长更新所述第一动态比重和所述第二动态比重,其中,更新后的所述第一动态比重小于更新前的所述第一动态比重,更新后的所述第二动态比重大于更新前的所述第二动态比重。A second proportion updating module, configured to update the first dynamic proportion based on a second update step size when the signal amplitude of the third audio signal is greater than the first amplitude threshold and less than the second amplitude threshold and the second dynamic proportion, wherein the first dynamic proportion after the update is smaller than the first dynamic proportion before the update, and the second dynamic proportion after the update is greater than the second dynamic proportion before the update.
  15. 根据权利要求14所述的装置,其中,所述装置还包括:The apparatus of claim 14, wherein the apparatus further comprises:
    第一标识替换模块,用于在更新后的所述第一动态比重大于等于1的情况下,将所述第二削波标识替换为所述第一削波标识。A first identifier replacement module, configured to replace the second clipping identifier with the first clipping identifier when the updated first dynamic proportion is greater than or equal to 1.
  16. 根据权利要求13所述的装置,其中,所述装置还包括:The apparatus of claim 13, wherein the apparatus further comprises:
    第二标识替换模块,用于在所述第三音频信号的信号幅度大于所述第一幅度阈值,且设置有所述第一削波标识的情况下,将所述第一削波标识替换为所述第二削波标识。A second identifier replacement module, configured to replace the first clipping identifier with the first clipping identifier when the signal amplitude of the third audio signal is greater than the first amplitude threshold the second clipping flag.
  17. 根据权利要求11所述的装置,其中,所述补偿单元,用于:The apparatus according to claim 11, wherein the compensation unit is used for:
    确定所述第一模拟增益与所述第二模拟增益的模拟增益差值;determining an analog gain difference between the first analog gain and the second analog gain;
    基于所述模拟增益差值对所述第二音频信号进行增益补偿,得到所述第三音频信号。Gain compensation is performed on the second audio signal based on the analog gain difference to obtain the third audio signal.
  18. 根据权利要求11所述的装置,其中,所述补偿单元,用于:The apparatus according to claim 11, wherein the compensation unit is used for:
    确定所述第一模拟增益与所述第二模拟增益的模拟增益差值;determining an analog gain difference between the first analog gain and the second analog gain;
    基于所述模拟增益差值对所述第二音频信号进行增益补偿,得到第四音频信号;Perform gain compensation on the second audio signal based on the analog gain difference to obtain a fourth audio signal;
    确定所述第一音频信号与所述第四音频信号的信号幅度比值;determining a signal amplitude ratio of the first audio signal to the fourth audio signal;
    基于所述信号幅度比值对所述第四音频信号进行幅度补偿,得到所述第三音频信号。Perform amplitude compensation on the fourth audio signal based on the signal amplitude ratio to obtain the third audio signal.
  19. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如权利要求1至9任一所述的音频信号的处理方法。A computer device comprising a processor and a memory, wherein the memory stores at least one instruction, the at least one instruction is loaded and executed by the processor to implement any one of claims 1 to 9 The processing method of the audio signal.
  20. 一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条程序代码,所述程序代码由处理器加载并执行以实现如权利要求1至9任一所述的音频信号的处理方法。A computer-readable storage medium having at least one piece of program code stored in the computer-readable storage medium, the program code being loaded and executed by a processor to implement the audio signal processing according to any one of claims 1 to 9 method.
  21. 一种计算机程序产品,所述计算机程序产品包括计算机指令,所述计算机指令存储在计算机可读存储介质中;计算机设备的处理器从所述计算机可读存储介质读取所述计算机指 令,所述处理器执行所述计算机指令,以实现权利要求1至9任一所述的音频信号的处理方法。A computer program product comprising computer instructions stored in a computer-readable storage medium; a processor of a computer device reading the computer instructions from the computer-readable storage medium, the The processor executes the computer instructions to implement the audio signal processing method according to any one of claims 1 to 9.
PCT/CN2022/076756 2021-03-22 2022-02-18 Audio signal processing method and apparatus, terminal, and storage medium WO2022199288A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110301715.0A CN113079440B (en) 2021-03-22 2021-03-22 Audio signal processing method and device, terminal and storage medium
CN202110301715.0 2021-03-22

Publications (1)

Publication Number Publication Date
WO2022199288A1 true WO2022199288A1 (en) 2022-09-29

Family

ID=76613145

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/076756 WO2022199288A1 (en) 2021-03-22 2022-02-18 Audio signal processing method and apparatus, terminal, and storage medium

Country Status (2)

Country Link
CN (1) CN113079440B (en)
WO (1) WO2022199288A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113079440B (en) * 2021-03-22 2022-12-06 Oppo广东移动通信有限公司 Audio signal processing method and device, terminal and storage medium
CN113709626A (en) * 2021-08-04 2021-11-26 Oppo广东移动通信有限公司 Audio recording method, device, storage medium and computer equipment
CN114584902B (en) * 2022-03-17 2023-05-16 睿云联(厦门)网络通讯技术有限公司 Method and device for eliminating nonlinear echo of intercom equipment based on volume control

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102024456A (en) * 2009-09-21 2011-04-20 联发科技股份有限公司 Audio processing apparatus and audio processing method
CN109115245A (en) * 2014-03-28 2019-01-01 意法半导体股份有限公司 Multichannel transducer device and its operating method
CN111345047A (en) * 2019-04-17 2020-06-26 深圳市大疆创新科技有限公司 Audio signal processing method, apparatus and storage medium
CN111699700A (en) * 2019-04-17 2020-09-22 深圳市大疆创新科技有限公司 Audio signal processing method, apparatus and storage medium
WO2020221028A1 (en) * 2019-04-29 2020-11-05 深圳锐越微技术有限公司 Analog-to-digital conversion-based two-stage audio gain circuit and audio terminal
CN112199070A (en) * 2020-10-14 2021-01-08 展讯通信(上海)有限公司 Audio processing method and device, storage medium and computer equipment
CN113079440A (en) * 2021-03-22 2021-07-06 Oppo广东移动通信有限公司 Audio signal processing method and device, terminal and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10284217B1 (en) * 2014-03-05 2019-05-07 Cirrus Logic, Inc. Multi-path analog front end and analog-to-digital converter for a signal processing system
US10727798B2 (en) * 2018-08-17 2020-07-28 Invensense, Inc. Method for improving die area and power efficiency in high dynamic range digital microphones

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102024456A (en) * 2009-09-21 2011-04-20 联发科技股份有限公司 Audio processing apparatus and audio processing method
CN109115245A (en) * 2014-03-28 2019-01-01 意法半导体股份有限公司 Multichannel transducer device and its operating method
CN111345047A (en) * 2019-04-17 2020-06-26 深圳市大疆创新科技有限公司 Audio signal processing method, apparatus and storage medium
CN111699700A (en) * 2019-04-17 2020-09-22 深圳市大疆创新科技有限公司 Audio signal processing method, apparatus and storage medium
WO2020221028A1 (en) * 2019-04-29 2020-11-05 深圳锐越微技术有限公司 Analog-to-digital conversion-based two-stage audio gain circuit and audio terminal
CN112199070A (en) * 2020-10-14 2021-01-08 展讯通信(上海)有限公司 Audio processing method and device, storage medium and computer equipment
CN113079440A (en) * 2021-03-22 2021-07-06 Oppo广东移动通信有限公司 Audio signal processing method and device, terminal and storage medium

Also Published As

Publication number Publication date
CN113079440A (en) 2021-07-06
CN113079440B (en) 2022-12-06

Similar Documents

Publication Publication Date Title
WO2022199288A1 (en) Audio signal processing method and apparatus, terminal, and storage medium
US10015591B2 (en) Pickup apparatus and pickup method
JP6123503B2 (en) Audio correction apparatus, audio correction program, and audio correction method
KR20110039560A (en) A method and an apparatus for processing an audio signal
US10020003B2 (en) Voice signal processing apparatus and voice signal processing method
CN111508510B (en) Audio processing method and device, storage medium and electronic equipment
JP6669176B2 (en) Signal processing device and signal processing method
WO2022140928A1 (en) Audio signal processing method and system for suppressing echo
CN116346061B (en) Dynamic range control method and circuit based on peak value and effective value double-value detection
TWI501657B (en) Electronic audio device
CN115835094A (en) Audio signal processing method, system, device, product and medium
CN211792016U (en) Noise reduction voice device and electronic device
US20220141583A1 (en) Hearing assisting device and method for adjusting output sound thereof
WO2022041485A1 (en) Method for processing audio signal, electronic device and storage medium
US9514765B2 (en) Method for reducing noise and computer program thereof and electronic device
US9313582B2 (en) Hearing aid and method of enhancing speech output in real time
WO2021120247A1 (en) Hearing compensation method and device, and computer readable storage medium
JP2001324989A (en) Device for shaping signals, especially audio signals
CN114094966A (en) Dynamic range control circuit, audio processing chip and audio processing method thereof
TW201737715A (en) Signal receiving end of digital TV and signal processing method thereof
CN114554346B (en) Adaptive adjustment method and device of ANC parameters and storage medium
WO2024098958A1 (en) Audio processing method and apparatus, and electronic device and storage medium
JP2002299975A (en) Digital agc device
CN113470692B (en) Audio processing method and device, readable medium and electronic equipment
CN113194388B (en) Signal processing method, device, equipment, medium and chip system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22773950

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22773950

Country of ref document: EP

Kind code of ref document: A1