WO2019128630A1 - 音频信号的处理方法、装置、终端及存储介质 - Google Patents

音频信号的处理方法、装置、终端及存储介质 Download PDF

Info

Publication number
WO2019128630A1
WO2019128630A1 PCT/CN2018/118766 CN2018118766W WO2019128630A1 WO 2019128630 A1 WO2019128630 A1 WO 2019128630A1 CN 2018118766 W CN2018118766 W CN 2018118766W WO 2019128630 A1 WO2019128630 A1 WO 2019128630A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
channel
audio signal
hrtf
hrtf data
Prior art date
Application number
PCT/CN2018/118766
Other languages
English (en)
French (fr)
Inventor
刘佳泽
Original Assignee
广州酷狗计算机科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州酷狗计算机科技有限公司 filed Critical 广州酷狗计算机科技有限公司
Priority to US16/617,986 priority Critical patent/US10924877B2/en
Priority to EP18895910.0A priority patent/EP3624463A4/en
Publication of WO2019128630A1 publication Critical patent/WO2019128630A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/002Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/02Spatial or constructional arrangements of loudspeakers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/04Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/002Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
    • H04S3/004For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2205/00Details of stereophonic arrangements covered by H04R5/00 but not provided for in any of its subgroups
    • H04R2205/026Single (sub)woofer with two or more satellite loudspeakers for mid- and high-frequency band reproduction driven via the (sub)woofer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/01Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/11Positioning of individual sound objects, e.g. moving airplane, within a sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]

Definitions

  • the present application relates to the field of audio processing technologies, and in particular, to a method, an apparatus, a terminal, and a storage medium for processing an audio signal.
  • 5.1 channels include: front left channel, front right channel, front center channel, rear left channel, rear right channel, a total of 5 channels, and 0.1 channel.
  • the 0.1 channel is also called the low frequency channel or the subwoofer channel.
  • the 5.1 channel audio signal cannot be played.
  • the embodiment of the present invention provides a method, a device, a terminal, and a storage medium for processing an audio signal, which can solve the problem that the stereo effect is poor when playing the left channel audio signal and the right channel audio signal through the audio playing unit.
  • the technical solution is as follows:
  • the embodiment of the present application provides a method, a device, and a terminal for processing an audio signal, which can solve the problem that a 5.1 channel audio signal cannot be played when the user does not have a speaker device supporting 5.1 channels.
  • the technical solution is as follows:
  • the embodiment of the present application provides a method for processing an audio signal, where the method is performed by a terminal, and the method includes:
  • the processed 5.1 channel audio signal is synthesized into a stereo audio signal.
  • the embodiment of the present application provides an audio signal processing apparatus, where the apparatus is applied to a terminal, where the apparatus includes:
  • a first acquiring module configured to acquire a 5.1 channel audio signal
  • a second acquiring module configured to acquire, according to coordinates of the 5.1 virtual speaker in the virtual environment, a head related transformation function HRTF data corresponding to each virtual speaker in the 5.1 virtual speaker;
  • a processing module configured to process, according to the HRTF data corresponding to each of the virtual speakers, a corresponding channel audio signal in the 5.1 channel audio signal to obtain a processed 5.1 channel audio signal;
  • a synthesizing module configured to synthesize the processed 5.1 channel audio signal into a stereo audio signal.
  • an embodiment of the present application provides a computer readable storage medium, where the storage medium stores at least one instruction that is loaded by a processor and executed to implement the above-described processing method of the audio signal.
  • an embodiment of the present application provides a terminal, where the terminal includes a processor and a memory, where the memory stores at least one instruction, and the instruction is loaded by the processor and executed to implement the audio signal. Approach.
  • the stereo audio signal is synthesized, so that the user can play the 5.1 channel audio signal only by using the ordinary stereo earphone or the 2.0 speaker, and obtain better. Play sound quality.
  • FIG. 1 is a flow chart showing a method of processing an audio signal provided by an exemplary embodiment of the present application
  • FIG. 2 is a flow chart showing a method of processing an audio signal provided by an exemplary embodiment of the present application
  • FIG. 3 is a flow chart showing a method of processing an audio signal provided by an exemplary embodiment of the present application
  • FIG. 4 is a flow chart showing a method of processing an audio signal provided by an exemplary embodiment of the present application
  • FIG. 5 is a flowchart showing a method of processing an audio signal provided by an exemplary embodiment of the present application
  • FIG. 6 is a flowchart showing a method of processing an audio signal provided by an exemplary embodiment of the present application
  • FIG. 7 is a schematic diagram showing the arrangement of a 5.1 channel virtual speaker provided by an exemplary embodiment of the present application.
  • FIG. 8 is a flowchart showing a method of processing an audio signal provided by an exemplary embodiment of the present application.
  • FIG. 9 is a schematic diagram showing the collection of HRTF data provided by an exemplary embodiment of the present application.
  • FIG. 10 is a block diagram showing a processing apparatus of an audio signal provided by an exemplary embodiment of the present application.
  • FIG. 11 is a block diagram showing a processing apparatus of an audio signal provided by an exemplary embodiment of the present application.
  • FIG. 12 is a block diagram showing a processing apparatus of an audio signal provided by an exemplary embodiment of the present application.
  • FIG. 1 is a flowchart of a method for processing an audio signal provided by an exemplary embodiment of the present application, which may be performed by a terminal having an audio signal processing function, the method including:
  • Step 101 Acquire a first stereo audio signal.
  • the terminal reads the first stereo audio signal stored locally, or obtains the first stereo audio signal on the server through a wired or wireless network.
  • the first stereo audio signal is obtained by recording sound through a stereo recording device.
  • the stereo recording device usually includes a first microphone on the left side and a second microphone on the right side, and the stereo recording device separately records the left side through the first microphone and the second microphone.
  • the side sound and the right sound obtain the left channel audio signal and the right channel audio signal, and the stereo recording device superimposes the left channel audio signal and the right channel audio signal to obtain the first stereo signal.
  • the terminal stores the received first stereo audio signal in a buffer of the terminal, and the first stereo audio signal is recorded as X_PCM.
  • the terminal stores the received first stereo audio signal in the form of a sample pair of the left channel audio signal and the corresponding right channel audio signal in a built-in buffer area, and acquires the first stereo audio signal from the buffer area in use.
  • step 102 the first stereo audio signal is split into 5.1 channel audio signals.
  • the terminal splits the first stereo audio signal into a 5.1 channel audio signal through a preset algorithm.
  • the 5.1 channel audio signal includes a front left channel signal, a front right channel signal, a front center channel signal, a low frequency channel signal, a rear left channel signal, and a rear right channel signal.
  • step 103 the 5.1 channel audio signal is processed according to the speaker parameters of the 5.1 virtual speaker of the three-dimensional surround, and the processed 5.1 channel audio signal is obtained.
  • the terminal performs signal processing on the 5.1 channel audio signal according to the speaker parameters of the 5.1 virtual speaker of the three-dimensional surround, and obtains the processed 5.1 channel audio signal.
  • the processed 5.1 channel audio signal includes a processed front left channel signal, a processed front right channel signal, a processed front center channel signal, and a processed rear left channel signal. And the processed rear right channel signal.
  • the three-dimensional surround 5.1 virtual speaker is a terminal-preset audio model that simulates the playback of a 5.1-channel speaker that surrounds the user in a real-life scene.
  • the user In the real scene, the user is centered, the user's face is oriented in front of the direction, and the 5.1-channel speaker includes a front left speaker located at the left front of the user, a front right speaker located at the front right of the user, and located directly in front of the user.
  • Step 104 synthesize the processed 5.1 channel audio signal into a second stereo audio signal.
  • the terminal synthesizes the processed 5.1 channel audio signal into a second stereo audio signal.
  • the second stereo audio signal can be played through a normal stereo earphone or a 2.0 speaker, and the user has a 5.1 channel stereo effect after hearing a normal stereo earphone or a 2.0 speaker second stereo audio signal.
  • the method provided in this embodiment splits the first stereo audio signal into a 5.1 channel audio signal, and then processes and synthesizes the 5.1 channel audio signal into a second stereo audio signal through two channels.
  • the audio playback unit plays the second stereo audio signal to enable the user to obtain the stereo effect of the 5.1 channel audio, which solves the problem that the stereo effect caused by playing only the two-channel audio signal in the related art is poor, and the stereo of the audio playback is improved. effect.
  • splitting the first stereo audio signal into a 5.1 channel audio signal is divided into two stages, and the first stage is to acquire a 5.0 channel audio signal in the 5.1 channel audio signal, as shown in Figure 2 below.
  • 3 and the embodiment of FIG. 4 will explain the splitting of the 5.0 channel audio signal from the first stereo audio signal; the second stage is to obtain the 0.1 channel audio signal in the 5.1 channel audio signal, the implementation of FIG. 5 described below.
  • the example will explain the splitting of the 0.1 channel audio signal from the first stereo audio signal; the third stage is to synthesize the 5.0 channel audio signal and the 0.1 channel audio signal into the second stereo audio signal, as shown in Figure 6 below.
  • the embodiment of 8 provides a method of processing and synthesizing the 5.1 channel audio signal to obtain a second stereo audio signal.
  • step 102 in the embodiment of FIG. 1.
  • step 103 the method includes:
  • Step 201 Filter the first stereo audio signal input high-pass filter to obtain a first high-frequency signal.
  • the terminal filters the first stereo audio signal input high-pass filter to obtain a first high-frequency signal.
  • the first high frequency signal is a superposed signal of the first left channel high frequency signal and the first right channel high frequency signal.
  • the terminal filters the first stereo through a 4th-order IIR high-pass filter to obtain a first high-frequency signal.
  • Step 202 Calculate a left channel high frequency signal, a center channel high frequency signal, and a right channel high frequency signal according to the first high frequency signal.
  • the terminal splits the first high frequency signal into a left channel high frequency signal, a center channel high frequency signal, and a right channel high frequency signal.
  • the left channel high frequency signal comprises a front left channel signal and a rear left channel signal
  • the center channel high frequency signal comprises a front center channel signal
  • the right channel high frequency signal comprises a front right channel signal And the rear right channel signal.
  • the terminal calculates the central channel high frequency signal according to the first high frequency signal, and subtracts the central channel high frequency signal from the first left channel high frequency signal to obtain the left channel high frequency signal, and the first right The channel high frequency signal is subtracted from the center channel high frequency signal to obtain the right channel high frequency signal.
  • Step 203 Calculate a front left channel signal, a front right channel signal, and a front end in the 5.1 channel audio signal according to the left channel high frequency signal, the center channel high frequency signal, and the right channel high frequency signal. Center channel signal, rear left channel signal, and rear right channel signal.
  • the terminal calculates the front left channel signal and the rear left channel signal according to the left channel high frequency signal, and calculates the front right channel signal and the rear right channel signal according to the right channel high frequency signal, according to The center channel high frequency signal is calculated to obtain the front center channel signal.
  • the terminal extracts first rear/reverberation signal data in the left channel high frequency signal, second rear/reverberation signal data in the central channel high frequency signal, and third in the right channel high frequency signal Rear/reverberation signal data, calculating front left channel signal, rear left channel signal, front based on first rear/reverberation signal data, second rear/reverberation signal data, and third rear/reverberation signal data Set the right channel signal, the rear right channel signal, and the front center channel signal.
  • Step 204 Multiply the front left channel signal, the front right channel signal, the front center channel signal, the rear left channel signal, and the rear right channel signal by scalar multiplication with corresponding speaker parameters to obtain a scalar multiplication
  • the processed front left channel signal, the processed front right channel signal, the processed front center channel signal, the processed rear left channel signal, and the processed rear right channel signal are the processed front left channel signal, the processed front right channel signal, the processed front center channel signal, the processed rear left channel signal, and the processed rear right channel signal.
  • the terminal multiplies the front left channel signal by the volatility V1 of the virtual front left channel speaker to obtain the processed front left channel signal X_FL; and the front right channel signal and the virtual
  • the front right channel speaker volume V2 is scalar multiplied to obtain the processed front right channel signal X_FR;
  • the front center channel signal is multiplied by the volute V3 of the virtual front center channel speaker to obtain a scalar multiplication
  • the processed front center channel signal X_FC multiplying the rear left channel signal by the volume V4 of the virtual rear left channel speaker to obtain the processed rear left channel signal X_RL;
  • the right channel signal is scalar multiplied by the volume V5 of the virtual rear right channel speaker to obtain a processed rear right channel signal X_RR.
  • the method provided in this embodiment generates a left channel high frequency signal, a center channel high frequency signal, and a right according to the first high frequency signal by filtering the first stereo audio signal to obtain a first high frequency signal.
  • the channel high-frequency signal is calculated according to the left channel high-frequency signal, the center channel high-frequency signal and the right channel high-frequency signal to obtain a 5.0-channel audio signal, thereby obtaining a processed 5.0-channel audio signal, thereby realizing the first
  • a high frequency signal is extracted from the first stereo audio signal and split into 5.0 channel audio signals in the 5.1 channel audio signal, and the processed 5.0 channel audio signal is further obtained.
  • FIG. 3 is a flowchart of a method for processing an audio signal provided by an exemplary embodiment of the present application. The method is applied to a terminal having an audio signal processing function, and the method may be the step in the embodiment of FIG. 2.
  • Step 301 Perform Fast Fourier Transform (FFT) on the first high frequency signal to obtain a high frequency real number signal and a high frequency imaginary number signal.
  • FFT Fast Fourier Transform
  • a high frequency real number signal and a high frequency imaginary number signal are obtained.
  • the fast Fourier transform is an algorithm that converts signals in the time domain into frequency domain signals.
  • the first high frequency signal obtains a high frequency real number signal and a high frequency imaginary number signal by fast Fourier transform.
  • the high frequency real number signal includes a left channel high frequency real number signal and a right channel high frequency real number signal
  • the high frequency imaginary number signal includes a left channel high frequency imaginary number signal and a right channel high frequency imaginary number signal.
  • Step 302 calculating a vector projection based on the high frequency real number signal and the high frequency imaginary number signal.
  • the terminal adds the left channel high frequency real number signal and the right channel high frequency real number signal in the high frequency real number signal to obtain a high frequency real number and a signal.
  • the high frequency real number and signal are calculated by the following formula:
  • X_HIPASS_RE_L is the left channel high frequency real number signal
  • X_HIPASS_RE_R is the right channel high frequency real number signal
  • sumRE is the high frequency real number and signal.
  • the terminal adds the left channel high frequency imaginary signal and the right channel high frequency imaginary signal in the high frequency imaginary signal to obtain a high frequency imaginary number and a signal.
  • the high frequency imaginary number and signal are calculated by the following formula:
  • X_HIPASS_IM_L is the left channel high frequency imaginary signal
  • X_HIPASS_IM_R is the right channel high frequency imaginary signal
  • sumIM is the high frequency imaginary number and signal.
  • the terminal subtracts the left channel high frequency real number signal and the right channel high frequency real number signal in the high frequency real number signal to obtain a high frequency real difference signal.
  • the high frequency real difference signal is calculated by the following formula:
  • diffRE is a high frequency real difference signal.
  • the terminal subtracts the left channel high frequency imaginary signal and the right channel high frequency imaginary signal in the high frequency imaginary signal to obtain a high frequency imaginary difference signal.
  • the high frequency imaginary difference signal is calculated by the following formula:
  • diffIM is a high frequency imaginary difference signal.
  • the terminal calculates the real number and the signal based on the high frequency real number and the signal and the high frequency imaginary number and signal.
  • sumSq is a real number and a signal.
  • the terminal calculates a real difference signal according to the high frequency real difference signal and the high frequency imaginary difference signal.
  • the real difference signal is calculated by the following formula:
  • diffSq is a real difference signal.
  • the terminal performs vector projection calculation based on the real number and the signal and the real difference signal to obtain a vector projection, and the vector projection represents the distance from each virtual speaker to the user in the 5.1 virtual speaker surrounded by the three-dimensional.
  • the vector projection is calculated by the following formula:
  • alpha is the vector projection
  • SQRT represents the square root
  • * represents the scalar product
  • Step 303 after performing inverse fast Fourier transform (IFFT) and overlap-add (Overlap-Add) on the product of the left channel high-frequency real number signal and the vector projection in the high-frequency real number signal, Center channel high frequency signal.
  • IFFT inverse fast Fourier transform
  • overlap-add overlap-Add
  • Inverse Fast Fourier Transform is an algorithm for converting a frequency domain signal into a time domain signal.
  • the terminal performs inverse fast Fourier transform on the product of the left channel high frequency real number signal and the vector projection in the high frequency real number signal.
  • the central channel high frequency signal is obtained.
  • the overlap and addition is a mathematical algorithm, which can be referred to https://en.wikipedia.org/wiki/Overlap-add_method.
  • the center channel high frequency signal can be calculated by the left channel high frequency real number signal or the right channel high frequency real number signal, but since the first stereo signal contains only one channel of the audio signal, the audio signal is mostly concentrated on the left. The channel, so the central high frequency signal is more accurate through the left channel high frequency real number calculation.
  • Step 304 the difference between the left channel high frequency signal and the center channel signal in the first high frequency signal is taken as the left channel high frequency signal.
  • the terminal uses the difference between the left channel high frequency signal and the center channel signal in the first high frequency signal as the left channel high frequency signal.
  • the left channel high frequency signal is calculated by the following formula:
  • X_PRE_L X_HIPASS_L-X_PRE_C
  • X_HIPASS_L is a left channel high frequency signal in the first high frequency signal
  • X_PRE_C is a center channel signal
  • X_PRE_L is a left channel high frequency signal.
  • Step 305 the difference between the right channel high frequency signal and the center channel signal in the first high frequency signal is used as the right channel high frequency signal.
  • the terminal uses the difference between the right channel high frequency signal and the center channel signal in the first high frequency signal as the right channel high frequency signal.
  • the right channel high frequency signal is calculated by the following formula:
  • X_PRE_R X_HIPASS_R-X_PRE_C
  • X_HIPASS_R is a right channel high frequency signal in the first high frequency signal
  • X_PRE_C is a center channel signal
  • X_PRE_R is a right channel high frequency signal.
  • step 304 and step 305 The execution sequence of step 304 and step 305 is not limited.
  • the terminal may perform step 304 and then perform step 305, or perform step 305 before performing step 304.
  • the method provided in this embodiment obtains a high-frequency real number signal and a high-frequency imaginary number signal by performing fast Fourier transform on the first high-frequency signal, and performs calculation through some columns according to the high-frequency real number signal and the high-frequency imaginary number signal.
  • the central high frequency signal is obtained, and the left channel high frequency signal and the right channel high frequency signal are calculated according to the central high frequency signal, thereby realizing the left channel high frequency signal calculated according to the first high frequency signal, and the center channel is high. Frequency signal and right channel high frequency signal.
  • FIG. 4 is a flowchart of a method for processing an audio signal provided by an exemplary embodiment of the present application, which may be performed by a terminal having an audio signal processing function, which may be step 203 in the embodiment of FIG. 2.
  • An optional implementation manner, the method includes:
  • step 401 for at least one of the left channel high frequency signal, the center channel high frequency signal, and the right channel high frequency signal, at least one movement is obtained according to the sampling point in the channel high frequency signal.
  • Window, each moving window includes n sampling points, and adjacent two moving windows have n/2 sampling points overlapping.
  • the terminal passes the moving window algorithm to any one of the left channel high frequency signal, the center channel high frequency signal, and the right channel high frequency signal, according to the sampling point in the channel high frequency signal. Get at least one moving window. Wherein, if each moving window has n sampling points, n/2 sampling points between adjacent two moving windows are overlapped, n ⁇ 1.
  • Moving windows are an algorithm similar to overlapping and adding, but only overlap and do not add. For example, data A contains 1024 sample points. If the movement step size is 128 and the overlap length is 64, then the signal output by the moving window is: first output A[0-128], second output A[64 -192], the third output A[128-256], ..., where A is the moving window, and the square brackets are the number of the sampling point.
  • Step 402 Calculate a low correlation signal in the moving window and a starting time point of the low correlation signal, where the low correlation signal includes a signal that the first attenuation envelope sequence of the amplitude spectrum and the second attenuation envelope sequence of the phase spectrum are not equal.
  • the terminal performs fast Fourier transform on the sampling point signal in the i-th moving window to obtain a fast Fourier transformed sampling point signal, i ⁇ 1.
  • the terminal performs moving window and fast Fourier transform on the left channel high frequency signal, the right channel high frequency signal and the center channel signal according to the preset moving step size and the overlapping length, respectively, and sequentially obtains the left channel high frequency real number.
  • Signal and left channel high frequency imaginary signal (denoted as FFT_L), right channel high frequency real number signal and right channel high frequency imaginary number signal (denoted as FFT_R), center channel real number signal and center channel imaginary number signal (denoted as FFT_C).
  • the terminal calculates the amplitude spectrum and the phase spectrum of the sample point signal after the fast Fourier transform.
  • the terminal calculates an amplitude spectrum AMP_L of the left channel high frequency signal and a phase spectrum PH_L of the left channel high frequency signal according to the FFT_L; and calculates a phase spectrum of the amplitude spectrum AMP_R of the right channel high frequency signal and the left channel high frequency signal according to the FFT_R PH_R;
  • the amplitude spectrum AMP_C of the center channel signal and the phase spectrum PH_C of the center channel signal are calculated according to FFT_C.
  • AMP_L, AMP_R, and AMP_C are collectively referred to as AMP_L/R/C;
  • PH_L, PH_R, and PH_C are collectively referred to as PH_L/R/C.
  • the terminal calculates a first attenuation envelope sequence of the m frequency lines in the i-th moving window according to the amplitude spectrum of the sampling point signal after the fast Fourier transform; and the phase spectrum of the sampling point signal according to the fast Fourier transform Calculating a second attenuation envelope sequence of the m frequency lines in the i-th moving window; when the attenuation envelope sequence and the second attenuation envelope sequence of the j-th frequency line in the m frequency lines are different, determining The j frequency lines are low correlation signals; determining the starting time point of the low correlation signal according to the window number of the i-th moving window and the frequency line number of the jth frequency line, wherein m ⁇ 1, 1 ⁇ j ⁇ m .
  • the terminal calculates the attenuation envelope sequence and correlation of all frequency lines for AMP_L/R/C and PH_L/R/C of all moving windows, respectively, wherein the attenuation envelope sequence between the moving windows is calculated, corresponding to the same moving window.
  • the amplitude spectrum and phase spectrum are valid conditions.
  • the attenuation envelope sequences of the frequency spectrum of the 0th frequency line corresponding to the moving window 1, the moving window 2, and the moving window 3 are 1.0, 0.8, and 0.6, respectively, and the moving window 1, the moving window 2, and the moving window 3 correspond to the 0th number.
  • the attenuation envelope sequences of the phase spectrum of the frequency line are 1.0, 0.8, and 1.0, respectively, and it is considered that the frequency line 0 of the moving window 1 and the frequency line 0 of the moving window 2 have a high correlation, and the frequency line 0 of the moving window 2 is respectively It has a low correlation with the frequency line 0 of the moving window 3.
  • the n sample points will get n/2+1 frequency lines, and the window number and frequency line of the moving window corresponding to the low correlation signal will be taken out.
  • the signal can be calculated by X_PRE_L through the window number. The starting time point in X_PRE_R and X_PRE_C.
  • Step 403 determining a target low correlation signal that meets the rear/reverb characteristics.
  • the terminal determines the target low correlation signal that meets the rear/reverb characteristics by:
  • VHF Very high frequency
  • the method for determining, by the terminal, the target low correlation signal that meets the rear/reverberation characteristics includes but is not limited to:
  • the terminal determines that the low correlation signal is consistent with the rear/reverberation characteristic Target low correlation signal.
  • Step 404 calculating an end time point of the target low correlation signal.
  • the terminal calculates the end time point of the low correlation signal by:
  • the terminal acquires a time point at which the energy of the frequency line corresponding to the amplitude spectrum of the target low correlation signal is less than the fourth threshold as the end time point.
  • the terminal calculates the end time point of the low correlation signal by:
  • the terminal determines the starting time point of the next low correlation signal as the end time point of the target low correlation signal.
  • Step 405 Extract the target low correlation signal according to the start time point and the end time point as the rear/reverberation signal data in the channel high frequency signal.
  • the terminal extracts the channel signal segment located in the start time point and the end time point; performs fast Fourier transform on the channel signal segment to obtain a fast Fourier transformed signal segment; from the fast Fourier Extracting the frequency line corresponding to the target low correlation signal in the transformed signal segment to obtain the first partial signal; performing inverse fast Fourier transform and overlapping addition on the first portion to obtain rear/reverberation in the channel high frequency signal Signal data.
  • the terminal obtains the first rear/reverberation signal data in the left channel high frequency signal, the second rear/reverberation signal data in the center channel high frequency signal, and the third in the right channel high frequency signal.
  • Rear/reverb signal data is the first rear/reverberation signal data in the left channel high frequency signal, the second rear/reverberation signal data in the center channel high frequency signal, and the third in the right channel high frequency signal.
  • Step 406 Calculate a front left channel signal, a rear left channel signal, and a front right channel according to the first rear/reverberation signal data, the second rear/reverberation signal data, and the third rear/reverberation signal data. Signal, rear right channel signal and front center channel signal.
  • the terminal determines the difference between the left channel high frequency signal and the first rear/reverberation signal data obtained in the above step as the front left channel signal.
  • the first rear/reverberation signal data is the audio data contained in the left channel high frequency signal, and is the audio data contained in the rear left channel signal of the 5.1 virtual speaker surrounded by the three-dimensional surround, and the left channel high frequency signal includes the front data.
  • the terminal determines the sum of the first rear/reverberation signal data and the second rear/reverberation signal data obtained in the above steps as the rear left channel signal.
  • the terminal determines the difference between the right channel high frequency signal and the third rear/reverberation signal data obtained in the above step as the front right channel signal.
  • the third rear/reverberation signal data is the audio data contained in the right channel high frequency signal, and is the audio data contained in the rear right channel signal of the 5.1 virtual speaker surrounded by the three-dimensional surround, and the right channel high frequency signal includes the front data.
  • Right channel signal and partial rear right channel signal so the right channel high frequency signal is subtracted from the part of the rear right channel signal, that is, the third rear/reverberation signal data, and the front right sound is obtained. Signal.
  • the terminal determines the sum of the third rear/reverberation signal data and the second rear/reverberation signal data obtained in the above steps as the rear right channel signal.
  • the terminal determines the difference between the center channel high frequency signal and the second rear/reverberation signal data obtained in the above step as the front center channel signal.
  • the second rear/reverberation signal data is the audio data contained in the rear left channel signal of the 5.1 virtual sound box of the three-dimensional surround and the audio data contained in the rear right channel signal, and the center channel high frequency signal includes the front center channel The signal and the second rear/reverberation signal data thus subtract the second rear/reverberation signal data from the center channel high frequency signal.
  • the method provided in this embodiment extracts the rear/reverberation in the high frequency signal of each channel by calculating the start time and the end time of the rear/reverberation signal data in the high frequency signal of each channel.
  • Signal data based on the rear/reverberation signal data in each channel high frequency signal, the front left channel signal, the rear left channel signal, the front right channel signal, the rear right channel signal, and the front are calculated.
  • the center channel signal is set to improve the accuracy of the 5.1 channel audio signal calculated from the left channel high frequency signal, the center channel high frequency signal, and the right channel high frequency signal.
  • FIG. 5 is a flowchart of a method for processing an audio signal provided by an exemplary embodiment of the present application, which may be performed by a terminal having an audio signal processing function, which may be step 102 in the embodiment of FIG.
  • An optional embodiment, the method comprising:
  • Step 501 Filter the first stereo audio signal input low pass filter to obtain a first low frequency signal.
  • the terminal filters the first stereo audio signal input low pass filter to obtain a first low frequency signal.
  • the first low frequency signal is a superposed signal of the first left channel low frequency signal and the first right channel low frequency signal.
  • the terminal filters the first stereo through a 4th-order IIR low-pass filter to obtain a first low-frequency signal.
  • Step 502 scalar multiplying the first vertical low frequency signal and the volume parameter of the low frequency channel speaker in the 5.1 virtual speaker to obtain a second low frequency signal.
  • the terminal multiplies the first low frequency signal and the volume parameter of the low frequency channel speaker in the 5.1 virtual speaker by a scalar quantity to obtain a second low frequency signal.
  • the terminal calculates the second low frequency signal by the following formula:
  • X_LFE is the first stereo low frequency signal
  • V6 is the volume parameter of the low frequency channel speaker in 5.1 virtual speaker
  • X_LFE_S is the second low frequency signal, which is the first left channel low frequency signal X_LFE_S_L and the first right channel low frequency signal X_LFE_S_R Superimposed signal, * represents scalar multiplication.
  • Step 503 Perform mono conversion on the second low frequency signal to obtain a processed low frequency channel signal.
  • the terminal performs mono conversion on the second low frequency signal to obtain a processed low frequency channel signal.
  • the terminal calculates the processed low frequency channel signal by the following formula:
  • X_LFE_M (X_LFE_S_L+X_LFE_S_R)/2
  • X_LFE_M is the processed low frequency channel signal.
  • the method provided in this embodiment obtains a first low frequency signal by filtering the first stereo audio signal, and performs mono conversion on the first low frequency signal to obtain a low frequency channel signal in the 5.1 channel audio signal.
  • the first low frequency signal is extracted from the first stereo signal and split into 0.1 channel audio signals in the 5.1 channel audio signal.
  • the above method embodiment obtains a 5.1 channel audio signal, which is a front left channel signal, a front right channel signal, a front center channel signal, and a low frequency channel.
  • the signal, the rear left channel signal, and the rear right channel signal, the embodiments of Figures 6 and 8 below provide a method of processing and synthesizing the 5.1 channel audio signal to obtain a second stereo audio signal,
  • the method may be an optional embodiment of step 104 in the embodiment of FIG. 1, or may be a separate embodiment.
  • the stereo signal obtained in the embodiments of FIG. 6 and FIG. 8 may be the second stereo signal in the above method embodiment.
  • the HRTF (Head Related Transfer Function) processing technique is a processing technique for generating stereo surround sound effects.
  • the technician can pre-build an HRTF database in which the correspondence between the HRTF data, the HRTF data collection point, and the position coordinates of the HRTF data collection point with respect to the reference human head is recorded.
  • the HRTF data is a set of parameters used to process the left channel audio signal and the right channel audio signal.
  • FIG. 6 is a flowchart showing a method of processing an audio signal provided by an exemplary embodiment of the present application, which may be performed by a terminal having an audio signal processing function, which may be one of step 104 in the embodiment of FIG.
  • An alternative embodiment, the method comprising:
  • Step 601 obtaining a 5.1 channel audio signal
  • the 5.1 channel audio signal is the processed 5.1 channel audio signal obtained by splitting and processing the first stereo audio signal in the embodiment of FIG. 1 to FIG. 5 described above.
  • the 5.1 channel audio signal is a 5.1 channel audio signal that is downloaded or read from a storage medium.
  • the 5.1 channel audio signal includes: a front left channel signal, a front right channel signal, a front center channel signal, a low frequency channel signal, a rear left channel signal, and a rear right channel signal.
  • Step 602 Acquire HRTF data corresponding to each virtual speaker in the 5.1 virtual speaker according to coordinates of the 5.1 virtual speaker in the virtual environment;
  • the 5.1 virtual speaker includes: a front left channel virtual speaker FL, a front right channel virtual speaker FR, a front center channel virtual speaker FC, a subwoofer virtual speaker LFE, a rear left channel virtual speaker RL And rear right channel virtual speaker RR.
  • the 5.1 virtual speaker has its own coordinates in the virtual environment.
  • the virtual environment may be a two-dimensional planar virtual environment or a three-dimensional virtual environment planar virtual environment.
  • FIG. 7 a schematic diagram of a 5.1 channel virtual speaker in a two-dimensional planar virtual environment is assumed, assuming that the reference human head is at the center point 70 in FIG. 7 and facing the center channel virtual speaker FC. Each channel is at the same distance from the center point 70 where the reference head is located and is in the same plane.
  • the front center channel virtual speaker FC is directly in front of the facing direction of the reference head.
  • the front left channel virtual speaker FL and the front right channel virtual speaker FR are respectively located on both sides of the front center channel FC, respectively, at an angle of 30 degrees with the facing direction of the reference head, and are symmetrically arranged.
  • the rear left channel virtual speaker RL and the rear right channel virtual speaker RR are respectively located at the opposite sides of the reference human head facing the direction, respectively, and are respectively at an angle of 100-120 degrees with the facing direction of the reference human head, and are symmetrically arranged. .
  • the placement position of the subwoofer virtual speaker LFE is not strictly required.
  • the back-to-back direction of the reference head is used as an example, but the application does not refer to the subwoofer virtual speaker LFE and the reference head.
  • the angle of the facing direction is limited.
  • each of the above-mentioned 5.1-channel virtual speakers and the reference head is only exemplary, and the distance between each virtual speaker and the reference head may be different.
  • the height of each virtual speaker may be different, and the difference in the position of each virtual speaker may cause a difference in sound signals, which is not limited in the present disclosure.
  • the coordinates of each virtual speaker in the virtual environment can be obtained.
  • the HRTF database is stored in the terminal, and the HRTF database includes: a correspondence between at least one HRTF data collection point and HRTF data, and each HRTF data collection point has a respective coordinate.
  • the terminal queries the HRTF data collection point closest to the i-th coordinate in the HRTF database according to the i-th coordinate of the i-th virtual speaker in the 5.1 virtual speaker, and determines the HRTF data of the HRTF data collection point closest to the i-th coordinate. For the HRTF data of the i-th virtual speaker, i ⁇ 1.
  • Step 603 according to the HRTF data corresponding to each virtual speaker, processing the corresponding channel audio signal in the 5.1 channel audio signal to obtain the processed 5.1 channel audio signal;
  • each HRTF data includes a left channel HRTF coefficient and a right channel HRTF coefficient.
  • the terminal processes the i-th channel audio signal in the 5.1-channel audio signal according to the left channel HRTF coefficient in the HRTF data corresponding to the i-th virtual speaker, and obtains the processed i-th channel audio signal corresponding to Left channel component;
  • the terminal processes the ith channel audio signal in the 5.1 channel audio signal according to the right channel HRTF coefficient in the HRTF data corresponding to the i-th virtual speaker, and obtains the processed ith channel audio signal corresponding to Right channel component.
  • Step 604 synthesize the processed 5.1 channel audio signal into a stereo audio signal.
  • the 5.1 channel audio signal in the implementation of the present application is the processed 5.1 channel audio obtained by being separated and processed from the first stereo audio signal in the embodiment of FIG. 1 to FIG. 5 described above.
  • the stereo audio signal in this step is the second stereo audio signal in the embodiment of Fig. 1.
  • the method provided by the embodiment can process the stereo audio signal by processing the 5.1 channel audio signal according to the HRTF data of each 5.1 virtual speaker, so that the user only needs an ordinary stereo earphone or a 2.0 speaker. Play 5.1 channel audio signals and get better playback quality.
  • FIG. 8 is a flowchart showing a method of processing an audio signal provided by an exemplary embodiment of the present application, which may be performed by a terminal having an audio signal processing function, which may be one of step 104 in the embodiment of FIG. 1.
  • An alternative embodiment, the method comprising:
  • Step 801 Collect a series of at least one HRTF data with reference to the human head as the center of the sphere in the acoustic room, and record position coordinates of the HRTF data collection points corresponding to the reference human heads of the respective HRTF data;
  • the developer pre-places the reference human head 92 (made of the dummy human head) in the acoustic room 91 (the sound absorbing sponge is disposed around the room to reduce the echo interference), and sets the micro omnidirectional microphone to the reference head. 92 in the left and right ear canal.
  • the developer After completing the reference head 92 setting, the developer sets the HRTF data collection point every predetermined distance on the surface of the sphere with the reference head 92 as the center of the sphere, and uses the speaker 93 to play the predetermined audio at the HRTF data collection point.
  • the HRTF data at the HRTF data collection point includes a left channel HRTF coefficient corresponding to the left channel and a right channel HRTF coefficient corresponding to the right channel.
  • Step 802 Generate an HRTF database according to the HRTF data, the identifier of the HRTF data collection point, and the location coordinates of the HRTF data collection point.
  • a coordinate system is established with reference to the head 92 as a center point.
  • the coordinate system is established in the same way as the coordinate system of the 5.1-channel virtual speaker.
  • the virtual environment corresponding to the 5.1-channel virtual speaker is a two-dimensional virtual environment
  • HRTF data it is also possible to establish a coordinate system only for the horizontal plane where the reference human head 92 is located, and only collect HRTF data belonging to the horizontal plane. For example, on a ring centered on the reference head 92, a point is taken every 5° as an HRTF data sampling point. At this time, the amount of HRTF data that the terminal needs to store can be reduced.
  • the virtual environment corresponding to the 5.1 channel virtual speaker is a three-dimensional virtual environment
  • a coordinate system can be established for the three-dimensional environment in which the reference head 92 is located, and the HRTF on the surface of the sphere with the reference head 92 as the center of the sphere is collected. data. For example, on the surface of the sphere with the reference head 92 as the center of the sphere, one point is taken every 5 degrees in the longitude direction and the latitude direction as the HRTF data sampling point.
  • the terminal generates an HRTF database according to the identifier of each HRTF data sampling point, the HRTF data of each HRTF data sampling point, and the position coordinates of each HRTF data collection point.
  • step 801 and step 802 can also be performed and implemented by other devices. After the HRTF database is generated, it is transmitted to the current terminal through a network or a storage medium.
  • Step 803 acquiring a 5.1 channel audio signal
  • the terminal acquires a 5.1 channel audio signal.
  • the 5.1 channel audio signal is the processed 5.1 channel audio signal obtained by separating and processing the first stereo audio signal in the above-described embodiments of FIGS. 1 to 5.
  • the 5.1 channel audio signal is a 5.1 channel audio signal that is downloaded or read from a storage medium.
  • the 5.1 channel audio signal includes: a front left channel signal X_FL, a front right channel signal X_FC, a front center channel signal X_FC, a low frequency channel signal X_LFE_M, a rear left channel signal X_RL, and a rear right sound.
  • Channel signal X_RR includes: a front left channel signal X_FL, a front right channel signal X_FC, a front center channel signal X_FC, a low frequency channel signal X_LFE_M, a rear left channel signal X_RL, and a rear right sound.
  • Channel signal X_RR Channel signal
  • Step 804 Acquire an HRTF database, where the HRTF database includes: a correspondence between at least one HRTF data collection point and HRTF data, and each HRTF data collection point has a respective coordinate;
  • the terminal can read the HRTF database stored locally or access the HRTF library stored on the network.
  • Step 805 according to the i-th coordinate of the i-th virtual speaker in the 5.1 virtual speaker, query the HRTF data acquisition point closest to the i-th coordinate in the HRTF database, and the HRTF of the HRTF data collection point closest to the i-th coordinate.
  • the data is determined as the HRTF data of the i-th virtual speaker;
  • the terminal pre-stores the coordinates of each virtual speaker in the 5.1 virtual speaker. Among them, i ⁇ 1.
  • the terminal queries the HRTF database for the HRTF data collection point closest to the first coordinate according to the first coordinate of the front left channel virtual speaker, and determines the HRTF data of the HRTF data collection point closest to the first coordinate as the front. HRTF data for the left channel virtual speaker.
  • the terminal queries the HRTF database for the HRTF data collection point closest to the second coordinate according to the second coordinate of the front right channel virtual speaker, and determines the HRTF data of the HRTF data collection point closest to the second coordinate as the front. HRTF data for the right channel virtual speaker.
  • the terminal queries the HRTF database to select the HRTF data collection point closest to the third coordinate according to the third coordinate of the front center channel virtual speaker, and determines the HRTF data of the HRTF data collection point closest to the third coordinate as the front. HRTF data for the center channel virtual speaker.
  • the terminal queries the HRTF data acquisition point closest to the fourth coordinate in the HRTF database according to the fourth coordinate of the rear left channel virtual speaker, and determines the HRTF data of the HRTF data collection point closest to the fourth coordinate as the rear position. HRTF data for the left channel virtual speaker.
  • the terminal queries the HRTF data acquisition point closest to the fifth coordinate in the HRTF database according to the fifth coordinate of the rear right channel virtual speaker, and determines the HRTF data of the HRTF data collection point closest to the fifth coordinate as the post-position. HRTF data for the right channel virtual speaker.
  • the terminal queries the HRTF data acquisition point closest to the sixth coordinate in the HRTF database according to the sixth coordinate of the low frequency virtual speaker, and determines the HRTF data of the HRTF data collection point closest to the sixth coordinate as the HRTF data of the low frequency virtual speaker. .
  • “closest” means that the coordinates of the virtual speaker and the coordinates of the HRTF data sampling point are the same or the distance between the coordinates is the shortest.
  • Step 806 for the audio signal of the i-th channel in the 5.1-channel audio signal, performing a first convolution using the left channel HRTF coefficient in the HRTF data corresponding to the i-th virtual speaker, to obtain the first convolution The audio signal of the i-th channel;
  • * indicates convolution
  • H_L_i indicates the left channel HRTF coefficient in the HRTF data corresponding to the i-th virtual speaker.
  • Step 807 superimposing the audio signals of the first convolved channels to obtain a left channel signal in the stereo audio signal
  • Step 808 for the audio signal of the i-th channel in the 5.1-channel audio signal, performing a second convolution using the right channel HRTF coefficient in the HRTF data corresponding to the i-th virtual speaker, to obtain the second convolution The audio signal of the i-th channel;
  • Step 809 superimposing the audio signals of the respective channels after the second convolution to obtain a right channel signal in the stereo audio signal
  • Step 810 synthesizing the left channel signal and the right channel signal into a stereo audio signal.
  • the synthesized stereo audio signal can be stored as an audio file or played on an input playback device.
  • the 5.1 channel audio signal in the implementation of the present application is the processed 5.1 channel audio obtained by being separated and processed from the first stereo audio signal in the embodiment of FIG. 1 to FIG. 5 described above.
  • the stereo audio signal in this step is the second stereo audio signal in the embodiment of Fig. 1.
  • the method provided by the embodiment can process the stereo audio signal by processing the 5.1 channel audio signal according to the HRTF data of each 5.1 virtual speaker, so that the user only needs an ordinary stereo earphone or a 2.0 speaker. Play 5.1 channel audio signals and get better playback quality.
  • the method provided in this embodiment can obtain a stereo audio signal with better three-dimensional surround sound by convolving and superimposing the 5.1 channel audio signal according to the HRTF data of each 5.1 virtual speaker.
  • the stereo audio signal is played during playback. Has a good three-dimensional surround effect.
  • FIG. 10 is a structural block diagram of a processing apparatus for an audio signal provided by an exemplary embodiment of the present application, which may be implemented as part of a terminal or a terminal.
  • the device includes:
  • the obtaining module 1010 is configured to acquire a first stereo audio signal
  • the processing module 1020 is configured to split the first stereo audio signal into a 5.1 channel audio signal, and perform signal processing on the 5.1 channel audio signal according to the speaker parameters of the 5.1 virtual speaker of the three-dimensional surround to obtain the processed 5.1 channel audio signal. ;
  • the synthesizing module 1030 is configured to synthesize the processed 5.1 channel audio signal into a stereo audio signal.
  • the apparatus further includes a computing module 1040;
  • the processing module 1020 is configured to filter the first stereo audio signal input high-pass filter to obtain a first high-frequency signal
  • the calculating module 1040 is configured to calculate, according to the first high frequency signal, a left channel high frequency signal, a center channel high frequency signal, and a right channel high frequency signal; according to the left channel high frequency signal and the center channel high frequency signal And the right channel high frequency signal, the front left channel signal, the front right channel signal, the front center channel signal, the low frequency channel signal, the rear left channel signal and the 5.1 channel audio signal are calculated.
  • Rear right channel signal is configured to calculate, according to the first high frequency signal, a left channel high frequency signal, a center channel high frequency signal, and a right channel high frequency signal; according to the left channel high frequency signal and the center channel high frequency signal And the right channel high frequency signal, the front left channel signal, the front right channel signal, the front center channel signal, the low frequency channel signal, the rear left channel signal and the 5.1 channel audio signal are calculated.
  • Rear right channel signal Rear right channel signal.
  • the calculating module 1040 is further configured to perform fast Fourier transform on the first high frequency signal to obtain a high frequency real number signal and a high frequency imaginary number signal; according to the high frequency real number signal and the high frequency imaginary number signal Calculating the vector projection; performing inverse fast Fourier transform on the product of the left channel high frequency real signal and the computed vector projection in the high frequency real signal to obtain a center channel high frequency signal; and the left sound in the first high frequency signal
  • the difference between the high frequency signal and the center channel signal is used as the left channel high frequency signal; the difference between the right channel high frequency signal and the center channel signal in the first high frequency signal is used as the right channel high frequency signal.
  • the calculating module 1040 is further configured to add the left channel high frequency real number signal and the right channel high frequency real number signal in the high frequency real number signal to obtain a high frequency real number and a signal; and the left channel high in the high frequency imaginary signal
  • the frequency imaginary signal and the right channel high frequency imaginary signal are added to obtain a high frequency imaginary number and a signal; the left channel high frequency real number signal and the right channel high frequency real number signal in the high frequency real number signal are subtracted to obtain a high frequency real number
  • the difference signal is obtained by subtracting the left channel high frequency imaginary signal and the right channel high frequency imaginary signal in the high frequency imaginary signal to obtain a high frequency imaginary difference signal; and calculating according to the high frequency real number and the signal and the high frequency imaginary number and signal Real number and signal; according to the high frequency real difference signal and the high frequency imaginary difference signal, the real difference signal is calculated; according to the real number and the signal and the real difference signal, the vector projection calculation is performed to obtain the vector projection.
  • the calculation module 1040 is further configured to calculate a vector projection according to the following formula when the real number and the signal are valid numbers:
  • alpha is a vector projection
  • diffSq is a real difference signal
  • sumSQ is a real number and a signal
  • SQRT represents an open square
  • * represents a scalar multiplication
  • the processing module 1020 is further configured to extract the first rear/reverberation signal data in the left channel high frequency signal, the second rear/reverberation signal data in the center channel high frequency signal, and the right channel high frequency signal.
  • Third rear/reverberation signal data is further configured to extract the first rear/reverberation signal data in the left channel high frequency signal, the second rear/reverberation signal data in the center channel high frequency signal, and the right channel high frequency signal.
  • the calculating module 1040 is further configured to determine a difference between the left channel high frequency signal and the first rear/reverberation signal data as a front left channel signal; and the first rear/reverberation signal data and the second rear/mixed
  • the sum of the signal data is determined as the rear left channel signal; the difference between the right channel high frequency signal and the third rear/reverberation signal data is determined as the front right channel signal; the third rear/reverberation is to be performed
  • the sum of the signal data and the second rear/reverberation signal data is determined as a rear right channel signal; the difference between the center channel high frequency signal and the second rear/reverberation signal data is determined as the front center channel signal .
  • the obtaining module 1010 is further configured to: according to any one of the left channel high frequency signal, the center channel high frequency signal, and the right channel high frequency signal, according to the channel
  • the sampling points in the high frequency signal obtain at least one moving window, each moving window includes n sampling points, and n/2 sampling points of the adjacent two moving windows are overlapped, n ⁇ 1.
  • the calculating module 1040 is further configured to calculate a low correlation signal in the moving window and a starting time point of the low correlation signal, where the low correlation signal includes the first attenuation envelope sequence of the amplitude spectrum and the second attenuation envelope sequence of the phase spectrum are not equal Signal; determine the target low correlation signal that meets the rear/reverb characteristics; calculate the end time point of the target low correlation signal; extract the target low correlation signal according to the start time point and the end time point, as the rear of the channel high frequency signal / Reverberation signal data.
  • the calculation module 1040 is further configured to calculate a low correlation signal in the moving window and a start time point of the low correlation signal, where the low correlation signal includes a first attenuation envelope sequence and a phase spectrum of the amplitude spectrum.
  • the second attenuation envelope sequence is not equal to the signal; the target low correlation signal that meets the rear/reverberation characteristic is determined; the end time point of the target low correlation signal is calculated; and the target low correlation signal is extracted according to the start time point and the end time point, As rear/reverberation signal data in the channel high frequency signal.
  • the calculating module 1040 is further configured to perform fast Fourier transform on the sampling point signal in the i-th moving window to obtain a fast Fourier-transformed sampling point signal; and calculate a magnitude of the sampling point signal after the fast Fourier transform Spectrum and phase spectrum; calculating a first attenuation envelope sequence of m frequency lines in the i-th moving window according to an amplitude spectrum of the sample point signal after fast Fourier transform; sampling points according to fast Fourier transform a phase spectrum of the signal, calculating a second attenuation envelope sequence of the m frequency lines in the i-th moving window; when the attenuation envelope sequence and the second attenuation envelope sequence of the j-th frequency line in the m frequency lines are not At the same time, determining that the jth frequency line is a low correlation signal; determining a starting time point of the low correlation signal according to the window number of the i-th moving window and the frequency line number of the jth frequency line, i ⁇ 1, m ⁇ 1 , 1 ⁇ j
  • the calculation module 1040 is further configured to: when the amplitude spectrum energy of the very high frequency line of the low correlation signal is less than the first threshold and the attenuation envelope slope of the adjacent window of the window where the very high frequency line is located is greater than The second threshold determines whether the low correlation signal is a target low correlation signal that conforms to the rear/reverberation characteristic; or, when the amplitude spectral energy of the very high frequency line of the low correlation signal is less than the first threshold and the phase of the window where the very high frequency line is located When the attenuation speed of the adjacent window is greater than the third threshold, it is determined that the low correlation signal is a target low correlation signal that conforms to the rear/reverberation characteristic.
  • the calculating module 1040 is further configured to acquire a time point when the energy of the frequency line corresponding to the amplitude spectrum of the target low correlation signal is less than the fourth threshold, as the end time point; or, when the target low correlation signal When the energy is less than 1/n of the energy of the next low correlation signal, the starting time point of the next low correlation signal is determined as the end time point of the target low correlation signal.
  • the acquisition module 1010 is further configured to extract channel signal segments located in the start time point and the end time point.
  • the calculating module 1040 is further configured to perform fast Fourier transform on the channel signal segment to obtain a fast Fourier transformed signal segment; and extract a frequency line corresponding to the target low correlation signal from the fast Fourier transformed signal segment Obtaining a first partial signal; performing inverse fast Fourier transform and overlapping addition on the first partial signal to obtain rear/reverberation signal data in the channel high frequency signal.
  • the calculating module 1040 is further configured to multiply the front left channel signal and the volume of the virtual front left channel speaker by scalar to obtain the processed front left channel signal;
  • the front right channel signal is multiplied by the scalar volume of the virtual front right channel speaker to obtain the processed front right channel signal;
  • the front center channel signal and the virtual front center channel speaker volume are Multiplying the scalar to obtain the processed front center channel signal; multiplying the rear left channel signal by the volume of the virtual rear left channel speaker to obtain the processed rear left channel signal;
  • the right channel signal is multiplied by the volume of the virtual rear right channel speaker to obtain a processed rear right channel signal.
  • the 5.1 channel audio signal comprises a low frequency channel signal
  • the processing module 1020 is further configured to filter the first stereo audio signal input low pass filter to obtain a first low frequency signal.
  • the calculating module 1040 is further configured to perform scalar multiplication of the first low frequency signal and the volume parameter of the low frequency channel speaker in the 5.1 virtual speaker to obtain a second low frequency signal; and the second low frequency signal is mono converted and processed. Low frequency channel signal.
  • the second low frequency signal comprises: a left channel low frequency signal and a right channel low frequency signal;
  • the calculation module 1040 is further configured to perform averaging after superimposing the left channel low frequency signal and the right channel low frequency signal, and using the averaged audio signal as the processed low frequency channel signal.
  • FIG. 11 is a block diagram showing the structure of a processing apparatus for an audio signal provided by an exemplary embodiment of the present application.
  • the device can be implemented as part of a terminal or terminal.
  • the device includes:
  • a first obtaining module 1120 configured to acquire a 5.1 channel audio signal
  • a second obtaining module 1140 configured to acquire, according to coordinates of the 5.1 virtual speaker in the virtual environment, a head related transformation function HRTF data corresponding to each virtual speaker in the 5.1 virtual speaker;
  • the processing module 1160 is configured to process the corresponding channel audio signal in the 5.1 channel audio signal according to the HRTF data corresponding to each virtual speaker, to obtain the processed 5.1 channel audio signal;
  • the synthesizing module 1180 is configured to synthesize the processed 5.1 channel audio signal into a stereo audio signal.
  • the second obtaining module 1140 is configured to acquire an HRTF database, where the HRTF database includes: a correspondence between at least one HRTF data collection point and HRTF data, and each HRTF data collection point has a respective coordinate. According to the i-th coordinate of the i-th virtual speaker in the 5.1 virtual speaker, the HRTF data collection point closest to the i-th coordinate is searched in the HRTF database, and the HRTF data of the HRTF data collection point closest to the i-th coordinate is determined. For the HRTF data of the i-th virtual speaker, i ⁇ 1.
  • the device further includes:
  • the collecting module 1112 is configured to collect, in the acoustic room, a series of at least one HRTF data with reference to the human head as a center of the sphere, and record position coordinates of each HRTF data corresponding to the HRTF data collection point relative to the reference human head;
  • the generating module 1114 is configured to generate an HRTF database according to the HRTF data, the identifier of the HRTF data collection point, and the location coordinates of the HRTF data collection point.
  • the HRTF data includes: a left channel HRTF coefficient
  • the processing module 1160 includes:
  • the left channel convolution unit is configured to perform, for the audio signal of the i-th channel in the 5.1-channel audio signal, the first convolution using the left channel HRTF coefficient in the HRTF data corresponding to the i-th virtual speaker, The audio signal of the i-th channel after the first convolution;
  • the left channel synthesizing unit is configured to superimpose the audio signals of the respective channels after the first convolution to obtain a left channel signal in the stereo audio signal.
  • the HRTF data includes: a right channel HRTF coefficient
  • the processing module 1160 includes:
  • a right channel convolution unit for performing a second convolution on the audio signal of the i th channel of the 5.1 channel audio signal by using the right channel HRTF coefficient in the HRTF data corresponding to the i th virtual speaker The audio signal of the i-th channel after the second convolution;
  • the right channel synthesizing unit is configured to superimpose the audio signals of the respective channels after the second convolution to obtain a right channel signal in the stereo audio signal.
  • FIG. 12 is a structural block diagram of a terminal 1200 provided by an exemplary embodiment of the present application.
  • the terminal 1200 can be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III), and an MP4 (Moving Picture Experts Group Audio Layer IV). Level 4) Player, laptop or desktop computer.
  • Terminal 1200 may also be referred to as a user device, a portable terminal, a laptop terminal, a desktop terminal, and the like.
  • the terminal 1200 includes a processor 1201 and a memory 1202.
  • Processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 1201 may be configured by at least one of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). achieve.
  • the processor 1201 may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in an awake state, which is also called a CPU (Central Processing Unit); the coprocessor is A low-power processor for processing data in standby.
  • the processor 1201 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and rendering of the content that the display needs to display.
  • the processor 1201 may further include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.
  • AI Artificial Intelligence
  • Memory 1202 can include one or more computer readable storage media, which can be non-transitory.
  • the memory 1202 may also include high speed random access memory, as well as non-volatile memory such as one or more magnetic disk storage devices, flash memory storage devices.
  • the non-transitory computer readable storage medium in memory 1202 is for storing at least one instruction for execution by processor 1201 to implement the audio provided by various method embodiments of the present application. Signal processing method.
  • the terminal 1200 optionally further includes: a peripheral device interface 1203 and at least one peripheral device.
  • the processor 1201, the memory 1202, and the peripheral device interface 1203 may be connected by a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 1203 via a bus, signal line or circuit board.
  • the peripheral device includes at least one of a radio frequency circuit 1204, a touch display screen 1205, a camera 1206, an audio circuit 1207, a positioning component 1208, and a power source 1209.
  • the peripheral device interface 1203 can be used to connect at least one peripheral device associated with an I/O (Input/Output) to the processor 1201 and the memory 1202.
  • processor 1201, memory 1202, and peripheral interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one of processor 1201, memory 1202, and peripheral interface 1203 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the RF circuit 1204 is configured to receive and transmit an RF (Radio Frequency) signal, also referred to as an electromagnetic signal.
  • Radio frequency circuit 1204 communicates with the communication network and other communication devices via electromagnetic signals.
  • the RF circuit 1204 converts the electrical signal into an electromagnetic signal for transmission, or converts the received electromagnetic signal into an electrical signal.
  • the radio frequency circuit 1204 includes an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like.
  • the radio frequency circuit 1204 can communicate with other terminals via at least one wireless communication protocol.
  • the wireless communication protocols include, but are not limited to, the World Wide Web, a metropolitan area network, an intranet, generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity) networks.
  • the radio frequency circuit 1204 may further include an NFC (Near Field Communication) related circuit, which is not limited in this application.
  • the display screen 1205 is used to display a UI (User Interface).
  • the UI can include graphics, text, icons, video, and any combination thereof.
  • display screen 1205 is a touch display screen, display screen 1205 also has the ability to capture touch signals over the surface or surface of display screen 1205.
  • the touch signal can be input to the processor 1201 as a control signal for processing.
  • the display screen 1205 can also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards.
  • the display screen 1205 may be one, and the front panel of the terminal 1200 is disposed; in other embodiments, the display screens 1205 may be at least two, respectively disposed on different surfaces of the terminal 1200 or in a folded design; In still other embodiments, the display screen 1205 can be a flexible display screen disposed on a curved surface or a folded surface of the terminal 1200. Even the display screen 1205 can be set as a non-rectangular irregular pattern, that is, a profiled screen.
  • the display screen 1205 can be prepared by using an LCD (Liquid Crystal Display) or an OLED (Organic Light-Emitting Diode).
  • Camera component 1206 is used to capture images or video.
  • camera assembly 1206 includes a front camera and a rear camera.
  • the front camera is placed on the front panel of the terminal, and the rear camera is placed on the back of the terminal.
  • the rear camera is at least two, which are respectively a main camera, a depth camera, a wide-angle camera, and a telephoto camera, so as to realize the background blur function of the main camera and the depth camera, and the main camera Combine with a wide-angle camera for panoramic shooting and VR (Virtual Reality) shooting or other integrated shooting functions.
  • camera assembly 1206 can also include a flash.
  • the flash can be a monochrome temperature flash or a two-color temperature flash.
  • the two-color temperature flash is a combination of a warm flash and a cool flash that can be used for light compensation at different color temperatures.
  • the audio circuit 1207 can include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals for processing into the processor 1201 for processing, or input to the RF circuit 1204 for voice communication.
  • the microphones may be multiple, and are respectively disposed at different parts of the terminal 1200.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is then used to convert electrical signals from the processor 1201 or the RF circuit 1204 into sound waves.
  • the speaker can be a conventional film speaker or a piezoelectric ceramic speaker.
  • the audio circuit 1207 can also include a headphone jack.
  • the positioning component 1208 is configured to locate the current geographic location of the terminal 1200 to implement navigation or LBS (Location Based Service).
  • the positioning component 1208 can be a positioning component based on a US-based GPS (Global Positioning System), a Chinese Beidou system, or a Russian Galileo system.
  • a power supply 1209 is used to power various components in the terminal 1200.
  • the power source 1209 can be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
  • the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery.
  • a wired rechargeable battery is a battery that is charged by a wired line
  • a wireless rechargeable battery is a battery that is charged by a wireless coil.
  • the rechargeable battery can also be used to support fast charging technology.
  • terminal 1200 also includes one or more sensors 1210.
  • the one or more sensors 1210 include, but are not limited to, an acceleration sensor 1211, a gyro sensor 1212, a pressure sensor 1213, a fingerprint sensor 1214, an optical sensor 1215, and a proximity sensor 1216.
  • the acceleration sensor 1211 can detect the magnitude of the acceleration on the three coordinate axes of the coordinate system established by the terminal 1200.
  • the acceleration sensor 1211 can be used to detect components of gravity acceleration on three coordinate axes.
  • the processor 1201 can control the touch display 1205 to display the user interface in a landscape view or a portrait view according to the gravity acceleration signal collected by the acceleration sensor 1211.
  • the acceleration sensor 1211 can also be used for the acquisition of game or user motion data.
  • the gyro sensor 1212 can detect the body direction and the rotation angle of the terminal 1200, and the gyro sensor 1212 can cooperate with the acceleration sensor 1211 to collect the 3D motion of the user to the terminal 1200. Based on the data collected by the gyro sensor 1212, the processor 1201 can implement functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • the pressure sensor 1213 may be disposed at a side border of the terminal 1200 and/or a lower layer of the touch display screen 1205.
  • the pressure sensor 1213 When the pressure sensor 1213 is disposed on the side frame of the terminal 1200, the user's holding signal to the terminal 1200 can be detected, and the processor 1201 performs left and right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 1213.
  • the operability control on the UI interface is controlled by the processor 1201 according to the user's pressure operation on the touch display screen 1205.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 1214 is configured to collect the fingerprint of the user, and the processor 1201 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 1214, or the fingerprint sensor 1214 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 1201 authorizes the user to perform related sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying and changing settings, and the like.
  • the fingerprint sensor 1214 can be disposed on the front, back, or side of the terminal 1200. When the physical button or vendor logo is set on the terminal 1200, the fingerprint sensor 1214 can be integrated with the physical button or the manufacturer logo.
  • Optical sensor 1215 is used to collect ambient light intensity.
  • the processor 1201 can control the display brightness of the touch display screen 1205 based on the ambient light intensity acquired by the optical sensor 1215. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 1205 is raised; when the ambient light intensity is low, the display brightness of the touch display screen 1205 is lowered.
  • the processor 1201 can also dynamically adjust the shooting parameters of the camera assembly 1206 based on the ambient light intensity acquired by the optical sensor 1215.
  • Proximity sensor 1216 also referred to as a distance sensor, is typically disposed on the front panel of terminal 1200. Proximity sensor 1216 is used to capture the distance between the user and the front of terminal 1200. In one embodiment, when the proximity sensor 1216 detects that the distance between the user and the front side of the terminal 1200 is gradually decreasing, the processor 1201 controls the touch display screen 1205 to switch from the bright screen state to the screen state; when the proximity sensor 1216 detects When the distance between the user and the front side of the terminal 1200 gradually becomes larger, the processor 1201 controls the touch display screen 1205 to switch from the state of the screen to the bright state.
  • FIG. 12 does not constitute a limitation to the terminal 1200, and may include more or less components than those illustrated, or some components may be combined, or different component arrangements may be employed.
  • the application further provides a computer readable storage medium, where the storage medium stores at least one instruction, at least one program, a code set or a set of instructions, the at least one instruction, the at least one program, the code set or The instruction set is loaded by the processor and executed to implement the processing method of the audio signal provided by the above method embodiment.
  • the present application also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the processing of the audio signals described in the various aspects above.
  • a plurality as referred to herein means two or more.
  • "and/or” describing the association relationship of the associated objects, indicating that there may be three relationships, for example, A and/or B, which may indicate that there are three cases where A exists separately, A and B exist at the same time, and B exists separately.
  • the character "/" generally indicates that the contextual object is an "or" relationship.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)

Abstract

本申请公开了一种音频信号的处理方法、装置、终端及存储介质,属于音频处理技术领域。所述方法包括:获取第一立体声音频信号;将第一立体声音频信号拆分为5.1声道音频信号;对5.1声道音频信号按照三维环绕的5.1虚拟音箱的音箱参数进行信号处理,得到处理后的5.1声道音频信号;将处理后的5.1声道音频信号,合成为第二立体声音频信号。本申请通过将第一立体声音频信号拆分为5.1声道音频信号,再将5.1声道音频信号处理并合成为第二立体声音频信号,通过双声道的音频播放单元播放该第二立体声音频信号使得用户获得5.1声道音频的立体声效果,解决了相关技术中仅播放双声道音频信号所带来的立体效果较差的问题,提高了音频播放的立体效果。

Description

音频信号的处理方法、装置、终端及存储介质
本申请要求于2017年12月26日提交的申请号为201711436811.6、发明名称为“音频信号的处理方法、装置及终端”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及音频处理技术领域,特别涉及一种音频信号的处理方法、装置、终端及存储介质。
背景技术
5.1声道包括:前置左声道、前置右声道、前置中央声道、后置左声道、后置右声道共5个声道,以及0.1声道。0.1声道也称低频声道或重低音声道。
很多电影都采用5.1声道音频信号进行音频录制和回放。相关技术中,用户需要购买支持5.1声道的音箱设备,将5.1声道音频信号输入至音频播放设备和功放设备,然后由功放设备将各个声道的音频信号,分别输出至5.1声道的音箱设备中进行播放。
但是当用户不具有支持5.1声道的音箱设备时,无法播放5.1声道音频信号。
发明内容
本申请实施例提供了一种音频信号的处理方法、装置、终端及存储介质,可以解决通过音频播放单元播放左声道音频信号和右声道音频信号时立体效果较差的问题。所述技术方案如下:
本申请实施例提供了一种音频信号的处理方法、装置及终端,可以解决当用户不具有支持5.1声道的音箱设备时,无法播放5.1声道音频信号的问题。所述技术方案如下:
一方面,本申请实施例提供了一种音频信号的处理方法,所述方法由终端执行,所述方法包括:
获取5.1声道音频信号;
根据5.1虚拟音箱在虚拟环境中的坐标,获取所述5.1虚拟音箱中每个虚拟音箱对应的头相关变换函数HRTF数据;
根据每个所述虚拟音箱对应的HRTF数据,对所述5.1声道音频信号中的相应声道音频信号进行处理,得到处理后的5.1声道音频信号;
将所述处理后的5.1声道音频信号,合成为立体声音频信号。
一方面,本申请实施例提供了一种音频信号处理装置,所述装置应用于终端中,所述装置包括:
第一获取模块,用于获取5.1声道音频信号;
第二获取模块,用于根据5.1虚拟音箱在虚拟环境中的坐标,获取所述5.1虚拟音箱中每个虚拟音箱对应的头相关变换函数HRTF数据;
处理模块,用于根据每个所述虚拟音箱对应的HRTF数据,对所述5.1声道音频信号中的相应声道音频信号进行处理,得到处理后的5.1声道音频信号;
合成模块,用于将所述处理后的5.1声道音频信号,合成为立体声音频信号。
一方面,本申请实施例提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现上述的音频信号的处理方法。
一方面,本申请实施例提供了一种终端,所述终端包括处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现上述的音频信号的处理方法。
本申请实施例提供的技术方案带来的有益效果是:
通过将5.1声道音频信号按照各个5.1虚拟音箱的HRTF数据进行处理后,合成得到立体声音频信号,使得用户只需要普通的立体声耳机或2.0音箱也能够播放5.1声道音频信号,且获得较好的播放音质。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了本申请一个示例性实施例提供的音频信号的处理方法的流程 图;
图2示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图;
图3示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图;
图4示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图;
图5示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图;
图6示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图;
图7示出了本申请一个示例性实施例提供的5.1声道虚拟音箱的摆放示意图;
图8示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图;
图9示出了本申请一个示例性实施例提供的HRTF数据的采集原理图;
图10示出了本申请一个示例性实施例提供的音频信号的处理装置的框图;
图11示出了本申请一个示例性实施例提供的音频信号的处理装置的框图;
图12示出了本申请一个示例性实施例提供的音频信号的处理装置的框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
图1,示出了本申请一个示例性的实施例提供的音频信号的处理方法的方法流程图,该方法可具有音频信号处理功能的终端执行,该方法包括:
步骤101,获取第一立体声音频信号。
终端读取本地存储的第一立体声音频信号,或通过有线或无线网络获取得到服务器上的第一立体声音频信号。
第一立体声音频信号是通过立体声录音设备录制声音得到的,立体声录音设备通常包括位于左侧的第一麦克风和位于右侧的第二麦克风,立体声录音设 备通过第一麦克风和第二麦克风分别录制左侧的声音和右侧的声音获得左声道音频信号和右声道音频信号,立体声录音设备将左声道音频信号和右声道音频信号叠加后获得第一立体声信号。
可选的,终端将接收到的第一立体声音频信号存储在终端的缓存中,第一立体声音频信号记为X_PCM。
终端将接收到的第一立体声音频信号以左声道音频信号和对应的右声道音频信号的采样对形式存储在一个内置的缓存区域,使用时从该缓存区域获取第一立体声音频信号。
步骤102,将第一立体声音频信号拆分为5.1声道音频信号。
终端通过预置的算法将第一立体声音频信号拆分为5.1声道音频信号。其中,5.1声道音频信号包括前置左声道信号、前置右声道信号、前置中央声道信号、低频声道信号、后置左声道信号和后置右声道信号。
步骤103,对5.1声道音频信号按照三维环绕的5.1虚拟音箱的音箱参数进行信号处理,得到处理后的5.1声道音频信号。
终端对5.1声道音频信号按照三维环绕的5.1虚拟音箱的音箱参数进行信号处理,得到处理后的5.1声道音频信号。
其中,处理后的5.1声道音频信号包括处理后的前置左声道信号、处理后的前置右声道信号、处理后的前置中央声道信号、处理后的后置左声道信号和处理后的后置右声道信号。
三维环绕的5.1虚拟音箱是终端预置的音频模型,其模拟了现实场景中环绕在用户周围的5.1声道音箱的播放效果。
现实场景中,以用户为中心,用户的脸朝向的方向为正前方,5.1声道的音箱包括位于用户左前方的前置左音箱、位于用户右前方的前置右音箱、位于用户正前方的前置中央音箱、低频音箱(不限定位置)、位于用户左后方的后置左音箱以及位于用户右后方的后置右音箱。
步骤104,将处理后的5.1声道音频信号,合成为第二立体声音频信号。
终端将处理后的5.1声道音频信号,合成为第二立体声音频信号。该第二立体声音频信号可通过普通的立体声耳机或2.0音箱等播放,用户在听到普通的立体声耳机或2.0音箱第二立体声音频信号后会有5.1声道立体声效果。
综上所述,本实施例提供的方法,通过将第一立体声音频信号拆分为5.1声道音频信号,再将5.1声道音频信号处理并合成为第二立体声音频信号,通 过双声道的音频播放单元播放该第二立体声音频信号使得用户获得5.1声道音频的立体声效果,解决了相关技术中仅播放双声道音频信号所带来的立体效果较差的问题,提高了音频播放的立体效果。
图1实施例中,将第一立体声音频信号拆分为5.1声道音频信号分为两个阶段,第一阶段是获取5.1声道音频信号中的5.0声道音频信号,下述图2、图3以及图4的实施例将对从第一立体声音频信号拆分出5.0声道音频信号进行阐述;第二阶段是获取5.1声道音频信号中的0.1声道音频信号,下述图5的实施例将对从第一立体声音频信号拆分出0.1声道音频信号进行阐述;第三阶段是将5.0声道音频信号和0.1声道音频信号合成为第二立体声音频信号,下述图6和图8的实施例提供了对该5.1声道音频信号进行处理和合成,得到第二立体声音频信号的方法。
图2,示出了本申请一个示例性的实施例提供的音频信号的处理方法的方法流程图,该方法可由具有音频信号处理功能的终端执行,该方法可以是图1实施例中的步骤102和步骤103的一种可选的实施方式,该方法包括:
步骤201,对第一立体声音频信号输入高通滤波器进行滤波,得到第一高频信号。
终端对第一立体声音频信号输入高通滤波器进行滤波,得到第一高频信号。其中,第一高频信号为第一左声道高频信号和第一右声道高频信号的叠加信号。
可选的,终端通过4阶的IIR高通滤波器对第一立体声滤波,得到第一高频信号。
步骤202,根据第一高频信号,计算得到左声道高频信号、中央声道高频信号和右声道高频信号。
终端将第一高频信号拆分为左声道高频信号、中央声道高频信号和右声道高频信号。其中,左声道高频信号包含前置左声道信号和后置左声道信号,中央声道高频信号包含前置中央声道信号,右声道高频信号包含前置右声道信号和后置右声道信号。
可选的,终端根据第一高频信号中计算得到中央声道高频信号,将第一左声道高频信号减去中央声道高频信号得到左声道高频信号,将第一右声道高频 信号减去中央声道高频信号得到右声道高频信号。
步骤203,根据左声道高频信号、中央声道高频信号和右声道高频信号,计算得到5.1声道音频信号中的前置左声道信号、前置右声道信号、前置中央声道信号、后置左声道信号和后置右声道信号。
终端根据左声道高频信号中计算得到前置左声道信号和后置左声道信号,根据右声道高频信号中计算得到前置右声道信号和后置右声道信号,根据中央声道高频信号计算得到前置中央声道信号。
可选的,终端提取左声道高频信号中的第一后方/混响信号数据、中央声道高频信号中的第二后方/混响信号数据、右声道高频信号中的第三后方/混响信号数据,根据第一后方/混响信号数据、第二后方/混响信号数据以及第三后方/混响信号数据计算前置左声道信号、后置左声道信号、前置右声道信号、后置右声道信号和前置中央声道信号。
步骤204,将前置左声道信号、前置右声道信号、前置中央声道信号、后置左声道信号和后置右声道信号分别与对应的音箱参数进行标量相乘,得到处理后的前置左声道信号、处理后的前置右声道信号、处理后的前置中央声道信号、处理后的后置左声道信号和处理后的后置右声道信号。
可选的,终端将所前置左声道信号与虚拟前置左声道音箱的音量V1进行标量相乘,得到处理后的前置左声道信号X_FL;将前置右声道信号与虚拟前置右声道音箱的音量V2进行标量相乘,得到处理后的前置右声道信号X_FR;将前置中央声道信号与虚拟前置中央声道音箱的音量V3进行标量相乘,得到处理后的前置中央声道信号X_FC;将后置左声道信号与虚拟后置左声道音箱的音量V4进行标量相乘,得到处理后的后置左声道信号X_RL;将所后置右声道信号与虚拟后置右声道音箱的音量V5进行标量相乘,得到处理后的后置右声道信号X_RR。
综上所述,本实施例提供的方法,通过将第一立体声音频信号滤波得到第一高频信号,根据第一高频信号计算得到左声道高频信号、中央声道高频信号和右声道高频信号,根据左声道高频信号、中央声道高频信号和右声道高频信号计算得到5.0声道音频信号,进而得到处理后5.0声道音频信号,从而实现了将第一高频信号从第一立体声音频信号中提取并拆分为5.1声道音频信号中的5.0声道音频信号,并进一步得到处理后的5.0声道音频信号。
图3,示出了本申请一个示例性的实施例提供的音频信号的处理方法的方法流程图,该方法应用于具有音频信号处理功能的终端中,该方法可以是图2实施例中的步骤202的一种可选的实施方式,该方法包括:
步骤301,对第一高频信号进行快速傅里叶变换(Fast Fourier transform,FFT),得到高频实数信号和高频虚数信号。
终端对第一高频信号进行快速傅里叶变换后,得到高频实数信号和高频虚数信号。
快速傅里叶变换是将时域的信号转化为频域信号的算法。本实施例中,第一高频信号通过快速傅里叶变换得到高频实数信号和高频虚数信号。其中,高频实数信号包括左声道高频实数信号和右声道高频实数信号,高频虚数信号包括左声道高频虚数信号和右声道高频虚数信号。
步骤302,根据高频实数信号和高频虚数信号计算向量投影。
终端将高频实数信号中的左声道高频实数信号和右声道高频实数信号相加,得到高频实数和信号。
示例性的,高频实数和信号通过以下公式计算:
sumRE=X_HIPASS_RE_L+X_HIPASS_RE_R
其中,X_HIPASS_RE_L为左声道高频实数信号,X_HIPASS_RE_R为右声道高频实数信号,sumRE为高频实数和信号。
终端将高频虚数信号中的左声道高频虚数信号和右声道高频虚数信号相加,得到高频虚数和信号。
示例性的,高频虚数和信号通过以下公式计算:
sumIM=X_HIPASS_IM_L+X_HIPASS_IM_R
其中,X_HIPASS_IM_L为左声道高频虚数信号,X_HIPASS_IM_R为右声道高频虚数信号,sumIM为高频虚数和信号。
终端将高频实数信号中的左声道高频实数信号和右声道高频实数信号相减,得到高频实数差信号。
示例性的,高频实数差信号通过以下公式计算:
diffRE=X_HIPASS_RE_L-X_HIPASS_RE_R
其中,diffRE为高频实数差信号。
终端将高频虚数信号中的左声道高频虚数信号和右声道高频虚数信号相减,得到高频虚数差信号。
示例性的,高频虚数差信号通过以下公式计算:
diffIM=X_HIPASS_IM_L-X_HIPASS_IM_R
其中,diffIM为高频虚数差信号。
终端根据高频实数和信号和所述高频虚数和信号,计算得到实数和信号。
示例性的,实数和信号通过以下公式计算:
sumSq=sumRE*sumRE+sumIM*sumIM
其中,sumSq为实数和信号。
终端根据高频实数差信号和所述高频虚数差信号,计算得到实数差信号。
示例性的,实数差信号通过以下公式计算:
diffSq=diffRE*diffRE+diffIM*diffIM
其中,diffSq为实数差信号。
终端根据实数和信号和实数差信号,进行向量投影计算,得到向量投影,向量投影代表了三维环绕的5.1虚拟音箱中每个虚拟音箱到用户的距离。
可选的,当实数和信号为有效数字时,即当实数和信号不是无穷小或0时,向量投影通过以下公式计算:
alpha=0.5–SQRT(diffSq/sumSq)*0.5
其中,alpha为向量投影,SQRT代表开平方,*代表标量乘积。
步骤303,对高频实数信号中的左声道高频实数信号和向量投影的乘积进行快速傅里叶逆变换(Inverse fast Fourier transform,IFFT)和交迭相加(Overlap-Add)后,得到中央声道高频信号。
快速傅里叶逆变换是将频域信号转换为时域信号的算法,本申请中,终端对高频实数信号中的左声道高频实数信号和向量投影的乘积进行快速傅里叶逆变换和交迭相加后,得到中央声道高频信号,其中,交迭相加是一种数学算法,具体可参考https://en.wikipedia.org/wiki/Overlap–add_method。中央声道高频信号可通过左声道高频实数信号或右声道高频实数信号计算,但是由于第一立体声信号中若只包含一个声道的音频信号,则音频信号大部分集中在左声道,因此中央高频信号通过左声道高频实数计算会更加准确。
步骤304,将第一高频信号中的左声道高频信号和中央声道信号的差,作为左声道高频信号。
终端将第一高频信号中的左声道高频信号和中央声道信号的差,作为左声道高频信号。
示例性的,左声道高频信号通过以下公式计算:
X_PRE_L=X_HIPASS_L-X_PRE_C
其中,X_HIPASS_L为第一高频信号中的左声道高频信号,X_PRE_C为中央声道信号,X_PRE_L为左声道高频信号。
步骤305,将第一高频信号中的右声道高频信号和中央声道信号的差,作为右声道高频信号。
终端将第一高频信号中的右声道高频信号和中央声道信号的差,作为右声道高频信号。
示例性的,右声道高频信号通过以下公式计算:
X_PRE_R=X_HIPASS_R-X_PRE_C
其中,X_HIPASS_R为第一高频信号中的右声道高频信号,X_PRE_C为中央声道信号,X_PRE_R为右声道高频信号。
步骤304和步骤305的执行顺序不加限定,终端可先执行步骤304再执行步骤305,或先执行步骤305再执行步骤304。
综上所述,本实施例提供的方法,通过将第一高频信号进行快速傅里叶变换得到高频实数信号和高频虚数信号,根据高频实数信号和高频虚数信号通过一些列计算得到中央高频信号,进而根据中央高频信号计算得到左声道高频信号和右声道高频信号,从而实现了根据第一高频信号计算得到左声道高频信号、中央声道高频信号和右声道高频信号。
图4,示出了本申请一个示例性的实施例提供的音频信号的处理方法的方法流程图,该方法可由具有音频信号处理功能的终端执行,该方法可以是图2实施例中的步骤203的一种可选的实施方式,该方法包括:
在步骤401中,对于左声道高频信号、中央声道高频信号和右声道高频信号中的任意一个声道高频信号,根据声道高频信号中的采样点得到至少一个移动窗,每个移动窗包括n个采样点,相邻的两个移动窗存在n/2个采样点是重叠的。
终端通过移动窗(Moving window)算法对左声道高频信号、中央声道高频信号和右声道高频信号中的任意一个声道高频信号,根据声道高频信号中的采样点得到至少一个移动窗。其中,若每个移动窗的采样点为n个,则相邻的两个移动窗之间n/2个采样点为重叠的,n≥1。
移动窗是一种类似交迭相加的算法,但只做交迭,不做相加。例如,数据A包含1024个采样点,若移动步长为128,重叠长度为64,那么移动窗每次输出的信号为:第一次输出A[0-128],第二次输出A[64-192],第三次输出A[128-256],……,其中,A为移动窗,方括号内为采样点的编号。
步骤402,计算移动窗中的低相关信号以及低相关信号的起始时间点,低相关信号包括幅度谱的第一衰减包络序列和相位谱的第二衰减包络序列不相等的信号。
终端对第i个移动窗中的采样点信号进行快速傅里叶变换,得到快速傅里叶变换后的采样点信号,i≥1。
终端根据预设的移动步长和重叠长度,对左声道高频信号、右声道高频信号和中央声道信号分别进行移动窗和快速傅里叶变换,依次得到左声道高频实数信号和左声道高频虚数信号(记为FFT_L)、右声道高频实数信号以及右声道高频虚数信号(记为FFT_R)、中央声道实数信号和中央声道虚数信号(记为FFT_C)。
终端计算快速傅里叶变换后的采样点信号的幅度谱和相位谱。
终端根据FFT_L计算左声道高频信号的幅度谱AMP_L以及左声道高频信号的相位谱PH_L;根据根据FFT_R计算右声道高频信号的幅度谱AMP_R以及左声道高频信号的相位谱PH_R;根据FFT_C计算中央声道信号的幅度谱AMP_C以及中央声道信号的相位谱PH_C。
以下将AMP_L、AMP_R以及AMP_C统一记为AMP_L/R/C;将PH_L、PH_R、PH_C统一记为PH_L/R/C。
终端根据快速傅里叶变换后的采样点信号的幅度谱,计算第i个移动窗中的m条频率线的第一衰减包络序列;根据快速傅里叶变换后的采样点信号的相位谱,计算第i个移动窗中的m条频率线的第二衰减包络序列;当m条频率线中的第j条频率线的衰减包络序列和第二衰减包络序列不同时,确定第j条频率线为低相关信号;根据第i个移动窗的窗口号和第j条频率线的频率线号,确定低相关信号的起始时间点,其中,m≥1,1≤j≤m。
终端对所有移动窗的AMP_L/R/C和PH_L/R/C分别计算其所有频率线的衰减包络序列和相关度,其中,计算移动窗之间的衰减包络序列,对应同一个移动窗的幅度谱和相位谱为有效条件。
例如,移动窗1、移动窗2、移动窗3对应的0号频率线的频率谱的衰减 包络序列分别为1.0、0.8、0.6,移动窗1、移动窗2、移动窗3对应的0号频率线的相位谱的衰减包络序列分别为1.0、0.8、1.0,则认为移动窗1的0号频率线和移动窗2的0号频率线具有高度相关性,移动窗2的0号频率线和移动窗3的0号频率线具有低度相关性。
n个采样点经过快速傅里叶变换后会得到n/2+1条频率线,取出低相关度的信号对应的移动窗的窗口号及频率线,通过窗口号可计算出该信号在X_PRE_L、X_PRE_R和X_PRE_C中的起始时间点。
步骤403,确定符合后方/混响特征的目标低相关信号。
可选的,终端通过以下方式确定符合后方/混响特征的目标低相关信号:
当低相关信号的甚高频率线的幅度谱能量小于第一阈值且甚高频率线所在窗口的相邻窗口的衰减包络斜率大于第二阈值时,终端确定低相关信号是符合后方/混响特征的目标低相关信号,其中,甚高频(Very high frequency,VHF)频率线是指频带由30Mhz到300MHz的频率线。
可选的,终端确定符合后方/混响特征的目标低相关信号的方法包括但不限于:
当低相关信号的甚高频率线的幅度谱能量小于第一阈值且甚高频率线所在窗口的相邻窗口的衰减速度大于第三阈值时,终端确定低相关信号是符合后方/混响特征的目标低相关信号。
步骤404,计算目标低相关信号的结束时间点。
可选的,终端通过以下方式计算低相关信号的结束时间点:
终端获取目标低相关信号的幅度谱对应的频率线的能量小于第四阈值的时间点,作为结束时间点。
可选的,终端通过以下方式计算低相关信号的结束时间点:
当目标低相关信号的能量小于下一个低相关信号的能量的1/n时,终端确定下一个低相关信号的起始时间点作为目标低相关信号的结束时间点。
步骤405,根据起始时间点和结束时间点提取目标低相关信号,作为声道高频信号中的后方/混响信号数据。
可选的,终端提取位于起始时间点和结束时间点中的声道信号片段;对声道信号片段进行快速傅里叶变换,得到快速傅里叶变换后的信号片段;从快速傅里叶变换后的信号片段中提取目标低相关信号对应的频率线,得到第一部分信号;对第一部分进行快速傅里叶逆变换和交迭相加后,得到声道高频信号中 的后方/混响信号数据。
通过上述步骤,终端得到左声道高频信号中的第一后方/混响信号数据、中央声道高频信号中的第二后方/混响信号数据、右声道高频信号中的第三后方/混响信号数据。
步骤406,根据第一后方/混响信号数据、第二后方/混响信号数据、第三后方/混响信号数据计算前置左声道信号、后置左声道信号、前置右声道信号、后置右声道信号和前置中央声道信号。
终端将左声道高频信号和上述步骤中获得的第一后方/混响信号数据的差,确定为前置左声道信号。
第一后方/混响信号数据是左声道高频信号中包含的音频数据,是三维环绕的5.1虚拟音箱的后置左声道信号包含的音频数据,而左声道高频信号包含前置左声道信号和部分后置左声道信号,因此将左声道高频信号减去部分后置左声道信号的部分,即第一后方/混响信号数据,即可得到前置左声道信号。
终端将上述步骤中获得的第一后方/混响信号数据和第二后方/混响信号数据的和,确定为后置左声道信号。
终端将右声道高频信号和上述步骤中获得第三后方/混响信号数据的差,确定为前置右声道信号。
第三后方/混响信号数据是右声道高频信号中包含的音频数据,是三维环绕的5.1虚拟音箱的后置右声道信号包含的音频数据,而右声道高频信号包含前置右声道信号和部分后置右声道信号,因此将右声道高频信号减去部分后置右声道信号的部分,即第三后方/混响信号数据,即可得到前置右声道信号。
终端将上述步骤中获得的第三后方/混响信号数据和第二后方/混响信号数据的和,确定为后置右声道信号。
终端将中央声道高频信号和上述步骤中获得的第二后方/混响信号数据的差,确定为前置中央声道信号。
第二后方/混响信号数据为三维环绕的5.1虚拟音箱的后置左声道信号包含的音频数据和后置右声道信号包含的音频数据,中央声道高频信号包括前置中央声道信号和第二后方/混响信号数据,因此将中央声道高频信号减去第二后方/混响信号数据。
综上所述,本实施例提供的方法,通过计算每个声道高频信号中的后方/混响信号数据的起始时间和结束时间提取每个声道高频信号中的后方/混响信 号数据,根据每个声道高频信号中的后方/混响信号数据计算得到前置左声道信号、后置左声道信号、前置右声道信号、后置右声道信号以及前置中央声道信号,提高了根据左声道高频信号、中央声道高频信号和右声道高频信号计算得到5.1声道音频信号的准确度。
图5,示出了本申请一个示例性的实施例提供的音频信号的处理方法的方法流程图,该方法可由具有音频信号处理功能的终端执行,该方法可以是图1实施例中步骤102的一个可选的实施例,该方法包括:
步骤501,对第一立体声音频信号输入低通滤波器进行滤波,得到第一低频信号。
终端对第一立体声音频信号输入低通滤波器进行滤波,得到第一低频信号。其中,第一低频信号为第一左声道低频信号和第一右声道低频信号的叠加信号。
可选的,终端通过4阶的IIR低通滤波器对第一立体声滤波,得到第一低频信号。
步骤502,对第一立低频信号和5.1虚拟音箱中的低频声道音箱的音量参数进行标量相乘,得到第二低频信号。
终端将第一低频信号和5.1虚拟音箱中的低频声道音箱的音量参数进行标量相乘,得到第二低频信号。
示例性的,终端通过以下公式计算第二低频信号:
X_LFE_S=X_LFE*V6
其中,X_LFE为第一立体声低频信号,V6为5.1虚拟音箱中的低频声道音箱的音量参数,X_LFE_S为第二低频信号,是第一左声道低频信号X_LFE_S_L和第一右声道低频信号X_LFE_S_R的叠加信号,*代表标量乘法。
步骤503,对第二低频信号进行单声道转换,得到处理后的低频声道信号。
终端对第二低频信号进行单声道转换,得到处理后的低频声道信号。
示例性的,终端通过以下公式计算处理后的低频声道信号:
X_LFE_M=(X_LFE_S_L+X_LFE_S_R)/2
其中,X_LFE_M为处理后的低频声道信号。
综上所述,本实施例提供的方法,通过将第一立体声音频信号滤波得到第一低频信号,将第一低频信号进行单声道转换,得到5.1声道音频信号中的低 频声道信号,从而实现了将第一低频信号从第一立体声信号中提取并拆分为5.1声道音频信号中的0.1声道音频信号。
上述方法实施例将第一立体声音频信号拆分并处理后,得到了5.1声道音频信号,分别为前置左声道信号、前置右声道信号、前置中央声道信号、低频声道信号、后置左声道信号和后置右声道信号,下述图6和图8的实施例提供了对该5.1声道音频信号进行处理和合成,得到第二立体声音频信号的方法,该方法可以是图1实施例中步骤104的可选的实施例,也可作为单独的实施例,图6和图8实施例中得到的立体声信号可以是上述方法实施例中的第二立体声信号。
HRTF(Head Related Transfer Function,头相关变换函数)处理技术是一种产生立体声环绕音效的处理技术。技术人员可以预先构建HRTF数据库,该HRTF数据库中记录有HRTF数据、HRTF数据采集点、HRTF数据采集点相对于参考人头的位置坐标之间的对应关系。HRTF数据是用于对左声道音频信号和右声道音频信号进行处理的一组参数。
图6,示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图,该方法可由具有音频信号处理功能的终端执行,该方法可以是图1实施例中的步骤104的一种可选的实施方式,该方法包括:
步骤601,获取得到5.1声道音频信号;
可选地,该5.1声道音频信号是上述图1至图5实施例中从第一立体声音频信号中拆分出并处理后得到的处理后的5.1声道音频信号。或者,该5.1声道音频信号是下载或从存储介质上读取到的5.1声道音频信号。
该5.1声道音频信号包括:前置左声道信号、前置右声道信号、前置中央声道信号、低频声道信号、后置左声道信号和后置右声道信号。
步骤602,根据5.1虚拟音箱在虚拟环境中的坐标,获取5.1虚拟音箱中每个虚拟音箱对应的HRTF数据;
可选地,5.1虚拟音箱包括:前置左声道虚拟音箱FL、前置右声道虚拟音箱FR、前置中央声道虚拟音箱FC、重低音虚拟音箱LFE、后置左声道虚拟音箱RL和后置右声道虚拟音箱RR。
可选地,该5.1虚拟音箱在虚拟环境中具有各自的坐标。该虚拟环境可以是二维平面虚拟环境,也可以是三维虚拟环境平面虚拟环境。
示意性的参考图7,其示出了一种5.1声道虚拟音箱在二维平面虚拟环境中的示意图,假设参考人头处于图7中的中心点70并朝向中央声道虚拟音箱FC所在位置,每个声道与参考人头所在的中心点70的距离相等且处于同一平面。
前置中央声道虚拟音箱FC处于参考人头的面对方向的正前方。
前置左声道虚拟音箱FL和前置右声道虚拟音箱FR分别处于前置中央声道FC的两侧,分别与参考人头的面对方向呈30度夹角,呈对称设置。
后置左声道虚拟音箱RL和后置右声道虚拟音箱RR分别处于参考人头的面对方向的两侧靠后,分别与参考人头的面对方向呈100-120度夹角,呈对称设置。
由于重低音虚拟音箱LFE的方向感较弱,重低音虚拟音箱LFE的摆放位置没有严格要求,本文中以参考人头的背对方向来举例说明,但本申请不对重低音虚拟音箱LFE与参考人头的面对方向的夹角作出限定。
需要说明的一点,上述5.1声道虚拟音箱中的每个虚拟音箱与参考人头的面对方向的夹角仅是示例性的,另外,每个虚拟音箱与参考人头之间的距离可以不同。当虚拟环境为三维虚拟环境时,每个虚拟音箱所在的高度也可以不同,每个虚拟音箱的摆放位置的不同都会引起声音信号的不同,本公开对此不作限定。
可选地,以参考人头为原点为二维虚拟环境或三维虚拟环境建立坐标系后,能够得到每个虚拟音箱在虚拟环境中的坐标。
终端内存储有HRTF数据库,该HRTF数据库包括:至少一个HRTF数据采集点和HRTF数据之间的对应关系,每个HRTF数据采集点具有各自的坐标。
终端根据5.1虚拟音箱中的第i个虚拟音箱的第i坐标,在HRTF数据库中查询与第i坐标最接近的HRTF数据采集点,将与第i坐标最接近的HRTF数据采集点的HRTF数据确定为第i个虚拟音箱的HRTF数据,i≥1。
步骤603,根据每个虚拟音箱对应的HRTF数据,对5.1声道音频信号中的相应声道音频信号进行处理,得到处理后的5.1声道音频信号;
可选地,每个HRTF数据包括左声道HRTF系数和右声道HRTF系数。
终端根据第i个虚拟音箱对应的HRTF数据中的左声道HRTF系数,对5.1 声道音频信号中的第i个声道音频信号进行处理,得到处理后的第i个声道音频信号对应的左声道分量;
终端根据第i个虚拟音箱对应的HRTF数据中的右声道HRTF系数,对5.1声道音频信号中的第i个声道音频信号进行处理,得到处理后的第i个声道音频信号对应的右声道分量。
步骤604,将处理后的5.1声道音频信号,合成为立体声音频信号。
需要说明的是,当本申请实施中的5.1声道音频信号是从是上述图1至图5实施例中从第一立体声音频信号中拆分出并处理后得到的处理后的5.1声道音频信号时,该步骤中的立体声音频信号是图1实施例中的第二立体声音频信号。
综上所述,本实施例提供的方法,通过将5.1声道音频信号按照各个5.1虚拟音箱的HRTF数据进行处理后,合成得到立体声音频信号,使得用户只需要普通的立体声耳机或2.0音箱也能够播放5.1声道音频信号,且获得较好的播放音质。
图8,示出了本申请一个示例性实施例提供的音频信号的处理方法的流程图,该方法可由具有音频信号处理功能的终端执行,该方法可以是图1实施例中的步骤104的一种可选的实施方式,该方法包括:
步骤801,在声学房间中采集一系列以参考人头为球心的至少一条HRTF数据,并记录各条HRTF数据对应HRTF数据采集点相对于参考人头的位置坐标;
参考图9,开发人员预先在声学房间91(房间四周设置有吸音海绵以减小回声干扰)中央放置参考人头92(模仿真人头部制成),并将微型全指向性麦克风分别设置在参考人头92的左右耳道内。
完成参考人头92设置后,开发人员在以参考人头92为球心的球体表面上,每隔预定距离设置HRTF数据采集点,并在HRTF数据采集点处使用扬声器93播放预定音频。
由于左右耳道到扬声器93的距离不同,且声波在传输过程中受到折射、绕射和衍射等因素影响,同一音频达到左右耳道时音频特征不同。因此,通过分析麦克风采集到的音频与原始音频的差异,即可得到HRTF数据采集点处的HRTF数据。其中,同一HRTF数据采集点对应的HRTF数据中,包含左声道 对应的左声道HRTF系数以及右声道对应的右声道HRTF系数。
步骤802,根据HRTF数据、HRTF数据采集点的标识和HRTF数据采集点的位置坐标,生成HRTF数据库;
可选地,以参考人头92为中心点建立坐标系。该坐标系的建立方式与5.1声道虚拟音箱的坐标系建立方式是相同的。
当5.1声道虚拟音箱对应的虚拟环境是二维虚拟环境时,在采集HRTF数据时也可以仅对参考人头92所在的水平面建立坐标系,仅采集属于该水平面的HRTF数据。比如,在以参考人头92为圆心的圆环上,每隔5°取一个点作为HRTF数据采样点。此时,可以减少终端所需要存储的HRTF数据量。
当5.1声道虚拟音箱对应的虚拟环境是三维虚拟环境时,在采集HRTF数据时可以对以参考人头92所在的三维环境建立坐标系,采集与该参考人头92为球心的球体表面上的HRTF数据。比如,在以参考人头92为球心的球体表面上,按照经度方向和纬度方向每隔5°取一个点作为HRTF数据采样点。
然后,终端根据每个HRTF数据采样点的标识、每个HRTF数据采样点的HRTF数据和每个HRTF数据采集点的位置坐标,生成HRTF数据库。
需要说明的是,步骤801和步骤802也可以由其它设备执行和实现。在生成HRTF数据库后,再通过网络或存储介质传输到当前终端上。
步骤803,获取5.1声道音频信号;
可选地,终端获取5.1声道音频信号。
该5.1声道音频信号是上述图1至图5实施例中从第一立体声音频信号中拆分出并处理后得到的处理后的5.1声道音频信号。或者,该5.1声道音频信号是下载或从存储介质上读取到的5.1声道音频信号。
该5.1声道音频信号包括:前置左声道信号X_FL、前置右声道信号X_FC、前置中央声道信号X_FC、低频声道信号X_LFE_M、后置左声道信号X_RL和后置右声道信号X_RR。
步骤804,获取HRTF数据库,HRTF数据库包括:至少一个HRTF数据采集点和HRTF数据之间的对应关系,每个HRTF数据采集点具有各自的坐标;
终端可以读取存储在本地的HRTF数据库,或者,访问存储在网络上的HRTF库。
步骤805,根据5.1虚拟音箱中的第i个虚拟音箱的第i坐标,在HRTF数据库中查询与第i坐标最接近的HRTF数据采集点,将与第i坐标最接近的 HRTF数据采集点的HRTF数据确定为第i个虚拟音箱的HRTF数据;
可选地,终端预先存储有5.1虚拟音箱中的各个虚拟音箱的坐标。其中,i≥1。
终端根据前置左声道虚拟音箱的第一坐标,在HRTF数据库中查询与第一坐标最接近的HRTF数据采集点,将与第一坐标最接近的HRTF数据采集点的HRTF数据确定为前置左声道虚拟音箱的HRTF数据。
终端根据前置右声道虚拟音箱的第二坐标,在HRTF数据库中查询与第二坐标最接近的HRTF数据采集点,将与第二坐标最接近的HRTF数据采集点的HRTF数据确定为前置右声道虚拟音箱的HRTF数据。
终端根据前置中央声道虚拟音箱的第三坐标,在HRTF数据库中查询与第三坐标最接近的HRTF数据采集点,将与第三坐标最接近的HRTF数据采集点的HRTF数据确定为前置中央声道虚拟音箱的HRTF数据。
终端根据后置左声道虚拟音箱的第四坐标,在HRTF数据库中查询与第四坐标最接近的HRTF数据采集点,将与第四坐标最接近的HRTF数据采集点的HRTF数据确定为后置左声道虚拟音箱的HRTF数据。
终端根据后置右声道虚拟音箱的第五坐标,在HRTF数据库中查询与第五坐标最接近的HRTF数据采集点,将与第五坐标最接近的HRTF数据采集点的HRTF数据确定为后置右声道虚拟音箱的HRTF数据。
终端根据低频虚拟音箱的第六坐标,在HRTF数据库中查询与第六坐标最接近的HRTF数据采集点,将与第六坐标最接近的HRTF数据采集点的HRTF数据确定为低频虚拟音箱的HRTF数据。
其中,“最接近”是指虚拟音箱的坐标和HRTF数据采样点的坐标相同或坐标间的距离最短。
步骤806,对于5.1声道音频信号中的第i个声道的音频信号,采用第i个虚拟音箱对应的HRTF数据中的左声道HRTF系数进行第一卷积,得到第一卷积后的第i个声道的音频信号;
设5.1声道音频信号中的第i个声道的音频信号为X_i,计算Li=X_i*H_L_i。其中,*表示卷积,H_L_i表示第i个虚拟音箱对应的HRTF数据中的左声道HRTF系数。
步骤807,将第一卷积后的各个声道的音频信号进行叠加,得到立体声音频信号中的左声道信号;
终端将第一卷积后的6个声道的音频信号Li进行叠加,得到立体声音频信号中的左声道信号L=L1+L2+L3+L4+L5+L6。
步骤808,对于5.1声道音频信号中的第i个声道的音频信号,采用第i个虚拟音箱对应的HRTF数据中的右声道HRTF系数进行第二卷积,得到第二卷积后的第i个声道的音频信号;
设5.1声道音频信号中的第i个声道的音频信号为X_i,计算Ri=X_i*H_R_i。其中,*表示卷积,H_R_i表示第i个虚拟音箱对应的HRTF数据中的右声道HRTF系数。
步骤809,将第二卷积后的各个声道的音频信号进行叠加,得到立体声音频信号中的右声道信号;
终端将第二卷积后的6个声道的音频信号Ri进行叠加,得到立体声音频信号中的右声道信号R=R1+R2+R3+R4+R5+R6。
步骤810,将左声道信号和右声道信号,合成为立体声音频信号。
该合成的立体声音频信号可以存储为音频文件,或者输入播放设备中进行播放。
需要说明的是,当本申请实施中的5.1声道音频信号是从是上述图1至图5实施例中从第一立体声音频信号中拆分出并处理后得到的处理后的5.1声道音频信号时,该步骤中的立体声音频信号是图1实施例中的第二立体声音频信号。
综上所述,本实施例提供的方法,通过将5.1声道音频信号按照各个5.1虚拟音箱的HRTF数据进行处理后,合成得到立体声音频信号,使得用户只需要普通的立体声耳机或2.0音箱也能够播放5.1声道音频信号,且获得较好的播放音质。
本实施例提供的方法,通过将5.1声道音频信号按照各个5.1虚拟音箱的HRTF数据分别进行卷积和叠加,能够获得具有较好的三维环绕音效的立体声音频信号,该立体声音频信号在播放时具有较好的三维环绕效果。
图10,示出了本申请一个示例性实施例提供的音频信号的处理装置的结构框图,该装置可以实现成为终端或终端中的一部分。该装置包括:
获取模块1010,用于获取第一立体声音频信号;
处理模块1020,用于将第一立体声音频信号拆分为5.1声道音频信号;对 5.1声道音频信号按照三维环绕的5.1虚拟音箱的音箱参数进行信号处理,得到处理后的5.1声道音频信号;
合成模块1030,用于将处理后的5.1声道音频信号,合成为立体声音频信号。
在一个可选的实施例中,该装置还包括计算模块1040;
处理模块1020,用于对第一立体声音频信号输入高通滤波器进行滤波,得到第一高频信号;
计算模块1040,用于根据第一高频信号,计算得到左声道高频信号、中央声道高频信号和右声道高频信号;根据左声道高频信号、中央声道高频信号和右声道高频信号,计算得到5.1声道音频信号中的前置左声道信号、前置右声道信号、前置中央声道信号、低频声道信号、后置左声道信号和后置右声道信号。
在一个可选的实施例中,计算模块1040,还用于对第一高频信号进行快速傅里叶变换,得到高频实数信号和高频虚数信号;根据高频实数信号和高频虚数信号计算向量投影;对高频实数信号中的左声道高频实数信号和计算向量投影的乘积进行快速傅里叶逆变换,得到中央声道高频信号;将第一高频信号中的左声道高频信号和中央声道信号的差,作为左声道高频信号;将第一高频信号中的右声道高频信号和中央声道信号的差,作为右声道高频信号。
计算模块1040,还用于将高频实数信号中的左声道高频实数信号和右声道高频实数信号相加,得到高频实数和信号;将高频虚数信号中的左声道高频虚数信号和右声道高频虚数信号相加,得到高频虚数和信号;将高频实数信号中的左声道高频实数信号和右声道高频实数信号相减,得到高频实数差信号;将高频虚数信号中的左声道高频虚数信号和右声道高频虚数信号相减,得到高频虚数差信号;根据高频实数和信号和高频虚数和信号,计算得到实数和信号;根据高频实数差信号和高频虚数差信号,计算得到实数差信号;根据实数和信号和实数差信号,进行向量投影计算,得到向量投影。
在一个可选的实施例中,
计算模块1040,还用于当实数和信号为有效数字时,按照如下公式计算向量投影:
alpha=0.5–SQRT(diffSQ/sumSQ)*0.5
其中,alpha为向量投影,diffSq为实数差信号,sumSQ为实数和信号, SQRT代表开平方,*代表标量乘法。
在一个可选的实施例中,
处理模块1020,还用于提取左声道高频信号中的第一后方/混响信号数据、中央声道高频信号中的第二后方/混响信号数据、右声道高频信号中的第三后方/混响信号数据;
计算模块1040,还用于将左声道高频信号和第一后方/混响信号数据的差,确定为前置左声道信号;将第一后方/混响信号数据和第二后方/混响信号数据的和,确定为后置左声道信号;将右声道高频信号和第三后方/混响信号数据的差,确定为前置右声道信号;将第三后方/混响信号数据和第二后方/混响信号数据的和,确定为后置右声道信号;将中央声道高频信号和第二后方/混响信号数据的差,确定为前置中央声道信号。
在一个可选的实施例中,获取模块1010,还用于对于左声道高频信号、中央声道高频信号和右声道高频信号中的任意一个声道高频信号,根据声道高频信号中的采样点得到至少一个移动窗,每个移动窗包括n个采样点,相邻的两个移动窗存在n/2个采样点是重叠的,n≥1。
计算模块1040,还用于计算移动窗中的低相关信号以及低相关信号的起始时间点,低相关信号包括幅度谱的第一衰减包络序列和相位谱的第二衰减包络序列不相等的信号;确定符合后方/混响特征的目标低相关信号;计算目标低相关信号的结束时间点;根据起始时间点和结束时间点提取目标低相关信号,作为声道高频信号中的后方/混响信号数据。
在一个可选的实施例中,计算模块1040,还用于计算移动窗中的低相关信号以及低相关信号的起始时间点,低相关信号包括幅度谱的第一衰减包络序列和相位谱的第二衰减包络序列不相等的信号;确定符合后方/混响特征的目标低相关信号;计算目标低相关信号的结束时间点;根据起始时间点和结束时间点提取目标低相关信号,作为声道高频信号中的后方/混响信号数据。
计算模块1040,还用于对第i个移动窗中的采样点信号进行快速傅里叶变换,得到快速傅里叶变换后的采样点信号;计算快速傅里叶变换后的采样点信号的幅度谱和相位谱;根据快速傅里叶变换后的采样点信号的幅度谱,计算第i个移动窗中的m条频率线的第一衰减包络序列;根据快速傅里叶变换后的采样点信号的相位谱,计算第i个移动窗中的m条频率线的第二衰减包络序列;当m条频率线中的第j条频率线的衰减包络序列和第二衰减包络序列不同时, 确定第j条频率线为低相关信号;根据第i个移动窗的窗口号和第j条频率线的频率线号,确定低相关信号的起始时间点,i≥1,m≥1,1≤j≤m。
在一个可选的实施例中,计算模块1040,还用于当低相关信号的甚高频率线的幅度谱能量小于第一阈值且甚高频率线所在窗口的相邻窗口的衰减包络斜率大于第二阈值时,确定低相关信号是符合后方/混响特征的目标低相关信号;或,当低相关信号的甚高频率线的幅度谱能量小于第一阈值且甚高频率线所在窗口的相邻窗口的衰减速度大于第三阈值时,确定低相关信号是符合后方/混响特征的目标低相关信号。
在一个可选的实施例中,计算模块1040,还用于获取目标低相关信号的幅度谱对应的频率线的能量小于第四阈值的时间点,作为结束时间点;或,当目标低相关信号的能量小于下一个低相关信号的能量的1/n时,确定下一个低相关信号的起始时间点作为目标低相关信号的结束时间点。
在一个可选的实施例中,获取模块1010,还用于提取位于起始时间点和结束时间点中的声道信号片段。
计算模块1040,还用于对声道信号片段进行快速傅里叶变换,得到快速傅里叶变换后的信号片段;从快速傅里叶变换后的信号片段中提取目标低相关信号对应的频率线,得到第一部分信号;对第一部分信号进行快速傅里叶逆变换和交迭相加后,得到声道高频信号中的后方/混响信号数据。
在一个可选的实施例中,计算模块1040,还用于将前置左声道信号与虚拟前置左声道音箱的音量进行标量相乘,得到处理后的前置左声道信号;将前置右声道信号与虚拟前置右声道音箱的音量进行标量相乘,得到处理后的前置右声道信号;将前置中央声道信号与虚拟前置中央声道音箱的音量进行标量相乘,得到处理后的前置中央声道信号;将后置左声道信号与虚拟后置左声道音箱的音量进行标量相乘,得到处理后的后置左声道信号;将后置右声道信号与虚拟后置右声道音箱的音量进行标量相乘,得到处理后的后置右声道信号。
在一个可选的实施例中,5.1声道音频信号包括低频声道信号;
处理模块1020,还用于对第一立体声音频信号输入低通滤波器进行滤波,得到第一低频信号。
计算模块1040,还用于对第一低频信号和5.1虚拟音箱中的低频声道音箱的音量参数进行标量相乘,得到第二低频信号;将第二低频信号进行单声道转换,得到处理后的低频声道信号。
在一个可选的实施例中,第二低频信号包括:左声道低频信号和右声道低频信号;
计算模块1040,还用于将左声道低频信号和右声道低频信号叠加后求平均,将平均后的音频信号,作为处理后的低频声道信号。
图11,示出了本申请一个示例性实施例提供的音频信号的处理装置的结构框图。该装置可以实现成为终端或终端中的一部分。该装置包括:
第一获取模块1120,用于获取5.1声道音频信号;
第二获取模块1140,用于根据5.1虚拟音箱在虚拟环境中的坐标,获取5.1虚拟音箱中每个虚拟音箱对应的头相关变换函数HRTF数据;
处理模块1160,用于根据每个虚拟音箱对应的HRTF数据,对5.1声道音频信号中的相应声道音频信号进行处理,得到处理后的5.1声道音频信号;
合成模块1180,用于将处理后的5.1声道音频信号,合成为立体声音频信号。
在一个可选的实施例中,第二获取模块1140,用于获取HRTF数据库,HRTF数据库包括:至少一个HRTF数据采集点和HRTF数据之间的对应关系,每个HRTF数据采集点具有各自的坐标;根据5.1虚拟音箱中的第i个虚拟音箱的第i坐标,在HRTF数据库中查询与第i坐标最接近的HRTF数据采集点,将与第i坐标最接近的HRTF数据采集点的HRTF数据确定为第i个虚拟音箱的HRTF数据,i≥1。
在一个可选的实施例中,该装置,还包括:
采集模块1112,用于在声学房间中采集一系列以参考人头为球心的至少一条HRTF数据,并记录各条HRTF数据对应HRTF数据采集点相对于参考人头的位置坐标;
生成模块1114,用于根据HRTF数据、HRTF数据采集点的标识和HRTF数据采集点的位置坐标,生成HRTF数据库。
在一个可选的实施例中,HRTF数据包括:左声道HRTF系数;
处理模块1160,包括:
左声道卷积单元,用于对于5.1声道音频信号中的第i个声道的音频信号,采用第i个虚拟音箱对应的HRTF数据中的左声道HRTF系数进行第一卷积,得到第一卷积后的第i个声道的音频信号;
左声道合成单元,用于将第一卷积后的各个声道的音频信号进行叠加,得到立体声音频信号中的左声道信号。
在一个可选的实施例中,HRTF数据包括:右声道HRTF系数;
处理模块1160,包括:
右声道卷积单元,用于对于5.1声道音频信号中的第i个声道的音频信号,采用第i个虚拟音箱对应的HRTF数据中的右声道HRTF系数进行第二卷积,得到第二卷积后的第i个声道的音频信号;
右声道合成单元,用于将第二卷积后的各个声道的音频信号进行叠加,得到立体声音频信号中的右声道信号。
图12,示出了本申请一个示例性实施例提供的终端1200的结构框图。该终端1200可以是:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑。终端1200还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
通常,终端1200包括有:处理器1201和存储器1202。
处理器1201可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器1201可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1201也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1201可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1201还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1202可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器1202还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例 中,存储器1202中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器1201所执行以实现本申请中各个方法实施例提供的音频信号的处理方法。
在一些实施例中,终端1200还可选包括有:外围设备接口1203和至少一个外围设备。处理器1201、存储器1202和外围设备接口1203之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口1203相连。具体地,外围设备包括:射频电路1204、触摸显示屏1205、摄像头1206、音频电路1207、定位组件1208和电源1209中的至少一种。
外围设备接口1203可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器1201和存储器1202。在一些实施例中,处理器1201、存储器1202和外围设备接口1203被集成在同一芯片或电路板上;在一些其他实施例中,处理器1201、存储器1202和外围设备接口1203中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
射频电路1204用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路1204通过电磁信号与通信网络以及其他通信设备进行通信。射频电路1204将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路1204包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路1204可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:万维网、城域网、内联网、各代移动通信网络(2G、3G、4G及5G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路1204还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本申请对此不加以限定。
显示屏1205用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏1205是触摸显示屏时,显示屏1205还具有采集在显示屏1205的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器1201进行处理。此时,显示屏1205还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏1205可以为一个,设置终端1200的前面板;在另一些实施例中,显示屏1205可以为至少两个,分别设置在终端1200的不同表面或呈折叠设计;在再一些实施例中,显示屏1205可以是柔性显示屏,设置在终端1200 的弯曲表面上或折叠面上。甚至,显示屏1205还可以设置成非矩形的不规则图形,也即异形屏。显示屏1205可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。
摄像头组件1206用于采集图像或视频。可选地,摄像头组件1206包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件1206还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。
音频电路1207可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器1201进行处理,或者输入至射频电路1204以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在终端1200的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器1201或射频电路1204的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路1207还可以包括耳机插孔。
定位组件1208用于定位终端1200的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。定位组件1208可以是基于美国的GPS(Global Positioning System,全球定位系统)、中国的北斗系统或俄罗斯的伽利略系统的定位组件。
电源1209用于为终端1200中的各个组件进行供电。电源1209可以是交流电、直流电、一次性电池或可充电电池。当电源1209包括可充电电池时,该可充电电池可以是有线充电电池或无线充电电池。有线充电电池是通过有线线路充电的电池,无线充电电池是通过无线线圈充电的电池。该可充电电池还可以用于支持快充技术。
在一些实施例中,终端1200还包括有一个或多个传感器1210。该一个或 多个传感器1210包括但不限于:加速度传感器1211、陀螺仪传感器1212、压力传感器1213、指纹传感器1214、光学传感器1215以及接近传感器1216。
加速度传感器1211可以检测以终端1200建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器1211可以用于检测重力加速度在三个坐标轴上的分量。处理器1201可以根据加速度传感器1211采集的重力加速度信号,控制触摸显示屏1205以横向视图或纵向视图进行用户界面的显示。加速度传感器1211还可以用于游戏或者用户的运动数据的采集。
陀螺仪传感器1212可以检测终端1200的机体方向及转动角度,陀螺仪传感器1212可以与加速度传感器1211协同采集用户对终端1200的3D动作。处理器1201根据陀螺仪传感器1212采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。
压力传感器1213可以设置在终端1200的侧边框和/或触摸显示屏1205的下层。当压力传感器1213设置在终端1200的侧边框时,可以检测用户对终端1200的握持信号,由处理器1201根据压力传感器1213采集的握持信号进行左右手识别或快捷操作。当压力传感器1213设置在触摸显示屏1205的下层时,由处理器1201根据用户对触摸显示屏1205的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。
指纹传感器1214用于采集用户的指纹,由处理器1201根据指纹传感器1214采集到的指纹识别用户的身份,或者,由指纹传感器1214根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器1201授权该用户执行相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器1214可以被设置终端1200的正面、背面或侧面。当终端1200上设置有物理按键或厂商Logo时,指纹传感器1214可以与物理按键或厂商Logo集成在一起。
光学传感器1215用于采集环境光强度。在一个实施例中,处理器1201可以根据光学传感器1215采集的环境光强度,控制触摸显示屏1205的显示亮度。具体地,当环境光强度较高时,调高触摸显示屏1205的显示亮度;当环境光强度较低时,调低触摸显示屏1205的显示亮度。在另一个实施例中,处理器1201还可以根据光学传感器1215采集的环境光强度,动态调整摄像头组件 1206的拍摄参数。
接近传感器1216,也称距离传感器,通常设置在终端1200的前面板。接近传感器1216用于采集用户与终端1200的正面之间的距离。在一个实施例中,当接近传感器1216检测到用户与终端1200的正面之间的距离逐渐变小时,由处理器1201控制触摸显示屏1205从亮屏状态切换为息屏状态;当接近传感器1216检测到用户与终端1200的正面之间的距离逐渐变大时,由处理器1201控制触摸显示屏1205从息屏状态切换为亮屏状态。
本领域技术人员可以理解,图12中示出的结构并不构成对终端1200的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
本申请还提供一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现上述方法实施例提供的音频信号的处理方法。
可选地,本申请还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的音频信号的处理方法。
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (10)

  1. 一种音频信号的处理方法,其特征在于,所述方法由终端执行,所述方法包括:
    获取5.1声道音频信号;
    根据5.1虚拟音箱在虚拟环境中的坐标,获取所述5.1虚拟音箱中每个虚拟音箱对应的头相关变换函数HRTF数据;
    根据每个所述虚拟音箱对应的HRTF数据,对所述5.1声道音频信号中的相应声道音频信号进行处理,得到处理后的5.1声道音频信号;
    将所述处理后的5.1声道音频信号,合成为立体声音频信号。
  2. 根据权利要求1所述的方法,其特征在于,所述根据5.1虚拟音箱在虚拟环境中的坐标,获取所述5.1虚拟音箱中每个虚拟音箱对应的头相关变换函数HRTF数据,包括:
    获取HRTF数据库,所述HRTF数据库包括:至少一个HRTF数据采集点和HRTF数据之间的对应关系,每个所述HRTF数据采集点具有各自的坐标;
    根据所述5.1虚拟音箱中的第i个虚拟音箱的第i坐标,在所述HRTF数据库中查询与所述第i坐标最接近的HRTF数据采集点,将与所述第i坐标最接近的HRTF数据采集点的HRTF数据确定为所述第i个虚拟音箱的HRTF数据,i≥1。
  3. 根据权利要求2所述的方法,其特征在于,所述获取HRTF数据库之前,还包括:
    在声学房间中采集一系列以参考人头为球心的至少一条HRTF数据,并记录各条HRTF数据对应HRTF数据采集点相对于所述参考人头的位置坐标;
    根据所述HRTF数据、所述HRTF数据采集点的标识和所述HRTF数据采集点的位置坐标,生成所述HRTF数据库。
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述HRTF数据包括:左声道HRTF系数;
    所述根据每个所述虚拟音箱对应的HRTF数据,对所述5.1声道音频信号中 的相应声道音频信号进行处理,得到处理后的5.1声道音频信号,包括:
    对于所述5.1声道音频信号中的第i个声道的音频信号,采用第i个虚拟音箱对应的HRTF数据中的左声道HRTF系数进行第一卷积,得到所述第一卷积后的第i个声道的左声道分量;
    将所述第一卷积后的各个声道的左声道分量进行叠加,得到所述立体声音频信号中的左声道信号。
  5. 根据权利要求1至3任一所述的方法,其特征在于,所述HRTF数据包括:右声道HRTF系数;
    所述根据每个所述虚拟音箱对应的HRTF数据,对所述5.1声道音频信号中的相应声道音频信号进行处理,得到处理后的5.1声道音频信号,包括:
    对于所述5.1声道音频信号中的第i个声道的音频信号,采用第i个虚拟音箱对应的HRTF数据中的右声道HRTF系数进行第二卷积,得到所述第二卷积后的第i个声道的右声道分量;
    将所述第二卷积后的各个声道的右声道分量进行叠加,得到所述立体声音频信号中的右声道信号。
  6. 一种音频信号的处理装置,其特征在于,所述装置应用于终端中,所述装置包括:
    第一获取模块,用于获取5.1声道音频信号;
    第二获取模块,用于根据5.1虚拟音箱在虚拟环境中的坐标,获取所述5.1虚拟音箱中每个虚拟音箱对应的头相关变换函数HRTF数据;
    处理模块,用于根据每个所述虚拟音箱对应的HRTF数据,对所述5.1声道音频信号中的相应声道音频信号进行处理,得到处理后的5.1声道音频信号;
    合成模块,用于将所述处理后的5.1声道音频信号,合成为立体声音频信号。
  7. 根据权利要求6所述的装置,其特征在于,
    所述第二获取模块,用于获取HRTF数据库,所述HRTF数据库包括:至少一个HRTF数据采集点和HRTF数据之间的对应关系,每个所述HRTF数据采集点具有各自的坐标;根据所述5.1虚拟音箱中的第i个虚拟音箱的第i坐标,在所述HRTF数据库中查询与所述第i坐标最接近的HRTF数据采集点,将与所 述第i坐标最接近的HRTF数据采集点的HRTF数据确定为所述第i个虚拟音箱的HRTF数据,i≥1。
  8. 根据权利要求7所述的装置,其特征在于,所述装置,还包括:
    采集模块,用于在声学房间中采集一系列以参考人头为球心的至少一条HRTF数据,并记录各条HRTF数据对应HRTF数据采集点相对于所述参考人头的位置坐标;
    生成模块,用于根据所述HRTF数据、所述HRTF数据采集点的标识和所述HRTF数据采集点的位置坐标,生成所述HRTF数据库。
  9. 一种终端,其特征在于,所述终端包括处理器和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器加载并执行以实现如权利要求1至5任一所述的音频信号处理方法。
  10. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1至5任一所述的音频信号处理方法。
PCT/CN2018/118766 2017-12-26 2018-11-30 音频信号的处理方法、装置、终端及存储介质 WO2019128630A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/617,986 US10924877B2 (en) 2017-12-26 2018-11-30 Audio signal processing method, terminal and storage medium thereof
EP18895910.0A EP3624463A4 (en) 2017-12-26 2018-11-30 AUDIO SIGNAL PROCESSING METHOD AND DEVICE, TERMINAL DEVICE AND STORAGE MEDIUM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711436811.6 2017-12-26
CN201711436811.6A CN108156561B (zh) 2017-12-26 2017-12-26 音频信号的处理方法、装置及终端

Publications (1)

Publication Number Publication Date
WO2019128630A1 true WO2019128630A1 (zh) 2019-07-04

Family

ID=62461968

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/118766 WO2019128630A1 (zh) 2017-12-26 2018-11-30 音频信号的处理方法、装置、终端及存储介质

Country Status (4)

Country Link
US (1) US10924877B2 (zh)
EP (1) EP3624463A4 (zh)
CN (1) CN108156561B (zh)
WO (1) WO2019128630A1 (zh)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107863095A (zh) 2017-11-21 2018-03-30 广州酷狗计算机科技有限公司 音频信号处理方法、装置和存储介质
CN108156575B (zh) 2017-12-26 2019-09-27 广州酷狗计算机科技有限公司 音频信号的处理方法、装置及终端
CN108156561B (zh) 2017-12-26 2020-08-04 广州酷狗计算机科技有限公司 音频信号的处理方法、装置及终端
CN108831425B (zh) * 2018-06-22 2022-01-04 广州酷狗计算机科技有限公司 混音方法、装置及存储介质
TWI698132B (zh) * 2018-07-16 2020-07-01 宏碁股份有限公司 音效輸出裝置、運算裝置及其音效控制方法
CN109036457B (zh) 2018-09-10 2021-10-08 广州酷狗计算机科技有限公司 恢复音频信号的方法和装置
WO2020102994A1 (zh) * 2018-11-20 2020-05-28 深圳市欢太科技有限公司 3d音效实现方法、装置、存储介质及电子设备
CN113875265A (zh) * 2020-04-20 2021-12-31 深圳市大疆创新科技有限公司 音频信号处理方法、音频处理装置及录音设备
CN111866644B (zh) * 2020-07-14 2022-02-22 歌尔科技有限公司 蓝牙耳机、蓝牙耳机的检测方法、设备及存储介质
CN112073890B (zh) * 2020-09-11 2022-08-02 成都极米科技股份有限公司 音频数据处理方法、装置和终端设备
CN112492380B (zh) * 2020-11-18 2023-06-30 腾讯科技(深圳)有限公司 音效调整方法、装置、设备及存储介质
CN113194400B (zh) * 2021-07-05 2021-08-27 广州酷狗计算机科技有限公司 音频信号的处理方法、装置、设备及存储介质
CN113873420B (zh) * 2021-09-28 2023-06-23 联想(北京)有限公司 音频数据处理方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1294782A (zh) * 1998-03-25 2001-05-09 雷克技术有限公司 音频信号处理方法和装置
CN1402592A (zh) * 2002-07-23 2003-03-12 华南理工大学 两扬声器虚拟5.1通路环绕声的信号处理方法
US20090185693A1 (en) * 2008-01-18 2009-07-23 Microsoft Corporation Multichannel sound rendering via virtualization in a stereo loudspeaker system
CN103237287A (zh) * 2013-03-29 2013-08-07 华南理工大学 具定制功能的5.1通路环绕声耳机重放信号处理方法
CN107040862A (zh) * 2016-02-03 2017-08-11 腾讯科技(深圳)有限公司 音频处理方法及处理系统
CN108156561A (zh) * 2017-12-26 2018-06-12 广州酷狗计算机科技有限公司 音频信号的处理方法、装置及终端
CN108156575A (zh) * 2017-12-26 2018-06-12 广州酷狗计算机科技有限公司 音频信号的处理方法、装置及终端

Family Cites Families (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742689A (en) * 1996-01-04 1998-04-21 Virtual Listening Systems, Inc. Method and device for processing a multichannel signal for use with a headphone
AUPP271598A0 (en) * 1998-03-31 1998-04-23 Lake Dsp Pty Limited Headtracked processing for headtracked playback of audio signals
US20020159607A1 (en) 2001-04-26 2002-10-31 Ford Jeremy M. Method for using source content information to automatically optimize audio signal
TWI236307B (en) 2002-08-23 2005-07-11 Via Tech Inc Method for realizing virtual multi-channel output by spectrum analysis
US6937737B2 (en) * 2003-10-27 2005-08-30 Britannia Investment Corporation Multi-channel audio surround sound from front located loudspeakers
US20050273324A1 (en) * 2004-06-08 2005-12-08 Expamedia, Inc. System for providing audio data and providing method thereof
EP1761110A1 (en) * 2005-09-02 2007-03-07 Ecole Polytechnique Fédérale de Lausanne Method to generate multi-channel audio signals from stereo signals
CN1937854A (zh) * 2005-09-22 2007-03-28 三星电子株式会社 用于再现双声道虚拟声音的装置和方法
CN100588288C (zh) 2005-12-09 2010-02-03 华南理工大学 双通路立体声信号模拟5.1通路环绕声的信号处理方法
KR100829560B1 (ko) * 2006-08-09 2008-05-14 삼성전자주식회사 멀티채널 오디오 신호의 부호화/복호화 방법 및 장치,멀티채널이 다운믹스된 신호를 2 채널로 출력하는 복호화방법 및 장치
US8688441B2 (en) 2007-11-29 2014-04-01 Motorola Mobility Llc Method and apparatus to facilitate provision and use of an energy value to determine a spectral envelope shape for out-of-signal bandwidth content
TWI475896B (zh) * 2008-09-25 2015-03-01 Dolby Lab Licensing Corp 單音相容性及揚聲器相容性之立體聲濾波器
CN101902679B (zh) 2009-05-31 2013-07-24 比亚迪股份有限公司 立体声音频信号模拟5.1声道音频信号的处理方法
US8000485B2 (en) * 2009-06-01 2011-08-16 Dts, Inc. Virtual audio processing for loudspeaker or headphone playback
CN101645268B (zh) 2009-08-19 2012-03-14 李宋 一种演唱和演奏的计算机实时分析系统
CN101695151B (zh) 2009-10-12 2011-12-21 清华大学 多声道音频信号变换为双声道音频信号的方法和设备
EP2489206A1 (fr) * 2009-10-12 2012-08-22 France Telecom Traitement de donnees sonores encodees dans un domaine de sous-bandes
CN102883245A (zh) * 2011-10-21 2013-01-16 郝立 3d幻音
CN102568470B (zh) 2012-01-11 2013-12-25 广州酷狗计算机科技有限公司 一种音频文件音质识别方法及其系统
TWI479905B (zh) * 2012-01-12 2015-04-01 Univ Nat Central Multi-channel down mixing device
US9484008B2 (en) 2012-03-05 2016-11-01 Institut Fur Rundfunktechnik Gmbh Method and apparatus for down-mixing of a multi-channel audio signal
KR101897455B1 (ko) 2012-04-16 2018-10-04 삼성전자주식회사 음질 향상 장치 및 방법
CN203206451U (zh) * 2012-07-30 2013-09-18 郝立 一种3d音频处理系统
JP6576934B2 (ja) 2014-01-07 2019-09-18 ハーマン インターナショナル インダストリーズ インコーポレイテッド 圧縮済みオーディオ信号の信号品質ベース強調及び補償
CN104091601A (zh) 2014-07-10 2014-10-08 腾讯科技(深圳)有限公司 音乐品质检测方法和装置
CN104103279A (zh) 2014-07-16 2014-10-15 腾讯科技(深圳)有限公司 音乐真实品质判断方法和系统
CN104581602B (zh) 2014-10-27 2019-09-27 广州酷狗计算机科技有限公司 录音数据训练方法、多轨音频环绕方法及装置
KR102033603B1 (ko) 2014-11-07 2019-10-17 삼성전자주식회사 오디오 신호를 복원하는 방법 및 장치
US10063989B2 (en) * 2014-11-11 2018-08-28 Google Llc Virtual sound systems and methods
CN104464725B (zh) 2014-12-30 2017-09-05 福建凯米网络科技有限公司 一种唱歌模仿的方法与装置
US10123120B2 (en) 2016-03-15 2018-11-06 Bacch Laboratories, Inc. Method and apparatus for providing 3D sound for surround sound configurations
WO2017165968A1 (en) * 2016-03-29 2017-10-05 Rising Sun Productions Limited A system and method for creating three-dimensional binaural audio from stereo, mono and multichannel sound sources
CN105788612B (zh) 2016-03-31 2019-11-05 广州酷狗计算机科技有限公司 一种检测音质的方法和装置
CN105869621B (zh) 2016-05-20 2019-10-25 广州华多网络科技有限公司 音频合成装置及其音频合成的方法
CN105872253B (zh) 2016-05-31 2020-07-07 腾讯科技(深圳)有限公司 一种直播声音处理方法及移动终端
CN106652986B (zh) 2016-12-08 2020-03-20 腾讯音乐娱乐(深圳)有限公司 一种歌曲音频拼接方法及设备
CN107172566B (zh) 2017-05-11 2019-01-01 广州酷狗计算机科技有限公司 音频处理方法及装置
CN107863095A (zh) 2017-11-21 2018-03-30 广州酷狗计算机科技有限公司 音频信号处理方法、装置和存储介质
CN109036457B (zh) 2018-09-10 2021-10-08 广州酷狗计算机科技有限公司 恢复音频信号的方法和装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1294782A (zh) * 1998-03-25 2001-05-09 雷克技术有限公司 音频信号处理方法和装置
CN1402592A (zh) * 2002-07-23 2003-03-12 华南理工大学 两扬声器虚拟5.1通路环绕声的信号处理方法
US20090185693A1 (en) * 2008-01-18 2009-07-23 Microsoft Corporation Multichannel sound rendering via virtualization in a stereo loudspeaker system
CN103237287A (zh) * 2013-03-29 2013-08-07 华南理工大学 具定制功能的5.1通路环绕声耳机重放信号处理方法
CN107040862A (zh) * 2016-02-03 2017-08-11 腾讯科技(深圳)有限公司 音频处理方法及处理系统
CN108156561A (zh) * 2017-12-26 2018-06-12 广州酷狗计算机科技有限公司 音频信号的处理方法、装置及终端
CN108156575A (zh) * 2017-12-26 2018-06-12 广州酷狗计算机科技有限公司 音频信号的处理方法、装置及终端

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3624463A4 *

Also Published As

Publication number Publication date
EP3624463A1 (en) 2020-03-18
US20200112812A1 (en) 2020-04-09
CN108156561A (zh) 2018-06-12
CN108156561B (zh) 2020-08-04
EP3624463A4 (en) 2020-11-18
US10924877B2 (en) 2021-02-16

Similar Documents

Publication Publication Date Title
WO2019128630A1 (zh) 音频信号的处理方法、装置、终端及存储介质
WO2019128629A1 (zh) 音频信号的处理方法、装置、终端及存储介质
CN108401124B (zh) 视频录制的方法和装置
CN111050250B (zh) 降噪方法、装置、设备和存储介质
CN113192527B (zh) 用于消除回声的方法、装置、电子设备和存储介质
US11315582B2 (en) Method for recovering audio signals, terminal and storage medium
CN110931053B (zh) 检测录音时延、录制音频的方法、装置、终端及存储介质
WO2019105238A1 (zh) 重构语音信号的方法、终端及计算机存储介质
CN108335703B (zh) 确定音频数据的重音位置的方法和装置
CN109192218B (zh) 音频处理的方法和装置
CN111402913A (zh) 降噪方法、装置、设备和存储介质
CN109065068B (zh) 音频处理方法、装置及存储介质
CN109243479B (zh) 音频信号处理方法、装置、电子设备及存储介质
CN111445901A (zh) 音频数据获取方法、装置、电子设备及存储介质
CN109003621B (zh) 一种音频处理方法、装置及存储介质
CN112133332B (zh) 播放音频的方法、装置及设备
CN109448676B (zh) 音频处理方法、装置及存储介质
CN109360582B (zh) 音频处理方法、装置及存储介质
CN114339582B (zh) 双通道音频处理、方向感滤波器生成方法、装置以及介质
CN108196813B (zh) 添加音效的方法和装置
CN113209610B (zh) 虚拟场景画面展示方法、装置、计算机设备及存储介质
CN110708582B (zh) 同步播放的方法、装置、电子设备及介质
CN110443841B (zh) 地面深度的测量方法、装置及系统
CN113099373A (zh) 声场宽度扩展的方法、装置、终端及存储介质
CN111916105A (zh) 语音信号处理方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18895910

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE