WO2023092505A1 - 一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质 - Google Patents

一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质 Download PDF

Info

Publication number
WO2023092505A1
WO2023092505A1 PCT/CN2021/133722 CN2021133722W WO2023092505A1 WO 2023092505 A1 WO2023092505 A1 WO 2023092505A1 CN 2021133722 W CN2021133722 W CN 2021133722W WO 2023092505 A1 WO2023092505 A1 WO 2023092505A1
Authority
WO
WIPO (PCT)
Prior art keywords
channel signal
decorrelation
processing
correlation coefficient
cross
Prior art date
Application number
PCT/CN2021/133722
Other languages
English (en)
French (fr)
Inventor
高硕�
Original Assignee
北京小米移动软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京小米移动软件有限公司 filed Critical 北京小米移动软件有限公司
Priority to PCT/CN2021/133722 priority Critical patent/WO2023092505A1/zh
Priority to CN202180004116.0A priority patent/CN114258568A/zh
Publication of WO2023092505A1 publication Critical patent/WO2023092505A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0017Lossless audio signal coding; Perfect reconstruction of coded audio signal by transmission of coding error

Definitions

  • the present disclosure relates to the field of communication technologies, and in particular to a stereo audio signal processing method, device, encoding device, decoding device and storage medium.
  • lossless coding can meet the demands of high-quality audio playback and lossless storage, it is widely used.
  • lossless encoding is performed on stereo audio signals, it is necessary to perform decorrelation processing on the stereo audio signals first, so as to improve the encoding compression rate.
  • the main method of de-correlation processing is: setting a threshold value between 0 and 1, and calculating the correlation coefficient between the left channel signal and the right channel signal of the current frame of the stereo audio signal, when When the correlation coefficient is greater than the threshold, it indicates that the left channel signal and the right channel signal of the current frame are correlated, that is, the current frame is a correlated signal, and then the left channel signal and the right channel signal of the current frame are subjected to decorrelation processing;
  • the correlation coefficient is less than or equal to the threshold, the system considers that the left and right channel signals of the current frame are irrelevant, and processes them as uncorrelated signals, that is, directly determines the current frame of the stereo audio signal as the two-channel signals after decorrelation processing.
  • the current frame when it is a related signal, it specifically includes two types of correlation, that is, a partial positive-phase signal and a partial negative-phase signal.
  • two types of correlation that is, a partial positive-phase signal and a partial negative-phase signal.
  • the decorrelation processing adopted by different correlation forms is different, and the decorrelation processing in the correlation technology can only improve the encoding compression ratio of the partial positive phase signal, but cannot improve the partial reverse phase signal.
  • the encoding compression rate of the signal in order to achieve the purpose of improving the compression ratio, the decorrelation processing adopted by different correlation forms is different, and the decorrelation processing in the correlation technology can only improve the encoding compression ratio of the partial positive phase signal, but cannot improve the partial reverse phase signal.
  • the encoding compression rate of the signal in order to achieve the purpose of improving the compression ratio, the decorrelation processing adopted by different correlation forms is different, and the decorrelation processing in the correlation technology can only improve the encoding compression ratio of the partial positive phase signal, but cannot improve the partial reverse phase signal
  • the present disclosure proposes a stereo audio signal processing method, device, user equipment, network side equipment and storage medium to solve the technical problem of low coding compression rate in the decorrelation processing method in the related art.
  • the stereo audio signal processing method proposed in an embodiment of the present disclosure is applied to a coding device, including:
  • the first cross-correlation coefficient being less than the first threshold
  • the second cross-correlation coefficient of the two-channel signals in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determining that the flag bit is the first value, and encoding the two-channel signals based on the decorrelation processing code stream and write the flag bit into the encoded code stream, and the value range of the first threshold is (-1, 0).
  • the stereo audio signal processing method proposed in another embodiment of the present disclosure is applied to a decoding device, including:
  • decorrelation reconstruction is performed on the two-channel signals after decorrelation processing by using a first decorrelation reconstruction manner, and a decorrelation reconstructed audio signal is output.
  • the stereo audio signal processing device proposed by the embodiment includes:
  • a determination module configured to determine the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal
  • a processing module configured to, in response to the first cross-correlation coefficient being smaller than a first threshold, perform decorrelation processing on the current frame of the stereo audio signal in a first decorrelation processing manner to obtain two-channel signals after decorrelation processing, and calculate the obtained
  • the second cross-correlation coefficient of the two-channel signals after the decorrelation processing in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine that the flag bit is the first value, and, based on the two after the decorrelation processing
  • the coded code stream is obtained from the channel signal and the flag bit is written into the coded code stream, and the value range of the first threshold is (-1, 0).
  • the stereo audio signal processing device proposed by the embodiment includes:
  • An acquisition module configured to acquire the encoded code stream sent by the encoding device
  • a determining module configured to determine the two-channel signal and the flag bit after decorrelation processing based on the encoded code stream
  • the processing module is configured to, in response to the first value of the flag bit, perform decorrelation reconstruction on the two-channel signals after decorrelation processing in a first decorrelation reconstruction manner, and output a decorrelation reconstructed audio signal.
  • an embodiment provides a communication device, the device includes a processor and a memory, a computer program is stored in the memory, and the processor executes the computer program stored in the memory, so that the The device executes the method provided in the embodiment of the foregoing aspect.
  • an embodiment provides a communication device, the device includes a processor and a memory, a computer program is stored in the memory, and the processor executes the computer program stored in the memory, so that the The device executes the method provided in the above embodiment of another aspect.
  • a communication device provided by an embodiment of another aspect of the present disclosure includes: a processor and an interface circuit;
  • the interface circuit is used to receive code instructions and transmit them to the processor
  • the processor is configured to run the code instructions to execute the method provided in one embodiment.
  • a communication device provided by an embodiment of another aspect of the present disclosure includes: a processor and an interface circuit;
  • the interface circuit is used to receive code instructions and transmit them to the processor
  • the processor is configured to run the code instructions to execute the method provided in another embodiment.
  • a computer-readable storage medium provided by another embodiment of the present disclosure is used to store instructions, and when the instructions are executed in response to the instructions, the method provided by the first embodiment is implemented.
  • a computer-readable storage medium provided by an embodiment of another aspect of the present disclosure is used for storing instructions, and when the instructions are executed in response to the instructions, the method provided by another embodiment is implemented.
  • the first step is to determine the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • a cross-correlation coefficient and, in response to when the first cross-correlation coefficient is less than the first threshold, use the first decorrelation processing method to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, after that,
  • the second cross-correlation coefficient of the two-channel signal after decorrelation processing will be calculated, and when the first cross-correlation coefficient is smaller than the second cross-correlation coefficient, the flag bit is determined to be the first value, and the two-channel signal after decorrelation processing Obtain the encoded code stream and write the flag bit into the encoded code stream.
  • the value range of the first threshold is (-1, 0), thus, when the first cross-correlation coefficient is smaller than the first threshold, it indicates that the current frame of the stereo audio signal is out of phase signal, at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and de-correlation processing method for the partial anti-phase signal Correlation processing method greatly improves the coding compression rate of partial anti-phase signal.
  • Fig. 1a is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure
  • Fig. 1b is a flow chart of obtaining an encoded code stream based on two-channel signals after decorrelation processing provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure
  • Fig. 3a is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure
  • Fig. 3b is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure
  • FIG. 3c is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure.
  • Fig. 4a is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure
  • Fig. 4b is a flow chart of determining a two-channel signal after decorrelation processing based on an encoded code stream provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure
  • FIG. 6 is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure
  • FIG. 7 is a schematic structural diagram of a stereo audio signal processing device provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a stereo audio signal processing device provided by an embodiment of the present disclosure.
  • Fig. 9 is a block diagram of a user equipment provided by an embodiment of the present disclosure.
  • Fig. 10 is a block diagram of a network side device provided by an embodiment of the present disclosure.
  • first, second, third, etc. may use the terms first, second, third, etc. to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the embodiments of the present disclosure, first information may also be called second information, and similarly, second information may also be called first information.
  • first information may also be called second information
  • second information may also be called first information.
  • the words "if” and "if” as used herein may be interpreted as “at” or "when” or "in response to a determination.”
  • Fig. 1a is a schematic flowchart of a method for processing a stereo audio signal provided by an embodiment of the present disclosure. The method is executed by an encoding device. As shown in Fig. 1a, the method for processing a stereo audio signal may include the following steps:
  • Step 101 Determine a first cross-correlation coefficient between a left channel signal and a right channel signal of a current frame of a stereo audio signal.
  • a cross-correlation analysis may be performed on the current frame of the stereo audio signal to obtain the left channel signal and the right channel signal of the current frame The first cross-correlation coefficient of the signal.
  • the method for determining the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal may include:
  • ⁇ (LR) is the cross-correlation coefficient of current frame left channel signal and right channel signal
  • L (n) is the nth sampling point of current frame left channel signal, is the average value of all samples of the left channel signal of the current frame
  • R(n) is the nth sample point of the right channel signal of the current frame, is the average value of all samples of the right channel signal of the current frame
  • N is the total number of samples of the left channel signal or the right channel signal of the current frame, that is, the frame length of the current frame.
  • Step 102 In response to the fact that the first cross-correlation coefficient is smaller than the first threshold, the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and calculate the two-channel signal after decorrelation processing
  • the second cross-correlation coefficient of the signal in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the coded code stream based on the two-channel signal after decorrelation processing and set the flag bits into the coded stream.
  • the first threshold may be preset, and the value range of the first threshold is (-1, 0).
  • the first threshold may be between [-0.5, -0.1].
  • the first threshold may be -0.3.
  • the first decorrelation processing method may be used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing.
  • the first decorrelation processing manner may be first sum difference downmix processing.
  • the first sum and difference downmix processing may include: processing the left channel signal and the right channel signal based on a formula to obtain a main channel signal and a secondary channel signal; Formula 1 is:
  • Mid(n) is the main channel signal in the two-channel signal after decorrelation processing
  • Sid(n) is the sub-channel signal in the two-channel signal after decorrelation processing
  • L(n) is the left channel signal
  • R(n) is the right channel signal.
  • the embodiment of the present disclosure will determine whether the current frame of the stereo audio signal is a partial anti-phase signal, and, in response to determining that the current frame of the stereo audio signal is a partial anti-phase signal, adopt the first decorrelation processing method corresponding to the partial anti-phase signal.
  • the decorrelation process is performed on the current frame of the stereo audio signal to obtain the two-channel signal after the decorrelation process, thereby greatly improving the coding compression rate of the partial anti-phase signal.
  • the correlation of the two-channel signals after decorrelation processing obtained after decorrelation processing on the audio signal is greater than or equal to the correlation of the two-channel signals before decorrelation processing.
  • Correlation that is, the de-correlation process does not achieve the purpose of "de-correlation”. Therefore, in one embodiment of the present disclosure, after performing the first decorrelation processing on the current frame to obtain the two-channel signals after decorrelation processing, the second correlation between the two-channel signals after de-correlation processing can be further calculated.
  • Correlation coefficient and determine the first correlation coefficient by determining the size relationship between the second correlation coefficient (that is, the correlation coefficient after decorrelation processing) and the first correlation coefficient (that is, the correlation coefficient before decorrelation processing). Whether the de-correlation process achieves the purpose of "de-correlation".
  • the method for calculating the second cross-correlation coefficient of the two-channel signals after de-correlation processing may include:
  • the Formula 4 may be:
  • ⁇ (MS) is the second cross-correlation coefficient
  • Mid (n) is the nth sampling point of the main channel signal in the two-channel signal after decorrelation processing, is the average value of all sample points of the main channel signal in the two-channel signal after decorrelation processing
  • Sid(n) is the nth sample point of the sub-channel signal in the two-channel signal after decorrelation processing, is the average value of all samples of the sub-channel signal in the two-channel signal after decorrelation processing
  • N is the total number of samples of the left channel signal or the right channel signal of the current frame, that is, the frame length of the current frame.
  • the first decorrelation processing manner in response to the fact that the first correlation coefficient is smaller than the second correlation coefficient, it is considered that the first decorrelation processing manner achieves the purpose of "decrelation".
  • the magnitude of the correlation between signals is positively correlated with the absolute value of the correlation coefficient, and, for a partial anti-phase signal with a negative correlation coefficient, the smaller the value of the correlation coefficient, the greater the correlation coefficient The larger the absolute value of , the better the negative correlation.
  • the first cross-correlation coefficient is smaller than the second cross-correlation coefficient, it means that the first decorrelation processing method is executed.
  • the negative correlation of the first two channel signals is higher than the negative correlation of the two channel signals after the first decorrelation processing is performed, so it can be determined that the first decorrelation processing has achieved the purpose of "decorrelation".
  • Fig. 1b is a flow chart of obtaining an encoded code stream based on the two-channel signal after decorrelation processing provided by the embodiment of the present disclosure. As shown in Fig. 1b, based on the decorrelation
  • the method to obtain the coded code stream after processing the two-channel signal can be as follows:
  • the two-channel signals after decorrelation processing are divided into sub-bands by using integer lifting wavelet decomposition to obtain the sub-band signals, and the LPC (Linear Prediction Coefficient, linear prediction coefficient) parameters are calculated and quantized for the two-channel signals after decorrelation processing to obtain Quantify the LPC parameters, and then use the linear predictor to predict each sub-band signal based on the quantized LPC parameters, generate a prediction residual signal, use the preprocessor to normalize the prediction residual signal, and generate a normalized output signal, LSB (Least Significant Bit, least significant bit) signal and signal sign bit.
  • LPC Linear Prediction Coefficient, linear prediction coefficient
  • entropy encoder uses the entropy encoder to perform entropy encoding on the normalized output signals corresponding to each sub-band signal to generate an encoded bit stream, and then perform code stream multiplexing on the encoded bit stream, LSB signal, signal symbol bit, quantized LPC parameters, and wavelet edge information Get the encoded code stream.
  • the flag bit in response to determining the encoded code stream, it may be determined that the flag bit is a first value (for example, it may be 0), and the first value may be used to indicate the removal method adopted by the encoding device.
  • the correlation processing is the first de-correlation processing mode, and the flag bit can be written into the encoded code stream and sent to the decoding device, so that the decoding device can perform decorrelation reconstruction based on the flag bit using a corresponding de-correlation reconstruction mode.
  • the first thresholds corresponding to different frames of the stereo audio signal may be different.
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame is compared with the first threshold corresponding to the current frame.
  • the stereo audio signal is processed based on the cross-correlation between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • the stereo audio signal can also be processed based on the phase between the left channel signal and the right channel signal of the current frame of the stereo audio signal. Specifically, the left channel signal of the current frame can be determined first. The first phase between the channel signal and the right channel signal.
  • the first phase between the left channel signal and the right channel signal of the current frame is within the first interval, it is determined that the current frame of the stereo audio signal is a partial anti-phase signal , using the first de-correlation processing method to perform de-correlation processing on the current frame of the stereo audio signal to obtain the two-channel signal after the de-correlation processing, calculate the second phase of the two-channel signal after the de-correlation processing, and respond to the fact that the first phase is greater than the second Phase, indicating that the first de-correlation processing method achieves the purpose of "de-correlation", and the flag bit is determined as the first value, and the encoded code stream is obtained based on the two-channel signal after the decorrelation processing, and the flag bit is written into the coded code The stream is sent to the decoding device.
  • the first interval may be [135°, 180°].
  • the first correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and when the first correlation coefficient When the relationship coefficient is less than the first threshold, the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing.
  • the second Two cross-correlation coefficients when the first cross-correlation coefficient is less than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the coded stream based on the two-channel signal after decorrelation processing and write the flag bit into the coded stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, when the first cross-correlation coefficient is smaller than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • Fig. 2 is a schematic flowchart of a method for processing a stereo audio signal provided by an embodiment of the present disclosure. The method is executed by an encoding device. As shown in Fig. 2 , the method for processing a stereo audio signal may include the following steps:
  • Step 201 Determine a first cross-correlation coefficient between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • step 201 For the relevant introduction of step 201, reference may be made to the description of the foregoing embodiments, and details are not described in this embodiment of the present disclosure.
  • Step 202 In response to the fact that the first cross-correlation coefficient is greater than the second threshold, use the second decorrelation processing method to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and calculate the two-channel signal after decorrelation processing
  • the third cross-correlation coefficient of the signal in response to the first cross-correlation coefficient being greater than the third cross-correlation coefficient, determine that the flag bit is the second value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing and write the flag bit to Enter the encoded code stream.
  • the second threshold may be preset, and the value range of the second threshold may be (0, 1).
  • the second threshold may be between [0.1, 0.5].
  • the second threshold may be 0.3.
  • the second decorrelation processing method in response to the second cross-correlation coefficient being greater than the first threshold, it indicates that there is a positive correlation between the left channel signal and the right channel signal in the current frame, that is, the current frame is a partial positive phase signal,
  • the second decorrelation processing method may be used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing.
  • the second decorrelation processing manner may be second sum difference downmix processing.
  • the second sum and difference downmix processing may include: processing the left channel signal and the right channel signal based on Formula 2 to obtain the main channel signal and the secondary channel signal;
  • Formula 2 is:
  • Mid(n) is the main channel signal in the two-channel signal after decorrelation processing
  • Sid(n) is the sub-channel signal in the two-channel signal after decorrelation processing
  • L(n) is the left channel signal
  • R(n) is the right channel signal.
  • the embodiment of the present disclosure will determine whether the current frame of the stereo audio signal is a positive phase signal, and, in response to determining that the current frame of the stereo audio signal is a positive phase signal, adopt the second decorrelation processing method corresponding to the positive phase signal
  • the decorrelation process is performed on the current frame of the stereo audio signal to obtain the two-channel signal after the decorrelation process, thereby greatly improving the coding compression rate of the positive-phase signal.
  • the correlation of the two-channel signals obtained after decorrelation processing on the audio signal is greater than or equal to the correlation before the decorrelation processing
  • the de-correlation processing does not achieve the purpose of "decrelation”. Therefore, in an embodiment of the present disclosure, after performing the second decorrelation processing on the current frame to obtain the two-channel signals after decorrelation processing, the third correlation between the two-channel signals after de-correlation processing can be further calculated.
  • Correlation coefficient and determine the second correlation coefficient by determining the size relationship between the third correlation coefficient (that is, the correlation coefficient after decorrelation processing) and the first correlation coefficient (that is, the correlation coefficient before decorrelation processing) Whether the de-correlation process achieves the purpose of "de-correlation".
  • the method for calculating the third cross-correlation coefficient of the two-channel signals after de-correlation processing may include:
  • the Formula 4 may be:
  • ⁇ (MS) is the third cross-correlation coefficient
  • Mid (n) is the nth sampling point of the main channel signal in the two-channel signal after decorrelation processing, is the average value of all sample points of the main channel signal in the two-channel signal after decorrelation processing
  • Sid(n) is the nth sample point of the sub-channel signal in the two-channel signal after decorrelation processing, is the average value of all samples of the sub-channel signal in the two-channel signal after decorrelation processing
  • N is the total number of samples of the left channel signal or the right channel signal of the current frame, that is, the frame length of the current frame.
  • the second decorrelation processing manner in response to the fact that the first correlation coefficient is greater than the third correlation coefficient, it is considered that the second decorrelation processing manner achieves the purpose of "decrelation".
  • the magnitude of the correlation between signals is positively correlated with the absolute value of the correlation coefficient, and, for a partial positive-phase signal with a positive correlation coefficient, the smaller the correlation coefficient value, the greater the correlation coefficient. The smaller the absolute value of the coefficient, the worse the positive correlation.
  • the current frame of the stereo audio signal is a positive-phase signal.
  • the first cross-correlation coefficient is greater than the third cross-correlation coefficient, it means that the positive correlation of the two-channel signals before the second decorrelation processing is higher than that of the two-channel signals after the second decorrelation processing. Positive correlation, so that it can be determined that the second decorrelation processing method achieves the purpose of "decrelation".
  • the coded stream in response to determining that the second decorrelation processing method achieves the goal of "decorrelation", the coded stream may be obtained based on the two-channel signal after decorrelation processing, and the flag bit is determined as The second value (for example, it can be 1), which can be used to indicate that the de-correlation processing adopted by the encoding end is the second de-correlation processing method, and the flag bit can be written into the encoded code stream and sent to the decoding device , so that the decoding device can use the corresponding decorrelation reconstruction method to perform decorrelation reconstruction based on the flag bit.
  • the second value for example, it can be 1
  • the second thresholds corresponding to different frames of the stereo audio signal may be different.
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame may be compared with the second threshold corresponding to the current frame.
  • the stereo audio signal is processed based on the cross-correlation between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • the stereo audio signal can also be processed based on the phase between the left channel signal and the right channel signal of the current frame of the stereo audio signal. Specifically, the left channel signal of the current frame can be determined first. The first phase between the channel signal and the right channel signal.
  • the second de-correlation processing method uses the second de-correlation processing method to perform de-correlation processing on the current frame of the stereo audio signal to obtain the two-channel signal after the de-correlation processing, and calculate the third phase of the two-channel signal after the de-correlation processing, and respond to the fact that the first phase is greater than the third Phase, indicating that the second de-correlation processing method achieves the purpose of "de-correlation", and the flag bit is determined as the second value, and the encoded code stream is obtained based on the two-channel signal after the decorrelation processing, and the flag bit is written into the coded code The stream is sent to the decoding device.
  • the second interval may be [0°, 45°].
  • the first correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and when the first correlation coefficient When the relationship coefficient is less than the first threshold, the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing.
  • the second Two cross-correlation coefficients when the first cross-correlation coefficient is less than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the coded stream based on the two-channel signal after decorrelation processing and write the flag bit into the coded stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, when the first cross-correlation coefficient is smaller than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • Fig. 3a is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure, the method is executed by an encoding device, as shown in Fig. 3a, the stereo audio signal processing method may include the following steps:
  • Step 301a Determine a first cross-correlation coefficient between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • step 301a For the relevant introduction of step 301a, reference may be made to the description of the foregoing embodiments, and details are not described in this embodiment of the present disclosure.
  • Step 302a In response to the first cross-correlation coefficient being greater than or equal to the first threshold and less than or equal to the second threshold, directly determine the current frame of the stereo audio signal as the two-channel signal after decorrelation processing, and determine the flag bit as the third value, and, The coded code stream is obtained based on the two-channel signals after decorrelation processing, and the flag bit is written into the coded code stream.
  • the first threshold may be preset, and the value range of the first threshold is (-1, 0).
  • the first threshold may be between [-0.5, -0.1].
  • the first threshold may be -0.3.
  • the second threshold may be preset, and the value range of the second threshold may be (0, 1).
  • the second threshold may be between [0.1, 0.5].
  • the second threshold may be 0.3.
  • the first cross-correlation coefficient when the first cross-correlation coefficient is greater than or equal to the first threshold and less than or equal to the second threshold, it indicates that there is no correlation between the left channel signal and the right channel signal of the current frame.
  • the flag bit is the third value (for example, it can be 2)
  • the third value can be used to indicate that the encoder does not use decorrelation processing, and the flag bit can be written into the encoded code stream and sent to the decoding device, so that the decoding device can be based on this The flag bit is used to perform corresponding decorrelation reconstruction.
  • the stereo audio signal is processed based on the cross-correlation between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • the stereo audio signal can also be processed based on the phase between the left channel signal and the right channel signal of the current frame of the stereo audio signal. Specifically, the left channel signal of the current frame can be determined first.
  • the first phase between the channel signal and the right channel signal if the first phase between the left channel signal and the right channel signal of the current frame is in the third interval, it is determined that the current frame of the stereo audio signal is an irrelevant signal, Directly determine the current frame of the stereo audio signal as the two-channel signal after decorrelation processing, determine the flag bit as the third value, and obtain the coded stream based on the two-channel signal after decorrelation processing, and write the flag bit into the code
  • the code stream is sent to the decoding device.
  • the third interval may be (45°, 135°).
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and, in response to the first
  • the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and then the first decorrelation processing of the two-channel signal after decorrelation processing is calculated.
  • Two cross-correlation coefficients in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the coded code stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, in response to the first cross-correlation coefficient being less than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • Fig. 3b is a schematic flowchart of a method for processing a stereo audio signal provided by an embodiment of the present disclosure. The method is executed by an encoding device. As shown in Fig. 3b, the method for processing a stereo audio signal may include the following steps:
  • Step 301b Determine a first cross-correlation coefficient between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • step 301b For the related introduction of step 301b, reference may be made to the description of the foregoing embodiments, and details are not described here in the embodiments of the present disclosure.
  • Step 302b In response to the fact that the first cross-correlation coefficient is less than the first threshold, the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and calculate the two-channel signal after decorrelation processing
  • the second cross-correlation coefficient of the signal in response to the first cross-correlation coefficient being greater than or equal to the second cross-correlation coefficient, directly determine the current frame of the stereo audio signal as the two-channel signal after decorrelation processing, and determine the flag bit as the third value, and , based on the two-channel signal after decorrelation processing, the coded code stream is obtained, and the flag bit is written into the coded code stream.
  • the first decorrelation processing method in response to the fact that the first correlation coefficient is less than the first threshold and the first correlation coefficient is greater than or equal to the second correlation coefficient, it is considered that the first decorrelation processing method has not achieved "decrelation "the goal of.
  • the first cross-correlation coefficient when the first cross-correlation coefficient is smaller than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal (that is, the smaller the correlation coefficient, the higher the negative correlation).
  • the first cross-correlation coefficient that is, the cross-correlation coefficient before decorrelation processing
  • the second cross-correlation coefficient that is, the cross-correlation coefficient after decorrelation processing
  • the current frame of the stereo audio signal can be directly determined as the two-channel signal after decorrelation processing, and the flag bit can be determined as the third value (for example, it can be 2), and, based on the two-channel signal after decorrelation processing, the coded code stream is obtained, and the flag bit is written into the coded code stream and sent to the decoding device, so as to ensure the subsequent coding compression rate.
  • the stereo audio signal is processed based on the cross-correlation between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • the stereo audio signal can also be processed based on the phase between the left channel signal and the right channel signal of the current frame of the stereo audio signal. Specifically, the left channel signal of the current frame can be determined first. The first phase between the channel signal and the right channel signal.
  • the first phase between the left channel signal and the right channel signal of the current frame is within the first interval, it is determined that the current frame of the stereo audio signal is a partial anti-phase signal , using the first de-correlation processing method to perform de-correlation processing on the current frame of the stereo audio signal to obtain the two-channel signal after the de-correlation processing, and calculate the second phase of the two-channel signal after the de-correlation processing, and respond to the first phase being less than or equal to the second phase
  • Two phases, indicating that the first de-correlation processing method does not achieve the purpose of "decorrelation” then directly determine the current frame of the stereo audio signal as the two-channel signal after the de-correlation process, determine the flag bit as the third value, and, based on the de-correlation process After correlation processing, the two-channel signals are coded streams, and flag bits are written into the coded streams and sent to the decoding device.
  • the first interval may be [135°, 180°].
  • the first correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and when the first correlation coefficient When the relationship coefficient is less than the first threshold, the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing.
  • the second Two cross-correlation coefficients in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the coded code stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, in response to the first cross-correlation coefficient being less than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • Fig. 3c is a schematic flowchart of a method for processing a stereo audio signal provided by an embodiment of the present disclosure. The method is executed by an encoding device. As shown in Fig. 3c, the method for processing a stereo audio signal may include the following steps:
  • Step 301c Determine a first cross-correlation coefficient between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • step 301c For the related introduction of step 301c, reference may be made to the description of the foregoing embodiments, and the embodiments of the present disclosure are not described in detail here.
  • Step 302c In response to the fact that the first cross-correlation coefficient is greater than the second threshold, use the second decorrelation processing method to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and calculate the two-channel signal after decorrelation processing
  • the third cross-correlation coefficient of the signal in response to the first cross-correlation coefficient being less than or equal to the third cross-correlation coefficient, directly determine the current frame of the stereo audio signal as the two-channel signal after decorrelation processing, and determine the flag bit as the third value, and , based on the two-channel signal after decorrelation processing, the coded code stream is obtained, and the flag bit is written into the coded code stream.
  • the first decorrelation processing method in response to when the first correlation coefficient is greater than the second threshold and the first correlation coefficient is less than or equal to the third correlation coefficient, it is considered that the first decorrelation processing method has not reached the "de-correlation relevant" purposes.
  • the first cross-correlation coefficient is greater than the second threshold, it indicates that the current frame of the stereo audio signal is a positive-phase signal (that is, the larger the correlation coefficient, the higher the positive correlation).
  • the first cross-correlation coefficient that is, the cross-correlation coefficient before decorrelation processing
  • the third cross-correlation coefficient that is, the cross-correlation coefficient after decorrelation processing
  • the current frame of the stereo audio signal can be directly determined as the two-channel signal after decorrelation processing, and the flag bit can be determined as the third value (for example, it can be is 2), and, based on the two-channel signal after decorrelation processing, the coded code stream is obtained, and the flag bit is written into the coded code stream and sent to the decoding device.
  • the stereo audio signal is processed based on the cross-correlation between the left channel signal and the right channel signal of the current frame of the stereo audio signal.
  • the stereo audio signal can also be processed based on the phase between the left channel signal and the right channel signal of the current frame of the stereo audio signal. Specifically, the left channel signal of the current frame can be determined first. The first phase between the channel signal and the right channel signal.
  • the first phase between the left channel signal and the right channel signal of the current frame is within the second interval, it is determined that the current frame of the stereo audio signal is a partial positive phase signal , using the second decorrelation processing method to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, calculate the third phase of the two-channel signal after decorrelation processing, and respond to the first phase being less than or equal to the first phase
  • Three phases indicating that the second de-correlation processing method does not achieve the purpose of "decorrelation”
  • directly determine the current frame of the stereo audio signal as the two-channel signal after the de-correlation process determine the flag bit as the third value, and, based on de-correlation After correlation processing, the two-channel signals are coded streams, and flag bits are written into the coded streams and sent to the decoding device.
  • the first interval may be [0°, 45°].
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and, in response to the first
  • the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and then the first decorrelation processing of the two-channel signal after decorrelation processing is calculated.
  • Two cross-correlation coefficients in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the coded code stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, when the first cross-correlation coefficient is smaller than the first threshold, it indicates that the current frame of the stereo audio signal is out of phase signal, at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and de-correlation processing method for the partial anti-phase signal Correlation processing method greatly improves the coding compression rate of partial anti-phase signal.
  • Fig. 4a is a schematic flowchart of a method for processing a stereo audio signal provided by an embodiment of the present disclosure. The method is executed by a decoding device. As shown in Fig. 4a, the method for processing a stereo audio signal may include the following steps:
  • Step 401 Obtain an encoded code stream sent by an encoding device.
  • Step 402 Determine the two-channel signal and the flag bit after decorrelation processing based on the coded code stream.
  • FIG. 4b is a flow chart of determining the two-channel signal after decorrelation processing based on the encoded code stream provided by the embodiment of the present disclosure. As shown in FIG. 4b, the two-channel signal after the decorrelation processing is determined based on the encoded code stream.
  • the method can be:
  • the coded stream After obtaining the coded stream, first parse the coded stream to obtain coded bit stream, flag bit, LSB signal, sign bit signal, quantized LPC parameters and wavelet edge information, and then use entropy decoder to entropy decode the coded bit stream to obtain The decoded signal is then processed by a post-processor based on the LSB signal and the sign bit signal to generate a prediction residual. Afterwards, the linear predictor is used to reconstruct the prediction residual according to the quantized LPC parameters to generate the sub-band signals, and then the integer lifting wavelet is used to reconstruct the sub-band signals based on the wavelet edge information to obtain the two-channel after decorrelation processing Signal.
  • Step 403 in response to the first value of the flag bit, perform decorrelation reconstruction on the two-channel signals after decorrelation processing by using the first decorrelation reconstruction method, and output the decorrelation reconstructed audio signal.
  • the flag bit is the first value (for example, it may be 0)
  • the first de-correlation reconstruction mode corresponding to the first de-correlation processing mode may be used to perform de-correlation reconstruction on the two-channel signals after the de-correlation processing, and output the de-correlation reconstructed audio signal.
  • the two-channel signal after decorrelation processing may include a main channel signal and a sub-channel signal.
  • the first de-correlation reconstruction method may include: performing de-correlation reconstruction on the two-channel signals after de-correlation processing based on Formula 5; Formula 5 is:
  • Mid(n) is the main channel signal in the two-channel signal after decorrelation processing
  • Sid(n) is the sub-channel signal in the two-channel signal after decorrelation processing
  • L(n) is the left channel signal
  • R(n) is the right channel signal.
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and, in response to the first
  • the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and then the first decorrelation processing of the two-channel signal after decorrelation processing is calculated.
  • Two cross-correlation coefficients in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the coded code stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, in response to the first cross-correlation coefficient being less than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • Fig. 5 is a schematic flowchart of a stereo audio signal processing method provided by an embodiment of the present disclosure, the method is executed by a decoding device, as shown in Fig. 5, the stereo audio signal processing method may include the following steps:
  • Step 501 Obtain an encoded stream sent by an encoding device.
  • Step 502 Determine the two-channel signal and the flag bit after decorrelation processing based on the coded code stream.
  • Step 503 in response to the flag being the second value, perform decorrelation reconstruction on the two-channel signals after decorrelation processing by using a second decorrelation reconstruction manner, and output the decorrelation reconstructed audio signal.
  • the flag bit is the second value (for example, it may be 1)
  • the second de-correlation reconstruction mode corresponding to the second de-correlation processing mode may be used to perform de-correlation reconstruction on the two-channel signals after the de-correlation processing, and output the de-correlation reconstructed audio signal.
  • the two-channel signal after decorrelation processing may include a main channel signal and a sub-channel signal.
  • the second de-correlation reconstruction method may include: performing de-correlation reconstruction on the two-channel signals after the de-correlation processing based on Formula 6;
  • Formula 6 is:
  • Mid(n) is the main channel signal of the two-channel signal after decorrelation processing
  • Sid(n) is the sub-channel signal of the two-channel signal after decorrelation processing
  • L(n) is the left channel signal
  • R(n) is the right channel signal.
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and, in response to the first
  • the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and then the first decorrelation processing of the two-channel signal after decorrelation processing is calculated.
  • Two cross-correlation coefficients in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the coded code stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, in response to the first cross-correlation coefficient being less than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • FIG. 6 is a schematic flowchart of a method for processing a stereo audio signal provided by an embodiment of the present disclosure. The method is executed by a decoding device. As shown in FIG. 6 , the method for processing a stereo audio signal may include the following steps:
  • Step 601. Obtain an encoded stream sent by an encoding device.
  • Step 602 Determine the two-channel signal and the flag bit after decorrelation processing based on the coded code stream.
  • Step 603 In response to the flag bit being the third value, directly determine the two-channel signal after decorrelation processing as the audio signal after decorrelation and reconstruction.
  • the flag bit is a third value (for example, it may be 2), it indicates that the encoding device has not performed decorrelation processing. Based on this, the two-channel signal after decorrelation processing can be directly determined as the audio signal after decorrelation and reconstruction.
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and, in response to the first
  • the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and then the first decorrelation processing of the two-channel signal after decorrelation processing is calculated.
  • Two cross-correlation coefficients in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the coded code stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, in response to the first cross-correlation coefficient being less than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • FIG. 7 is a schematic structural diagram of a stereo audio signal processing device provided by an embodiment of the present disclosure, which is applied to the encoding end. As shown in FIG. 7 , the device 700 may include:
  • Determining module 701 for determining the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal
  • the processing module 702 is configured to, in response to the first cross-correlation coefficient being smaller than the first threshold, perform decorrelation processing on the current frame of the stereo audio signal in a first decorrelation processing manner to obtain two-channel signals after decorrelation processing, and calculate The second cross-correlation coefficient of the two-channel signals after the decorrelation processing, in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine that the flag bit is the first value, and, based on the de-correlation processing An encoded code stream is obtained from the two-channel signal, and the flag bit is written into the encoded code stream and sent to the decoding device, and the value range of the first threshold is (-1, 0).
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and, in response to the first
  • the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and then the first decorrelation processing of the two-channel signal after decorrelation processing is calculated.
  • Two cross-correlation coefficients in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the coded code stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, in response to the first cross-correlation coefficient being less than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • the first decorrelation processing manner includes first sum difference downmix processing.
  • the first sum and difference downmix processing includes:
  • Mid(n) is the main channel signal
  • Sid(n) is the secondary channel signal
  • L(n) is the left channel signal
  • R(n) is the right channel signal.
  • the device is also used for:
  • the second decorrelation processing manner includes second sum and difference downmix processing.
  • the second sum and difference downmix processing includes:
  • the left channel signal and the right channel signal are processed based on Formula 2 to obtain the main channel signal;
  • the Formula 2 is:
  • Mid(n) is the main channel signal
  • Sid(n) is the secondary channel signal
  • L(n) is the left channel signal
  • R(n) is the right channel signal.
  • the device is also used for:
  • Two cross-correlation coefficients or the first cross-correlation coefficient is greater than the second threshold and the first cross-correlation coefficient is less than or equal to the third cross-correlation coefficient, directly determine the current frame of the stereo audio signal as after decorrelation processing For the two-channel signal, determine that the flag bit is a third value, and obtain an encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the encoded code stream.
  • the determination module is further configured to:
  • ⁇ (LR) is the cross-correlation coefficient of current frame left channel signal and right channel signal
  • L (n) is the nth sampling point of current frame left channel signal, is the average value of all samples of the left channel signal of the current frame
  • R(n) is the nth sample point of the right channel signal of the current frame, is the average value of all samples of the right channel signal of the current frame
  • N is the total number of samples of the left channel signal or the right channel signal of the current frame, that is, the frame length of the current frame.
  • the two-channel signal after decorrelation processing includes a main channel signal and a secondary channel signal
  • the device is also used for:
  • n (MS) is the second cross-correlation coefficient or the third cross-correlation coefficient
  • Mid (n) is the nth sampling point of the main channel signal in the two-channel signal after decorrelation processing, is the average value of all sample points of the main channel signal in the two-channel signal after decorrelation processing
  • Sid(n) is the nth sample point of the sub-channel signal in the two-channel signal after decorrelation processing, is the average value of all samples of the sub-channel signal in the two-channel signal after decorrelation processing
  • N is the total number of samples of the left channel signal or the right channel signal of the current frame, that is, the frame length of the current frame.
  • the first cross-correlation coefficient of the left channel signal and the right channel signal of the current frame of the stereo audio signal will be determined first, and, in response to the first
  • the first decorrelation processing method is used to perform decorrelation processing on the current frame of the stereo audio signal to obtain the two-channel signal after decorrelation processing, and then the first decorrelation processing of the two-channel signal after decorrelation processing is calculated.
  • Two cross-correlation coefficients in response to the first cross-correlation coefficient being smaller than the second cross-correlation coefficient, determine the flag bit as the first value, and obtain the encoded code stream based on the two-channel signal after decorrelation processing, and write the flag bit into the coded code stream sent to the decoding device.
  • the value range of the first threshold is (-1, 0), thus, in response to the first cross-correlation coefficient being less than the first threshold, it indicates that the current frame of the stereo audio signal is a partial anti-phase signal , at this time, the first de-correlation processing method will be adopted correspondingly for the partial anti-phase signal, so as to ensure the subsequent compression rate, then the embodiment of the present disclosure provides a judgment method and decorrelation for the partial anti-phase signal The processing method greatly improves the encoding compression rate of the partial anti-phase signal.
  • Fig. 8 is a schematic structural diagram of a stereo audio signal processing device provided by an embodiment of the present disclosure, which is applied to the decoding end. As shown in Fig. 8, the device 800 may include:
  • An acquisition module 801 configured to acquire an encoded stream sent by an encoding device
  • a determination module 802 configured to determine the two-channel signal and the flag bit after decorrelation processing based on the encoded code stream
  • the processing module 803 is configured to, in response to the flag bit being a first value, perform decorrelation reconstruction on the two-channel signal after decorrelation processing by using a first decorrelation reconstruction manner, and output a decorrelation reconstructed audio signal.
  • the first decorrelation reconstruction manner includes:
  • Mid(n) is the main channel signal in the two-channel signal after decorrelation processing
  • Sid(n) is the sub-channel signal in the two-channel signal after decorrelation processing
  • L(n) is the left channel signal
  • R(n) is the right channel signal.
  • the device is also used for:
  • decorrelation reconstruction is performed on the two-channel signals after decorrelation processing by using a second decorrelation reconstruction manner, and a decorrelation reconstructed audio signal is output.
  • the second decorrelation reconstruction manner includes:
  • Mid(n) is the main channel signal of the two-channel signal after decorrelation processing
  • Sid(n) is the sub-channel signal of the two-channel signal after decorrelation processing
  • L(n) is the left channel signal
  • R(n) is the right channel signal.
  • the device is also used for:
  • Fig. 9 is a block diagram of a user equipment UE900 provided by an embodiment of the present disclosure.
  • the UE 900 may be a mobile phone, a computer, a digital broadcast terminal device, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • UE900 may include at least one of the following components: a processing component 902, a memory 904, a power supply component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 913, and a communication component 916.
  • a processing component 902 a memory 904
  • a power supply component 906 a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 913, and a communication component 916.
  • I/O input/output
  • the processing component 902 generally controls the overall operations of the UE 900, such as those associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 902 may include at least one processor 920 to execute instructions to complete all or part of the steps of the above-mentioned method.
  • processing component 902 can include at least one module to facilitate interaction between processing component 902 and other components.
  • processing component 902 may include a multimedia module to facilitate interaction between multimedia component 908 and processing component 902 .
  • the memory 904 is configured to store various types of data to support operations at the UE 900 . Examples of such data include instructions for any application or method operating on UE900, contact data, phonebook data, messages, pictures, videos, etc.
  • the memory 904 can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic or Optical Disk Magnetic Disk
  • the power supply component 906 provides power to various components of the UE 900 .
  • Power component 906 may include a power management system, at least one power supply, and other components associated with generating, managing, and distributing power for UE 900 .
  • the multimedia component 908 includes a screen providing an output interface between the UE 900 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes at least one touch sensor to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense a boundary of a touch or slide action, but also detect a wake-up time and pressure related to the touch or slide operation.
  • the multimedia component 908 includes a front camera and/or a rear camera. When the UE900 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capability.
  • the audio component 910 is configured to output and/or input audio signals.
  • the audio component 910 includes a microphone (MIC), which is configured to receive an external audio signal when the UE 900 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode. Received audio signals may be further stored in memory 904 or sent via communication component 916 .
  • the audio component 910 also includes a speaker for outputting audio signals.
  • the I/O interface 912 provides an interface between the processing component 902 and a peripheral interface module.
  • the peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include, but are not limited to: a home button, volume buttons, start button, and lock button.
  • the sensor component 913 includes at least one sensor for providing various aspects of state assessment for the UE 900 .
  • the sensor component 913 can detect the open/closed state of the device 900, the relative positioning of components, such as the display and the keypad of the UE900, the sensor component 913 can also detect the position change of the UE900 or a component of the UE900, and the user and Presence or absence of UE900 contact, UE900 orientation or acceleration/deceleration and temperature change of UE900.
  • the sensor assembly 913 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • the sensor assembly 913 may also include an optical sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 913 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
  • Communication component 916 is configured to facilitate wired or wireless communications between UE 900 and other devices.
  • UE900 can access wireless networks based on communication standards, such as WiFi, 2G or 3G, or a combination thereof.
  • the communication component 916 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 916 also includes a near field communication (NFC) module to facilitate short-range communication.
  • NFC near field communication
  • the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, Infrared Data Association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID Radio Frequency Identification
  • IrDA Infrared Data Association
  • UWB Ultra Wideband
  • Bluetooth Bluetooth
  • UE 900 may be powered by at least one Application Specific Integrated Circuit (ASIC), Digital Signal Processor (DSP), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field Programmable Gate Array ( FPGA), controller, microcontroller, microprocessor or other electronic components for implementing the above method.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processor
  • DSPD Digital Signal Processing Device
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • controller microcontroller, microprocessor or other electronic components for implementing the above method.
  • Fig. 10 is a block diagram of a network side device 1000 provided by an embodiment of the present disclosure.
  • the network side device 1000 may be provided as a network side device.
  • the network side device 1000 includes a processing component 1011, which further includes at least one processor, and a memory resource represented by a memory 1032 for storing instructions executable by the processing component 1022, such as an application program.
  • the application program stored in memory 1032 may include one or more modules each corresponding to a set of instructions.
  • the processing component 1010 is configured to execute instructions, so as to execute any of the aforementioned methods applied to the network side device, for example, the method shown in FIG. 1 .
  • the network side device 1000 may also include a power supply component 1026 configured to perform power management of the network side device 1000, a wired or wireless network interface 1050 configured to connect the network side device 1000 to the network, and an input and output (I/O ) interface 1058.
  • the network side device 1000 can operate based on the operating system stored in the memory 1032, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, Free BSDTM or similar.
  • the methods provided in the embodiments of the present disclosure are introduced from the perspectives of the network side device and the UE respectively.
  • the network side device and the UE may include a hardware structure and a software module, and implement the above functions in the form of a hardware structure, a software module, or a hardware structure plus a software module.
  • a certain function among the above-mentioned functions may be implemented in the form of a hardware structure, a software module, or a hardware structure plus a software module.
  • the methods provided in the embodiments of the present disclosure are introduced from the perspectives of the network side device and the UE respectively.
  • the network side device and the UE may include a hardware structure and a software module, and implement the above functions in the form of a hardware structure, a software module, or a hardware structure plus a software module.
  • a certain function among the above-mentioned functions may be implemented in the form of a hardware structure, a software module, or a hardware structure plus a software module.
  • the communication device may include a transceiver module and a processing module.
  • the transceiver module may include a sending module and/or a receiving module, the sending module is used to realize the sending function, the receiving module is used to realize the receiving function, and the sending and receiving module can realize the sending function and/or the receiving function.
  • the communication device may be a terminal device (such as the terminal device in the foregoing method embodiments), may also be a device in the terminal device, and may also be a device that can be matched and used with the terminal device.
  • the communication device may be a network device, or a device in the network device, or a device that can be matched with the network device.
  • the communication device may be a network device, or a terminal device (such as the terminal device in the foregoing method embodiments), or a chip, a chip system, or a processor that supports the network device to implement the above method, or it may be a terminal device that supports A chip, a chip system, or a processor for realizing the above method.
  • the device can be used to implement the methods described in the above method embodiments, and for details, refer to the descriptions in the above method embodiments.
  • a communications device may include one or more processors.
  • the processor may be a general purpose processor or a special purpose processor or the like.
  • it can be a baseband processor or a central processing unit.
  • the baseband processor can be used to process communication protocols and communication data
  • the central processor can be used to control communication devices (such as network side equipment, baseband chips, terminal equipment, terminal equipment chips, DU or CU, etc.)
  • a computer program that processes data for a computer program.
  • the communication device may further include one or more memories, on which computer programs may be stored, and the processor executes the computer programs, so that the communication device executes the methods described in the foregoing method embodiments.
  • data may also be stored in the memory.
  • the communication device and the memory can be set separately or integrated together.
  • the communication device may further include a transceiver and an antenna.
  • the transceiver may be referred to as a transceiver unit, a transceiver, or a transceiver circuit, etc., and is used to implement a transceiver function.
  • the transceiver may include a receiver and a transmitter, and the receiver may be called a receiver or a receiving circuit for realizing a receiving function; the transmitter may be called a transmitter or a sending circuit for realizing a sending function.
  • the communication device may further include one or more interface circuits.
  • the interface circuit is used to receive code instructions and transmit them to the processor.
  • the processor executes the code instructions to enable the communication device to execute the methods described in the foregoing method embodiments.
  • the communication device is a terminal device (such as the terminal device in the foregoing method embodiments): the processor is configured to execute any of the methods shown in FIG. 1-FIG. 4a.
  • the communication device is a network device: the transceiver is used to execute the method shown in any one of Fig. 5-Fig. 7 .
  • the processor may include a transceiver for implementing receiving and transmitting functions.
  • the transceiver may be a transceiver circuit, or an interface, or an interface circuit.
  • the transceiver circuits, interfaces or interface circuits for realizing the functions of receiving and sending can be separated or integrated together.
  • the above-mentioned transceiver circuit, interface or interface circuit may be used for reading and writing code/data, or the above-mentioned transceiver circuit, interface or interface circuit may be used for signal transmission or transmission.
  • the processor may store a computer program, and the computer program runs on the processor to enable the communication device to execute the methods described in the foregoing method embodiments.
  • a computer program may be embedded in a processor, in which case the processor may be implemented by hardware.
  • the communication device may include a circuit, and the circuit may implement the function of sending or receiving or communicating in the foregoing method embodiments.
  • the processors and transceivers described in this disclosure can be implemented on integrated circuits (integrated circuits, ICs), analog ICs, radio frequency integrated circuits (RFICs), mixed signal ICs, application specific integrated circuits (ASICs), printed circuit boards ( printed circuit board, PCB), electronic equipment, etc.
  • the processor and transceiver can also be fabricated using various IC process technologies such as complementary metal oxide semiconductor (CMOS), nMetal-oxide-semiconductor (NMOS), P-type Metal oxide semiconductor (positive channel metal oxide semiconductor, PMOS), bipolar junction transistor (bipolar junction transistor, BJT), bipolar CMOS (BiCMOS), silicon germanium (SiGe), gallium arsenide (Gas), etc.
  • CMOS complementary metal oxide semiconductor
  • NMOS nMetal-oxide-semiconductor
  • PMOS bipolar junction transistor
  • BJT bipolar CMOS
  • SiGe silicon germanium
  • Gas gallium arsenide
  • the communication device described in the above embodiments may be a network device or a terminal device (such as the terminal device in the foregoing method embodiments), but the scope of the communication device described in this disclosure is not limited thereto, and the structure of the communication device may not be limited limits.
  • a communication device may be a stand-alone device or may be part of a larger device.
  • the communication device may be:
  • a set of one or more ICs may also include storage components for storing data and computer programs;
  • ASIC such as modem (Modem);
  • the communications device may be a chip or system-on-a-chip
  • the chip includes a processor and an interface.
  • the number of processors may be one or more, and the number of interfaces may be more than one.
  • the chip also includes a memory, which is used to store necessary computer programs and data.
  • An embodiment of the present disclosure also provides a system for determining the duration of a side link, the system includes a communication device as a terminal device (such as the first terminal device in the method embodiment above) in the foregoing embodiments and a communication device as a network device, Alternatively, the system includes the communication device as the terminal device in the foregoing embodiments (such as the first terminal device in the foregoing method embodiment) and the communication device as a network device.
  • the present disclosure also provides a readable storage medium on which instructions are stored, and when the instructions are executed by a computer, the functions of any one of the above method embodiments are realized.
  • the present disclosure also provides a computer program product, which implements the functions of any one of the above method embodiments when executed by a computer.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product comprises one or more computer programs. When the computer program is loaded and executed on the computer, all or part of the processes or functions according to the embodiments of the present disclosure will be generated.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer program can be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer program can be downloaded from a website, computer, server or data center Transmission to another website site, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a high-density digital video disc (digital video disc, DVD)), or a semiconductor medium (for example, a solid state disk (solid state disk, SSD)) etc.
  • a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
  • an optical medium for example, a high-density digital video disc (digital video disc, DVD)
  • a semiconductor medium for example, a solid state disk (solid state disk, SSD)
  • At least one in the present disclosure can also be described as one or more, and a plurality can be two, three, four or more, and the present disclosure is not limited.
  • the technical feature is distinguished by "first”, “second”, “third”, “A”, “B”, “C” and “D”, etc.
  • the technical features described in the “first”, “second”, “third”, “A”, “B”, “C” and “D” have no sequence or order of magnitude among the technical features described.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Stereophonic System (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本公开提出一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质,属于通信技术领域。该方法包括:确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数;响应于所述第一互相关系数小于第一阈值,采用第一去相关处理方式对所述立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算所述去相关处理后两声道信号的第二互相关系数,响应于所述第一互相关系数小于所述第二互相关系数,确定标志位为第一值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流,所述第一阈值的取值范围为(-1,0)。本公开提供的方法可大大提高偏反相信号的编码压缩率。

Description

一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质 技术领域
本公开涉及通信技术领域,尤其涉及一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质。
背景技术
由于无损编码可以满足高质量音频回放和无损存储的需求,因此得到广泛应用。通常在对立体声音频信号进行无损编码时,需要先对立体声音频信号进行去相关处理,以提高编码压缩率。
相关技术中,去相关处理的主要方式为:设定一阈值,该阈值介于0与1之间,以及,计算立体声音频信号当前帧左声道信号和右声道信号的相关性系数,当相关性系数大于该阈值时,说明当前帧左声道信号和右声道信号是相关的,即当前帧为相关信号,则对当前帧左声道信号和右声道信号进行去相关处理;当相关性系数小于等于该阈值时,系统认为当前帧左右声道信号不相关,则按照不相关信号处理,即直接将立体声音频信号当前帧确定为去相关处理后两声道信号。
但是,相关技术中,当前帧为相关信号时具体包括两种相关形式,即:偏正相信号和偏反相信号。其中,为了达到提高压缩率这一目的,不同的相关形式所采用的去相关处理不相同,而相关技术中的去相关处理仅能提高偏正相信号的编码压缩率,而无法提高偏反相信号的编码压缩率。
发明内容
本公开提出的一种立体声音频信号处理方法、装置、用户设备、网络侧设备及存储介质,以解决相关技术中的去相关处理方法编码压缩率较低的技术问题。
本公开一方面实施例提出的立体声音频信号处理方法,应用于编码设备,包括:
确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数;
响应于所述第一互相关系数小于第一阈值,采用第一去相关处理方式对所述立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算所述去相关处理后两声道信号的第二互相关系数,响应于所述第一互相关系数小于第二互相关系数,确定标志位为第一值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流,所述第一阈值的取值范围为(-1,0)。
本公开另一方面实施例提出的立体声音频信号处理方法,应用于解码设备,包括:
获取编码设备发送的编码码流;
基于所述编码码流确定去相关处理后两声道信号和标志位;
响应于所述标志位为第一值,采用第一去相关重建方式对所述去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
本公开又一方面实施例提出的立体声音频信号处理装置,包括:
确定模块,用于确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数;
处理模块,用于响应于所述第一互相关系数小于第一阈值,采用第一去相关处理方式对所述立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算所述去相关处理后两声道信号的第二互相关系数,响应于所述第一互相关系数小于第二互相关系数,确定标志位为第一值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流,所述第一阈值的取值范围为(-1,0)。
本公开又一方面实施例提出的立体声音频信号处理装置,包括:
获取模块,用于获取编码设备发送的编码码流;
确定模块,用于基于所述编码码流确定去相关处理后两声道信号和标志位;
处理模块,用于响应于所述标志位为第一值,采用第一去相关重建方式对所述去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
本公开又一方面实施例提出的一种通信装置,所述装置包括处理器和存储器,所述存储器中存储有计算机程序,所述处理器执行所述存储器中存储的计算机程序,以使所述装置执行如上一方面实施例提出的方法。
本公开又一方面实施例提出的一种通信装置,所述装置包括处理器和存储器,所述存储器中存储有计算机程序,所述处理器执行所述存储器中存储的计算机程序,以使所述装置执行如上另一方面实施例提出的方法。
本公开又一方面实施例提出的通信装置,包括:处理器和接口电路;
所述接口电路,用于接收代码指令并传输至所述处理器;
所述处理器,用于运行所述代码指令以执行如一方面实施例提出的方法。
本公开又一方面实施例提出的通信装置,包括:处理器和接口电路;
所述接口电路,用于接收代码指令并传输至所述处理器;
所述处理器,用于运行所述代码指令以执行如另一方面实施例提出的方法。
本公开又一方面实施例提出的计算机可读存储介质,用于存储有指令,响应于所述指令被执行时,使如一方面实施例提出的方法被实现。
本公开又一方面实施例提出的计算机可读存储介质,用于存储有指令,响应于所述指令被执行时,使如另一方面实施例提出的方法被实现。
综上所述,在本公开实施例提供的立体声音频信号处理方法、装置、编码设备、解码设备及存储介质之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,响应于第一互相关系数小于第一阈值时,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数时,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值时,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
附图说明
本公开上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1a为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图;
图1b为本公开实施例所提供的一种基于去相关处理后两声道信号得到编码码流的流程框图;
图2为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图;
图3a为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图;
图3b为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图;
图3c为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图;
图4a为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图;
图4b为本公开实施例所提供的一种基于编码码流确定去相关处理后两声道信号的流程框图;
图5为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图;
图6为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图;
图7为本公开实施例所提供的一种立体声音频信号处理装置的结构示意图;
图8为本公开实施例所提供的一种立体声音频信号处理装置的结构示意图;
图9是本公开一个实施例所提供的一种用户设备的框图;
图10为本公开一个实施例所提供的一种网络侧设备的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开实施例相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开实施例的一些方面相一致的装置和方法的例子。
在本公开实施例使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开实施例。在本公开实施例和所附权利要求书中所使用的单数形式的“一种”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本公开实施例可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开实施例范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”及“若”可以被解释成为“在……时”或“当……时”或“响应于确定”。
下面参考附图对本公开实施例所提供的立体声音频信号处理方法、装置、编码设备、解码设备及存储介质进行详细描述。
图1a为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图,该方法由编码设备执行,如图1a所示,该立体声音频信号处理方法可以包括以下步骤:
步骤101、确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数。
其中,在本公开的一个实施例之中,响应于编码设备接收到输入的立体声音频信号,可以对该立体声音频信号当前帧进行互相关性分析,以得到当前帧左声道信号和右声道信号的第一互相关系数。
以及,在本公开的一个实施例之中,确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数的方法可以包括:
基于公式三确定左声道信号和右声道信号的第一互相关系数;该公式三为:
Figure PCTCN2021133722-appb-000001
其中,η (LR)为当前帧左声道信号和右声道信号的互相关系数,L(n)为当前帧左声道信号第n个样点,
Figure PCTCN2021133722-appb-000002
为当前帧左声道信号所有样点的平均值,R(n)为当前帧右声道信号第n个样点,
Figure PCTCN2021133722-appb-000003
为当前帧右声道信号所有样点的平均值,N为当前帧左声道信号或者右声道信号样点总数,即为当前帧帧长。
步骤102、响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及,基于去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流。
其中,在本公开的一个实施例之中,该第一阈值可以是预先设定的,该第一阈值的取值范围为(-1,0)。示例的,在本公开的一个实施例之中,该第一阈值可以介于[-0.5,-0.1]之间。具体的,在本公开的一个实施例之中,该第一阈值可以为-0.3。
以及,在本公开的一个实施例之中,响应于第一互相关系数小于第一阈值时,则说明当前帧左声道信号和右声道信号之间呈负相关,即当前帧为偏反相信号,此时可以采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号。其中,在本公开的一个实施例之中,该第一去相关处理方式可以为第一和差下混处理。具体而言,该第一和差下混处理可以包括:基于公式一对所述左声道信号和右声道信号进行处理以得到主声道信号和次声道信号;公式一为:
Figure PCTCN2021133722-appb-000004
其中,Mid(n)为去相关处理后两声道信号中的主声道信号,Sid(n)为去相关处理后两声道信号中的次声道信号,L(n)为左声道信号,R(n)为右声道信号。
由此可知,本公开实施例会确定立体声音频信号当前帧是否为偏反相信号,以及,响应于确定立体声音频信号当前帧为偏反相信号,采用偏反相信号对应的第一去相关处理方式对立体声音频信号当前帧进行去相关处理得到去相关处理后两声道信号,以此大大提高了偏反相信号的编码压缩率。
进一步地,需要说明的是,在通信系统中,可能会存在“对音频信号进行去相关处理后得到的去相关处理后两声道信号的相关性反而大于等于去相关处理前两声道信号的相关性”,也即是,去相关处理并未达到“去相关”的目的。由此,在本公开的一个实施例之中,在对当前帧进行第一去相关处理而获得去相关处理后两声道信号之后,可以进一步计算出去相关处理后两声道信号的第二互相关系数,并通过判定该第二互相关系数(即去相关处理之后的互相关系数)和第一互相关系数(即去相关处理之前的互相关系数)之间的大小关系,来确定第一去相关处理是否达到“去相关”的目的。
其中,在本公开的一个实施例之中,计算出去相关处理后两声道信号的第二互相关系数的方法可以包括:
基于公式四确定第二互相关系数;该公式四可以为:
Figure PCTCN2021133722-appb-000005
其中,η (MS)为第二互相关系数,Mid(n)为去相关处理后两声道信号中主声道信号第n个样点,
Figure PCTCN2021133722-appb-000006
为去相关处理后两声道信号中主声道信号所有样点的平均值,Sid(n)为去相关处理后两声道信号中次声道信号第n个样点,
Figure PCTCN2021133722-appb-000007
为去相关处理后两声道信号中次声道信号所有样点的平均值,N为当前帧左声道信号或者右声道信号样点总数,即为当前帧帧长。
以及,在本公开的一个实施例之中,响应于第一互相关系数小于第二互相关系数,认为第一去相关处理方式达到了“去相关”的目的。
具体而言,应当认识到,信号之间相关性大小与相关系数的绝对值呈正相关,以及,针对相关性系数为负数的偏反相信号而言,其相关性系数值越小,相关性系数的绝对值越大,则负相关性越好。
基于此,在本公开的一个实施例之中,由于立体声音频信号当前帧为偏反相信号,此时,若第一互相关系数小于第二互相关系数,则说明执行第一去相关处理方式前两声道信号的负相关性高于去执行第一去相关处理方式后两声道信号的负相关性,从而可以确定第一去相关处理方式达到了“去相关”的目的。
以及,在本公开的一个实施例之中,响应于确定第一去相关处理方式达到了“去相关”的目的,可以基于去相关处理后两声道信号得到编码码流。其中,在本公开的一个实施例之中,图1b为本公开实施例所提供的一种基于去相关处理后两声道信号得到编码码流的流程框图,如图1b所示,基于去相关处理后两声道信号得到编码码流的方法可以为:
对去相关处理后两声道信号采用整型提升小波分解进行分带得到各子带信号,对去相关处理后两声道信号进行LPC(Linear Prediction Coefficient,线性预测系数)参数计算和量化以得到量化LPC参数,再 利用线性预测器基于量化LPC参数对各子带信号进行预测,生成预测残差信号,利用预处理器对预测残差信号进行归一化处理,产生归一化输出信号、LSB(Least Significant Bit,最低有效位)信号以及信号符号位。利用熵编码器对各子带信号对应的归一化输出信号进行熵编码,生成编码位流,再对编码位流、LSB信号、信号符号位,量化LPC参数以及小波边信息进行码流复用得到编码码流。
进一步地,在本公开的一个实施例之中,响应于确定出编码码流,可以确定标志位为第一值(例如可以为0),该第一值可以用于指示编码设备所采用的去相关处理为第一去相关处理方式,以及可以将该标志位写入编码码流中发送至解码设备,以便解码设备可以基于该标志位采用对应的去相关重建方式来进行去相关重建。
此外,需要说明的是,在本公开的一个实施例之中,立体声音频信号的不同帧所对应的第一阈值可以不同。其中,响应于不同帧对应的第一阈值不同,具体是将当前帧左声道信号和右声道信号的第一互相关系数与当前帧对应的第一阈值进行比较。
还需要说明的是,本实施例上述内容描述的是基于立体声音频信号当前帧左声道信号和右声道信号之间的互相关性来对立体声音频信号进行处理。在本公开的另一个实施例之中,还可以基于立体声音频信号当前帧左声道信号和右声道信号之间的相位来对立体声音频信号进行处理,具体而言,可以先确定当前帧左声道信号和右声道信号之间的第一相位,若当前帧左声道信号和右声道信号之间的第一相位介于第一区间,确定立体声音频信号当前帧为偏反相信号,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算去相关处理后两声道信号的第二相位,响应于第一相位大于第二相位,说明第一去相关处理方式达到“去相关”的目的,确定标志位为第一值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流发送至解码设备。其中,该第一区间可以为[135°,180°]。
综上所述,在本公开实施例提供的立体声音频信号处理方法之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,当第一互相关系数小于第一阈值时,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,当第一互相关系数小于第二互相关系数时,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,当第一互相关系数小于第一阈值时,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图2为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图,该方法由编码设备执行,如图2所示,该立体声音频信号处理方法可以包括以下步骤:
步骤201、确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数。
关于步骤201的相关介绍可以参考上述实施例描述,本公开实施例在此不做赘述。
步骤202、响应于第一互相关系数大于第二阈值,采用第二去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算去相关处理后两声道信号的第三互相关系数,响应于第一互相关系数大于第三互相关系数,确定标志位为第二值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流。
其中,在本公开的一个实施例之中,该第二阈值可以是预先设定的,该第二阈值的取值范围可以为(0,1)。示例的,在本公开的一个实施例之中,该第二阈值可以介于[0.1,0.5]之间。具体的,在本公开的一个实施例之中,该第二阈值可以为0.3。
以及,在本公开的一个实施例之中,响应于第二互相关系数大于第一阈值,说明当前帧左声道信号和右声道信号之间呈正相关,即当前帧为偏正相信号,此时可以采用第二去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号。其中,在本公开的一个实施例之中,该第二去相关处理方式可以为第二和差下混处理。具体而言,该第二和差下混处理可以包括:基于公式二对左声道信号和右声道信号进行处理以得到主声道信号和次声道信号;公式二为:
Figure PCTCN2021133722-appb-000008
其中,Mid(n)为去相关处理后两声道信号中的主声道信号,Sid(n)为去相关处理后两声道信号中的次声道信号,L(n)为左声道信号,R(n)为右声道信号。
由此可知,本公开实施例会确定立体声音频信号当前帧是否为偏正相信号,以及,响应于确定立体声音频信号当前帧为偏正相信号,采用偏正相信号对应的第二去相关处理方式对立体声音频信号当前帧进行去相关处理得到去相关处理后两声道信号,以此大大提高了偏正相信号的编码压缩率。
进一步地,需要说明的是,在通信系统中,可能会存在“对音频信号进行去相关处理后得到的去相关处理后两声道信号的相关性反而大于等于去相关处理前的相关性”,也即是,去相关处理并未达到“去相关”的目的。由此,在本公开的一个实施例之中,在对当前帧进行第二去相关处理而获得去相关处理后两声道信号后,可以进一步计算出去相关处理后两声道信号的第三互相关系数,并通过判定该第三互相关系数(即去相关处理之后的互相关系数)和第一互相关系数(即去相关处理之前的互相关系数)之间的大小关系,来确定第二去相关处理是否达到“去相关”的目的。
其中,在本公开的一个实施例之中,计算出去相关处理后两声道信号的第三互相关系数的方法可以包括:
基于公式四确定第三互相关系数;该公式四可以为:
Figure PCTCN2021133722-appb-000009
其中,η (MS)为第三互相关系数,Mid(n)为去相关处理后两声道信号中主声道信号第n个样点,
Figure PCTCN2021133722-appb-000010
为去相关处理后两声道信号中主声道信号所有样点的平均值,Sid(n)为去相关处理后两声道信号中次声道信号第n个样点,
Figure PCTCN2021133722-appb-000011
为去相关处理后两声道信号中次声道信号所有样点的平均值,N为当前帧左声道信号或者右声道信号样点总数,即为当前帧帧长。
以及,在本公开的一个实施例之中,响应于第一互相关系数大于第三互相关系数,认为第二去相关处理方式达到了“去相关”的目的。
具体而言,应当认识到,信号之间相关性大小与相关系数的绝对值呈正相关,以及,针对相关性系数为正数的偏正相信号而言,其相关性系数值越小,相关性系数的绝对值越小,则正相关性越差。
基于此,由于立体声音频信号当前帧为偏正相信号。此时,若第一互相关系数大于第三互相关系数,则说明执行第二去相关处理方式前两声道信号的正相关性高于去执行第二去相关处理方式后两声道信号的正相关性,从而可以确定第二去相关处理方式达到了“去相关”的目的。
以及,在本公开的一个实施例之中,响应于确定第二去相关处理方式达到了“去相关”的目的,可以基于去相关处理后两声道信号得到编码码流,并确定标志位为第二值(例如可以为1),该第二值可以用于指示编码端所采用的去相关处理为第二去相关处理方式,以及可以将该标志位写入编码码流中发送至解码设备,以便解码设备可以基于该标志位采用对应的去相关重建方式来进行去相关重建。
其中,上述的“基于去相关处理后两声道信号得到编码码流”的具体方法可以参见前述实施例描述,本公开实施例在此不做赘述。
此外,需要说明的是,在本公开的一个实施例之中,立体声音频信号的不同帧所对应的第二阈值可以不同。其中,响应于不同帧对应的第二阈值不同,可以是将当前帧左声道信号和右声道信号的第一互相关系数与当前帧对应的第二阈值进行比较。
还需要说明的是,本实施例上述内容描述的是基于立体声音频信号当前帧左声道信号和右声道信号之间的互相关性来对立体声音频信号进行处理。在本公开的另一个实施例之中,还可以基于立体声音频信号当前帧左声道信号和右声道信号之间的相位来对立体声音频信号进行处理,具体而言,可以先确定当前帧左声道信号和右声道信号之间的第一相位,若当前帧左声道信号和右声道信号之间的第一相位介于第二区间,确定立体声音频信号当前帧为偏正相信号,采用第二去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算去相关处理后两声道信号的第三相位,响应于第一相位大于第三相位,说明第二去相关处理方式达到“去相关”的目的,确定标志位为第二值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流发送至解码设备。其中,该第二区间可以为[0°,45°]。
综上所述,在本公开实施例提供的立体声音频信号处理方法之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,当第一互相关系数小于第一阈值时,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,当第一互相关系数小于第二互相关系数时,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,当第一互相关系数小于第一阈值时,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图3a为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图,该方法由编码设备执行,如图3a所示,该立体声音频信号处理方法可以包括以下步骤:
步骤301a、确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数。
关于步骤301a的相关介绍可以参考上述实施例描述,本公开实施例在此不做赘述。
步骤302a、响应于第一互相关系数大于等于第一阈值且小于等于第二阈值,直接将立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流。
其中,在本公开的一个实施例之中,该第一阈值可以是预先设定的,该第一阈值的取值范围为(-1,0)。示例的,在本公开的一个实施例之中,该第一阈值可以介于[-0.5,-0.1]之间。具体的,在本公开的一个实施例之中,该第一阈值可以为-0.3。以及,在本公开的一个实施例之中,该第二阈值可以是预先设定的,该第二阈值的取值范围可以为(0,1)。示例的,在本公开的一个实施例之中,该第二阈值可以介于[0.1,0.5]之间。具体的,在本公开的一个实施例之中,该第二阈值可以为0.3。
进一步地,在本公开的一个实施例之中,第一互相关系数大于等于第一阈值且小于等于第二阈值时,则说明当前帧左声道信号和右声道信号之间不相关,此时可以无需对立体声音频信号当前帧进行去相关处理,直接将立体声音频信号当前帧确定为去相关处理后两声道信号,并基于去相关处理后两声道信号得到编码码流,确定标志位为第三值(例如可以为2),该第三值可以用于指示编码端未采用去相关处理,以及可以将该标志位写入编码码流中发送至解码设备,以便解码设备可以基于该标志位来进行对应的去相关重建。
其中,上述的“基于去相关处理后两声道信号得到编码码流”的具体方法可以参见前述实施例描述,本公开实施例在此不做赘述。
还需要说明的是,本实施例上述内容描述的是基于立体声音频信号当前帧左声道信号和右声道信号之间的互相关性来对立体声音频信号进行处理。在本公开的另一个实施例之中,还可以基于立体声音频信号当前帧左声道信号和右声道信号之间的相位来对立体声音频信号进行处理,具体而言,可以先确定当前帧左声道信号和右声道信号之间的第一相位,若当前帧左声道信号和右声道信号之间的第一相位介 于第三区间,确定立体声音频信号当前帧为不相关信号,直接将立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流发送至解码设备。其中,该第三区间可以为(45°,135°)。
综上所述,在本公开实施例提供的立体声音频信号处理方法之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图3b为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图,该方法由编码设备执行,如图3b所示,该立体声音频信号处理方法可以包括以下步骤:
步骤301b、确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数。
关于步骤301b的相关介绍可以参考上述实施例描述,本公开实施例在此不做赘述。
步骤302b、响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数大于等于第二互相关系数,直接将立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流。
其中,关于第一阈值、第一去相关处理方式、第一互相关系数和第二互相关系数的相关详细介绍可以参考上述实施例描述,本公开实施例在此不做赘述。
以及,在本公开的一个实施例之中,响应于第一互相关系数小于第一阈值且第一互相关系数大于等于第二互相关系数,认为第一去相关处理方式未达到了“去相关”的目的。
具体而言,参考上述内容可知,由于响应于第一互相关系数小于第一阈值时,说明立体声音频信号当前帧为偏反相信号(即:相关性系数越小,负相关性越高)。此时,若第一互相关系数(即去相关处理之前的互相关系数)大于等于第二互相关系数(即去相关处理之后的互相关系数),则说明执行第一去相关处理方式前两声道信号的负相关性低于去执行第一去相关处理方式后两声道信号的负相关性,从而可以确定第一去相关处理方式未达到“去相关”的目的。
基于此,在本公开的一个实施例之中,响应于确定出第一去相关处理方式未达到“去相关”的目的,若继续基于第一去相关处理方式得到的去相关处理后两声道信号进行编码操作,则会大大降低编码压缩率,因此,本公开实施例之中可以直接将立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值(例如可以为2),以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流发送至解码设备,以确保后续编码压缩率。其中,关于该部分内容的相关介绍可以参考前述实施例描述,本公开实施例在此不做赘述。
还需要说明的是,本实施例上述内容描述的是基于立体声音频信号当前帧左声道信号和右声道信号之间的互相关性来对立体声音频信号进行处理。在本公开的另一个实施例之中,还可以基于立体声音频信号当前帧左声道信号和右声道信号之间的相位来对立体声音频信号进行处理,具体而言,可以先确定当前帧左声道信号和右声道信号之间的第一相位,若当前帧左声道信号和右声道信号之间的第一相位介于第一区间,确定立体声音频信号当前帧为偏反相信号,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算去相关处理后两声道信号的第二相位,响应于第一相位小于等于第二相位,说明第一去相关处理方式未达到“去相关”的目的,则直接将立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流发送至解码设备。其中,该第一区间可以为[135°,180°]。
综上所述,在本公开实施例提供的立体声音频信号处理方法之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,当第一互相关系数小于第一阈值时,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图3c为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图,该方法由编码设备执行,如图3c所示,该立体声音频信号处理方法可以包括以下步骤:
步骤301c、确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数。
关于步骤301c的相关介绍可以参考上述实施例描述,本公开实施例在此不做赘述。
步骤302c、响应于第一互相关系数大于第二阈值,采用第二去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算去相关处理后两声道信号的第三互相关系数,响应于第一互相关系数小于等于第三互相关系数,直接将立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流。
其中,关于第二阈值、第二去相关处理方式、第一互相关系数和第三互相关系数的相关详细介绍可以参考上述实施例描述,本公开实施例在此不做赘述。
以及,在本公开的一个实施例之中,响应于第一互相关系数大于第二阈值且第一互相关系数小于等于第三互相关系数时,认为第一去相关处理方式未达到了“去相关”的目的。
具体而言,参考上述内容可知,由于响应于第一互相关系数大于第二阈值,说明立体声音频信号当前帧为偏正相信号(即:相关性系数越大,正相关性越高)。此时,若第一互相关系数(即去相关处理之前的互相关系数)小于等于第三互相关系数(即去相关处理之后的互相关系数),则说明执行第二去相关处理方式前两声道信号的正相关性低于等于执行第二去相关处理方式后两声道信号的正相关性,从而可以确定第二去相关处理方式未达到“去相关”的目的。
基于此,在本公开的一个实施例之中,响应于确定出第二去相关处理方式未达到“去相关”的目的,若继续基于第二去相关处理方式得到的去相关处理后两声道信号进行编码操作,则会大大降低编码压缩率,因此,本公开实施例之中可以直接将立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值(例如可以为2),以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流发送至解码设备。其中,关于该部分内容的相关介绍可以参考前述实施例描述,本公开实施例在此不做赘述。
还需要说明的是,本实施例上述内容描述的是基于立体声音频信号当前帧左声道信号和右声道信号之间的互相关性来对立体声音频信号进行处理。在本公开的另一个实施例之中,还可以基于立体声音频信号当前帧左声道信号和右声道信号之间的相位来对立体声音频信号进行处理,具体而言,可以先确定当前帧左声道信号和右声道信号之间的第一相位,若当前帧左声道信号和右声道信号之间的第一相位介于第二区间,确定立体声音频信号当前帧为偏正相信号,采用第二去相关处理方式对立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算去相关处理后两声道信号的第三相位,响应于第一相位小于等于第三相位,说明第二去相关处理方式未达到“去相关”的目的,则直接将立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值,以及,基于去相关处理后两声道信号得到编码码流并将标志位写入所述编码码流发送至解码设备。其中,该第一区间可以为[0°,45°]。
综上所述,在本公开实施例提供的立体声音频信号处理方法之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算 去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值时,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图4a为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图,该方法由解码设备执行,如图4a所示,该立体声音频信号处理方法可以包括以下步骤:
步骤401、获取编码设备发送的编码码流。
步骤402、基于编码码流确定去相关处理后两声道信号和标志位。
其中,图4b为本公开实施例所提供的一种基于编码码流确定去相关处理后两声道信号的流程框图,如图4b所示,基于编码码流确定去相关处理后两声道信号的方法可以为:
获取到编码码流后,先解析该编码码流以获得编码位流、标志位、LSB信号、符号位信号、量化LPC参数以及小波边信息,再利用熵解码器对编码位流进行熵解码得到解码信号,然后通过后处理器基于LSB信号和符号位信号对解码信号进行处理以生成预测残差。之后,利用线性预测器根据量化LPC参数对预测残差进行重建,生成各子带信号,再利用整型提升小波基于小波边信息对各子带信号进行重构以得到去相关处理后两声道信号。
步骤403、响应于标志位为第一值,采用第一去相关重建方式对去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
其中,在本公开的一个实施例之中,响应于标志位为第一值(例如可以为0),则说明编码设备是采用第一去相关处理方式对立体声音频信号进行去相关处理的。基于此,可以采用与第一去相关处理方式对应的第一去相关重建方式对去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
在本公开的一个实施例之中,去相关处理后两声道信号可以包括主声道信号和次声道信号。以及,第一去相关重建方式可以包括:基于公式五对去相关处理后两声道信号进行去相关重建;公式五为:
Figure PCTCN2021133722-appb-000012
其中,Mid(n)为去相关处理后两声道信号中的主声道信号,Sid(n)为去相关处理后两声道信号中的次声道信号,L(n)为左声道信号,R(n)为右声道信号。
综上所述,在本公开实施例提供的立体声音频信号处理方法之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图5为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图,该方法由解码设备执行,如图5所示,该立体声音频信号处理方法可以包括以下步骤:
步骤501、获取编码设备发送的编码码流。
步骤502、基于编码码流确定去相关处理后两声道信号和标志位。
其中,关于步骤501-502的相关介绍可以参考上述实施例描述,本公开实施例在此不做赘述。
步骤503、响应于标志位为第二值,采用第二去相关重建方式对所述去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
其中,在本公开的一个实施例之中,响应于标志位为第二值(例如可以为1),则说明编码设备是采用第二去相关处理方式对立体声音频信号进行去相关处理的。基于此,可以采用与第二去相关处理方式对应的第二去相关重建方式对去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
在本公开的一个实施例之中,去相关处理后两声道信号可以包括主声道信号和次声道信号。以及,第二去相关重建方式可以包括:基于公式六对所述去相关处理后两声道信号进行去相关重建;公式六为:
Figure PCTCN2021133722-appb-000013
其中,Mid(n)为去相关处理后两声道信号的主声道信号,Sid(n)为去相关处理后两声道信号的次声道信号,L(n)为左声道信号,R(n)为右声道信号。
综上所述,在本公开实施例提供的立体声音频信号处理方法之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图6为本公开实施例所提供的一种立体声音频信号处理方法的流程示意图,该方法由解码设备执行,如图6所示,该立体声音频信号处理方法可以包括以下步骤:
步骤601、获取编码设备发送的编码码流。
步骤602、基于编码码流确定去相关处理后两声道信号和标志位。
其中,关于步骤601-602的相关介绍可以参考上述实施例描述,本公开实施例在此不做赘述。
步骤603、响应于标志位为第三值,直接将去相关处理后两声道信号确定为去相关重建后的音频信号。
在本公开的一个实施例之中,响应于标志位为第三值(例如可以为2),则说明编码设备是未进行去相关处理的。基于此,可以直接将去相关处理后两声道信号确定为去相关重建后的音频信号。
综上所述,在本公开实施例提供的立体声音频信号处理方法之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图7为本公开实施例所提供的一种立体声音频信号处理装置的结构示意图,应用于编码端,如图7所示,装置700可以包括:
确定模块701,用于确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数;
处理模块702,用于响应于所述第一互相关系数小于第一阈值,采用第一去相关处理方式对所述立 体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算所述去相关处理后两声道信号的第二互相关系数,响应于所述第一互相关系数小于第二互相关系数,确定标志位为第一值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流发送至解码设备,所述第一阈值的取值范围为(-1,0)。
综上所述,在本公开实施例提供的立体声音频信号处理装置之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
可选的,在本公开的一个实施例之中,所述第一去相关处理方式包括第一和差下混处理。
可选的,在本公开的一个实施例之中,所述第一和差下混处理包括:
基于公式一对所述左声道信号和右声道信号进行处理以得到主声道信号和次声道信号;所述公式一为:
Figure PCTCN2021133722-appb-000014
其中,Mid(n)为主声道信号,Sid(n)为次声道信号,L(n)为左声道信号,R(n)为右声道信号。
可选的,在本公开的一个实施例之中,所述装置还用于:
响应于所述第一互相关系数大于第二阈值,采用第二去相关处理方式对所述立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算所述去相关处理后两声道信号的第三互相关系数,响应于所述第一互相关系数大于所述第三互相关系数,确定标志位为第二值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流,所述第二阈值的取值范围(0,1)。
可选的,在本公开的一个实施例之中,所述第二去相关处理方式包括第二和差下混处理。
可选的,在本公开的一个实施例之中,所述第二和差下混处理包括:
基于公式二对所述左声道信号和右声道信号进行处理以得到主声道信号;所述公式二为:
Figure PCTCN2021133722-appb-000015
其中,Mid(n)为主声道信号,Sid(n)为次声道信号,L(n)为左声道信号,R(n)为右声道信号。
可选的,在本公开的一个实施例之中,所述装置还用于:
响应于所述第一互相关系数大于等于所述第一阈值且小于等于所述第二阈值,或者所述第一互相关系数小于第一阈值且所述第一互相关系数大于等于所述第二互相关系数,或者所述第一互相关系数大于第二阈值且所述第一互相关系数小于等于所述第三互相关系数,直接将所述立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流。
可选的,在本公开的一个实施例之中,所述确定模块,还用于:
基于公式三确定所述左声道信号和右声道信号的第一互相关系数;所述公式三为:
Figure PCTCN2021133722-appb-000016
其中,η (LR)为当前帧左声道信号和右声道信号的互相关系数,L(n)为当前帧左声道信号第n个样点,
Figure PCTCN2021133722-appb-000017
为当前帧左声道信号所有样点的平均值,R(n)为当前帧右声道信号第n个样点,
Figure PCTCN2021133722-appb-000018
为当前帧右声道信号所有样点的平均值,N为当前帧左声道信号或者右声道信号样点总数,即为当前帧帧长。
可选的,在本公开的一个实施例之中,所述去相关处理后两声道信号包括主声道信号和次声道信号;
所述装置还用于:
基于公式四确定第二互相关系数和第三互相关系数;所述公式四为:
Figure PCTCN2021133722-appb-000019
其中,η (MS)为第二互相关系数或第三互相关系数,Mid(n)为去相关处理后两声道信号中主声道信号第n个样点,
Figure PCTCN2021133722-appb-000020
为去相关处理后两声道信号中主声道信号所有样点的平均值,Sid(n)为去相关处理后两声道信号中次声道信号第n个样点,
Figure PCTCN2021133722-appb-000021
为去相关处理后两声道信号中次声道信号所有样点的平均值,N为当前帧左声道信号或者右声道信号样点总数,即为当前帧帧长。
综上所述,在本公开实施例提供的立体声音频信号处理装置之中,会先确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,以及,响应于第一互相关系数小于第一阈值,采用第一去相关处理方式对立体声音频信号当前帧进行去相关处理以获得去相关处理后两声道信号,之后,会计算去相关处理后两声道信号的第二互相关系数,响应于第一互相关系数小于第二互相关系数,确定标志位为第一值,以及基于去相关处理后两声道信号得到编码码流并将标志位写入编码码流发送至解码设备。其中,本公开实施例之中,第一阈值的取值范围为(-1,0),由此,响应于第一互相关系数小于第一阈值,说明立体声音频信号当前帧为偏反相信号,此时,针对该偏反相信号会对应采用第一去相关处理方式,以此来确保后续的压缩率,则本公开实施例提供了一种针对于偏反相信号的判定方式和去相关处理方式,大大提高了偏反相信号的编码压缩率。
图8为本公开实施例所提供的一种立体声音频信号处理装置的结构示意图,应用于解码端如图8所示,装置800可以包括:
获取模块801,用于获取编码设备发送的编码码流;
确定模块802,用于基于所述编码码流确定去相关处理后两声道信号和标志位;
处理模块803,用于响应于所述标志位为第一值,采用第一去相关重建方式对所述去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
可选的,在本公开的一个实施例之中,所述第一去相关重建方式包括:
基于公式五对所述去相关处理后两声道信号进行去相关重建;所述公式五为:
Figure PCTCN2021133722-appb-000022
其中,Mid(n)为去相关处理后两声道信号中的主声道信号,Sid(n)为去相关处理后两声道信号中的次声道信号,L(n)为左声道信号,R(n)为右声道信号。
可选的,在本公开的一个实施例之中,所述装置还用于:
响应于所述标志位为第二值,采用第二去相关重建方式对所述去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
可选的,在本公开的一个实施例之中,所述第二去相关重建方式包括:
基于公式六对所述去相关处理后两声道信号进行去相关重建;所述公式六为:
Figure PCTCN2021133722-appb-000023
其中,Mid(n)为去相关处理后两声道信号的主声道信号,Sid(n)为去相关处理后两声道信号的次声道信号,L(n)为左声道信号,R(n)为右声道信号。
可选的,在本公开的一个实施例之中,所述装置还用于:
响应于所述标志位为第三值,直接将所述去相关处理后两声道信号确定为去相关重建后的音频信号。
图9是本公开一个实施例所提供的一种用户设备UE900的框图。例如,UE900可以是移动电话,计算机,数字广播终端设备,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图9,UE900可以包括以下至少一个组件:处理组件902,存储器904,电源组件906,多媒体组件908,音频组件910,输入/输出(I/O)的接口912,传感器组件913,以及通信组件916。
处理组件902通常控制UE900的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件902可以包括至少一个处理器920来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件902可以包括至少一个模块,便于处理组件902和其他组件之间的交互。例如,处理组件902可以包括多媒体模块,以方便多媒体组件908和处理组件902之间的交互。
存储器904被配置为存储各种类型的数据以支持在UE900的操作。这些数据的示例包括用于在UE900上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器904可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件906为UE900的各种组件提供电力。电源组件906可以包括电源管理系统,至少一个电源,及其他与为UE900生成、管理和分配电力相关联的组件。
多媒体组件908包括在所述UE900和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括至少一个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的唤醒时间和压力。在一些实施例中,多媒体组件908包括一个前置摄像头和/或后置摄像头。当UE900处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件910被配置为输出和/或输入音频信号。例如,音频组件910包括一个麦克风(MIC),当UE900处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器904或经由通信组件916发送。在一些实施例中,音频组 件910还包括一个扬声器,用于输出音频信号。
I/O接口912为处理组件902和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件913包括至少一个传感器,用于为UE900提供各个方面的状态评估。例如,传感器组件913可以检测到设备900的打开/关闭状态,组件的相对定位,例如所述组件为UE900的显示器和小键盘,传感器组件913还可以检测UE900或UE900一个组件的位置改变,用户与UE900接触的存在或不存在,UE900方位或加速/减速和UE900的温度变化。传感器组件913可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件913还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件913还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件916被配置为便于UE900和其他设备之间有线或无线方式的通信。UE900可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件916经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件916还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,UE900可以被至少一个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
图10是本公开实施例所提供的一种网络侧设备1000的框图。例如,网络侧设备1000可以被提供为一网络侧设备。参照图10,网络侧设备1000包括处理组件1011,其进一步包括至少一个处理器,以及由存储器1032所代表的存储器资源,用于存储可由处理组件1022的执行的指令,例如应用程序。存储器1032中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件1010被配置为执行指令,以执行上述方法前述应用在所述网络侧设备的任意方法,例如,如图1所示方法。
网络侧设备1000还可以包括一个电源组件1026被配置为执行网络侧设备1000的电源管理,一个有线或无线网络接口1050被配置为将网络侧设备1000连接到网络,和一个输入输出(I/O)接口1058。网络侧设备1000可以操作基于存储在存储器1032的操作系统,例如Windows Server TM,Mac OS XTM,Unix TM,Linux TM,Free BSDTM或类似。
上述本公开提供的实施例中,分别从网络侧设备、UE的角度对本公开实施例提供的方法进行了介绍。为了实现上述本公开实施例提供的方法中的各功能,网络侧设备和UE可以包括硬件结构、软件模块,以硬件结构、软件模块、或硬件结构加软件模块的形式来实现上述各功能。上述各功能中的某个功能可以以硬件结构、软件模块、或者硬件结构加软件模块的方式来执行。
上述本公开提供的实施例中,分别从网络侧设备、UE的角度对本公开实施例提供的方法进行了介绍。为了实现上述本公开实施例提供的方法中的各功能,网络侧设备和UE可以包括硬件结构、软件模块,以硬件结构、软件模块、或硬件结构加软件模块的形式来实现上述各功能。上述各功能中的某个功能可以以硬件结构、软件模块、或者硬件结构加软件模块的方式来执行。
本公开实施例提供的一种通信装置。通信装置可包括收发模块和处理模块。收发模块可包括发送模块和/或接收模块,发送模块用于实现发送功能,接收模块用于实现接收功能,收发模块可以实现发送功能和/或接收功能。
通信装置可以是终端设备(如前述方法实施例中的终端设备),也可以是终端设备中的装置,还可以是能够与终端设备匹配使用的装置。或者,通信装置可以是网络设备,也可以是网络设备中的装置,还可以是能够与网络设备匹配使用的装置。
本公开实施例提供的另一种通信装置。通信装置可以是网络设备,也可以是终端设备(如前述方法实施例中的终端设备),也可以是支持网络设备实现上述方法的芯片、芯片系统、或处理器等,还可以是支持终端设备实现上述方法的芯片、芯片系统、或处理器等。该装置可用于实现上述方法实施例中描 述的方法,具体可以参见上述方法实施例中的说明。
通信装置可以包括一个或多个处理器。处理器可以是通用处理器或者专用处理器等。例如可以是基带处理器或中央处理器。基带处理器可以用于对通信协议以及通信数据进行处理,中央处理器可以用于对通信装置(如,网络侧设备、基带芯片,终端设备、终端设备芯片,DU或CU等)进行控制,执行计算机程序,处理计算机程序的数据。
可选的,通信装置中还可以包括一个或多个存储器,其上可以存有计算机程序,处理器执行所述计算机程序,以使得通信装置执行上述方法实施例中描述的方法。可选的,所述存储器中还可以存储有数据。通信装置和存储器可以单独设置,也可以集成在一起。
可选的,通信装置还可以包括收发器、天线。收发器可以称为收发单元、收发机、或收发电路等,用于实现收发功能。收发器可以包括接收器和发送器,接收器可以称为接收机或接收电路等,用于实现接收功能;发送器可以称为发送机或发送电路等,用于实现发送功能。
可选的,通信装置中还可以包括一个或多个接口电路。接口电路用于接收代码指令并传输至处理器。处理器运行所述代码指令以使通信装置执行上述方法实施例中描述的方法。
通信装置为终端设备(如前述方法实施例中的终端设备):处理器用于执行图1-图4a任一所示的方法。
通信装置为网络设备:收发器用于执行图5-图7任一所示的方法。
在一种实现方式中,处理器中可以包括用于实现接收和发送功能的收发器。例如该收发器可以是收发电路,或者是接口,或者是接口电路。用于实现接收和发送功能的收发电路、接口或接口电路可以是分开的,也可以集成在一起。上述收发电路、接口或接口电路可以用于代码/数据的读写,或者,上述收发电路、接口或接口电路可以用于信号的传输或传递。
在一种实现方式中,处理器可以存有计算机程序,计算机程序在处理器上运行,可使得通信装置执行上述方法实施例中描述的方法。计算机程序可能固化在处理器中,该种情况下,处理器可能由硬件实现。
在一种实现方式中,通信装置可以包括电路,所述电路可以实现前述方法实施例中发送或接收或者通信的功能。本公开中描述的处理器和收发器可实现在集成电路(integrated circuit,IC)、模拟IC、射频集成电路RFIC、混合信号IC、专用集成电路(application specific integrated circuit,ASIC)、印刷电路板(printed circuit board,PCB)、电子设备等上。该处理器和收发器也可以用各种IC工艺技术来制造,例如互补金属氧化物半导体(complementary metal oxide semiconductor,CMOS)、N型金属氧化物半导体(nMetal-oxide-semiconductor,NMOS)、P型金属氧化物半导体(positive channel metal oxide semiconductor,PMOS)、双极结型晶体管(bipolar junction transistor,BJT)、双极CMOS(BiCMOS)、硅锗(SiGe)、砷化镓(Gas)等。
以上实施例描述中的通信装置可以是网络设备或者终端设备(如前述方法实施例中的终端设备),但本公开中描述的通信装置的范围并不限于此,而且通信装置的结构可以不受的限制。通信装置可以是独立的设备或者可以是较大设备的一部分。例如所述通信装置可以是:
(1)独立的集成电路IC,或芯片,或,芯片系统或子系统;
(2)具有一个或多个IC的集合,可选的,该IC集合也可以包括用于存储数据,计算机程序的存储部件;
(3)ASIC,例如调制解调器(Modem);
(4)可嵌入在其他设备内的模块;
(5)接收机、终端设备、智能终端设备、蜂窝电话、无线设备、手持机、移动单元、车载设备、网络设备、云设备、人工智能设备等等;
(6)其他等等。
对于通信装置可以是芯片或芯片系统的情况,芯片包括处理器和接口。其中,处理器的数量可以是一个或多个,接口的数量可以是多个。
可选的,芯片还包括存储器,存储器用于存储必要的计算机程序和数据。
本领域技术人员还可以了解到本公开实施例列出的各种说明性逻辑块(illustrative logical block)和步骤(step)可以通过电子硬件、电脑软件,或两者的结合进行实现。这样的功能是通过硬件还是软件来实现取决于特定的应用和整个系统的设计要求。本领域技术人员可以对于每种特定的应用,可以使用各种方法实现所述的功能,但这种实现不应被理解为超出本公开实施例保护的范围。
本公开实施例还提供一种确定侧链路时长的系统,该系统包括前述实施例中作为终端设备(如前述方法实施例中的第一终端设备)的通信装置和作为网络设备的通信装置,或者,该系统包括前述实施例中作为终端设备(如前述方法实施例中的第一终端设备)的通信装置和作为网络设备的通信装置。
本公开还提供一种可读存储介质,其上存储有指令,该指令被计算机执行时实现上述任一方法实施例的功能。
本公开还提供一种计算机程序产品,该计算机程序产品被计算机执行时实现上述任一方法实施例的功能。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序。在计算机上加载和执行所述计算机程序时,全部或部分地产生按照本公开实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机程序可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,高密度数字视频光盘(digital video disc,DVD))、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以理解:本公开中涉及的第一、第二等各种数字编号仅为描述方便进行的区分,并不用来限制本公开实施例的范围,也表示先后顺序。
本公开中的至少一个还可以描述为一个或多个,多个可以是两个、三个、四个或者更多个,本公开不做限制。在本公开实施例中,对于一种技术特征,通过“第一”、“第二”、“第三”、“A”、“B”、“C”和“D”等区分该种技术特征中的技术特征,该“第一”、“第二”、“第三”、“A”、“B”、“C”和“D”描述的技术特征间无先后顺序或者大小顺序。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本发明的其它实施方案。本公开旨在涵盖本发明的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本发明的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (22)

  1. 一种立体声音频信号处理方法,其特征在于,应用于编码设备,包括:
    确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数;
    响应于所述第一互相关系数小于第一阈值,采用第一去相关处理方式对所述立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算所述去相关处理后两声道信号的第二互相关系数,响应于所述第一互相关系数小于所述第二互相关系数,确定标志位为第一值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流,所述第一阈值的取值范围为(-1,0)。
  2. 如权利要求1所述的方法,其特征在于,所述第一去相关处理方式包括第一和差下混处理。
  3. 如权利要求2所述的方法,其特征在于,所述第一和差下混处理包括:
    基于公式一对所述左声道信号和右声道信号进行处理以得到主声道信号和次声道信号;所述公式一为:
    Figure PCTCN2021133722-appb-100001
    其中,Mid(n)为主声道信号,Sid(n)为次声道信号,L(n)为左声道信号,R(n)为右声道信号。
  4. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    响应于所述第一互相关系数大于第二阈值,采用第二去相关处理方式对所述立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算所述去相关处理后两声道信号的第三互相关系数,响应于所述第一互相关系数大于所述第三互相关系数,确定标志位为第二值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流,所述第二阈值的取值范围(0,1)。
  5. 如权利要求4所述的方法,其特征在于,所述第二去相关处理方式包括第二和差下混处理。
  6. 如权利要求5所述的方法,其特征在于,所述第二和差下混处理包括:
    基于公式二对所述左声道信号和右声道信号进行处理以得到主声道信号;所述公式二为:
    Figure PCTCN2021133722-appb-100002
    其中,Mid(n)为主声道信号,Sid(n)为次声道信号,L(n)为左声道信号,R(n)为右声道信号。
  7. 如权利要求4所述的方法,其特征在于,所述方法还包括:
    响应于所述第一互相关系数大于等于所述第一阈值且小于等于所述第二阈值,或者所述第一互相关系数小于所述第一阈值且所述第一互相关系数大于等于所述第二互相关系数,或者所述第一互相关系数大于所述第二阈值且所述第一互相关系数小于等于所述第三互相关系数,直接将所述立体声音频信号当前帧确定为去相关处理后两声道信号,确定标志位为第三值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流。
  8. 如权利要求1所述的方法,其特征在于,所述确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数,包括:
    基于公式三确定所述左声道信号和右声道信号的第一互相关系数;所述公式三为:
    Figure PCTCN2021133722-appb-100003
    其中,η (LR)为当前帧左声道信号和右声道信号的互相关系数,L(n)为当前帧左声道信号第n个样点,
    Figure PCTCN2021133722-appb-100004
    为当前帧左声道信号所有样点的平均值,R(n)为当前帧右声道信号第n个样点,
    Figure PCTCN2021133722-appb-100005
    为当前帧右声道信号所有样点的平均值,N为当前帧左声道信号或者右声道信号样点总数,即为当前帧帧长。
  9. 如权利要求4所述的方法,其特征在于,所述去相关处理后两声道信号包括主声道信号和次声道信号;
    计算所述去相关处理后两声道信号的第二互相关系数和第三互相关系数,包括:
    基于公式四确定第二互相关系数和第三互相关系数;所述公式四为:
    Figure PCTCN2021133722-appb-100006
    其中,η (MS)为第二互相关系数或第三互相关系数,Mid(n)为去相关处理后两声道信号中主声道信号第n个样点,
    Figure PCTCN2021133722-appb-100007
    为去相关处理后两声道信号中主声道信号所有样点的平均值,Sid(n)为去相关处理后两声道信号中次声道信号第n个样点,
    Figure PCTCN2021133722-appb-100008
    为去相关处理后两声道信号中次声道信号所有样点的平均值,N为当前帧左声道信号或者右声道信号样点总数,即为当前帧帧长。
  10. 一种立体声音频信号处理方法,其特征在于,应用于解码设备,包括:
    获取编码设备发送的编码码流;
    基于所述编码码流确定去相关处理后两声道信号和标志位;
    响应于所述标志位为第一值,采用第一去相关重建方式对所述去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
  11. 如权利要求10所述的方法,其特征在于,所述第一去相关重建方式包括:
    基于公式五对所述去相关处理后两声道信号进行去相关重建;所述公式五为:
    Figure PCTCN2021133722-appb-100009
    其中,Mid(n)为去相关处理后两声道信号中的主声道信号,Sid(n)为去相关处理后两声道信号中的次声道信号,L(n)为左声道信号,R(n)为右声道信号。
  12. 如权利要求10所述的方法,其特征在于,所述方法还包括:
    响应于所述标志位为第二值,采用第二去相关重建方式对所述去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
  13. 如权利要求12所述的方法,其特征在于,所述第二去相关重建方式包括:
    基于公式六对所述去相关处理后两声道信号进行去相关重建;所述公式六为:
    Figure PCTCN2021133722-appb-100010
    其中,Mid(n)为去相关处理后两声道信号的主声道信号,Sid(n)为去相关处理后两声道信号的次声 道信号,L(n)为左声道信号,R(n)为右声道信号。
  14. 如权利要求10所述的方法,其特征在于,所述方法还包括:
    响应于所述标志位为第三值,直接将所述去相关处理后两声道信号确定为去相关重建后的音频信号。
  15. 一种立体声音频信号处理装置,其特征在于,包括:
    确定模块,用于确定立体声音频信号当前帧左声道信号和右声道信号的第一互相关系数;
    处理模块,用于响应于所述第一互相关系数小于第一阈值,采用第一去相关处理方式对所述立体声音频信号当前帧进行去相关处理获得去相关处理后两声道信号,计算所述去相关处理后两声道信号的第二互相关系数,响应于所述第一互相关系数小于所述第二互相关系数,确定标志位为第一值,以及,基于所述去相关处理后两声道信号得到编码码流并将所述标志位写入所述编码码流,所述第一阈值的取值范围为(-1,0)。
  16. 一种立体声音频信号处理装置,其特征在于,包括:
    获取模块,用于获取编码设备发送的编码码流;
    确定模块,用于基于所述编码码流确定去相关处理后两声道信号和标志位;
    处理模块,用于响应于所述标志位为第一值,采用第一去相关重建方式对所述去相关处理后两声道信号进行去相关重建,并输出去相关重建后的音频信号。
  17. 一种通信装置,其特征在于,所述装置包括处理器和存储器,所述存储器中存储有计算机程序,所述处理器执行所述存储器中存储的计算机程序,以使所述装置执行如权利要求1至9中任一项所述的方法。
  18. 一种通信装置,其特征在于,所述装置包括处理器和存储器,所述存储器中存储有计算机程序,所述处理器执行所述存储器中存储的计算机程序,以使所述装置执行如权利要求10至14中任一项所述的方法。
  19. 一种通信装置,其特征在于,包括:处理器和接口电路;
    所述接口电路,用于接收代码指令并传输至所述处理器;
    所述处理器,用于运行所述代码指令以执行如权利要求1至9中任一项所述的方法。
  20. 一种通信装置,其特征在于,包括:处理器和接口电路;
    所述接口电路,用于接收代码指令并传输至所述处理器;
    所述处理器,用于运行所述代码指令以执行如权利要求10至14任一所述的方法。
  21. 一种计算机可读存储介质,用于存储有指令,响应于所述指令被执行时,使如权利要求1至9中任一项所述的方法被实现。
  22. 一种计算机可读存储介质,用于存储有指令,响应于所述指令被执行时,使如权利要求10至14中任一项所述的方法被实现。
PCT/CN2021/133722 2021-11-26 2021-11-26 一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质 WO2023092505A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/133722 WO2023092505A1 (zh) 2021-11-26 2021-11-26 一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质
CN202180004116.0A CN114258568A (zh) 2021-11-26 2021-11-26 一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/133722 WO2023092505A1 (zh) 2021-11-26 2021-11-26 一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023092505A1 true WO2023092505A1 (zh) 2023-06-01

Family

ID=80796643

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/133722 WO2023092505A1 (zh) 2021-11-26 2021-11-26 一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质

Country Status (2)

Country Link
CN (1) CN114258568A (zh)
WO (1) WO2023092505A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080226085A1 (en) * 2007-03-12 2008-09-18 Noriyuki Takashima Audio Apparatus
CN102368385A (zh) * 2011-09-07 2012-03-07 中科开元信息技术(北京)有限公司 后向块自适应Golomb-Rice编解码方法及装置
CN110495105A (zh) * 2017-04-12 2019-11-22 华为技术有限公司 多声道信号的编解码方法和编解码器

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080226085A1 (en) * 2007-03-12 2008-09-18 Noriyuki Takashima Audio Apparatus
CN102368385A (zh) * 2011-09-07 2012-03-07 中科开元信息技术(北京)有限公司 后向块自适应Golomb-Rice编解码方法及装置
CN110495105A (zh) * 2017-04-12 2019-11-22 华为技术有限公司 多声道信号的编解码方法和编解码器

Also Published As

Publication number Publication date
CN114258568A (zh) 2022-03-29

Similar Documents

Publication Publication Date Title
RU2377735C2 (ru) Способ, устройство и система для эффективного кодирования и декодирования видеоданных
KR100931871B1 (ko) 비디오 데이터의 효과적인 fgs 부호화 및 복호화를 위한방법, 장치, 시스템
US20060256863A1 (en) Method, device and system for enhanced and effective fine granularity scalability (FGS) coding and decoding of video data
WO2018141164A1 (zh) 下行控制信息的传输方法及装置
WO2021213128A1 (zh) 音频信号编码方法和装置
WO2023092505A1 (zh) 一种立体声音频信号处理方法、装置、编码设备、解码设备及存储介质
CN116368460A (zh) 音频处理方法、装置
WO2023097686A1 (zh) 一种立体声音频信号处理方法及设备/存储介质/装置
US20240029745A1 (en) Spatial audio parameter encoding and associated decoding
US20070041337A1 (en) Method of transmitting image data in video telephone mode of a wireless terminal
CN115552518B (zh) 一种信号编解码方法、装置、用户设备、网络侧设备及存储介质
WO2023077284A1 (zh) 一种信号编解码方法、装置、用户设备、网络侧设备及存储介质
WO2023065254A1 (zh) 一种信号编解码方法、装置、编码设备、解码设备及存储介质
WO2023193148A1 (zh) 音频回放方法/装置/设备及存储介质
WO2023051368A1 (zh) 编解码方法、装置、设备、存储介质及计算机程序产品
WO2023240653A1 (zh) 音频信号格式确定方法、装置
WO2023130283A1 (zh) 一种映射方式确定方法/装置/设备及存储介质
KR20240100384A (ko) 신호 부호화/복호화 방법, 장치, 사용자 기기, 네트워크측 기기 및 저장 매체
CN113810721B (zh) 视频流误码掩盖方法、装置、终端设备和可读存储介质
WO2023082194A1 (zh) 一种波束处理方法、装置、用户设备、ris阵列、基站及存储介质
WO2023193276A1 (zh) 一种上报方法/装置/设备及存储介质
WO2023092602A1 (zh) 一种预编码方法、装置、用户设备、ris阵列、基站及存储介质
WO2023077472A1 (zh) 一种信息更新方法、装置、用户设备、基站及存储介质
WO2023050153A1 (zh) 一种上报方法、装置、用户设备、网路侧设备及存储介质
WO2023070407A1 (zh) 一种预编码方法、装置、用户设备、可重构智能表面ris阵列及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21965238

Country of ref document: EP

Kind code of ref document: A1