WO2019037714A1 - 立体声信号的编码方法和编码装置 - Google Patents

立体声信号的编码方法和编码装置 Download PDF

Info

Publication number
WO2019037714A1
WO2019037714A1 PCT/CN2018/101524 CN2018101524W WO2019037714A1 WO 2019037714 A1 WO2019037714 A1 WO 2019037714A1 CN 2018101524 W CN2018101524 W CN 2018101524W WO 2019037714 A1 WO2019037714 A1 WO 2019037714A1
Authority
WO
WIPO (PCT)
Prior art keywords
window
current frame
attenuation
linear prediction
prediction analysis
Prior art date
Application number
PCT/CN2018/101524
Other languages
English (en)
French (fr)
Inventor
苏谟特艾雅
吉布斯乔纳森·阿拉斯泰尔
李海婷
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to ES18848208T priority Critical patent/ES2873880T3/es
Priority to EP21160112.5A priority patent/EP3901949B1/en
Priority to KR1020207008343A priority patent/KR102380642B1/ko
Priority to KR1020227010056A priority patent/KR102486258B1/ko
Priority to EP18848208.7A priority patent/EP3664089B1/en
Publication of WO2019037714A1 publication Critical patent/WO2019037714A1/zh
Priority to US16/797,484 priority patent/US11244691B2/en
Priority to US17/552,682 priority patent/US11636863B2/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques

Definitions

  • the present application relates to the field of audio signal coding and decoding technologies, and more particularly to an encoding method and an encoding device for a stereo signal.
  • the time-domain downmix processing is performed on the signal after the delay alignment processing to obtain the main channel signal and the secondary channel signal;
  • the inter-channel time difference, the time domain downmix processing parameters, the main channel signal, and the secondary channel signal are encoded to obtain an encoded code stream.
  • the channel with relatively backward time may be selected from the left channel and the right channel of the stereo signal as the target channel according to the time difference between the channels, and the channel is selected as the target channel.
  • the other channel acts as a reference channel for delay alignment processing of the target channel, and then performs delay alignment processing on the signal of the target channel, so that the target channel signal after the delay alignment process is compared with the reference channel signal There is no time difference between channels.
  • the delay alignment process also includes manually reconstructing the forward signal of the target channel.
  • the manually determined partial signal since a part of the signal of the target channel is manually determined (including the transition segment signal and the forward signal), the manually determined partial signal has a large difference from the real signal, and thus may cause a mono coding algorithm. There is a certain difference between the linear prediction coefficient obtained by linear prediction analysis of the main channel signal and the secondary channel signal determined by the stereo signal after the delay alignment processing, and the true linear prediction coefficient, thereby affecting the coding quality.
  • the application provides a coding method and an encoding device for a stereo signal to improve the accuracy of linear prediction in the encoding process.
  • the stereo signal in the present application may be an original stereo signal, a stereo signal composed of two signals included in a multi-channel signal, or a combination of multiple signals included in a multi-channel signal.
  • the two signals form a stereo signal.
  • the encoding method of the stereo signal in the present application may be a coding method of a stereo signal used in the multi-channel encoding method.
  • a method for encoding a stereo signal comprising: determining a window length of an attenuation window of the current frame according to an inter-channel time difference of a current frame; determining a window length of the attenuation window of the current frame
  • the modified linear prediction analysis window wherein the value of at least a part of the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window is smaller than the L-sub_window_len point of the initial linear prediction analysis window to The value of the corresponding point in the point L-1, sub_window_len is the window length of the attenuation window of the current frame, L is the window length of the modified linear prediction analysis window, and the window length of the modified linear prediction analysis window Equal to the window length of the initial linear prediction analysis window; performing linear prediction analysis on the channel signals to be processed according to the modified linear prediction analysis window.
  • the value of at least a part of the L-sub_window_len point to the L-1 point in the corrected linear prediction analysis window is smaller than the corresponding point of the L-sub_window_len point to the L-1 point of the linear prediction analysis window.
  • the value of any one of the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window is smaller than the initial linear prediction analysis window.
  • the determining, by the inter-channel time difference of the current frame, a window length of the attenuation window of the current frame including: according to a channel of the current frame The difference between the time difference and the length of the preset transition period determines the window length of the attenuation window of the current frame.
  • the window length of the attenuation window of the current frame is determined according to an inter-channel time difference of the current frame and a length of a preset transition segment
  • the method includes: determining a sum of an absolute value of an inter-channel time difference of the current frame and a length of the preset transition segment as a window length of an attenuation window of the current frame.
  • the window length of the attenuation window of the current frame is determined according to an inter-channel time difference of the current frame and a length of a preset transition segment
  • the method includes: when an absolute value of an inter-channel time difference of the current frame is greater than or equal to a length of the preset transition segment, and an absolute value of an inter-channel time difference of the current frame and the preset The sum of the lengths of the determined transition segments is determined as the window length of the attenuation window of the current frame; in the case where the absolute value of the inter-channel time difference of the current frame is less than the length of the predetermined transition segment, N times the absolute value of the inter-channel time difference of the current frame is determined as the window length of the attenuation window of the current frame, where N is a preset real number greater than 0 and less than L/MAX DELAY, MAX DELAY is A preset real number greater than 0.
  • the above MAX DELAY is the maximum value of the absolute value of the time difference between channels.
  • the inter-channel time difference here may be a preset inter-channel time difference when encoding and decoding a stereo signal.
  • the determining a corrected linear prediction analysis window according to a window length of the attenuation window of the current frame comprising: a window length according to an attenuation window of the current frame Correcting the initial linear prediction analysis window, wherein the value of the modified linear prediction analysis window from the L-sub_window_len point to the L-1 point is relative to the L-sub_window_len of the initial linear prediction analysis window The attenuation value of the corresponding point in the point L-1 is gradually increased.
  • the attenuation value may be an attenuation value of a value of a point in the modified linear prediction analysis window relative to a value of a corresponding point in the linear prediction analysis window.
  • the first point is any point from the L-sub_window_len point to the L-1 point in the modified linear prediction attenuation window
  • the second point is corresponding to the first point in the linear prediction analysis window.
  • the above attenuation value may be an attenuation value of the value of the first point relative to the value of the second point.
  • the modified linear predictive analysis window satisfies a formula:
  • w adp (i) is a modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • MAX_ATTEN is a preset real number greater than 0.
  • MAX_ATTEN may be the largest attenuation value among a plurality of attenuation values preset when the channel signal is encoded and decoded.
  • the determining a corrected linear prediction analysis window according to a window length of the attenuation window of the current frame comprising: a window length according to an attenuation window of the current frame Determining an attenuation window of the current frame; correcting the initial linear prediction analysis window according to an attenuation window of the current frame, wherein the modified linear prediction analysis window is from a L-sub_window_len point to an L-1 point
  • the modified linear prediction analysis window is from a L-sub_window_len point to an L-1 point
  • the determining, by the window length of the attenuation window of the current frame, the attenuation window of the current frame comprising: a window according to an attenuation window of the current frame Long, the attenuation window of the current frame is determined from a plurality of candidate attenuation windows stored in advance, wherein the plurality of candidate attenuation windows correspond to different window length value ranges, and the different window length ranges between values There is no intersection.
  • the computational complexity in determining the attenuation window can be reduced.
  • the window length of the attenuation window may be corresponding to different value ranges.
  • the attenuation window is stored, so that after determining the window length of the attenuation window of the current frame, the attenuation of the current frame can be determined from the pre-stored plurality of attenuation windows directly according to the range of values satisfying the window length of the attenuation window of the current frame.
  • the window can reduce the calculation process and simplify the computational complexity.
  • the pre-selected attenuation window length when calculating the attenuation window may be a subset of all possible values of the window length of the attenuation window or all possible values of the window length of the attenuation window.
  • the attenuation window of the current frame satisfies a formula:
  • sub_window(i) is an attenuation window of the current frame
  • MAX_ATTEN is a preset real number greater than 0.
  • MAX_ATTEN may be the largest attenuation value among a plurality of attenuation values preset when the channel signal is encoded and decoded.
  • the modified linear predictive analysis window satisfies a formula:
  • w adp (i) is a window function of the modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • sub_window(.) is an attenuation window of the current frame.
  • the determining a corrected linear prediction analysis window according to a window length of the attenuation window of the current frame comprising: a window length according to an attenuation window of the current frame Determining the modified linear prediction analysis window from a plurality of candidate linear prediction analysis windows stored in advance, wherein the plurality of candidate linear prediction analysis windows correspond to different window length value ranges, and the different window lengths are taken There is no intersection between the range of values.
  • the computational complexity in determining the corrected linear prediction analysis window can be reduced.
  • the attenuation window may be The window length is stored in the corresponding modified linear prediction analysis window in different value ranges, so that the window length of the attenuation window of the current frame can be directly determined according to the window length of the attenuation window of the current frame. Determining the corrected linear predictive analysis window in a plurality of pre-stored linear predictive analysis windows can reduce the calculation process and simplify the computational complexity.
  • the pre-selected attenuation window length when calculating the modified linear prediction analysis window may be a subset of all possible values of the window length of the attenuation window or all possible values of the window length of the attenuation window.
  • the method before determining the corrected linear prediction analysis window according to the window length of the attenuation window of the current frame, the method further includes: according to the preset interval step Correcting a window length of the attenuation window of the current frame to obtain a window length of the modified attenuation window, wherein the interval step is a preset positive integer; the attenuation window according to the current frame
  • the window length determines the modified linear prediction analysis window, comprising: determining a modified linear prediction analysis window based on the initial linear prediction analysis window and the window length of the modified attenuation window.
  • the interval step is a positive integer smaller than a maximum window length of the attenuation window.
  • the window length of the attenuation window can be reduced, and the possible values of the window length of the modified attenuation window are limited to a set of finite values. Therefore, it is convenient to store the attenuation window corresponding to the possible value of the window length of the modified attenuation window, thereby reducing the complexity of the subsequent calculation.
  • the window length of the modified attenuation window satisfies the formula:
  • sub_window_len_mod is the window length of the modified attenuation window
  • len_step is the interval step size
  • the determining the corrected linear predictive analysis window based on the initial linear predictive analysis window and the window length of the modified attenuation window comprises: The window length of the attenuation window corrects the initial linear prediction analysis window.
  • the determining the corrected linear predictive analysis window based on the initial linear predictive analysis window and the window length of the modified attenuation window comprises: The window length of the attenuation window determines an attenuation window of the current frame; and the linear prediction analysis window initial linear prediction analysis window of the current frame is corrected according to the modified attenuation window.
  • the determining, by the window length of the modified attenuation window, the attenuation window of the current frame comprises: according to a window length of the modified attenuation window, And determining, by the pre-stored plurality of candidate attenuation windows, the attenuation window of the current frame, wherein the pre-stored plurality of candidate attenuation windows are attenuation windows corresponding to the window length of the modified attenuation window at different values.
  • the attenuation window corresponding to the window length of the pre-selected modified attenuation window may be stored, so that the correction may be determined later.
  • the window length of the attenuation window can directly determine the attenuation window of the current frame from the plurality of candidate attenuation windows stored in advance according to the window length of the modified attenuation window, which can reduce the calculation process and simplify the calculation complexity.
  • window length of the pre-selected modified attenuation window herein may be a subset of all possible values of the window length of the modified attenuation window or all possible values of the window length of the modified attenuation window.
  • the linear prediction analysis window initial linear prediction analysis window of the current frame and the window length of the modified attenuation window determine a modified linear prediction analysis window
  • the method includes: determining, according to the window length of the modified attenuation window, the corrected linear prediction analysis window from a plurality of candidate linear prediction analysis windows stored in advance, wherein the pre-stored plurality of candidate linear prediction analysis windows are The window of the modified attenuation window has a corresponding modified linear prediction analysis window at different values.
  • the pre-selected modified attenuation may be determined after calculating the corresponding corrected linear prediction analysis window based on the linear prediction analysis window initial linear prediction analysis window of the current frame and the window length of a set of pre-selected modified attenuation windows, respectively.
  • the modified linear prediction analysis window corresponding to the window length of the window is stored, so that the window length of the modified attenuation window can be directly determined from the pre-stored plurality of candidate linear prediction analysis windows according to the window length of the modified attenuation window. Determining the modified linear predictive analysis window can reduce the computational process and simplify the computational complexity.
  • the window length of the pre-selected modified attenuation window herein is a subset of all possible values of the window length of the modified attenuation window or all possible values of the window length of the modified attenuation window.
  • an encoding apparatus comprising means for performing the first aspect or various implementations thereof.
  • an encoding apparatus comprising: a memory for storing a program, the processor for executing a program, the processor executing the first aspect when the program is executed Or the method of any of the possible implementations of the first aspect.
  • a computer readable storage medium storing program code for device execution, the program code comprising instructions for performing the method of the first aspect or various implementations thereof .
  • a chip comprising a processor and a communication interface, the communication interface for communicating with an external device, the processor for performing the first aspect or any possible implementation of the first aspect The method in the way.
  • the chip may further include a memory, where the memory stores an instruction, the processor is configured to execute an instruction stored on the memory, when the instruction is executed, The processor is for performing the method of the first aspect or any of the possible implementations of the first aspect.
  • the chip is integrated on a terminal device or a network device.
  • 1 is a schematic flow chart of a time domain stereo coding method
  • FIG. 2 is a schematic flow chart of a time domain stereo decoding method
  • FIG. 3 is a schematic flowchart of a method for encoding a stereo signal according to an embodiment of the present application
  • FIG. 4 is a frequency spectrum diagram of a difference between a linear prediction coefficient and a true linear prediction coefficient obtained by a method of encoding a stereo signal according to an embodiment of the present application;
  • FIG. 5 is a schematic flowchart of a method for encoding a stereo signal according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a delay alignment process according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a delay alignment process according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a delay alignment process according to an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a linear prediction analysis process of an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of a linear prediction analysis process according to an embodiment of the present application.
  • FIG. 11 is a schematic block diagram of an encoding apparatus according to an embodiment of the present application.
  • FIG. 12 is a schematic block diagram of an encoding apparatus according to an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a terminal device according to an embodiment of the present application.
  • FIG. 14 is a schematic diagram of a network device according to an embodiment of the present application.
  • 15 is a schematic diagram of a network device according to an embodiment of the present application.
  • FIG. 16 is a schematic diagram of a terminal device according to an embodiment of the present application.
  • FIG. 17 is a schematic diagram of a network device according to an embodiment of the present application.
  • FIG. 18 is a schematic diagram of a network device according to an embodiment of the present application.
  • the encoding method 100 specifically includes:
  • the encoder end estimates the inter-channel time difference of the stereo signal, and obtains the inter-channel time difference of the stereo signal.
  • the stereo signal includes a left channel signal and a right channel signal
  • the inter-channel time difference of the stereo signal refers to a time difference between the left channel signal and the right channel signal.
  • the main channel signal and the secondary channel signal obtained after the downmix processing are separately encoded, and a code stream of the primary channel signal and the secondary channel signal is obtained, and the stereo coded code stream is written.
  • the decoding method 200 specifically includes:
  • step 210 may be received by the decoding end from the encoding end.
  • step 210 is performed to perform main channel signal decoding and secondary channel signal decoding, respectively, to obtain a primary channel signal and a secondary channel signal. .
  • the present application proposes a new stereo coding method which corrects the initial linear prediction analysis window so that the corrected linear prediction analysis window corresponds to the artificially reconstructed forward signal of the target channel of the current frame.
  • the value of the target is smaller than the value of the corresponding point of the artificially reconstructed forward signal of the target channel of the current frame in the uncorrected linear prediction analysis window, thereby reducing the reconstruction of the target channel of the current frame during linear prediction.
  • the effect of the forward signal thereby reducing the influence of the error between the artificially reconstructed forward signal and the true forward signal on the accuracy of the linear prediction analysis result, thus reducing the linear prediction coefficient obtained by linear prediction analysis The difference between the true linear prediction coefficients and the accuracy of the linear prediction analysis.
  • FIG. 3 is a schematic flowchart of an encoding method in an embodiment of the present application.
  • the method 300 can be performed by an encoding end, which can be an encoder or a device having the function of encoding a stereo signal. It should be understood that method 300 may be part of the overall process of encoding the primary channel signal and the secondary channel signal obtained after the downmix processing in step 160 of method 100 above. Specifically, the method 300 may be a process of performing linear prediction on the primary channel signal or the secondary channel signal obtained after the downmix processing in the above step 160.
  • the method 300 above specifically includes:
  • the sum of the absolute value of the inter-channel time difference of the current frame and the length of the transition segment preset between the current frame may be directly determined.
  • the window length of the window may be directly determined.
  • the window length of the attenuation window of the current frame can be determined according to formula (1).
  • sub_window_len is the window length of the attenuation window
  • cur_itd is the inter-channel time difference of the current frame
  • abs(cur_itd) is the absolute value of the inter-channel time difference of the current frame
  • Ts2 is for enhancing the real signal of the current frame. The length of the smooth and predetermined transition between the forward signal and the manually reconstructed forward signal.
  • MAX_WIN_LEN MAX_DELAY+Ts2 (2)
  • MAX_WIN_LEN is the maximum value of the window length of the attenuation window.
  • Ts2 in equation (2) is the same as that in equation (1).
  • MAX_DELAY is preset to a real number greater than 0. Further, MAX_DELAY can be sound.
  • MAX_DELAY may be 40, and Ts2 may be 10.
  • MAX_WIN_LEN of the absolute value of the inter-channel time difference of the current frame is 50.
  • the window length of the attenuation window of the current frame may also be determined according to the magnitude relationship between the absolute value of the inter-channel time difference of the current frame and the length of the preset transition segment of the current frame.
  • the window length of the attenuation window of the current frame is the absolute value of the inter-channel time difference of the current frame.
  • the window length of the attenuation window of the current frame can be determined according to formula (3).
  • sub_window_len is the window length of the attenuation window
  • cur_itd is the inter-channel time difference of the current frame
  • abs(cur_itd) is the absolute value of the inter-channel time difference of the current frame
  • Ts2 is for enhancing the real signal of the current frame.
  • the length of the smooth and predetermined transition period between the manually reconstructed forward signal, N is a preset real number greater than 0 and less than L/MAX DELAY, preferably, N is a predetermined greater than 0 and less than or equal to An integer of 2, for example, N is 2.
  • Ts2 is a preset positive integer.
  • Ts2 when the sampling rate is 16 kHz, Ts2 is 10.
  • Ts2 when the sampling rate of the stereo signal is different, Ts2 can set the same value or set different values.
  • the maximum value of the window length of the attenuation window satisfies the formula (4) or the formula (5).
  • MAX_WIN_LEN MAX_DELAY+Ts2 (4)
  • MAX_WIN_LEN N*MAX_DELAY (5)
  • the sampling rate of the stereo signal is 16 kHz
  • MAX_DELAY can be 40
  • Ts2 can be 10
  • N can be 2.
  • the maximum value of the absolute value of the inter-channel time difference of the current frame MAX_WIN_LEN is 50.
  • the sampling rate of the stereo signal is 16 kHz
  • MAX_DELAY can be 40
  • Ts2 can be 50
  • N can be 2.
  • the maximum value of the absolute value of the inter-channel time difference of the current frame is MAX_WIN_LEN. 80.
  • 320 Determine a modified linear prediction analysis window according to a window length of the attenuation window of the current frame, where at least a part of the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window is smaller than an initial linearity.
  • the value of the corresponding point in the L-sub_window_len point to the L-1 point of the prediction analysis window, sub_window_len is the window length of the attenuation window of the current frame
  • L is the window length of the modified linear prediction analysis window
  • the corrected linear prediction analysis The window length of the window is equal to the window length of the initial linear prediction analysis window.
  • any one of the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window is smaller than the corresponding point of the L-sub_window_len point to the L-1 point of the initial linear prediction analysis window. Value.
  • the corresponding point from the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window at the L-sub_window_len point to the L-1 point of the initial linear prediction analysis window refers to the initial point A point in the linear prediction analysis window having the same index as the arbitrary point.
  • the corresponding point of the L-sub_window_len point in the modified linear prediction analysis window in the initial linear prediction analysis window is an initial linear prediction analysis window.
  • determining the modified linear prediction analysis window according to the window length of the attenuation window of the current frame specifically: correcting the initial linear prediction analysis window according to the window length of the attenuation window of the current frame to obtain a modified linear prediction analysis window.
  • the value of the modified linear prediction analysis window from the L-sub_window_len point to the L-1 point is relative to the corresponding point in the L-sub_window_len point to the L-1 point of the initial linear prediction analysis window. The value of the attenuation value gradually increases.
  • the above attenuation value may be an attenuation value of the value of the point in the modified linear prediction analysis window relative to the value of the corresponding point in the linear prediction analysis window.
  • the L-sub_window_len point in the window can be specifically analyzed by the linear prediction. The value is determined by the difference between the value of the L-sub_window_len point in the modified linear prediction analysis window.
  • the first point is any point from the L-sub_window_len point to the L-1 point in the modified linear prediction attenuation window
  • the second point is the corresponding point in the linear prediction analysis window corresponding to the first point.
  • the above attenuation value may be a difference between the value of the first point and the value of the second point.
  • the initial linear prediction analysis window is modified according to the window length of the attenuation window of the current frame in order to change the value of at least a part of the L-sub_window_len point to the L-1 point in the initial linear prediction analysis window.
  • Small that is, after the initial linear prediction analysis window is corrected to obtain a modified linear prediction analysis window, at least a part of the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window is smaller than the initial value. The value of the corresponding point in the linear prediction analysis window.
  • the attenuation value corresponding to each point in the window length range of the attenuation window or the value of each point in the attenuation window may or may not include 0.
  • the value of each point in the window length range of the attenuation window and the value of each point in the attenuation window may be a real number less than or equal to 0, or a real number greater than or equal to 0.
  • the L-sub_window_len point in the initial linear prediction analysis window can be adjusted to the L-th when the initial linear prediction analysis window is corrected according to the window length of the attenuation window.
  • the value of any point in 1 point is added to the value of the corresponding point in the attenuation window, and the value of the corresponding point in the modified linear prediction analysis window is obtained.
  • the L-sub_window_len point in the initial linear prediction analysis window may be adjusted to the Lth when the initial linear prediction analysis window is corrected according to the window length of the attenuation window.
  • the value of any point in the -1 point is subtracted from the value of the corresponding point in the attenuation window, and the value of the corresponding point in the modified linear prediction analysis window is obtained.
  • the initial linear prediction analysis window is corrected, and the corrected linear prediction analysis window is in the L-sub_window_len point to the L-1 point.
  • the value of any point is smaller than the value of the corresponding point in the L-sub_window_len point to the L-1 point of the initial linear prediction analysis window.
  • any type of linear predictive analysis window can be selected as the initial linear predictive analysis window for the current frame.
  • the initial linear prediction analysis window of the current frame can be either a symmetric window or an asymmetric window.
  • the window length L of the initial linear prediction analysis window may be 320 points, and the initial linear prediction analysis window w(n) satisfies the formula (6):
  • the initial linear prediction analysis window can be determined in various ways.
  • the initial linear prediction analysis window can be obtained through real-time operation, or the initial linear prediction analysis window can be directly obtained from the pre-stored linear prediction analysis window. These are pre-stored.
  • the linear predictive analysis window can be obtained by operation and stored in the form of a table.
  • the method of obtaining the linear prediction analysis window from the pre-stored linear prediction analysis window can quickly obtain the initial linear prediction analysis window, reduce the computational complexity, and improve the coding efficiency.
  • the above-described corrected linear prediction analysis window satisfies the formula (7), that is, the corrected linear prediction analysis window can be determined according to the formula (7).
  • sub_window_len is the window length of the attenuation window of the current frame
  • w adp (i) is the modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • L is the modified linear prediction analysis window.
  • the length of the window, Among them, MAX_ATTEN is a preset real number greater than 0.
  • MAX_ATTEN may specifically be the maximum attenuation value that can be obtained when the initial linear prediction analysis window is attenuated when the initial linear prediction analysis window is corrected.
  • the value of MAX_ATTEN may be 0.07, 0.08, etc.
  • MAX_ATTEN may be The technician presets according to experience.
  • determining the corrected linear prediction analysis window according to the initial linear prediction analysis window and the window length of the attenuation window of the current frame specifically includes: determining an attenuation window of the current frame according to a window length of the attenuation window of the current frame;
  • the initial linear prediction analysis window is modified according to the attenuation window of the current frame, wherein the value of the modified linear prediction analysis window from the L-sub_window_len point to the L-1 point is relative to the L-th of the initial linear prediction analysis window.
  • the attenuation value of the corresponding value of the sub_window_len point to the L-1 point gradually increases.
  • the gradual increase of the attenuation value means that the index of the point in the corrected linear prediction analysis window from the L-sub_window_len point to the L-1 point gradually increases, and the attenuation value also gradually increases, that is, It is said that the attenuation value of the L-sub_window_len point is the smallest, the attenuation value of the L-1th point is the largest, and the attenuation value of the Nth point is larger than the attenuation value of the N-1th point, and L-sub_window_len ⁇ N ⁇ L-1.
  • the above attenuation window can be either a linear window or a non-linear window.
  • the attenuation window satisfies the formula (8) when the attenuation window is determined according to the window length of the attenuation window of the current frame, that is, the attenuation window can be determined according to the formula (8).
  • MAX_ATTEN is the maximum value of the attenuation value, and MAX_ATTEN has the same meaning in formula (8) as in equation (7).
  • the modified linear prediction analysis window obtained by correcting the linear prediction analysis window according to the attenuation window of the current frame satisfies the formula (9), that is, after determining the attenuation window according to the above formula (8), the following can be based on the formula ( 9) Determine the corrected linear prediction analysis window.
  • sub_window_len is the window length of the attenuation window of the current frame
  • sub_window(.) is the attenuation window of the current frame.
  • sub_window(i-(L-sub_window_len)) is the current The value of the frame's attenuation window at point i-(L-sub_window_len)
  • w adp (i) is the modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • L is the modified linear prediction analysis window.
  • the length of the window is the length of the window.
  • the attenuation window of the current frame may be determined from the plurality of candidate attenuation windows stored in advance according to the window length of the attenuation window of the current frame, where Multiple candidate attenuation windows correspond to different window length values, and there is no intersection between different window length values.
  • the computational complexity when determining the attenuation window can be reduced, and then the attenuation window of the current frame obtained from the plurality of attenuation windows stored in advance can be directly used. To determine the corrected linear prediction analysis window.
  • the window length of the attenuation window may be corresponding to different value ranges.
  • the attenuation window is stored, so that after determining the window length of the attenuation window of the current frame, the attenuation of the current frame can be determined from the pre-stored plurality of attenuation windows directly according to the range of values satisfying the window length of the attenuation window of the current frame.
  • the window can reduce the calculation process and simplify the computational complexity.
  • the pre-selected attenuation window length when calculating the attenuation window may be a subset of all possible values of the window length of the attenuation window or all possible values of the window length of the attenuation window.
  • the corresponding attenuation window is recorded as sub_window_20(i)
  • the corresponding attenuation window is recorded as sub_window_40(i)
  • the window length of the attenuation window is 60.
  • the corresponding attenuation window is denoted as sub_window_60(i)
  • the corresponding attenuation window of the attenuation window is denoted as sub_window_80(i).
  • Sub_window_20(i) is determined as the attenuation window of the current frame; if the window length of the attenuation window of the current frame is greater than or equal to 40 and less than 60, then sub_window_40(i) may be determined as the attenuation window of the current frame; if the attenuation window of the current frame If the window length is greater than or equal to 60 and less than 80, then sub_window_60(i) can be determined as the attenuation window of the current frame; if the window length of the attenuation window of the current frame is greater than or equal to 80, sub_window_80(i) can be determined as the current frame. Attenuation window.
  • the value range of the window length of the attenuation window of the current frame may be directly derived from multiple presets.
  • the attenuation window of the current frame is determined in the stored attenuation window.
  • the attenuation window of the current frame can be determined according to formula (10).
  • sub_window(i) is the attenuation window of the current frame
  • sub_window_len is the window length of the attenuation window of the current frame
  • sub_window_20(i) is the window length of the attenuation window of the current frame
  • sub_window_40(i) is the window length of the attenuation window of the current frame
  • sub_window_60(i) is the window length of the attenuation window of the current frame
  • sub_window_80(i) are pre-stored attenuation windows.
  • the attenuation windows corresponding to 20, 40, 60, and 80 are respectively long.
  • the attenuation window determined by the above formula (10) is a linear window.
  • the attenuation window in the present application may be a nonlinear window in addition to a linear window.
  • the attenuation window is a nonlinear window, it can be determined according to the formula (11) to the formula (13).
  • sub_window(i) is the attenuation window of the current frame
  • sub_window_len is the window length of the attenuation window of the current frame
  • MAX_ATTEN has the same meaning as above.
  • the corrected linear prediction analysis window can also be determined according to the formula (10).
  • the modified linear prediction analysis window obtained by correcting the linear prediction analysis window according to the attenuation window of the current frame satisfies the formula (14), that is, after determining the attenuation window according to the above formula (10), the following can be based on the formula ( 14) to equation (17) to determine the corrected linear prediction analysis window.
  • sub_window_len is the window length of the attenuation window of the current frame
  • w adp (i) is the modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • L is The window length of the modified linear prediction analysis window.
  • Sub_window_20(.), sub_window_40(.), sub_window_60(.), sub_window_80(.) are the attenuation windows corresponding to the pre-stored attenuation window lengths of 20, 40, 60, 80, respectively, according to formula (10).
  • the corresponding attenuation window in the case where the attenuation window lengths are 20, 40, 60, 80, respectively, is calculated and stored in advance to any one of the formulas (13).
  • the corrected linearity can be determined according to the range of the value of the attenuation window window length. Predictive analysis window. For example, if the window length of the attenuation window of the current frame is 50, then the window length of the attenuation window of the current frame is between 40 and 60 (more than or equal to 40 is less than 60). Therefore, the corrected value can be determined according to formula (15).
  • Linear prediction analysis window if the window length of the attenuation window of the current frame is 70, then the window length of the attenuation window of the current frame is between 60 and 80 (more than or equal to 60 is less than 80), and then according to the formula (16) to determine the corrected linear prediction analysis window.
  • the channel signal to be processed may be a primary channel signal or a secondary channel signal. Further, the channel signal to be processed may further be time domain preprocessed for the primary channel signal or the secondary channel signal. The resulting channel signal.
  • the primary channel signal and the secondary channel signal may be channel signals obtained after downmix processing.
  • linear prediction analysis on the channel signal to be processed according to the modified linear prediction analysis window specifically, performing windowing processing on the channel signal to be processed according to the modified linear prediction analysis window, and then calculating the current according to the windowed processed signal
  • the linear prediction coefficient of the frame specifically, the Levinson Dubin algorithm can be used.
  • the value of at least a part of the L-sub_window_len point to the L-1 point in the modified linear prediction analysis window is smaller than the L-sub_window_len point to the L-1 point of the linear prediction analysis window.
  • the value of the corresponding point in the middle so that the linear reconstruction can reduce the artificial reconstruction reconstructed signal of the target channel of the current frame (the reconstructed signal can include the transition segment signal and the forward signal), thereby reducing the manual reconstruction
  • the influence of the error between the reconstructed signal and the true forward signal on the accuracy of the linear predictive analysis result therefore, the difference between the linear predictive coefficient obtained by the linear predictive analysis and the true linear predictive coefficient can be reduced, and the linearity can be improved. Predict the accuracy of the analysis.
  • the spectral distortion between the linear prediction coefficient obtained in the prior art and the true linear prediction coefficient is large, and the spectrum between the linear prediction coefficient obtained by the present application and the true linear prediction coefficient is obtained.
  • the distortion is small, and thus the encoding method of the stereo signal in the embodiment of the present application can reduce the spectral distortion of the linear prediction coefficient obtained in the linear prediction analysis, and improve the accuracy of the linear prediction analysis.
  • determining the corrected linear prediction analysis window according to the window length of the attenuation window of the current frame includes: selecting, according to a window length of the attenuation window of the current frame, from a plurality of candidate linear prediction analysis windows stored in advance A modified linear prediction analysis window is determined, wherein the plurality of candidate linear prediction analysis windows correspond to different window length value ranges, and there is no intersection between different window length value ranges.
  • the plurality of candidate linear prediction analysis windows stored in advance are the corrected linear prediction analysis windows corresponding to the window lengths of the attenuation windows of the current frame in different value ranges.
  • the attenuation window may be The window length is stored in the corresponding modified linear prediction analysis window in different value ranges, so that the window length of the attenuation window of the current frame can be directly determined according to the window length of the attenuation window of the current frame. Determining the corrected linear predictive analysis window in a plurality of pre-stored linear predictive analysis windows can reduce the calculation process and simplify the computational complexity.
  • the pre-selected attenuation window length when calculating the modified linear prediction analysis window may be a subset of all possible values of the window length of the attenuation window or all possible values of the window length of the attenuation window.
  • the corrected linear prediction analysis window may be determined according to formula (18).
  • w adp (i) is the modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • w adp _20(i) is the initial linear prediction analysis window
  • w adp _40(i) is the initial linear prediction analysis window
  • w adp _60(i) is the initial linear prediction analysis window
  • w adp _80 (i) is a plurality of linear prediction analysis windows stored in advance.
  • the window length, w adp _20 (i), w adp _40 (i), w adp _60 (i), w adp _80 (i) corresponding to the attenuation window 40, 60 and 80, respectively.
  • the range of values satisfying the window length of the attenuation window of the current frame may be directly according to the formula ( 18) Determine the modified linear prediction analysis window.
  • the method 300 before determining the corrected linear prediction analysis window according to the window length of the attenuation window, the method 300 further includes:
  • the window length of the attenuation window of the current frame is corrected according to a preset interval step size to obtain a window length of the modified attenuation window, wherein the interval step is a preset positive integer. Further, the interval step may be a positive integer smaller than a maximum window length of the attenuation window;
  • the modified linear prediction analysis window is determined according to the window length of the attenuation window, and specifically includes: determining the linear prediction analysis according to the initial linear prediction analysis window and the window length of the modified attenuation window. window.
  • the window length of the attenuation window of the current frame may be determined according to the time difference between channels of the current frame, and then the window length of the attenuation window is corrected according to the preset interval step size, and the window length of the modified attenuation window is obtained. .
  • the window length of the attenuation window can be reduced, and the value of the window length of the modified attenuation window belongs to a set consisting of a finite number of constants. Easy to store in advance, which can reduce the complexity of subsequent calculations.
  • the window length of the modified attenuation window satisfies the formula (19), that is, when the window length of the attenuation window is corrected according to the preset interval step, the window length of the attenuation window can be specifically corrected according to the formula (19).
  • sub_window_len_mod is the window length of the modified attenuation window
  • sub_window_len is the window length of the attenuation window
  • len_step is the interval step size
  • the interval step size can be a positive integer smaller than the maximum window length of the adaptive attenuation window, for example, 15, 20, etc.
  • the interval step size is also It can be preset by the technician.
  • the window length of the modified attenuation window only contains 0, 20, 40, 60, 80, which means that the window length of the modified attenuation window belongs to ⁇ 0 only. , 20, 40, 60, 80 ⁇ , when the window length of the modified attenuation window is 0, the initial linear prediction analysis window is directly used as the modified linear prediction analysis window.
  • determining the corrected linear prediction analysis window according to the initial linear prediction analysis window and the window length of the modified attenuation window including: analyzing the initial linear prediction according to the window length of the modified attenuation window The window is corrected.
  • determining the corrected linear prediction analysis window according to the initial linear prediction analysis window and the window length of the modified attenuation window further comprising: determining the current frame according to the window length of the modified attenuation window. Attenuation window; correcting the initial linear prediction analysis window according to the modified attenuation window.
  • determining the attenuation window of the current frame according to the window length of the modified attenuation window includes: determining, according to the window length of the modified attenuation window, the attenuation of the current frame from the plurality of candidate attenuation windows stored in advance The window, wherein the plurality of candidate attenuation windows pre-stored are attenuation windows corresponding to the window length of the modified attenuation window at different values.
  • the attenuation window corresponding to the window length of the pre-selected modified attenuation window may be stored, so that the correction may be determined later.
  • the window length of the attenuation window can directly determine the attenuation window of the current frame from the plurality of candidate attenuation windows stored in advance according to the window length of the modified attenuation window, which can reduce the calculation process and simplify the calculation complexity.
  • window length of the pre-selected modified attenuation window herein may be a subset of all possible values of the window length of the modified attenuation window or all possible values of the window length of the modified attenuation window.
  • the attenuation window of the current frame may be determined according to formula (20).
  • sub_window_len_mod is the window length of the modified attenuation window
  • sub_window_20(i), sub_window_40(i), sub_window_60(i), sub_window_80(i) are pre-stored attenuation window lengths.
  • determining the corrected linear prediction analysis window according to the initial linear prediction analysis window and the window length of the modified attenuation window including: pre-storing a plurality of candidate linearities according to the window length of the modified attenuation window A modified linear prediction analysis window is determined in the prediction analysis window, wherein the plurality of candidate linear prediction analysis windows stored in advance are corrected linear prediction analysis windows corresponding to the window length of the modified attenuation window at different values.
  • the correction of the window length corresponding to the pre-selected modified attenuation window may be performed.
  • the linear prediction analysis window is stored, so that the modified linear prediction analysis can be directly determined from the pre-stored plurality of candidate linear prediction analysis windows according to the window length of the modified attenuation window after the window length of the modified attenuation window is subsequently determined.
  • the window can reduce the calculation process and simplify the computational complexity.
  • the window length of the pre-selected modified attenuation window herein is a subset of all possible values of the window length of the modified attenuation window or all possible values of the window length of the modified attenuation window.
  • the corrected linear prediction analysis window may be determined according to the formula (21).
  • w adp (i) is the modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • w adp _20(i) is the initial linear prediction analysis window
  • w adp _40(i) is the initial linear prediction analysis window
  • w adp _60(i) is the initial linear prediction analysis window
  • w adp _80 (i) is a plurality of linear prediction analysis windows stored in advance.
  • the window length, w adp _20 (i), w adp _40 (i), w adp _60 (i), w adp _80 (i) corresponding to the attenuation window 40, 60 and 80, respectively.
  • the method 300 shown in FIG. 3 is a part of the process of stereo signal encoding.
  • the encoding method of the stereo signal in the embodiment of the present application will be described below with reference to FIG. 5 to FIG. The entire process is described in detail.
  • FIG. 5 is a schematic flowchart of a method for encoding a stereo signal according to an embodiment of the present application.
  • the method 500 of FIG. 5 specifically includes:
  • the stereo signal here is a time domain signal
  • the stereo signal specifically includes a left channel signal and a right channel signal.
  • the left and right channel signals of the current frame may be performed.
  • the high-pass filtering process obtains the left and right channel signals of the pre-processed current frame.
  • the time domain preprocessing here may be other processing in addition to the high pass filtering processing, for example, performing pre-emphasis processing.
  • the correlation coefficient between the left and right channels can be calculated according to the left and right channel signals preprocessed by the current frame, and then the index value corresponding to the maximum value of the cross-correlation coefficient is used as the sound of the current frame.
  • the time difference between the roads can be calculated according to the left and right channel signals preprocessed by the current frame, and then the index value corresponding to the maximum value of the cross-correlation coefficient is used as the sound of the current frame.
  • the estimation of the time difference between channels can be performed in the manners of the first mode to the third mode. It should be understood that the present application is not limited to the methods in the first mode to the third mode in performing the inter-channel time difference estimation, and the present application may also adopt other prior art techniques to implement the estimation of the time difference between channels.
  • the maximum and minimum values of the time difference between channels are T max and T min , respectively, where T max and T min are preset real numbers, and T max >T min , then the index can be searched
  • the value is the maximum value of the correlation coefficient between the left and right channels between the maximum value and the minimum value of the time difference between the channels, and finally the index value corresponding to the maximum value of the correlation coefficient between the searched left and right channels is determined as The inter-channel time difference of the current frame.
  • the values of T max and T min may be 40 and -40, respectively, so that the maximum value of the cross-correlation coefficient between the left and right channels can be searched in the range of -40 ⁇ i ⁇ 40, and then the correlation coefficient is The index value corresponding to the maximum value is taken as the inter-channel time difference of the current frame.
  • the maximum and minimum values of the inter-channel time difference at the current sampling rate are T max and T min , respectively, where T max and T min are preset real numbers, and T max >T min .
  • the cross-correlation function between the left and right channels can be calculated according to the left and right channel signals of the current frame, and calculated according to the cross-correlation function pair between the left and right channels of the previous L frame (L is an integer greater than or equal to 1)
  • L is an integer greater than or equal to 1
  • the cross-correlation function between the left and right channels of the current frame is smoothed, and the cross-correlation function between the left and right channels after smoothing is obtained, and then the smoothed left and right channels are searched within the range of T min ⁇ i ⁇ T max
  • the maximum value of the cross-correlation coefficient, and the index value i corresponding to the maximum value is taken as the inter-channel time difference of the current frame.
  • the inter-channel time difference of the first M frame (M is an integer greater than or equal to 1) of the current frame and the estimated inter-channel time difference of the current frame.
  • M is an integer greater than or equal to 1 of the current frame and the estimated inter-channel time difference of the current frame.
  • the inter-frame smoothing process is performed, and the smoothed inter-channel time difference is taken as the final inter-channel time difference of the current frame.
  • the left and right channel signals for inter-channel time difference estimation are the left and right channel signals in the original stereo signal.
  • the left and right channel signals in the original stereo signal may refer to the collected analog-to-digital (A/D) converted Pulse Code Modulation (PCM) signals.
  • the sampling rate of the stereo audio signal may be 8 KHz, 16 KHz, 32 KHz, 44.1 KHz, and 48 KHz, and the like.
  • one or two of the left channel signal and the right channel signal may be compressed or stretched according to the channel time difference of the current frame. Therefore, there is no inter-channel time difference between the left and right channel signals after the delay alignment processing.
  • the left and right channel signals after the delay alignment of the current frame obtained by the left and right channel signal delay alignment processing of the current frame are the stereo signals after the delay alignment processing of the current frame.
  • the current frame is first selected according to the inter-channel delay difference of the current frame and the inter-channel delay difference of the previous frame.
  • Target channel and reference channel Then, according to the magnitude relationship between the absolute value abs(cur_itd) of the inter-channel time difference of the current frame and the absolute value abs(prev_itd) of the inter-channel time difference of the previous frame of the current frame, the delay alignment processing can be performed in different manners.
  • the inter-channel delay difference of the current frame is recorded as cur_itd
  • the delay difference between the previous frames is recorded as prev_itd.
  • the absolute value abs(cur_itd) of the inter-channel time difference of the current frame and the absolute value of the inter-channel time difference of the previous frame of the current frame may be different.
  • the relationship may be different in the following three situations. It should be understood that the processing manner used in the delay alignment processing of the present application is not limited to the processing in the following three cases, and the present application may also adopt other methods. Any prior art delay alignment processing method performs delay alignment processing.
  • the Ts2 point signal is generated according to the reference channel signal of the current frame and the target channel signal of the current frame, as the N-Ts2 point to the N-1 point signal of the target channel after the delay alignment processing, And manually reconstruct the abs (cur_itd) point signal according to the reference channel signal, as the Nth point to the N+abs(cur_itd)-1 point signal of the target channel after the delay alignment processing.
  • abs() indicates an absolute value operation
  • the target channel signal delay abs(cur_itd) samples of the current frame are used as the target channel signal after the current frame delay alignment, and the reference channel signal of the current frame is used. Directly used as the reference channel signal after the current frame delay is aligned.
  • the absolute value of the inter-channel time difference of the current frame is smaller than the absolute value of the inter-channel time difference of the previous frame of the current frame, it is necessary to stretch the buffered target channel signal. Specifically, the signal from the -ts+abs(prev_itd)-abs(cur_itd) to the L-ts-1 point in the target channel signal of the current frame buffer is stretched into a signal of a length L point as a delay alignment. The -ts point to the L-ts-1 point signal of the processed target channel.
  • the signal from the L-ts point to the N-Ts2-1 point in the target channel signal of the current frame is directly used as the L-ts point to the N-Ts2-1 point signal of the target channel after the delay alignment processing.
  • a Ts2 point signal is generated according to the reference channel signal of the current frame and the target channel signal, as the N-Ts2 point to the N-1 point signal of the target channel after the delay alignment processing.
  • the abs (cur_itd) point signal is manually reconstructed according to the reference channel signal as the Nth point to the N+abs(cur_itd)-1 point signal of the target channel after the delay alignment processing.
  • L is the processing length of the delay alignment processing
  • the processing length L of the delay alignment processing can set different values for different sampling rates, or a uniform value can be used. In general, the easiest way is to preset a value based on the experience of the technician, such as 290.
  • the N-point signal starting from the abs (cur_itd) point of the target channel after the delay alignment processing is used as the target channel signal of the current frame after the delay alignment.
  • the reference channel signal of the current frame is directly used as the reference channel signal of the current frame after the delay is aligned.
  • the absolute value of the inter-channel time difference of the current frame is smaller than the absolute value of the inter-channel time difference of the previous frame of the current frame, it is necessary to compress the buffered target channel signal. Specifically, the signal from the -ts+abs(prev_itd)-abs(cur_itd) to the L-ts-1 point in the target channel signal of the current frame buffer is compressed into a signal having a length of L, as a delay alignment. The -ts point to the L-ts-1 point signal of the processed target channel.
  • the signal from the L-ts point to the N-Ts2-1 point in the target channel signal of the current frame is directly used as the L-ts point of the target channel after the delay alignment processing to N-Ts2-1 Point signal.
  • a Ts2 point signal is generated according to the reference channel signal of the current frame and the target channel signal as the N-Ts2 point to N-1 point signal of the target channel after the delay alignment processing.
  • an abs (cur_itd) point signal is generated according to the reference channel signal as the Nth point to the N+abs(cur_itd)-1 point signal of the target channel after the delay alignment processing.
  • L is still the processing length of the delay alignment process.
  • the N-point signal from the abs (cur_itd) point of the target channel after the delay alignment processing is still used as the target channel signal of the current frame after the delay alignment.
  • the reference channel signal of the current frame is directly used as the reference channel signal of the current frame after the delay is aligned.
  • any prior art quantization algorithm may be used to quantize the inter-channel time difference of the current frame to obtain a quantization index, and the quantization index is encoded and then written. Into the stream.
  • the channel combination scale factor of the current frame can be calculated according to the frame energy of the left and right channels.
  • the specific process is as follows:
  • the frame energy rms_L of the left channel of the current frame satisfies:
  • the frame energy rms_R of the right frame of the current frame satisfies:
  • x' L (i) is the left channel signal after the current frame delay is aligned
  • x' R (i) is the right channel signal after the current frame delay is aligned
  • i is the sample number.
  • the channel combination scale factor ratio of the current frame satisfies:
  • the channel combination scale factor is calculated based on the frame energy of the left and right channel signals.
  • the time-domain downmix processing of the delay-aligned stereo signals may be performed by any one of the prior art time domain downmix processing methods.
  • the corresponding time domain downmix processing method is selected according to the method of calculating the channel combination down factor to perform time domain processing on the delay aligned stereo signals, and the main channel signals and times are obtained. Want the channel signal.
  • the time domain downmix processing may be performed according to the channel combination scale factor ratio.
  • the time domain downmix processing may be determined according to formula (20). The main channel signal and the secondary channel signal.
  • Y(i) is the main channel signal of the current frame
  • X(i) is the secondary channel signal of the current frame
  • x' L (i) is the left channel signal after the current frame delay is aligned
  • x' R (i) is the right channel signal after the current frame delay is aligned
  • i is the sample number
  • N is the frame length
  • ratio is the channel combination scale factor
  • the monophonic signal encoding and decoding method may be used to encode the obtained main channel signal and the secondary channel signal after the downmix processing.
  • the parameter information obtained in the encoding process of the primary channel signal of the previous frame and/or the secondary channel signal of the previous frame and the total number of bits encoded by the primary channel signal and the secondary channel signal may be used.
  • the primary channel coding and the secondary channel coding bits are allocated.
  • the main channel signal and the secondary channel signal are respectively encoded according to the bit allocation result, and the encoding index of the main channel encoding and the encoding index of the secondary channel encoding are obtained.
  • an Algebraic Code Excited Linear Prediction (ACELP) encoding method can be used.
  • the encoding method of the stereo signal in the embodiment of the present application may be part of the encoding of the primary channel signal and the secondary channel signal obtained after the downmixing process in step 570 of the above method 500.
  • the encoding method of the stereo signal in the embodiment of the present application may be a process of performing linear prediction on the main channel signal or the secondary channel signal obtained after the downmix processing in the above step 570.
  • linear prediction analysis There are various ways to perform linear prediction analysis on the stereo signal of the current frame, and it is possible to perform two linear prediction analysis on the main channel signal and the secondary channel signal of the current frame, or only the main sound of the current frame.
  • the linear signal and the secondary channel signal are each subjected to linear prediction analysis. The manners of the two linear prediction analysis will be described in detail below with reference to FIG. 9 and FIG. 10 respectively.
  • FIG. 9 is a schematic flowchart of a linear prediction analysis process of an embodiment of the present application.
  • the linear prediction process shown in Figure 9 is to perform two linear prediction analysis on the main channel signal of the current frame.
  • the process of the linear predictive analysis shown in FIG. 9 specifically includes:
  • the preprocessing here may include sample rate conversion, pre-emphasis processing, and the like.
  • a main channel signal with a sampling rate of 16 kHz can be converted into a signal with a sampling rate of 12.8 kHz, which is convenient for encoding processing in the subsequent encoding mode of Algebraic Code Excited Linear Prediction (ACELP).
  • ACELP Algebraic Code Excited Linear Prediction
  • the initial linear prediction analysis window in step 920 is equivalent to the linear prediction analysis window in step 310 above.
  • the first windowing process on the pre-processed main channel signal according to the initial linear prediction analysis window may be specifically performed according to formula (20).
  • s pre (n) is the signal after pre-emphasis processing
  • s wmid (n) is the signal after the first windowing process
  • L is the window length of the linear prediction analysis window
  • w(n) is the initial linear prediction analysis. window.
  • the Levinson Dubin algorithm can be specifically calculated when calculating the first set of linear prediction coefficients of the current frame.
  • the first set of linear prediction coefficients of the current frame may be calculated by using the Levinson Dubin algorithm according to the signal s wmid (n) after the first windowing process.
  • the modified linear prediction analysis window may be a linear prediction analysis window that satisfies the above formula (7) and formula (9).
  • the second windowing process on the preprocessed main channel signal according to the modified linear prediction analysis window may be specifically performed according to formula (27).
  • s pre (n) is the signal after pre-emphasis processing
  • s wend (n) is the signal after the second windowing process
  • L is the window length of the modified linear prediction analysis window
  • w adp (n) is the correction Linear predictive analysis window.
  • the Levinson Dubin algorithm can be specifically used for calculation.
  • the second set of linear prediction coefficients of the current frame may be calculated by using the Levinson Dubin algorithm according to the signal s wend (n) after the second windowing process.
  • the process of performing linear prediction analysis on the secondary channel signal of the current frame is the same as the process of performing linear prediction analysis on the main channel signal of the current frame in steps 910 to 950 described above.
  • FIG. 10 is a schematic flowchart of a linear prediction analysis process of an embodiment of the present application.
  • the linear prediction process shown in Figure 10 is a linear predictive analysis of the primary channel signal of the current frame.
  • the process of the linear predictive analysis shown in FIG. 10 specifically includes:
  • the preprocessing here may include sample rate conversion, pre-emphasis processing, and the like.
  • the initial linear prediction analysis window in step 1020 is equivalent to the initial linear prediction analysis window in step 320 above.
  • the window length of the attenuation window of the current frame may be first determined according to the inter-channel time difference of the current frame, and then the corrected linear prediction analysis window is determined according to the manner in step 320 above.
  • the windowing process of the preprocessed main channel signal according to the modified linear prediction analysis window can be performed according to formula (28).
  • s w (n) is the windowed signal
  • L is the window length of the modified linear prediction analysis window
  • w adp (n) is the modified linear prediction analysis window
  • the calculation of the linear prediction coefficient of the current frame can be specifically calculated by using the Levinson Dubin algorithm.
  • the linear prediction coefficient of the current frame can be calculated by using the Levinson Dubin algorithm according to the windowed processed signal s w (n).
  • the process of performing linear prediction analysis on the secondary channel signal of the current frame is the same as the process of performing linear prediction analysis on the main channel signal of the current frame in steps 1010 to 1040 described above.
  • the encoding method of the stereo signal of the embodiment of the present application has been described in detail above with reference to FIGS. 1 to 10.
  • the apparatus for encoding the stereo signal of the embodiment of the present application will be described below with reference to FIG. 11 and FIG. 12. It should be understood that the apparatus in FIG. 11 to FIG. 12 corresponds to the encoding method of the stereo signal in the embodiment of the present application, and FIG. 11 and The apparatus in FIG. 12 can perform the encoding method of the stereo signal of the embodiment of the present application.
  • the repeated description is appropriately omitted below.
  • FIG. 11 is a schematic block diagram of an encoding apparatus for a stereo signal according to an embodiment of the present application.
  • the apparatus 1100 of Figure 11 includes:
  • the first determining module 1110 is configured to determine, according to an inter-channel time difference of the current frame, a window length of the attenuation window of the current frame;
  • the second determining module 1120 is configured to determine a modified linear prediction analysis window according to a window length of the attenuation window of the current frame, where at least the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window The value of a part of the points is smaller than the value of the corresponding point in the L-sub_window_len point to the L-1 point of the initial linear prediction analysis window, sub_window_len is the window length of the attenuation window of the current frame, and L is the linearity of the correction. Predicting a window length of the analysis window, the window length of the modified linear prediction analysis window being equal to a window length of the initial linear prediction analysis window;
  • the processing module 1130 is configured to perform linear prediction analysis on the channel signal to be processed according to the modified linear prediction analysis window.
  • the value of the corresponding point of the artificially reconstructed forward signal of the target channel of the current frame in the modified linear prediction analysis window is smaller than the target of the target channel of the current frame in the uncorrected linear prediction analysis window
  • the reconstructed forward signal corresponds to the value of the point, thereby reducing the effect of the artificially reconstructed forward signal of the target channel of the current frame during linear prediction, thereby reducing the artificially reconstructed forward signal and the true forward direction.
  • the influence of the error between the signals on the accuracy of the linear prediction analysis results therefore, the difference between the linear prediction coefficients obtained by the linear prediction analysis and the true linear prediction coefficients can be reduced, and the accuracy of the linear prediction analysis can be improved.
  • the value of any one of the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window is smaller than the L-sub_window_len point of the initial linear prediction analysis window.
  • the value of the corresponding point in point L-1 is smaller.
  • the first determining module 1110 is specifically configured to: determine, according to an inter-channel time difference of the current frame and a length of a preset transition segment, a window of the attenuation window of the current frame. long.
  • the first determining module 1110 is specifically configured to: determine a sum of an absolute value of an inter-channel time difference of the current frame and a length of the preset transition segment as the The window length of the attenuation window of the current frame.
  • the first determining module 1110 is specifically configured to: when an absolute value of an inter-channel time difference of the current frame is greater than or equal to a length of the preset transition segment, Determining, by a sum of an absolute value of an inter-channel time difference of the current frame and a length of the preset transition period, as a window length of an attenuation window of the current frame; a time difference between channels of the current frame
  • N is Pre-set real numbers greater than 0 less than L/MAX DELAY
  • MAX DELAY is a preset real number greater than zero.
  • the above MAX DELAY is the maximum value of the absolute value of the time difference between channels.
  • the second determining module 1120 is specifically configured to: modify the initial linear prediction analysis window according to a window length of the attenuation window of the current frame, where the modified linear prediction The value of the analysis window from the L-sub_window_len point to the L-1 point is gradually increased relative to the value of the corresponding point in the L-sub_window_len point to the L-1 point of the initial linear prediction analysis window. .
  • the modified linear prediction analysis window satisfies a formula:
  • w adp (i) is a modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • MAX_ATTEN is a preset real number greater than 0.
  • the second determining module 1120 is specifically configured to: determine, according to a window length of the attenuation window of the current frame, an attenuation window of the current frame; according to the attenuation window of the current frame
  • the initial linear prediction analysis window is modified, wherein the value of the modified linear prediction analysis window from the L-sub_window_len point to the L-1 point is relative to the L-sub_window_len point of the initial linear prediction analysis window.
  • the attenuation value of the corresponding point in the point L-1 gradually increases.
  • the second determining module 1120 is specifically configured to: determine, according to a window length of the attenuation window of the current frame, an attenuation window of the current frame from a plurality of candidate attenuation windows stored in advance
  • the plurality of candidate attenuation windows correspond to different window length value ranges, and there is no intersection between the different window length value ranges.
  • the attenuation window of the current frame satisfies a formula:
  • sub_window(i) is an attenuation window of the current frame
  • MAX_ATTEN is a preset real number greater than 0.
  • the modified linear prediction analysis window satisfies a formula:
  • w adp (i) is a window function of the modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • sub_window(.) is an attenuation window of the current frame.
  • the second determining module 1120 is specifically configured to: determine the linearity of the correction from a plurality of candidate linear predictive analysis windows stored in advance according to a window length of the attenuation window of the current frame.
  • the prediction analysis window wherein the plurality of candidate linear prediction analysis windows correspond to different window length value ranges, and there is no intersection between the different window length value ranges.
  • the apparatus before the second determining module 1120 determines the corrected linear prediction analysis window according to the window length of the attenuation window of the current frame, the apparatus further includes:
  • the correction module 1140 is configured to correct a window length of the attenuation window of the current frame according to a preset interval step size to obtain a window length of the modified attenuation window, wherein the interval step is a preset positive Integer
  • the second determining module 1120 is specifically configured to: determine a corrected linear prediction analysis window according to the initial linear prediction analysis window and the window length of the modified attenuation window.
  • the window length of the modified attenuation window satisfies the formula:
  • sub_window_len_mod is the window length of the modified attenuation window
  • len_step is the interval step size
  • FIG. 12 is a schematic block diagram of an apparatus for encoding a stereo signal according to an embodiment of the present application.
  • the apparatus 1200 of Figure 12 includes:
  • the memory 1210 is configured to store a program.
  • the processor 1220 is configured to execute a program stored in the memory 1210. When the program in the memory 1210 is executed, the processor 1220 is specifically configured to: determine the current frame according to an inter-channel time difference of a current frame. a window length of the attenuation window; determining a modified linear prediction analysis window according to a window length of the attenuation window of the current frame, wherein the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window The value of at least a part of the points is smaller than the value of the corresponding point in the L-sub_window_len point to the L-1 point of the initial linear prediction analysis window, and the sub_window_len is the window length of the attenuation window of the current frame, and L is the corrected a window length of the linear prediction analysis window, the window length of the modified linear prediction analysis window being equal to a window length of the initial linear prediction analysis window; linear prediction analysis of the channel signal to be processed according to the modified linear prediction analysis window .
  • the value of the corresponding point of the artificially reconstructed forward signal of the target channel of the current frame in the modified linear prediction analysis window is smaller than the target of the target channel of the current frame in the uncorrected linear prediction analysis window
  • the reconstructed forward signal corresponds to the value of the point, thereby reducing the effect of the artificially reconstructed forward signal of the target channel of the current frame during linear prediction, thereby reducing the artificially reconstructed forward signal and the true forward direction.
  • the influence of the error between the signals on the accuracy of the linear prediction analysis results therefore, the difference between the linear prediction coefficients obtained by the linear prediction analysis and the true linear prediction coefficients can be reduced, and the accuracy of the linear prediction analysis can be improved.
  • the value of any one of the L-sub_window_len point to the L-1 point of the modified linear prediction analysis window is smaller than the L-sub_window_len point of the initial linear prediction analysis window.
  • the value of the corresponding point in point L-1 is smaller.
  • the processor 1220 is specifically configured to: determine a window length of the attenuation window of the current frame according to an inter-channel time difference of the current frame and a length of a preset transition segment.
  • the processor 1220 is specifically configured to: determine a sum of an absolute value of an inter-channel time difference of the current frame and a length of the preset transition segment as the current frame. The window length of the attenuation window.
  • the processor 1220 is specifically configured to: when the absolute value of the inter-channel time difference of the current frame is greater than or equal to the length of the preset transition segment, The sum of the absolute value of the inter-channel time difference of the current frame and the length of the preset transition period is determined as the window length of the attenuation window of the current frame; the absolute value of the time difference between the channels of the current frame When the length of the preset transition period is smaller than the length of the preset transition period, N times the absolute value of the inter-channel time difference of the current frame is determined as the window length of the attenuation window of the current frame, where N is preset A real number greater than 0 is less than L/MAX DELAY, and MAX DELAY is a preset real number greater than zero.
  • the above MAX DELAY is the maximum value of the absolute value of the time difference between channels.
  • the processor 1220 is specifically configured to: modify the initial linear prediction analysis window according to a window length of the attenuation window of the current frame, where the modified linear prediction analysis window The value of the value from the L-sub_window_len point to the L-1 point gradually increases with respect to the value of the corresponding point in the L-sub_window_len point to the L-1 point of the initial linear prediction analysis window.
  • the modified linear prediction analysis window satisfies a formula:
  • w adp (i) is a modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • MAX_ATTEN is a preset real number greater than 0.
  • the processor 1220 is specifically configured to: determine, according to a window length of the attenuation window of the current frame, an attenuation window of the current frame; according to the attenuation window of the current frame, the initial The linear prediction analysis window is modified, wherein the value of the modified linear prediction analysis window from the L-sub_window_len point to the L-1 point is relative to the L-sub_window_len point to the Lth of the initial linear prediction analysis window.
  • the attenuation value of the corresponding point in the -1 point gradually increases.
  • the processor 1220 is specifically configured to: determine, according to a window length of the attenuation window of the current frame, an attenuation window of the current frame from a plurality of candidate attenuation windows stored in advance, where The plurality of candidate attenuation windows correspond to different window length value ranges, and there is no intersection between the different window length value ranges.
  • the attenuation window of the current frame satisfies a formula:
  • sub_window(i) is an attenuation window of the current frame
  • MAX_ATTEN is a preset real number greater than 0.
  • the modified linear prediction analysis window satisfies a formula:
  • w adp (i) is a window function of the modified linear prediction analysis window
  • w(i) is the initial linear prediction analysis window
  • sub_window(.) is an attenuation window of the current frame.
  • the processor 1220 is specifically configured to: determine, according to a window length of the attenuation window of the current frame, the corrected linear prediction analysis from a plurality of candidate linear prediction analysis windows stored in advance. a window, wherein the plurality of candidate linear prediction analysis windows correspond to different window length value ranges, and there is no intersection between the different window length value ranges.
  • the processor 1220 before the processor 1220 determines the corrected linear prediction analysis window according to the window length of the attenuation window of the current frame, the processor 1220 is further configured to: according to the preset interval step Long, correcting a window length of the attenuation window of the current frame to obtain a window length of the modified attenuation window, wherein the interval step is a preset positive integer; analyzing the window and the window according to the initial linear prediction The window length of the modified attenuation window determines the modified linear prediction analysis window.
  • the window length of the modified attenuation window satisfies the formula:
  • sub_window_len_mod is the window length of the modified attenuation window
  • len_step is the interval step size
  • the description of the apparatus for encoding the stereo signal in the embodiment of the present application is described above with reference to FIG. 11 and FIG. 12 , and the following describes the terminal device and the network device in the embodiment of the present application.
  • the encoding method of the stereo signal can be performed by the terminal device or the network device in FIGS. 13 to 18.
  • the encoding device in the embodiment of the present application may be disposed in the terminal device or the network device in FIG. 13 to FIG. 18, and specifically, the encoding device in the embodiment of the present application may be the terminal device in FIG. 13 to FIG. 18 or Stereo encoder in network equipment.
  • the stereo encoder in the first terminal device stereo-encodes the collected stereo signal, and the channel encoder in the first terminal device can perform the code stream obtained by the stereo encoder.
  • Channel coding next, the data obtained by channel coding of the first terminal device is transmitted to the second network device by using the first network device and the second network device.
  • the second terminal device After receiving the data of the second network device, the second terminal device performs channel decoding on the channel decoder of the second terminal device to obtain a stereo signal encoded code stream, and the stereo decoder of the second terminal device recovers the stereo signal by decoding.
  • the playback of the stereo signal is performed by the terminal device. This completes the audio communication on different terminal devices.
  • the second terminal device may also encode the collected stereo signal, and finally transmit the finally encoded data to the first terminal device by using the second network device and the second network device, where the first terminal The device obtains a stereo signal by channel decoding and stereo decoding of the data.
  • the first network device and the second network device may be wireless network communication devices or wired network communication devices.
  • the first network device and the second network device can communicate via a digital channel.
  • the first terminal device or the second terminal device in FIG. 13 may perform the encoding and decoding method of the stereo signal in the embodiment of the present application.
  • the encoding device and the decoding device in the embodiment of the present application may be the first terminal device or the second terminal device, respectively.
  • Stereo encoder, stereo decoder stereo encoder, stereo decoder.
  • a network device can implement transcoding of an audio signal codec format. As shown in FIG. 14, if the codec format of the signal received by the network device is the codec format corresponding to other stereo decoders, the channel decoder in the network device performs channel decoding on the received signal to obtain other stereo decoding. Corresponding encoded code stream, other stereo decoders decode the encoded code stream to obtain a stereo signal, and the stereo encoder encodes the stereo signal to obtain a coded stream of the stereo signal. Finally, the channel encoder re-pairs the stereo signal. The coded code stream is channel coded to obtain the final signal (the signal can be transmitted to the terminal device or other network device).
  • the codec format corresponding to the stereo encoder in FIG. 14 is different from the codec format corresponding to other stereo decoders. Assuming that the codec format of the other stereo decoder is the first codec format, and the codec format corresponding to the stereo encoder is the second codec format, then in FIG. 14, the audio signal is implemented by the network device. The codec format is converted to the second codec format.
  • the channel decoder of the network device performs channel decoding to obtain the coded stream of the stereo signal. Thereafter, the encoded stream of the stereo signal can be decoded by the stereo decoder to obtain a stereo signal, and then the stereo signal is encoded by other stereo encoders according to other codec formats to obtain corresponding stereo encoders. The code stream is streamed. Finally, the channel encoder performs channel coding on the code stream corresponding to the other stereo encoders to obtain a final signal (the signal can be transmitted to the terminal device or other network device). As in the case of FIG.
  • the codec format corresponding to the stereo decoder in FIG. 15 is also different from the codec format corresponding to other stereo encoders. If the codec format of the other stereo encoder is the first codec format, and the codec format corresponding to the stereo decoder is the second codec format, then in FIG. 15, the audio signal is implemented by the network device. The codec format is converted to the first codec format.
  • stereo codecs and stereo codecs respectively correspond to different codec formats, and therefore, the stereo signal codec format is realized by processing by other stereo codecs and stereo codecs. Transcode.
  • the stereo encoder in FIG. 14 can implement the encoding method of the stereo signal in the embodiment of the present application
  • the stereo decoder in FIG. 15 can implement the decoding method of the stereo signal in the embodiment of the present application.
  • the encoding device in the embodiment of the present application may be a stereo encoder in the network device in FIG. 14, and the decoding device in the embodiment of the present application may be a stereo decoder in the network device in FIG.
  • the network device in FIG. 14 and FIG. 15 may specifically be a wireless network communication device or a wired network communication device.
  • the stereo encoder in the multi-channel encoder in the first terminal device stereo-encodes the stereo signal generated by the acquired multi-channel signal, and the multi-channel encoder obtains
  • the code stream includes a code stream obtained by a stereo encoder
  • the channel encoder in the first terminal device can perform channel coding on the code stream obtained by the multi-channel encoder, and then the data obtained by channel coding of the first terminal device Transmitting to the second network device by the first network device and the second network device.
  • the second terminal device After receiving the data of the second network device, the second terminal device performs channel decoding on the channel decoder of the second terminal device to obtain an encoded code stream of the multi-channel signal, and the encoded code stream of the multi-channel signal includes the stereo signal.
  • the coded stream, the stereo decoder in the multi-channel decoder of the second terminal device recovers the stereo signal by decoding, and the multi-channel decoder decodes the recovered stereo signal to obtain the multi-channel signal, which is performed by the second terminal device. Playback of the multi-channel signal. This completes the audio communication on different terminal devices.
  • the second terminal device may also encode the collected multi-channel signal (in particular, the multi-channel collected by the stereo encoder in the multi-channel encoder in the second terminal device)
  • the stereo signal generated by the channel signal is stereo coded, and then the channel stream obtained by the multi-channel encoder is channel-coded by the channel encoder in the second terminal device, and finally transmitted to the second network device and the second network device.
  • the first terminal device obtains a multi-channel signal by channel decoding and multi-channel decoding.
  • the first network device and the second network device may be a wireless network communication device or a wired network communication device.
  • the first network device and the second network device can communicate via a digital channel.
  • the first terminal device or the second terminal device in FIG. 16 can perform the codec method of the stereo signal in the embodiment of the present application.
  • the encoding device in the embodiment of the present application may be a stereo encoder in the first terminal device or the second terminal device
  • the decoding device in the embodiment of the present application may be stereo decoding in the first terminal device or the second terminal device. Device.
  • a network device can implement transcoding of an audio signal codec format. As shown in FIG. 17, if the codec format of the signal received by the network device is a codec format corresponding to other multichannel decoders, the channel decoder in the network device performs channel decoding on the received signal to obtain other The encoded code stream corresponding to the multi-channel decoder, the other multi-channel decoder decodes the encoded code stream to obtain a multi-channel signal, and the multi-channel encoder encodes the multi-channel signal to obtain a multi-channel signal.
  • the encoded code stream wherein the stereo encoder in the multi-channel encoder stereo-encodes the stereo signal generated by the multi-channel signal to obtain an encoded code stream of the stereo signal, and the encoded code stream of the multi-channel signal includes the stereo signal.
  • the code stream is streamed.
  • the channel coder performs channel coding on the code stream to obtain a final signal (the signal can be transmitted to the terminal device or other network device).
  • the channel decoder of the network device performs channel decoding to obtain a multichannel signal.
  • the encoded stream of the multi-channel signal can be decoded by the multi-channel decoder to obtain a multi-channel signal, wherein the encoding code of the multi-channel signal by the stereo decoder in the multi-channel decoder
  • the encoded code stream of the stereo signal in the stream is stereo-decoded, and then the multi-channel signal is encoded by other multi-channel encoders according to other codec formats to obtain multiple sounds corresponding to other multi-channel encoders.
  • the channel encoder performs channel coding on the encoded code stream corresponding to other multi-channel encoders to obtain a final signal (the signal can be transmitted to the terminal device or other network device).
  • the stereo encoder of FIG. 17 is capable of implementing the encoding method of the stereo signal in the present application
  • the stereo decoder of FIG. 18 is capable of implementing the decoding method of the stereo signal in the present application.
  • the encoding device in the embodiment of the present application may be a stereo encoder in the network device in FIG. 17, and the decoding device in the embodiment of the present application may be a stereo decoder in the network device in FIG.
  • the network device in FIG. 17 and FIG. 18 may specifically be a wireless network communication device or a wired network communication device.
  • the present application also provides a chip including a processor and a communication interface for communicating with an external device for performing a method of encoding a stereo signal of an embodiment of the present application.
  • the chip may further include a memory, where the memory stores an instruction, the processor is configured to execute an instruction stored on the memory, when the instruction is executed, The processor is configured to perform the encoding method of the stereo signal of the embodiment of the present application.
  • the chip is integrated on a terminal device or a network device.
  • the present application provides a computer readable medium storing program code for device execution, the program code comprising instructions for performing an encoding method of a stereo signal of an embodiment of the present application.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种立体声信号的编码方法和编码装置。方法包括:根据当前帧的声道间时间差确定当前帧的衰减窗的窗长(310);根据当前帧的衰减窗的窗长确定修正的线性预测分析窗,其中,修正的线性预测分析窗的第L-sub_window_len点至第L-1点中的至少一部分点的取值小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值,sub_window_len为当前帧的衰减窗的窗长,L为修正的线性预测分析窗的窗长, 修正的线性预测分析窗的窗长等于当前帧的线性预测分析窗的窗长(320);根据修正的线性预测分析窗对待处理的声道信号进行线性预测分析(330)。能够提高线性预测的准确性。

Description

立体声信号的编码方法和编码装置
本申请要求于2017年08月23日提交中国专利局、申请号为201710731482.1、申请名称为“立体声信号的编码方法和编码装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及音频信号编解码技术领域,并且更具体地,涉及一种立体声信号的编码方法和编码装置。
背景技术
采用时域立体声编码技术对立体声信号进行编码的大致过程如下:
对立体声信号进行声道间时间差估计;
根据声道间时间差对立体声信号进行时延对齐处理;
根据时域下混处理的参数,对时延对齐处理后的信号进行时域下混处理,得到主要声道信号和次要声道信号;
对声道间时间差、时域下混处理的参数、主要声道信号和次要声道信号进行编码,得到编码码流。
其中,在根据声道间时间差对立体声信号进行时延对齐处理时可以先根据声道间时间差从立体声信号的左声道和右声道中选择时间上相对落后的声道作为目标声道,选择另一个声道作为对目标声道进行时延对齐处理的参考声道,然后对目标声道的信号进行时延对齐处理,使得时延对齐处理后的目标声道信号与参考声道信号之间不存在声道间时间差。另外,时延对齐处理还包括人工重建目标声道的前向信号。
但是,由于目标声道的一部分信号是人工确定的(包括过渡段信号和前向信号),该人工确定的部分信号与真实的信号的差别较大,因此,可能会导致采用单声道编码算法对根据时延对齐处理后的立体声信号确定的主要声道信号和次要声道信号进行线性预测分析时得到的线性预测系数与真实的线性预测系数之间存在一定的差异,进而影响编码质量。
发明内容
本申请提供一种立体声信号的编码方法和编码装置,以提高编码过程中线性预测的准确性。
应理解,本申请中的立体声信号可以是原始的立体声信号,也可以是多声道信号中包含的两路信号组成的立体声信号,还可以是由多声道信号中包含的多路信号联合产生的两路信号组成的立体声信号。
另外,本申请中的立体声信号的编码方法,可以是多声道编码方法中使用的立体声信 号的编码方法。
第一方面,提供了一种立体声信号的编码方法,该方法包括:根据当前帧的声道间时间差确定所述当前帧的衰减窗的窗长;根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,其中,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中的至少一部分点的取值小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值,sub_window_len为所述当前帧的衰减窗的窗长,L为所述修正的线性预测分析窗的窗长,所述修正的线性预测分析窗的窗长等于所述初始线性预测分析窗的窗长;根据所述修正的线性预测分析窗对待处理的声道信号进行线性预测分析。
由于修正的线性预测分析窗中第L-sub_window_len点至第L-1点中的至少一部分点的取值小于所述线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值,从而在线性预测时能够减小当前帧的目标声道的人工重建的重建信号(该重建信号可以包括过渡段信号和前向信号)所起的作用,从而降低人工重建的重建信号与真实信号之间的误差对线性预测分析结果的准确性的影响,因此,可以减小线性预测分析得到的线性预测系数与真实的线性预测系数之间的差异,提高线性预测分析的准确性。
结合第一方面,在第一方面的某些实现方式中,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中任意一点的取值小于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的声道间时间差确定所述当前帧的衰减窗的窗长,包括:根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长,包括:将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长,包括:在所述当前帧的声道间时间差的绝对值大于或者等于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长;在所述当前帧的声道间时间差的绝对值小于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值的N倍确定为所述当前帧的衰减窗的窗长,其中,N为预先设定的大于0且小于L/MAX DELAY的实数,MAX DELAY为预设的大于0的实数。
可选地,上述MAX DELAY为声道间时间差的绝对值的最大值。应理解,这里的声道间时间差可以是对立体声信号进行编解码时预先设定的声道间时间差。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:根据所述当前帧的衰减窗的窗长对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
上述衰减值可以是修正的线性预测分析窗中的点的取值相对于线性预测分析窗中相应点的取值的衰减值。
具体地,例如,第一点为修正的线性预测衰减窗中第L-sub_window_len点至第L-1点中的任意一点,而第二点则是线性预测分析窗中与第一点相对应的对应点。那么,上述衰减值可以是第一点的取值相对于第二点的取值的衰减值。
在对声道信号进行时延对齐处理时,需要人工重建当前帧的目标声道的前向信号,但是在人工重建的前向信号中,离当前帧的目标声道的真实信号越远的点的信号值估计的越不准确,而修正后的线性预测分析窗会作用于人工重建的前向信号,因此,采用本申请中的修正的线性预测分析窗对前向信号进行处理时,可以降低人工重建的前向信号离真实信号较远的点的信号在线性预测分析时的所占的比重,能够进一步提高线性预测的准确性。
结合第一方面,在第一方面的某些实现方式中,所述修正的线性预测分析窗满足公式:
Figure PCTCN2018101524-appb-000001
其中,w adp(i)为修正的线性预测分析窗,w(i)为所述初始线性预测分析窗,
Figure PCTCN2018101524-appb-000002
其中,MAX_ATTEN为预设的大于0的实数。
应理解,上述MAX_ATTEN可以是在对声道信号进行编解码时预先设置的多个衰减值中的最大的衰减值。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:根据所述当前帧的衰减窗的窗长确定所述当前帧的衰减窗;根据所述当前帧的衰减窗对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的衰减窗的窗长确定所述当前帧的衰减窗,包括:根据所述当前帧的衰减窗的窗长,从预先存储的多个候选衰减窗中确定所述当前帧的衰减窗,其中,所述多个候选衰减窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
通过从预先存储的多个候选衰减窗中确定当前帧的衰减窗,能够减少确定衰减窗时的计算复杂度。
具体地,在根据衰减窗的窗长在不同取值范围对应的预先选定的衰减窗的窗长分别计算出对应的衰减窗之后,可以将衰减窗的窗长在不同取值范围内对应的衰减窗存储起来,这样可以在后续确定当前帧的衰减窗的窗长后能够直接根据当前帧的衰减窗的窗长满足的取值范围从预先存储的多个衰减窗中确定出当前帧的衰减窗,能够减少计算过程,简化计算的复杂度。
应理解,在计算衰减窗时预先选定的衰减窗窗长可以是衰减窗的窗长的所有可能取值或者衰减窗的窗长的所有可能取值的子集。
结合第一方面,在第一方面的某些实现方式中,所述当前帧的衰减窗满足公式:
Figure PCTCN2018101524-appb-000003
其中,sub_window(i)为所述当前帧的衰减窗,MAX_ATTEN为预设的大于0的实数。
应理解,上述MAX_ATTEN可以是在对声道信号进行编解码时预先设置的多个衰减值中的最大的衰减值。
结合第一方面,在第一方面的某些实现方式中,所述修正的线性预测分析窗满足公式:
Figure PCTCN2018101524-appb-000004
其中,w adp(i)为所述修正的线性预测分析窗的窗函数,w(i)为所述初始线性预测分析窗,sub_window(.)为所述当前帧的衰减窗。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:根据所述当前帧的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定所述修正的线性预测分析窗,其中,所述多个候选线性预测分析窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
通过从预先存储的多个候选线性预测分析窗中确定修正的线性预测分析窗,能够减少确定修正的线性预测分析窗时的计算复杂度。
具体地,在根据初始线性预测分析窗以及衰减窗的窗长在不同取值范围对应的预先选定的衰减窗的窗长分别计算出对应的修正的线性预测分析窗之后,可以将衰减窗的窗长在不同取值范围内对应的修正的线性预测分析窗存储起来,这样可以在后续确定当前帧的衰减窗的窗长后能够直接根据当前帧的衰减窗的窗长满足的取值范围从预先存储的多个线性预测分析窗中确定出修正的线性预测分析窗,能够减少计算过程,简化计算的复杂度。
可选地,上述计算修正的线性预测分析窗时预先选定的衰减窗窗长可以是衰减窗的窗长的所有可能取值或者衰减窗的窗长的所有可能取值的子集。
结合第一方面,在第一方面的某些实现方式中,在根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗之前,所述方法还包括:根据预设的间隔步长,对所述当前帧的衰减窗的窗长进行修正,以获得修正的衰减窗的窗长,其中,所述间隔步长为预设的正整数;所述根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:根据所述初始线性预测分析窗以及所述修正的衰减窗的窗长确定修正的线性预测分析窗。
可选地,上述间隔步长为小于衰减窗的窗长最大值的正整数。
通过采用预设的间隔步长对当前帧的衰减窗的窗长进行修正,能够降低衰减窗的窗长,并且通过将修正的衰减窗的窗长的可能的取值限定在有限数值组成的集合内,便于存储修正的衰减窗的窗长的可能的取值对应的衰减窗,从而降低后续计算的复杂度。
结合第一方面,在第一方面的某些实现方式中,所述修正的衰减窗的窗长满足公式:
Figure PCTCN2018101524-appb-000005
其中,sub_window_len_mod为所述修正的衰减窗的窗长,len_step为所述间隔步长。
结合第一方面,在第一方面的某些实现方式中,所述根据所述初始线性预测分析窗以及所述修正的衰减窗的窗长确定修正的线性预测分析窗,包括:根据所述修正的衰减窗的窗长对所述初始线性预测分析窗进行修正。
结合第一方面,在第一方面的某些实现方式中,所述根据所述初始线性预测分析窗以及所述修正的衰减窗的窗长确定修正的线性预测分析窗,包括:根据所述修正的衰减窗的窗长确定所述当前帧的衰减窗;根据所述修正的衰减窗对所述当前帧的线性预测分析窗初始线性预测分析窗进行修正。
结合第一方面,在第一方面的某些实现方式中,所述根据所述修正的衰减窗的窗长确定所述当前帧的衰减窗,包括:根据所述修正的衰减窗的窗长,从预先存储的多个候选衰减窗中确定所述当前帧的衰减窗,其中所述预先存储的多个候选衰减窗为所述修正的衰减窗的窗长在不同取值时对应的衰减窗。
在根据一组预先选定的修正的衰减窗的窗长分别计算出对应的衰减窗之后,可以将预先选定的修正的衰减窗的窗长对应的衰减窗存储起来,这样可以在后续确定修正的衰减窗的窗长后能够直接根据修正的衰减窗的窗长从预先存储的多个候选衰减窗中确定出当前帧的衰减窗,能够减少计算过程,简化计算的复杂度。
应理解,这里预先选定的修正的衰减窗的窗长可以为修正的衰减窗的窗长的所有可能的取值或修正的衰减窗的窗长的所有可能的取值的子集。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的线性预测分析窗初始线性预测分析窗以及所述修正的衰减窗的窗长确定修正的线性预测分析窗,包括:根据所述修正的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定所述修正的线性预测分析窗,其中所述预先存储的多个候选线性预测分析窗为所述修正的衰减窗的窗长在不同取值时对应的修正的线性预测分析窗。
在根据当前帧的线性预测分析窗初始线性预测分析窗以及一组预先选定的修正的衰减窗的窗长分别计算出对应的修正的线性预测分析窗之后,可以将预先选定的修正的衰减窗的窗长对应的修正的线性预测分析窗存储起来,这样可以在后续确定修正的衰减窗的窗长后能够直接根据修正的衰减窗的窗长从预先存储的多个候选线性预测分析窗中确定出修正的线性预测分析窗,能够减少计算过程,简化计算的复杂度。
可选地,这里预先选定的修正的衰减窗的窗长为修正的衰减窗的窗长的所有可能的取值或修正的衰减窗的窗长的所有可能的取值的子集。
第二方面,提供一种编码装置,所述编码装置包括用于执行所述第一方面或者其各种实现方式的模块。
第三方面,提供一种编码装置,包括存储器和处理器,所述存储器用于存储程序,所述处理器用于执行程序,当所述程序被执行时,所述处理器执行所述第一方面或者第一方面的任一可能的实现方式中的方法。
第四方面,提供一种计算机可读存储介质,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码包括用于执行第一方面或其各种实现方式中的方法的指令。
第五方面,提供一种芯片,所述芯片包括处理器与通信接口,所述通信接口用于与外部器件进行通信,所述处理器用于执行第一方面或第一方面的任一可能的实现方式中的方法。
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面或第一方面的任一可能的实现方式中的方法。
可选地,作为一种实现方式,所述芯片集成在终端设备或网络设备上。
附图说明
图1是时域立体声编码方法的示意性流程图;
图2是时域立体声解码方法的示意性流程图;
图3是本申请实施例的立体声信号的编码方法的示意性流程图;
图4是根据本申请实施例的立体声信号的编码方法得到的线性预测系数与真实的线性预测系数之间的差异的频谱图;
图5是本申请实施例的立体声信号的编码方法的示意性流程图;
图6是本申请实施例的时延对齐处理的示意图;
图7是本申请实施例的时延对齐处理的示意图;
图8是本申请实施例的时延对齐处理的示意图;
图9是本申请实施例的线性预测分析过程的示意性流程图;
图10是本申请实施例的线性预测分析过程的示意性流程图;
图11是本申请实施例的编码装置的示意性框图;
图12是本申请实施例的编码装置的示意性框图;
图13是本申请实施例的终端设备的示意图;
图14是本申请实施例的网络设备的示意图;
图15是本申请实施例的网络设备的示意图;
图16是本申请实施例的终端设备的示意图;
图17是本申请实施例的网络设备的示意图;
图18是本申请实施例的网络设备的示意图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
为了便于理解本申请实施例的立体声信号编码方法,下面先结合图1和图2对时域立体声编解码方法的大致编解码过程进行简单的介绍。
图1是时域立体声编码方法的示意性流程图。该编码方法100具体包括:
110、编码端对立体声信号进行声道间时间差估计,得到立体声信号的声道间时间差。
其中,上述立体声信号包括左声道信号和右声道信号,立体声信号的声道间时间差是指左声道信号和右声道信号之间的时间差。
120、根据估计得到的声道间时间差对左声道信号和右声道信号进行时延对齐处理。
130、对立体声信号的声道间时间差进行编码,得到声道间时间差的编码索引,写入立体声编码码流。
140、确定声道组合比例因子,并对声道组合比例因子进行编码,得到声道组合比例因子的编码索引,写入立体声编码码流。
150、根据声道组合比例因子对时延对齐处理后的左声道信号和右声道信号进行时域下混处理。
160、对下混处理后得到的主要声道信号和次要声道信号分别进行编码,得到主要声道信号和次要声道信号的码流,写入立体声编码码流。
图2是时域立体声解码方法的示意性流程图。该解码方法200具体包括:
210、根据接收到的码流解码得到主要声道信号和次要声道信号。
步骤210中的码流可以是解码端从编码端接收到的,另外,步骤210相当于分别进行 主要声道信号解码和次要声道信号解码,以得到主要声道信号和次要声道信号。
220、根据接收到的码流解码得到声道组合比例因子。
230、根据声道组合比例因子对主要声道信号和次要声道信号进行时域上混处理,得到时域上混处理后的左声道重建信号和右声道重建信号。
240、根据接收到的码流解码得到声道间时间差。
250、根据声道间时间差对时域上混处理后的左声道重建信号和右声道重建信号进行时延调整,得到解码后的立体声信号。
在上述方法100的步骤120中进行时延对齐处理时,需要人工重建当前帧的目标声道的前向信号,但是当前帧的目标声道的人工重建的前向信号与真实的前向信号之间的差异较大。因此,在进行线性预测分析时,这部分人工重建的前向信号会导致步骤160中对下混处理后得到的主要声道信号和次要声道信号分别进行编码时线性预测分析得到的线性预测系数不够准确,线性预测分析得到的线性预测系数与真实的线性预测系数之间存在一定的差距,因此,需要提出一种新的立体声信号的编码方法,该编码方法能够提高线性预测分析的准确性,减小线性预测分析得到的线性预测系数与真实的线性预测系数之间的差异。
因此,本申请提出了一种新的立体声编码方法,该方法通过对初始线性预测分析窗进行修正,使得修正的线性预测分析窗中与当前帧的目标声道的人工重建的前向信号对应点的取值小于未经修正的线性预测分析窗中与当前帧的目标声道的人工重建的前向信号对应点的取值,从而在线性预测时减小当前帧的目标声道的工重建的前向信号所起的作用,从而降低人工重建的前向信号与真实的前向信号之间的误差对线性预测分析结果的准确性的影响,这样就可以减小线性预测分析得到的线性预测系数与真实的线性预测系数之间的差异,提高线性预测分析的准确性。
图3是本申请实施例的编码方法的示意性流程图。该方法300可以由编码端执行,该编码端可以是编码器或者是具有编码立体声信号功能的设备。应理解,方法300可以是上述方法100的步骤160中对下混处理后得到的主要声道信号和次要声道信号进行编码的整个过程中的一部分。具体地,方法300可以是上述步骤160中对下混处理后得到的主要声道信号或者次要声道信号进行线性预测的过程。
上述方法300具体包括:
310、根据当前帧的声道间时间差确定当前帧的衰减窗的窗长。
可选地,可以直接将当前帧的声道间时间差的绝对值与当前帧预先设定的过渡段(过渡段位于当前帧的真实信号与人工重建的前向信号之间)的长度的和确定为衰减窗的窗长。
具体地,可以根据公式(1)确定当前帧的衰减窗的窗长。
sub_window_len=abs(cur_itd)+Ts2                              (1)
在公式(1)中,sub_window_len为衰减窗的窗长,cur_itd为当前帧的声道间时间差,abs(cur_itd)为当前帧的声道间时间差的绝对值,Ts2是为了增强当前帧的真实信号与人工重建的前向信号之间的平滑而预先设定的过渡段的长度。
由公式(1)可知,衰减窗的窗长的最大值满足公式(2)。
MAX_WIN_LEN=MAX_DELAY+Ts2                          (2)
其中,MAX_WIN_LEN为衰减窗的窗长的最大值,Ts2的在公式(2)中的含义与在公式(1)中的含义相同,MAX_DELAY预设的大于0的实数,进一步地,MAX_DELAY可以为声道间时间差的绝对值所能取到的最大值。对于不同的编解码器来说,声道间时间差的绝对值所能取到的最大值可能会有所不同,并且MAX_DELAY可以由用户或编解码器的厂家根据需要进行设定。可以理解的是,在编解码器工作时,MAX_DELAY的具体取值已经是一个确定的值。
例如,当立体声信号的采样率为16KHz时,MAX_DELAY可以为40,Ts2可以为10,此时根据公式(2)可知当前帧的声道间时间差的绝对值的最大值MAX_WIN_LEN为50。
可选地,还可以根据当前帧的声道间时间差的绝对值与当前帧的预先设定的过渡段的长度的大小关系来确定当前帧的衰减窗的窗长。
具体而言,在当前帧的声道间时间差的绝对值大于等于当前帧的预先设定的过渡段的长度时,当前帧的衰减窗的窗长为当前帧的声道间时间差的绝对值与当前帧的预先设定的过渡段的长度的和;在当前帧的声道间时间差的绝对值小于当前帧的预先设定的过渡段的长度时,当前帧的衰减窗的窗长为当前帧的声道间时间差的绝对值的N倍,理论上,N可以是预先设置的任何大于零小于L/MAX DELAY的实数,而在一般情况下,N可以为预先设定的大于0小于等于2的整数。
具体地,可以根据公式(3)确定当前帧的衰减窗的窗长。
Figure PCTCN2018101524-appb-000006
在公式(4)中,sub_window_len为衰减窗的窗长,cur_itd为当前帧的声道间时间差,abs(cur_itd)为当前帧的声道间时间差的绝对值,Ts2是为了增强当前帧的真实信号与人工重建的前向信号之间的平滑而预先设定的过渡段的长度,N为预先设定的大于0小于L/MAX DELAY的实数,优选地,N为预先设定的大于0小于等于2的整数,例如N为2。
可选地,Ts2为预设的正整数,例如,在采样率为16KHz时,Ts2为10。另外,当立体声信号的采样率不同时,Ts2既可以设置相同的值,也可以设置不同的值。
当根据公式(3)确定当前帧的衰减窗的窗长时,衰减窗的窗长的最大值满足公式(4)或者公式(5)。
MAX_WIN_LEN=MAX_DELAY+Ts2                            (4)
MAX_WIN_LEN=N*MAX_DELAY                               (5)
例如,当立体声信号的采样率为16KHz时,MAX_DELAY可以为40,Ts2可以为10,N可以为2,此时根据公式(4)可知当前帧的声道间时间差的绝对值的最大值MAX_WIN_LEN为50。
例如,当立体声信号的采样率为16KHz时,MAX_DELAY可以为40,Ts2可以为50,N可以为2,此时根据公式(5)可知当前帧的声道间时间差的绝对值的最大值MAX_WIN_LEN为80。
320、根据当前帧的衰减窗的窗长确定修正的线性预测分析窗,其中,修正的线性预测分析窗的第L-sub_window_len点至第L-1点中的至少一部分点的取值小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值,sub_window_len为当前帧的衰减窗的窗长,L为修正的线性预测分析窗的窗长,修正的线性预测分析窗的窗长等于 初始线性预测分析窗的窗长。
进一步地,上述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中任意一点的取值小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值。
其中,修正的线性预测分析窗的第L-sub_window_len点至第L-1点中任意一点在初始线性预测分析窗的第L-sub_window_len点至第L-1点中的对应点,是指在初始线性预测分析窗中与所述任意一点具有相同的索引(index)的点,例如,修正的线性预测分析窗中的第L-sub_window_len点在初始线性预测分析窗的对应点是初始线性预测分析窗中的第L-sub_window_len点。
可选地,根据当前帧的衰减窗的窗长确定修正的线性预测分析窗,具体包括:根据当前帧的衰减窗的窗长对初始线性预测分析窗进行修正,以得到修正的线性预测分析窗。进一步地,修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
应理解,上述衰减值可以是修正的线性预测分析窗中的点的取值相对于线性预测分析窗中相应点的取值的衰减值。例如,在确定修正的线性预测分析窗中第L-sub_window_len点的取值相对于线性预测分析窗中相应点的取值的衰减值时,具体可以通过线性预测分析窗中第L-sub_window_len点的取值与修正的线性预测分析窗中第L-sub_window_len点的取值的差值确定。
例如,第一点为修正的线性预测衰减窗中第L-sub_window_len点至第L-1点中的任意一点,而第二点则是线性预测分析窗中与第一点相对应的对应点。那么,上述衰减值可以是第一点的取值与第二点的取值的差值。
应理解,在根据当前帧的衰减窗的窗长对初始线性预测分析窗进行修正是为了使得初始线性预测分析窗中第L-sub_window_len点至第L-1点中的至少一部分点的取值变小,也就是说,在对初始线性预测分析窗进行修正得到修正的线性预测分析窗之后,修正的线性预测分析窗的第L-sub_window_len点至第L-1点中的至少一部分点的小于初始线性预测分析窗中对应点的取值。
应理解,衰减窗的窗长范围内各个点对应的衰减值或者衰减窗中各个点的取值可以包含0,也可以不包含0。并且,衰减窗的窗长范围内各个点的取值以及衰减窗中各个点的取值既可以是小于等于0的实数,也可以是大于等于0的实数。
当衰减窗中各个点的取值是小于等于0的实数时,在根据衰减窗的窗长对初始线性预测分析窗进行修正时可以将初始线性预测分析窗中第L-sub_window_len点至第L-1点中任意一点的取值与衰减窗中对应点的取值相加,进而得到修正的线性预测分析窗中对应点的取值。
而当衰减窗中各个点的取值是大于等于0的实数时,在根据衰减窗的窗长对初始线性预测分析窗进行修正时可以将初始线性预测分析窗中第L-sub_window_len点至第L-1点中任意一点的取值与衰减窗中对应点的取值相减,进而得到修正的线性预测分析窗中对应点的取值。
上述两段内容介绍了衰减窗中各个点取值分别是大于等于0的实数以及小于等于0的实数时,确定修正的线性预测分析窗中对应点的取值的方式。应理解,当衰减窗的窗长范 围内各个点的取值分别是大于等于0的实数以及小于等于0的实数时也可以分别采用与上述两段内容中类似的方式来确定修正的线性预测分析窗中对应点的取值。
还应理解,当上述衰减窗中所有点的取值均为非零的实数时,对初始线性预测分析窗进行修正后,修正的线性预测分析窗中L-sub_window_len点至第L-1点中任意一点的取值均小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值。
而当上述衰减窗中某些点的取值为0时,在对初始线性预测分析窗进行修正后,修正的线性预测分析窗中L-sub_window_len点至第L-1点中至少一部分点的取值均小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值。
应理解,可以选择任何一种类型的线性预测分析窗作为当前帧的初始线性预测分析窗。具体而言,当前帧的初始线性预测分析窗既可以是对称窗也可以是非对称窗。
进一步地,当立体声信号的采样率为12.8KHz时,初始线性预测分析窗的窗长L可以为320点,这时初始线性预测分析窗w(n)满足公式(6):
Figure PCTCN2018101524-appb-000007
其中,L=L 1+L 2,L 1=188,L 2=132。
另外,初始线性预测分析窗的确定方式有多种,既可以通过实时运算得到初始线性预测分析窗,也可以是从预先存储的线性预测分析窗中直接获取初始线性预测分析窗,这些预先存储好的线性预测分析窗可以是通过运算得到并以表格的形式存储起来的。
与通过实时运算获取初始线性预测分析窗的方式相比,从预先存储的线性预测分析窗中获取线性预测分析窗的方式能够快速获取初始线性预测分析窗,降低计算的复杂度,提高编码效率。
在对声道信号进行时延对齐处理时,需要人工重建当前帧的目标声道的前向信号,但是在人工重建的前向信号中,离当前帧的目标声道的真实信号越远的点的信号值估计的越不准确,而修正后的线性预测分析窗会作用于人工重建的前向信号,因此,采用本申请中的修正的线性预测分析窗对前向信号进行处理时,可以降低人工重建的前向信号离真实信号较远的点的信号在线性预测分析时的所占的比重,能够进一步提高线性预测的准确性。
具体地,上述修正的线性预测分析窗满足公式(7),也就是说,可以根据公式(7)确定修正的线性预测分析窗。
Figure PCTCN2018101524-appb-000008
在公式(7)中,sub_window_len为当前帧的衰减窗的窗长,w adp(i)为修正的线性预测分析窗,w(i)为初始线性预测分析窗,L为修正的线性预测分析窗的窗长,
Figure PCTCN2018101524-appb-000009
其中,MAX_ATTEN为预设的大于0的实数。
应理解,上述MAX_ATTEN具体可以是在对初始线性预测分析窗进行修正时对初始线性预测分析窗进行衰减时所能取得的最大的衰减值,MAX_ATTEN的取值可以为0.07、0.08等,MAX_ATTEN可以由技术人员根据经验预设设定。
可选地,作为一个实施例,根据初始线性预测分析窗以及当前帧的衰减窗的窗长确定修正的线性预测分析窗具体包括:根据当前帧的衰减窗的窗长确定当前帧的衰减窗;根据当前帧的衰减窗对初始线性预测分析窗进行修正,其中,该修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。其中,衰减值逐渐增大是指随着修正的线性预测分析窗从第L-sub_window_len点至第L-1点中的点的索引(index)逐渐增大,衰减值也逐渐增大,也就是说,第L-sub_window_len点的衰减值最小,第L-1点的衰减值最大,并且第N点的衰减值要大于第N-1点的衰减值,L-sub_window_len≤N≤L-1。
应理解,上述衰减窗既可以是线性窗也可以是非线性窗。
具体地,在根据当前帧的衰减窗的窗长确定衰减窗时上述衰减窗满足公式(8),也就是说可以根据公式(8)确定衰减窗。
Figure PCTCN2018101524-appb-000010
其中,MAX_ATTEN为衰减值中的最大值,MAX_ATTEN在公式(8)中的含义与在公式(7)中的含义相同。
上述根据当前帧的衰减窗对线性预测分析窗修正后得到的修正的线性预测分析窗满足公式(9),也就是说再根据上述公式(8)确定了衰减窗之后,接下来可以根据公式(9)确定修正的线性预测分析窗。
Figure PCTCN2018101524-appb-000011
在上述公式(8)和公式(9)中,sub_window_len为当前帧的衰减窗的窗长,sub_window(.)为当前帧的衰减窗,具体地,sub_window(i-(L-sub_window_len))为当前帧的衰减窗在点i-(L-sub_window_len)处的取值,w adp(i)为修正的线性预测分析窗,w(i)为初始线性预测分析窗,L为修正的线性预测分析窗的窗长。
可选地,在根据当前帧的衰减窗的窗长确定衰减窗时,具体也可以根据当前帧的衰减窗的窗长,从预先存储的多个候选衰减窗中确定当前帧的衰减窗,其中,多个候选衰减窗对应不同的窗长取值范围,不同的窗长取值范围之间没有交集。
通过从预先存储的多个候选衰减窗中确定当前帧的衰减窗,能够减少确定衰减窗时的计算复杂度,接下来可以直接根据从预先存储的多个衰减窗中获取的当前帧的衰减窗来确定修正的线性预测分析窗。
具体地,在根据衰减窗的窗长在不同取值范围对应的预先选定的衰减窗的窗长分别计算出对应的衰减窗之后,可以将衰减窗的窗长在不同取值范围内对应的衰减窗存储起来,这样可以在后续确定当前帧的衰减窗的窗长后能够直接根据当前帧的衰减窗的窗长满足的取值范围从预先存储的多个衰减窗中确定出当前帧的衰减窗,能够减少计算过程,简化计算的复杂度。
应理解,在计算衰减窗时预先选定的衰减窗窗长可以是衰减窗的窗长的所有可能取值或者衰减窗的窗长的所有可能取值的子集。
具体地,假设衰减窗的窗长为20时对应的衰减窗记作sub_window_20(i),衰减窗的窗长为40时对应的衰减窗记作sub_window_40(i),衰减窗的窗长为60时对应的衰减窗记作sub_window_60(i),衰减窗的窗长为80时对应的衰减窗记作sub_window_80(i)。
因此,在根据当前帧的衰减窗的窗长,从预先存储的多个衰减窗中确定当前帧的衰减窗时,如果当前帧的衰减窗的窗长大于等于20,并且小于40,那么可以将sub_window_20(i)确定为当前帧的衰减窗;如果当前帧的衰减窗的窗长大于等于40,并且小于60,那么可以将sub_window_40(i)确定为当前帧的衰减窗;如果当前帧的衰减窗的窗长大于等于60,并且小于80,那么可以将sub_window_60(i)确定为当前帧的衰减窗;如果当前帧的衰减窗的窗长大于等于80,那么可以将sub_window_80(i)确定为当前帧的衰减窗。
具体地,在根据当前帧的衰减窗的窗长,从预先存储的多个衰减窗中确定当前帧的衰减窗时,可以直接根据当前帧的衰减窗的窗长的取值范围从多个预先存储的衰减窗中确定当前帧的衰减窗。具体而言,可以根据公式(10)来确定当前帧的衰减窗。
Figure PCTCN2018101524-appb-000012
其中,sub_window(i)为当前帧的衰减窗,sub_window_len为当前帧的衰减窗的窗长,sub_window_20(i),sub_window_40(i),sub_window_60(i),sub_window_80(i)为预先存储的衰减窗窗长分别为20、40、60、80时对应的衰减窗。
应理解,上述公式(10)确定的衰减窗为线性窗。本申请中的衰减窗除了可以是线性窗之外,还可以是非线性窗。
当衰减窗为非线性窗时,可以根据公式(11)至公式(13)来确定。
Figure PCTCN2018101524-appb-000013
Figure PCTCN2018101524-appb-000014
Figure PCTCN2018101524-appb-000015
在上述公式(11)至公式(13)中,sub_window(i)为当前帧的衰减窗,sub_window_len为当前帧的衰减窗的窗长,MAX_ATTEN与上文中的含义相同。
应理解,在根据公式(11)至公式(13)确定了衰减窗之后,也可以根据公式(10)确定修正的线性预测分析窗。
上述根据当前帧的衰减窗对线性预测分析窗修正后得到的修正的线性预测分析窗满足公式(14),也就是说再根据上述公式(10)确定了衰减窗之后,接下来可以根据公式(14)至公式(17)确定修正的线性预测分析窗。
Figure PCTCN2018101524-appb-000016
Figure PCTCN2018101524-appb-000017
Figure PCTCN2018101524-appb-000018
Figure PCTCN2018101524-appb-000019
在上述公式(14)至公式(17)中,sub_window_len为当前帧的衰减窗的窗长,w adp(i)为修正的线性预测分析窗,w(i)为初始线性预测分析窗,L为修正的线性预测分析窗的窗长。sub_window_20(.),sub_window_40(.),sub_window_60(.),sub_window_80(.)为预先存储的衰减窗窗长分别为20、40、60、80的情况下对应的衰减窗,可以根据公式(10)至公式(13)中的任意一种事先计算并存储衰减窗窗长分别为20、40、60、80的情况下对应的衰减窗。
在根据公式(14)至公式(17)计算修正的线性预测分析窗时,只要得知了当前帧的衰减窗的窗长,就可以根据衰减窗窗长的取值所在的范围确定修正的线性预测分析窗。例如当前帧的衰减窗的窗长为50,那么当前帧的衰减窗的窗长的取值介于40和60之间(大于等于40小于60),因此,可以按照公式(15)确定修正的线性预测分析窗;而如果当前帧的衰减窗的窗长为70,那么当前帧的衰减窗的窗长的取值介于60和80之间(大于等于60小于80),此时可以按照公式(16)来确定修正的线性预测分析窗。
330、根据修正的线性预测分析窗对待处理的声道信号进行线性预测分析。
上述待处理的声道信号可以是主要声道信号或者次要声道信号,进一步地,上述待处理的声道信号还可以是对主要声道信号或者次要声道信号进行时域预处理后得到的声道信号。该主要声道信号和次要声道信号可以是下混处理后得到的声道信号。
根据修正的线性预测分析窗对待处理的声道信号进行线性预测分析,具体可以是根据修正的线性预测分析窗对待处理的声道信号进行加窗处理,然后再根据加窗处理后的信号计算当前帧的线性预测系数(具体可以采用莱文逊杜宾算法)。
本申请中,由于修正的线性预测分析窗中第L-sub_window_len点至第L-1点中的至少一部分点的取值小于所述线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值,从而在线性预测时能够减小当前帧的目标声道的人工重建的重建信号(该重建信号可以包括过渡段信号和前向信号)所起的作用,从而降低人工重建的重建信号与真实的前向信号之间的误差对线性预测分析结果的准确性的影响,因此,可以减小线性预测分析得到的线性预测系数与真实的线性预测系数之间的差异,提高线性预测分析的准确性。
具体地,如图4所示,现有方案中得到的线性预测系数与真实的线性预测系数之间的谱失真较大,而本申请得到的线性预测系数与真实的线性预测系数之间的谱失真较小,由此可见,本申请实施例的立体声信号的编码方法能够减小线性预测分析时得到的线性预测系数的谱失真,改善线性预测分析的准确性。
可选地,作为一个实施例,根据当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:根据当前帧的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定修正的线性预测分析窗,其中,多个候选线性预测分析窗对应不同的窗长取值范围,不同的窗长取值范围之间没有交集。
预先存储的多个候选线性预测分析窗为所述当前帧的衰减窗的窗长在不同取值范围内对应的修正的线性预测分析窗。
具体地,在根据初始线性预测分析窗以及衰减窗的窗长在不同取值范围对应的预先选定的衰减窗的窗长分别计算出对应的修正的线性预测分析窗之后,可以将衰减窗的窗长在 不同取值范围内对应的修正的线性预测分析窗存储起来,这样可以在后续确定当前帧的衰减窗的窗长后能够直接根据当前帧的衰减窗的窗长满足的取值范围从预先存储的多个线性预测分析窗中确定出修正的线性预测分析窗,能够减少计算过程,简化计算的复杂度。
可选地,上述计算修正的线性预测分析窗时预先选定的衰减窗窗长可以是衰减窗的窗长的所有可能取值或者衰减窗的窗长的所有可能取值的子集。
具体地,在根据当前帧的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定修正的线性预测分析时,可以根据公式(18)来确定修正的线性预测分析窗。
Figure PCTCN2018101524-appb-000020
其中,w adp(i)为修正的线性预测分析窗,w(i)为初始线性预测分析窗,w adp_20(i),w adp_40(i),w adp_60(i),w adp_80(i)为预先存储的多个线性预测分析窗。具体地,w adp_20(i),w adp_40(i),w adp_60(i),w adp_80(i)对应的衰减窗的窗长分别为20、40、60和80。
在根据公式(18)确定修正的线性预测分析窗时,在确定了当前帧的衰减窗的窗长的数值之后,根据当前帧的衰减窗的窗长满足的取值范围,可以直接根据公式(18)确定出修正的线性预测分析窗。
可选地,作为一个实施例,在根据衰减窗的窗长确定修正的线性预测分析窗之前,上述方法300还包括:
根据预设的间隔步长,对当前帧的衰减窗的窗长进行修正,以获得修正的衰减窗的窗长,其中,该间隔步长为预设的正整数。进一步地,该间隔步长可以为小于衰减窗的窗长最大值的正整数;
而当对衰减窗的窗长进行修正时,上述根据衰减窗的窗长确定修正的线性预测分析窗,具体包括:根据初始线性预测分析窗以及修正的衰减窗的窗长确定修正的线性预测分析窗。
具体地,可以先根据当前帧的声道间时间差确定当前帧的衰减窗的窗长,然后再根据预设的间隔步长,对衰减窗的窗长进行修正,得到修正的衰减窗的窗长。
通过采用预设的间隔步长对自适应衰减窗的窗长进行修正,能够降低衰减窗的窗长,并且使得修正后的衰减窗的窗长的取值属于由有限个常数组成的集合中,便于预先存储,能够降低后续计算的复杂度。
上述修正的衰减窗的窗长满足公式(19),也就是说在根据预设的间隔步长对衰减窗的窗长进行修正时具体可以根据公式(19)对衰减窗的窗长进行修正。
Figure PCTCN2018101524-appb-000021
其中,sub_window_len_mod为修正的衰减窗的窗长,
Figure PCTCN2018101524-appb-000022
为取整符号,sub_window_len为衰减窗的窗长,len_step为间隔步长,间隔步长可以是小于自适应衰减窗的窗长最大值的正整数,例如,15、20等等,间隔步长也可以由技术人员预先设定。
当sub_window_len的最大值为80,len_step为20时,那么修正的衰减窗的窗长的取 值仅包含0,20,40,60,80,也就是说修正的衰减窗的窗长只属于{0,20,40,60,80},当修正的衰减窗的窗长为0时,直接使用初始线性预测分析窗作为修正的线性预测分析窗。
可选地,作为一个实施例,根据初始线性预测分析窗以及修正的衰减窗的窗长确定修正的线性预测分析窗,包括:根据所述修正的衰减窗的窗长对所述初始线性预测分析窗进行修正。
可选地,作为一个实施例,根据初始线性预测分析窗以及修正的衰减窗的窗长确定修正的线性预测分析窗,还包括:根据所述修正的衰减窗的窗长确定所述当前帧的衰减窗;根据所述修正的衰减窗对所述初始线性预测分析窗进行修正。
可选地,作为一个实施例,根据修正的衰减窗的窗长确定当前帧的衰减窗,包括:根据修正的衰减窗的窗长,从预先存储的多个候选衰减窗中确定当前帧的衰减窗,其中预先存储的多个候选衰减窗为修正的衰减窗的窗长在不同取值时对应的衰减窗。
在根据一组预先选定的修正的衰减窗的窗长分别计算出对应的衰减窗之后,可以将预先选定的修正的衰减窗的窗长对应的衰减窗存储起来,这样可以在后续确定修正的衰减窗的窗长后能够直接根据修正的衰减窗的窗长从预先存储的多个候选衰减窗中确定出当前帧的衰减窗,能够减少计算过程,简化计算的复杂度。
应理解,这里预先选定的修正的衰减窗的窗长可以为修正的衰减窗的窗长的所有可能的取值或修正的衰减窗的窗长的所有可能的取值的子集。
具体地,在根据修正的衰减窗的窗长,从预先存储的多个候选衰减窗中确定当前帧的衰减窗时,可以根据公式(20)来确定当前帧的衰减窗。
Figure PCTCN2018101524-appb-000023
其中,sub_window(i)为当前帧的衰减窗,sub_window_len_mod为修正的衰减窗的窗长,sub_window_20(i),sub_window_40(i),sub_window_60(i),sub_window_80(i)为预先存储的衰减窗窗长分别为20、40、60、80的情况下对应的衰减窗。因为当sub_window_len_mod等于0时,直接使用初始线性预测分析窗作为修正的线性预测分析窗,无须确定当前帧的衰减窗。
可选地,作为一个实施例,根据初始线性预测分析窗以及修正的衰减窗的窗长确定修正的线性预测分析窗,包括:根据修正的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定修正的线性预测分析窗,其中预先存储的多个候选线性预测分析窗为修正的衰减窗的窗长在不同取值时对应的修正的线性预测分析窗。
在根据初始线性预测分析窗以及一组预先选定的修正的衰减窗的窗长分别计算出对应的修正的线性预测分析窗之后,可以将预先选定的修正的衰减窗的窗长对应的修正的线性预测分析窗存储起来,这样可以在后续确定修正的衰减窗的窗长后能够直接根据修正的衰减窗的窗长从预先存储的多个候选线性预测分析窗中确定出修正的线性预测分析窗,能够减少计算过程,简化计算的复杂度。
可选地,这里预先选定的修正的衰减窗的窗长为修正的衰减窗的窗长的所有可能的取值或修正的衰减窗的窗长的所有可能的取值的子集。
具体地,在根据修正的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定修正的线性预测分析时,可以根据公式(21)来确定修正的线性预测分析窗。
Figure PCTCN2018101524-appb-000024
其中,w adp(i)为修正的线性预测分析窗,w(i)为初始线性预测分析窗,w adp_20(i),w adp_40(i),w adp_60(i),w adp_80(i)为预先存储的多个线性预测分析窗。具体地,w adp_20(i),w adp_40(i),w adp_60(i),w adp_80(i)对应的衰减窗的窗长分别为20、40、60和80。
应理解,上述图3所示的方法300是立体声信号编码的一部分过程,为了更好地理解本申请的立体声信号的编码方法,下面结合图5至图10对本申请实施例的立体声信号的编码方法的整个过程进行详细的介绍。
图5是本申请实施例的立体声信号的编码方法的示意性流程图。图5的方法500具体包括:
510、对立体声信号进行时域预处理。
具体地,这里的立体声信号为时域信号,立体声信号具体包括左声道信号和右声道信号,在对立体声信号进行时域处理时,具体可以是对当前帧的左、右声道信号进行高通滤波处理,得到预处理后的当前帧的左、右声道信号。另外,这里的时域预处理时除了高通滤波处理外还可以是其它处理,例如,进行预加重处理。
例如,立体声音频信号的采样率为16HKz,每帧信号为20ms,则帧长N=320,即每一帧包括320个样点。当前帧的立体声信号包括当前帧的左声道时域信号x L(n),当前帧的右声道时域信号x R(n),其中,n为样点序号,n=0,1,L,N-1,那么,通过对当前帧的左声道时域信号x L(n),当前帧的右声道时域信号x R(n)进行时域预处理,得到当前帧预处理后的左声道时域信号
Figure PCTCN2018101524-appb-000025
当前帧的右声道时域信号
Figure PCTCN2018101524-appb-000026
520、对当前帧预处理后的左、右声道时域信号进行声道间时间差估计,得到当前帧的左、右声道信号的声道间时间差。
在进行声道间时间差估计时具体可以根据当前帧预处理后的左、右声道信号计算左右声道间的互相关系数,然后将互相关系数的最大值对应的索引值作为当前帧的声道间时间差。
具体地,可以采用方式一至方式三中的方式来进行声道间时间差的估计。应理解,本申请在进行声道间时间差估计时不限于方式一至方式三中的方法,本申请还可以采用其它现有技术来实现对声道间时间差的估计。
方式一:
在当前采样率下,声道间时间差的最大值和最小值分别是T max和T min,其中,T max和T min为预先设定的实数,并且T max>T min,那么,可以搜索索引值在声道间时间差的最大值和最小值之间的左右声道间的互相关系数的最大值,最后将该搜索到的左右声道间的 互相关系数的最大值对应的索引值确定为当前帧的声道间时间差。具体地,T max和T min的取值可以分别为40和-40,这样就可以在-40≤i≤40范围内搜索左右声道间的互相关系数的最大值,然后将互相关系数的最大值对应的索引值作为当前帧的声道间时间差。
方式二:
当前采样率下的声道间时间差的最大值和最小值分别是T max和T min,其中,T max和T min为预先设定的实数,并且T max>T min。那么,可以根据当前帧的左、右声道信号计算左右声道间的互相关函数,并根据前L帧(L为大于等于1的整数)的左右声道间的互相关函数对计算出来的当前帧的左右声道间的互相关函数进行平滑处理,得到平滑处理后的左右声道间的互相关函数,然后在T min≤i≤T max范围内搜索平滑处理后的左右声道间的互相关系数的最大值,并将该最大值对应的索引值i作为当前帧的声道间时间差。
方式三:
在根据方式一或方式二估计出了当前帧的声道间时间差之后,对当前帧的前M帧(M为大于等于1的整数)的声道间时间差和当前帧估计出的声道间时间差进行帧间平滑处理,将平滑处理后的声道间时间差作为当前帧最终的声道间时间差。
应理解,上述步骤510中对当前帧的左、右声道时域信号进行时域预处理并不是必须的步骤。如果没有时域预处理的步骤,那么,进行声道间时间差估计的左、右声道信号就是原始立体声信号中的左、右声道信号。该原始立体声信号中的左、右声道信号可以是指采集到的经过模数(A/D)转换后的脉冲编码调制(Pulse Code Modulation,PCM)信号。另外,立体声音频信号的采样率可以为8KHz、16KHz、32KHz、44.1KHz以及48KHz等等。
530、根据估计出来的声道间时间差对当前帧预处理后的左、右声道信号时域信号进行时延对齐处理。
具体地,在对当前帧的左、右声道信号进行时延对齐处理时可以根据当前帧的声道时间差对左声道信号和右声道信号中的一路或者两路进行压缩或者拉伸处理,使得时延对齐处理后的左、右声道信号之间不存在声道间时间差。对当前帧的左、右声道信号时延对齐处理后得到的当前帧的时延对齐处理后的左、右声道信号即为当前帧的时延对齐处理后的立体声信号。
在根据声道间时间差对当前帧的左、右声道信号进行时延对齐处理时,首先要根据当前帧的声道间时延差和前一帧的声道间时延差选择当前帧的目标声道以及参考声道。然后根据当前帧的声道间时间差的绝对值abs(cur_itd)与当前帧的前一帧的声道间时间差的绝对值abs(prev_itd)的大小关系可以采取不同的方式进行时延对齐处理。
当前帧的声道间时延差记作cur_itd,前一帧声道间时延差记作prev_itd。具体地,根据当前帧的声道间时延差和前一帧的声道间时延差选择当前帧的目标声道以及参考声道可以是:如果cur_itd=0:则当前帧的目标声道与前一帧的目标声道保持一致;如果cur_itd<0:则当前帧的目标声道为左声道;如果cur_itd>0:则当前帧的目标声道为右声道。
在确定了目标声道和参考声道之后,可以根据当前帧的声道间时间差的绝对值abs(cur_itd)与当前帧的前一帧的声道间时间差的绝对值abs(prev_itd)的不同大小关系采取不同的时延对齐处理方式,具体可以包含以下三种情况,应理解,本申请在时延对齐处理时采用的处理方式不限于以下三种情况中的处理方式,本申请还可以采用其它任何现有技术中的时延对齐处理方式来进行时延对齐处理。
情况一:abs(cur_itd)等于abs(prev_itd)
在当前帧的声道间时间差的绝对值与当前帧的前一帧的声道间时间差的绝对值相等的情况下,不对目标声道的信号进行压缩或者拉伸处理。如图6所示,根据当前帧的参考声道信号以及当前帧的目标声道信号产生Ts2点信号,作为时延对齐处理后的目标声道的第N-Ts2点到N-1点信号,并根据参考声道信号人工重建abs(cur_itd)点信号,作为时延对齐处理后的目标声道的第N点到N+abs(cur_itd)-1点信号。其中,abs()表示取绝对值操作,当前帧的帧长为N,如果采样率为16KHz,则N=320,Ts2为预先设定的过渡段的长度,例如,Ts2=10。
最终在时延对齐处理后,是将当前帧的目标声道信号时延abs(cur_itd)个样点的信号,作为当前帧时延对齐后的目标声道信号,将当前帧的参考声道信号直接作为当前帧时延对齐后的参考声道信号。
情况二:abs(cur_itd)小于abs(prev_itd)
如图7所示,在当前帧的声道间时间差的绝对值小于当前帧的前一帧的声道间时间差的绝对值相等的情况下,需要对缓存的目标声道信号进行拉伸。具体地,将当前帧缓存的目标声道信号中从第-ts+abs(prev_itd)-abs(cur_itd)到第L-ts-1点的信号拉伸为长度L点的信号,作为时延对齐处理后的目标声道的第-ts点到第L-ts-1点信号。然后将当前帧的目标声道信号中从第L-ts点到N-Ts2-1点的信号直接作为时延对齐处理后的目标声道的第L-ts点到N-Ts2-1点信号。接下来,再根据当前帧的参考声道信号以及目标声道信号产生Ts2点信号,作为时延对齐处理后的目标声道的第N-Ts2点到N-1点信号。最后根据参考声道信号人工重建abs(cur_itd)点信号,作为时延对齐处理后的目标声道的第N点到N+abs(cur_itd)-1点信号。其中,ts为帧间平滑过渡段的长度,例如ts为abs(cur_itd)/2,L为时延对齐处理的处理长度,L可以是预设的小于等于当前速率下帧长N的任意正整数,一般会设为大于允许的最大声道间时延差的正整数,例如L=290,L=200等。时延对齐处理的处理长度L可以针对不同的采样率设置不同的值,也可以采用统一的值。一般情况下,最简单的方法就是根据技术人员的经验预设一个值,例如290。
最终在时延对齐处理后,是将时延对齐处理后的目标声道从第abs(cur_itd)点开始的N点信号,作为时延对齐后的当前帧的目标声道信号。将当前帧的参考声道信号直接作为时延对齐后当前帧的参考声道信号。
情况三:abs(cur_itd)大于abs(prev_itd)
如图8所示,在当前帧的声道间时间差的绝对值小于当前帧的前一帧的声道间时间差的绝对值相等的情况下,需要对缓存的目标声道信号进行压缩。具体地,将当前帧缓存的目标声道信号中从第-ts+abs(prev_itd)-abs(cur_itd)到第L-ts-1点的信号压缩为长度为L点的信号,作为时延对齐处理后的目标声道的第-ts点到第L-ts-1点信号。接下来,将当前帧的目标声道信号中从第L-ts点到N-Ts2-1点的信号直接作为时延对齐处理后的目标声道的第L-ts点到N-Ts2-1点信号。然后根据当前帧的参考声道信号以及目标声道信号产生Ts2点信号,作为时延对齐处理后的目标声道的第N-Ts2点到N-1点信号。然后再根据参考声道信号产生abs(cur_itd)点信号,作为时延对齐处理后的目标声道的第N点到N+abs(cur_itd)-1点信号。其中,L仍是时延对齐处理的处理长度。
最终在时延对齐处理后,仍是将时延对齐处理后的目标声道从第abs(cur_itd)点开始的N点信号,作为时延对齐后的当前帧的目标声道信号。将当前帧的参考声道信号直接作为 时延对齐后当前帧的参考声道信号。
540、对声道间时间差进行量化编码。
具体地,在对当前帧的声道间时间差进行量化编码时,可以使用任何现有技术中的量化算法对当前帧的声道间时间差进行量化处理,得到量化索引,并将量化索引编码后写入码流。
550、计算声道组合比例因子,并对其进行量化编码。
计算声道组合比例因子的方法多种,例如,可以根据左右声道的帧能量来计算当前帧的声道组合比例因子。具体过程如下:
(1)、根据当前帧时延对齐后的左右声道信号,计算左右声道信号的帧能量。
当前帧左声道的帧能量rms_L满足:
Figure PCTCN2018101524-appb-000027
当前帧右声道的帧能量rms_R满足:
Figure PCTCN2018101524-appb-000028
其中,x′ L(i)为当前帧时延对齐后的左声道信号,x′ R(i)为当前帧时延对齐后的右声道信号,i为样点序号。
(2)、然后再根据左右声道的帧能量,计算当前帧的声道组合比例因子。
当前帧的声道组合比例因子ratio满足:
Figure PCTCN2018101524-appb-000029
因此,根据左右声道信号的帧能量就计算得到了声道组合比例因子。
(3)、量化编码声道组合比例因子,写入码流。
560、根据声道组合比例因子对时延对齐后的立体声信号进行时域下混处理,得到主要声道信号和次要声道信号。
具体而言,可以采用现有技术中的任何一种时域下混处理方法对时延对齐后的立体声信号进行时域下混处理。但是,在进行时域下混处理时需要根据计算声道组合比例因子的方法来选择对应的时域下混处理方式对时延对齐后的立体声信号进行时域处理,得到主要声道信号和次要声道信号。
例如,当采用上述步骤550下面的方式计算出声道组合比例因子ratio之后,就可以根据声道组合比例因子ratio进行时域下混处理,例如,可以根据公式(20)确定时域下混处理后的主要声道信号和次要声道信号。
Figure PCTCN2018101524-appb-000030
其中,Y(i)为当前帧的主要声道信号,X(i)为当前帧的次要声道信号,x′ L(i)为当前帧时延对齐后的左声道信号,x′ R(i)为当前帧时延对齐后的右声道信号,i为样点序号,N为帧长,ratio为声道组合比例因子。
570、对主要声道信号和次要声道信号进行编码。
应理解,可以采用单声道信号编解码方法对下混处理后的得到的主要声道信号和次要声道信号进行编码处理。具体地,可以根据前一帧的主要声道信号和/或前一帧的次要声道信号编码过程中得到的参数信息以及主要声道信号和次要声道信号编码的总比特数,对主要声道编码和次要声道编码的比特进行分配。然后根据比特分配结果分别对主要声道信号和次要声道信号进行编码,得到主要声道编码的编码索引以及次要声道编码的编码索引。另外,在对主要声道和次要声道编码时,可以采用代数码本激励线性预测编码(Algebraic Code Excited Linear Prediction,ACELP)的编码方式。
应理解,本申请实施例的立体声信号的编码方法可以是上述方法500中的步骤570中对下混处理后得到的主要声道信号和次要声道信号进行编码的一部分。具体地,本申请实施例的立体声信号的编码方法可以是上述步骤570中对下混处理后得到的主要声道信号或者次要声道信号进行线性预测的过程。在对当前帧的立体声信号进行线性预测分析的方式有多种,既可以对当前帧的主要声道信号和次要声道信号分别进行两次线性预测分析,也可以只对当前帧的主要声道信号和次要声道信号各进行一次线性预测分析。下面结合图9和图10分别对这两种线性预测分析的方式进行详细的描述。
图9是本申请实施例的线性预测分析过程的示意性流程图。图9中所示的线性预测过程是对当前帧的主要声道信号进行两次线性预测分析。图9所示的线性预测分析的过程具体包括:
910、对当前帧的主要声道信号进行时域预处理。
这里的预处理可以包括采样率转换、预加重处理等等。例如,可以将采样率为16KHz的主要声道信号转化为采样率为12.8KHz的信号,便于在后续采用代数码激励线性预测(Algebraic Code Excited Linear Prediction,ACELP)的编码方式时的编码处理。
920、获取当前帧的初始线性预测分析窗。
步骤920中的初始线性预测分析窗相当于上述步骤310中的线性预测分析窗。
930、根据初始线性预测分析窗对预处理后的主要声道信号进行第一次加窗处理,根据加窗处理后的信号计算当前帧的第一组线性预测系数。
根据初始线性预测分析窗对预处理后的主要声道信号进行第一次加窗处理具体可以根据公式(20)进行。
s wmid(n)=s pre(n-80)w(n),n=0,1,...,L-1                         (26)
其中,s pre(n)为预加重处理后的信号,s wmid(n)为第一次加窗处理后的信号,L为线性预测分析窗的窗长,w(n)为初始线性预测分析窗。
在计算当前帧的第一组线性预测系数时具体可以采用莱文逊杜宾算法计算。具体地,可以根据第一次加窗处理后的信号s wmid(n),采用莱文逊杜宾算法计算当前帧的第一组线性预测系数。
940、根据当前帧的声道间时间差自适应产生修正的线性预测分析窗。
该修正的线性预测分析窗可以是满足上述公式(7)和公式(9)的线性预测分析窗。
950、根据修正的线性预测分析窗对预处理后的主要声道信号进行第二次加窗处理,根据加窗处理后的信号计算当前帧的第二组线性预测系数。
根据修正的线性预测分析窗对预处理后的主要声道信号进行第二次加窗处理具体可以根据公式(27)进行。
s wend(n)=s pre(n+48)w adp(n),n=0,1,...,L-1                        (27)
其中,s pre(n)为预加重处理后的信号,s wend(n)为第二次加窗处理后的信号,L为修正的线性预测分析窗的窗长,w adp(n)为修正的线性预测分析窗。
在计算当前帧的第二组线性预测系数时具体可以采用莱文逊杜宾算法计算。具体地,可以根据第二次加窗处理后的信号s wend(n),采用莱文逊杜宾算法计算当前帧的第二组线性预测系数。
同样,对当前帧的次要声道信号进行线性预测分析的处理过程与上述步骤910至步骤950中对当前帧的主要声道信号进行线性预测分析的过程相同。
应理解,本申请中的立体声信号的编码方法与上述方式一种的第二种加窗处理的方式相同。
图10是本申请实施例的线性预测分析过程的示意性流程图。图10中所示的线性预测过程是对当前帧的主要声道信号进行一次线性预测分析。图10所示的线性预测分析的过程具体包括:
1010、对当前帧的主要声道信号进行时域预处理。
这里的预处理可以包括采样率转换、预加重处理等等。
1020、获取当前帧的初始线性预测分析窗。
步骤1020中的初始线性预测分析窗相当于上述步骤320中的初始线性预测分析窗。
1030、根据当前帧的声道间时间差,自适应产生修正的线性预测分析窗。
具体地,可以根据当前帧的声道间时间差先确定当前帧的衰减窗的窗长,然后再按照上述步骤320中的方式确定修正的线性预测分析窗。
1040、根据修正的线性预测分析窗对预处理后的主要声道信号进行加窗处理,根据加窗处理后的信号计算当前帧的线性预测系数。
根据修正的线性预测分析窗对预处理后的主要声道信号进行加窗处理具体可以根据公式(28)进行。
s w(n)=s pre(n)w adp(n),n=0,1,...,L-1                        (28)
其中,为预加重处理后的信号,s w(n)为加窗处理后的信号,L为修正的线性预测分析窗的窗长,w adp(n)为修正的线性预测分析窗。
应理解,在计算当前帧的线性预测系数时具体可以采用莱文逊杜宾算法计算。具体地,可以根据加窗处理后的信号s w(n),采用莱文逊杜宾算法计算当前帧的线性预测系数。
同样,对当前帧的次要声道信号进行线性预测分析的处理过程与上述步骤1010至步骤1040中对当前帧的主要声道信号进行线性预测分析的过程相同。
上文结合图1至图10对本申请实施例的立体声信号的编码方法进行了详细的描述。下面结合图11和图12对本申请实施例的立体声信号的编码装置进行描述,应理解,图11至图12中的装置与本申请实施例的立体声信号的编码方法是对应的,并且图11和图12中的装置可以执行本申请实施例的立体声信号的编码方法。为了简洁,下面适当省略重复的描述。
图11是本申请实施例的立体声信号的编码装置的示意性框图。图11的装置1100包括:
第一确定模块1110,用于根据当前帧的声道间时间差确定所述当前帧的衰减窗的窗长;
第二确定模块1120,根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,其中,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中的至少一部分点的取值小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值,sub_window_len为所述当前帧的衰减窗的窗长,L为所述修正的线性预测分析窗的窗长,所述修正的线性预测分析窗的窗长等于所述初始线性预测分析窗的窗长;
处理模块1130,用于根据所述修正的线性预测分析窗对待处理的声道信号进行线性预测分析。
本申请中,由于修正的线性预测分析窗中与当前帧的目标声道的人工重建的前向信号对应点的取值小于未经修正的线性预测分析窗中与当前帧的目标声道的人工重建的前向信号对应点的取值,从而在线性预测时能够减小当前帧的目标声道的人工重建的前向信号所起的作用,从而降低人工重建的前向信号与真实的前向信号之间的误差对线性预测分析结果的准确性的影响,因此,可以减小线性预测分析得到的线性预测系数与真实的线性预测系数之间的差异,提高线性预测分析的准确性。
可选地,作为一个实施例,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中任意一点的取值小于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值。
可选地,作为一个实施例,所述第一确定模块1110具体用于:根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长。
可选地,作为一个实施例,所述第一确定模块1110具体用于:将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长。
可选地,作为一个实施例,所述第一确定模块1110具体用于:在所述当前帧的声道间时间差的绝对值大于或者等于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长;在所述当前帧的声道间时间差的绝对值小于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值的N倍确定为所述当前帧的衰减窗的窗长,其中,N为预先设定的大于0小于L/MAX DELAY的实数,MAX DELAY为预设的大于0的实数。
可选地,上述MAX DELAY为声道间时间差的绝对值的最大值。
可选地,作为一个实施例,所述第二确定模块1120具体用于:根据所述当前帧的衰减窗的窗长对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
可选地,作为一个实施例,所述修正的线性预测分析窗满足公式:
Figure PCTCN2018101524-appb-000031
其中,w adp(i)为修正的线性预测分析窗,w(i)为所述初始线性预测分析窗,
Figure PCTCN2018101524-appb-000032
其中,MAX_ATTEN为预设的大于0的实数。
可选地,作为一个实施例,所述第二确定模块1120具体用于:根据所述当前帧的衰减窗的窗长确定所述当前帧的衰减窗;根据所述当前帧的衰减窗对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
可选地,作为一个实施例,所述第二确定模块1120具体用于:根据所述当前帧的衰减窗的窗长,从预先存储的多个候选衰减窗中确定所述当前帧的衰减窗,其中,所述多个候选衰减窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
可选地,作为一个实施例,所述当前帧的衰减窗满足公式:
Figure PCTCN2018101524-appb-000033
其中,sub_window(i)为所述当前帧的衰减窗,MAX_ATTEN为预设的大于0的实数。
可选地,作为一个实施例,所述修正的线性预测分析窗满足公式:
Figure PCTCN2018101524-appb-000034
其中,w adp(i)为所述修正的线性预测分析窗的窗函数,w(i)为所述初始线性预测分析窗,sub_window(.)为所述当前帧的衰减窗。
可选地,作为一个实施例,所述第二确定模块1120具体用于:根据所述当前帧的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定所述修正的线性预测分析窗,其中,所述多个候选线性预测分析窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
可选地,作为一个实施例,在所述第二确定模块1120根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗之前,所述装置还包括:
修正模块1140,用于根据预设的间隔步长,对所述当前帧的衰减窗的窗长进行修正,以获得修正的衰减窗的窗长,其中,所述间隔步长为预设的正整数;
所述第二确定模块1120具体用于:根据所述初始线性预测分析窗以及所述修正的衰减窗的窗长确定修正的线性预测分析窗。
可选地,作为一个实施例,所述修正的衰减窗的窗长满足公式:
Figure PCTCN2018101524-appb-000035
其中,sub_window_len_mod为所述修正的衰减窗的窗长,len_step为所述间隔步长。
图12是本申请实施例的立体声信号的编码装置的示意性框图。图12的装置1200包括:
存储器1210,用于存储程序。
处理器1220,用于执行所述存储器1210中存储的程序,当所述存储器1210中的程序被执行时,所述处理器1220具体用于:根据当前帧的声道间时间差确定所述当前帧的衰减窗的窗长;根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,其中,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中的至少一部分点的取值小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值, sub_window_len为所述当前帧的衰减窗的窗长,L为所述修正的线性预测分析窗的窗长,所述修正的线性预测分析窗的窗长等于所述初始线性预测分析窗的窗长;根据所述修正的线性预测分析窗对待处理的声道信号进行线性预测分析。
本申请中,由于修正的线性预测分析窗中与当前帧的目标声道的人工重建的前向信号对应点的取值小于未经修正的线性预测分析窗中与当前帧的目标声道的人工重建的前向信号对应点的取值,从而在线性预测时能够减小当前帧的目标声道的人工重建的前向信号所起的作用,从而降低人工重建的前向信号与真实的前向信号之间的误差对线性预测分析结果的准确性的影响,因此,可以减小线性预测分析得到的线性预测系数与真实的线性预测系数之间的差异,提高线性预测分析的准确性。
可选地,作为一个实施例,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中任意一点的取值小于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值。
可选地,作为一个实施例,所述处理器1220具体用于:根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长。
可选地,作为一个实施例,所述处理器1220具体用于:将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长。
可选地,作为一个实施例,所述处理器1220具体用于:在所述当前帧的声道间时间差的绝对值大于或者等于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长;在所述当前帧的声道间时间差的绝对值小于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值的N倍确定为所述当前帧的衰减窗的窗长,其中,N为预先设定的大于0小于L/MAX DELAY的实数,MAX DELAY为预设的大于0的实数。
可选地,上述MAX DELAY为声道间时间差的绝对值的最大值。
可选地,作为一个实施例,所述处理器1220具体用于:根据所述当前帧的衰减窗的窗长对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点的中对应点取值的衰减值逐渐增大。
可选地,作为一个实施例,所述修正的线性预测分析窗满足公式:
Figure PCTCN2018101524-appb-000036
其中,w adp(i)为修正的线性预测分析窗,w(i)为所述初始线性预测分析窗,
Figure PCTCN2018101524-appb-000037
其中,MAX_ATTEN为预设的大于0的实数。
可选地,作为一个实施例,所述处理器1220具体用于:根据所述当前帧的衰减窗的窗长确定所述当前帧的衰减窗;根据所述当前帧的衰减窗对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
可选地,作为一个实施例,所述处理器1220具体用于:根据所述当前帧的衰减窗的 窗长,从预先存储的多个候选衰减窗中确定所述当前帧的衰减窗,其中,所述多个候选衰减窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
可选地,作为一个实施例,所述当前帧的衰减窗满足公式:
Figure PCTCN2018101524-appb-000038
其中,sub_window(i)为所述当前帧的衰减窗,MAX_ATTEN为预设的大于0的实数。
可选地,作为一个实施例,所述修正的线性预测分析窗满足公式:
Figure PCTCN2018101524-appb-000039
其中,w adp(i)为所述修正的线性预测分析窗的窗函数,w(i)为所述初始线性预测分析窗,sub_window(.)为所述当前帧的衰减窗。
可选地,作为一个实施例,所述处理器1220具体用于:根据所述当前帧的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定所述修正的线性预测分析窗,其中,所述多个候选线性预测分析窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
可选地,作为一个实施例,在所述处理器1220根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗之前,所述处理器1220还用于:根据预设的间隔步长,对所述当前帧的衰减窗的窗长进行修正,以获得修正的衰减窗的窗长,其中,所述间隔步长为预设的正整数;根据所述初始线性预测分析窗以及所述修正的衰减窗的窗长确定修正的线性预测分析窗。
可选地,作为一个实施例,所述修正的衰减窗的窗长满足公式:
Figure PCTCN2018101524-appb-000040
其中,sub_window_len_mod为所述修正的衰减窗的窗长,len_step为所述间隔步长。
上文结合图11和图12对本申请实施例的立体声信号的编码装置进行了描述,下面结合图13至图18本申请实施例的终端设备和网络设备进行描述,应理解,本申请实施例中的立体声信号的编码方法可以由图13至图18中的终端设备或者网络设备执行。另外,本申请实施例中的编码装置可以设置在图13至图18中的终端设备或者网络设备中,具体地,本申请实施例中的编码装置可以是图13至图18中的终端设备或者网络设备中的立体声编码器。
如图13所示,在音频通信中,第一终端设备中的立体声编码器对采集到的立体声信号进行立体声编码,第一终端设备中的信道编码器可以对立体声编码器得到的码流再进行信道编码,接下来,第一终端设备信道编码后得到的数据通过第一网络设备和第二网络设备传输到第二网络设备。第二终端设备在接收到第二网络设备的数据之后,第二终端设备的信道解码器进行信道解码,得到立体声信号编码码流,第二终端设备的立体声解码器再通过解码恢复出立体声信号,由终端设备进行该立体声信号的回放。这样就在不同的终端设备完成了音频通信。
应理解,在图13中,第二终端设备也可以对采集到的立体声信号进行编码,最终通过第二网络设备和第二网络设备将最终编码得到的数据传输给第一终端设备,第一终端设备通过对数据进行信道解码和立体声解码得到立体声信号。
在图13中,第一网络设备和第二网络设备可以是无线网络通信设备或者有线网络通 信设备。第一网络设备和第二网络设备之间可以通过数字信道进行通信。
图13中的第一终端设备或者第二终端设备可以执行本申请实施例的立体声信号的编解码方法,本申请实施例中的编码装置、解码装置可以分别是第一终端设备或者第二终端设备中的立体声编码器、立体声解码器。
在音频通信中,网络设备可以实现音频信号编解码格式的转码。如图14所示,如果网络设备接收到的信号的编解码格式为其它立体声解码器对应的编解码格式,那么,网络设备中的信道解码器对接收到的信号进行信道解码,得到其它立体声解码器对应的编码码流,其它立体声解码器对该编码码流进行解码,得到立体声信号,立体声编码器再对立体声信号进行编码,得到立体声信号的编码码流,最后,信道编码器再对立体声信号的编码码流进行信道编码,得到最终的信号(该信号可以传输给终端设备或者其它的网络设备)。应理解,图14中的立体声编码器对应的编解码格式与其它立体声解码器对应的编解码格式不同。假设其它立体声解码器对应的编解码格式为第一编解码格式,立体声编码器对应的编解码格式为第二编解码格式,那么在图14中,通过网络设备就实现了将音频信号由第一编解码格式转化为第二编解码格式。
类似的,如图15所示,如果网络设备接收到的信号的编解码格式与立体声解码器对应的编解码格式相同,那么,在网络设备的信道解码器进行信道解码得到立体声信号的编码码流之后,可以由立体声解码器对立体声信号的编码码流进行解码,得到立体声信号,接下来,再由其它立体声编码器按照其它的编解码格式对该立体声信号进行编码,得到其它立体声编码器对应的编码码流,最后,信道编码器再对其它立体声编码器对应的编码码流进行信道编码,得到最终的信号(该信号可以传输给终端设备或者其它的网络设备)。与图14中的情况相同,图15中的立体声解码器对应的编解码格式与其它立体声编码器对应的编解码格式也是不同的。如果其它立体声编码器对应的编解码格式为第一编解码格式,立体声解码器对应的编解码格式为第二编解码格式,那么在图15中,通过网络设备就实现了将音频信号由第二编解码格式转化为第一编解码格式。
在图14和图15中,其它立体声编解码器和立体声编解码器分别对应不同的编解码格式,因此,经过其它立体声编解码器和立体声编解码器的处理就实现了立体声信号编解码格式的转码。
还应理解,图14中的立体声编码器能够实现本申请实施例中的立体声信号的编码方法,图15中的立体声解码器能够实现本申请实施例的立体声信号的解码方法。本申请实施例中的编码装置可以是图14中的网络设备中的立体声编码器,本申请实施例中的解码装置可以是图15中的网络设备中的立体声解码器。另外,图14和图15中的网络设备具体可以是无线网络通信设备或者有线网络通信设备。
如图16所示,在音频通信中,第一终端设备中的多声道编码器中的立体声编码器对由采集到的多声道信号生成的立体声信号进行立体声编码,多声道编码器得到的码流包含立体声编码器得到的码流,第一终端设备中的信道编码器可以对多声道编码器得到的码流再进行信道编码,接下来,第一终端设备信道编码后得到的数据通过第一网络设备和第二网络设备传输到第二网络设备。第二终端设备在接收到第二网络设备的数据之后,第二终端设备的信道解码器进行信道解码,得到多声道信号的编码码流,多声道信号的编码码流包含了立体声信号的编码码流,第二终端设备的多声道解码器中的立体声解码器再通过解 码恢复出立体声信号,多声道解码器根据恢复出立体声信号解码得到多声道信号,由第二终端设备进行该多声道信号的回放。这样就在不同的终端设备完成了音频通信。
应理解,在图16中,第二终端设备也可以对采集到的多声道信号进行编码(具体由第二终端设备中的多声道编码器中的立体声编码器对由采集到的多声道信号生成的立体声信号进行立体声编码,然后再由第二终端设备中的信道编码器对多声道编码器得到的码流进行信道编码),最终通过第二网络设备和第二网络设备传输给第一终端设备,第一终端设备通过信道解码和多声道解码得到多声道信号。
在图16中,第一网络设备和第二网络设备可以是无线网络通信设备或者有线网络通信设备。第一网络设备和第二网络设备之间可以通过数字信道进行通信。
图16中的第一终端设备或者第二终端设备可以执行本申请实施例的立体声信号的编解码方法。另外,本申请实施例中的编码装置可以是第一终端设备或者第二终端设备中的立体声编码器,本申请实施例中的解码装置可以是第一终端设备或者第二终端设备中的立体声解码器。
在音频通信中,网络设备可以实现音频信号编解码格式的转码。如图17所示,如果网络设备接收到的信号的编解码格式为其它多声道解码器对应的编解码格式,那么,网络设备中的信道解码器对接收到的信号进行信道解码,得到其它多声道解码器对应的编码码流,其它多声道解码器对该编码码流进行解码,得到多声道信号,多声道编码器再对多声道信号进行编码,得到多声道信号的编码码流,其中多声道编码器中的立体声编码器对由多声道信号生成的立体声信号进行立体声编码得到立体声信号的编码码流,多声道信号的编码码流包含了立体声信号的编码码流,最后,信道编码器再对编码码流进行信道编码,得到最终的信号(该信号可以传输给终端设备或者其它的网络设备)。
类似的,如图18所示,如果网络设备接收到的信号的编解码格式与多声道解码器对应的编解码格式相同,那么,在网络设备的信道解码器进行信道解码得到多声道信号的编码码流之后,可以由多声道解码器对多声道信号的编码码流进行解码,得到多声道信号,其中多声道解码器中的立体声解码器对多声道信号的编码码流中的立体声信号的编码码流进行立体声解码,接下来,再由其它多声道编码器按照其它的编解码格式对该多声道信号进行编码,得到其它多声道编码器对应的多声道信号的编码码流,最后,信道编码器再对其它多声道编码器对应的编码码流进行信道编码,得到最终的信号(该信号可以传输给终端设备或者其它的网络设备)。
应理解,在图17和图18中,其它多声道编解码器和多声道编解码器分别对应不同的编解码格式。例如,在图17中,其它立体声解码器对应的编解码格式为第一编解码格式,多声道编码器对应的编解码格式为第二编解码格式,那么在图17中,通过网络设备就实现了将音频信号由第一编解码格式转化为第二编解码格式。类似地,在图18中,假设多声道解码器对应的编解码格式为第二编解码格式,其它立体声编码器对应的编解码格式为第一编解码格式,那么在图18中,通过网络设备就实现了将音频信号由第二编解码格式转化为第一编解码格式。因此,经过其它多声道编解码器和多声道编解码的处理就实现了音频信号编解码格式的转码。
还应理解,图17中的立体声编码器能够实现本申请中的立体声信号的编码方法,图18中的立体声解码器能够实现本申请中的立体声信号的解码方法。本申请实施例中的编 码装置可以是图17中的网络设备中的立体声编码器,本申请实施例中的解码装置可以是图18中的网络设备中的立体声解码器。另外,图17和图18中的网络设备具体可以是无线网络通信设备或者有线网络通信设备。
本申请还提供了一种芯片,所述芯片包括处理器与通信接口,所述通信接口用于与外部器件进行通信,所述处理器用于执行本申请实施例的立体声信号的编码方法。
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行本申请实施例的立体声信号的编码方法。
可选地,作为一种实现方式,所述芯片集成在终端设备或者网络设备上。
本申请提供了一种计算机可读存储介质,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码包括用于执行本申请实施例的立体声信号的编码方法的指令。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖 在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (28)

  1. 一种立体声信号的编码方法,其特征在于,包括:
    根据当前帧的声道间时间差确定所述当前帧的衰减窗的窗长;
    根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,其中,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中的至少一部分点的取值小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值,sub_window_len为所述当前帧的衰减窗的窗长,L为所述修正的线性预测分析窗的窗长,所述修正的线性预测分析窗的窗长等于所述初始线性预测分析窗的窗长;
    根据所述修正的线性预测分析窗对待处理的声道信号进行线性预测分析。
  2. 如权利要求1所述的方法,其特征在于,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中任意一点的取值小于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值。
  3. 如权利要求1或2所述的方法,其特征在于,所述根据当前帧的声道间时间差确定所述当前帧的衰减窗的窗长,包括:
    根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长。
  4. 如权利要求3所述的方法,其特征在于,所述根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长,包括:
    将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长。
  5. 如权利要求3所述的方法,其特征在于,所述根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长,包括:
    在所述当前帧的声道间时间差的绝对值大于或者等于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长;
    在所述当前帧的声道间时间差的绝对值小于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值的N倍确定为所述当前帧的衰减窗的窗长,其中,N为预先设定的大于0且小于L/MAX DELAY的实数,MAX DELAY为预设的大于0的实数。
  6. 如权利要求2-5中任一项所述的方法,其特征在于,所述根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:
    根据所述当前帧的衰减窗的窗长对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
  7. 如权利要求6所述的方法,其特征在于,所述修正的线性预测分析窗满足公式:
    Figure PCTCN2018101524-appb-100001
    其中,w adp(i)为修正的线性预测分析窗,w(i)为所述初始线性预测分析窗,
    Figure PCTCN2018101524-appb-100002
    其中,MAX_ATTEN为预设的大于0的实数。
  8. 如权利要求2-5中任一项所述的方法,其特征在于,所述根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:
    根据所述当前帧的衰减窗的窗长确定所述当前帧的衰减窗;
    根据所述当前帧的衰减窗对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
  9. 如权利要求8所述的方法,其特征在于,所述根据所述当前帧的衰减窗的窗长确定所述当前帧的衰减窗,包括:
    根据所述当前帧的衰减窗的窗长,从预先存储的多个候选衰减窗中确定所述当前帧的衰减窗,其中,所述多个候选衰减窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
  10. 如权利要求8所述的方法,其特征在于,所述当前帧的衰减窗满足公式:
    Figure PCTCN2018101524-appb-100003
    其中,sub_window(i)为所述当前帧的衰减窗,MAX_ATTEN为预设的大于0的实数。
  11. 如权利要求10所述的方法,其特征在于,所述修正的线性预测分析窗满足公式:
    Figure PCTCN2018101524-appb-100004
    其中,w adp(i)为所述修正的线性预测分析窗的窗函数,w(i)为所述初始线性预测分析窗,sub_window(.)为所述当前帧的衰减窗。
  12. 如权利要求2-5中任一项所述的方法,其特征在于,所述根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:
    根据所述当前帧的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定所述修正的线性预测分析窗,其中,所述多个候选线性预测分析窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
  13. 如权利要求1-12中任一项所述的方法,其特征在于,在根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗之前,所述方法还包括:
    根据预设的间隔步长,对所述当前帧的衰减窗的窗长进行修正,以获得修正的衰减窗的窗长,其中,所述间隔步长为预设的正整数;
    所述根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,包括:
    根据所述初始线性预测分析窗以及所述修正的衰减窗的窗长确定修正的线性预测分析窗。
  14. 如权利要求13所述的方法,其特征在于,所述修正的衰减窗的窗长满足公式:
    Figure PCTCN2018101524-appb-100005
    其中,sub_window_len_mod为所述修正的衰减窗的窗长,len_step为所述间隔步长。
  15. 一种编码装置,其特征在于,包括:
    第一确定模块,用于根据当前帧的声道间时间差确定所述当前帧的衰减窗的窗长;
    第二确定模块,根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗,其中, 所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中的至少一部分点的取值小于初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值,sub_window_len为所述当前帧的衰减窗的窗长,L为所述修正的线性预测分析窗的窗长,所述修正的线性预测分析窗的窗长等于所述初始线性预测分析窗的窗长;
    处理模块,用于根据所述修正的线性预测分析窗对待处理的声道信号进行线性预测分析。
  16. 如权利要求15所述的装置,其特征在于,所述修正的线性预测分析窗的第L-sub_window_len点至第L-1点中任意一点的取值小于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值。
  17. 如权利要求15或16所述的装置,其特征在于,所述第一确定模块具体用于:
    根据所述当前帧的声道间时间差以及预先设定的过渡段的长度,确定所述当前帧的衰减窗的窗长。
  18. 如权利要求17所述的装置,其特征在于,所述第一确定模块具体用于:
    将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长。
  19. 如权利要求17所述的装置,其特征在于,所述第一确定模块具体用于:
    在所述当前帧的声道间时间差的绝对值大于或者等于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值与所述预先设定的过渡段的长度的和确定为所述当前帧的衰减窗的窗长;
    在所述当前帧的声道间时间差的绝对值小于所述预先设定的过渡段的长度的情况下,将所述当前帧的声道间时间差的绝对值的N倍确定为所述当前帧的衰减窗的窗长,其中,N为预先设定的大于0且小于L/MAX DELAY的实数,MAX DELAY为预设的大于0的实数。
  20. 如权利要求16-19中任一项所述的装置,其特征在于,所述第二确定模块具体用于:
    根据所述当前帧的衰减窗的窗长对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
  21. 如权利要求20所述的装置,其特征在于,所述修正的线性预测分析窗满足公式:
    Figure PCTCN2018101524-appb-100006
    其中,w adp(i)为修正的线性预测分析窗,w(i)为所述初始线性预测分析窗,
    Figure PCTCN2018101524-appb-100007
    其中,MAX_ATTEN为预设的大于0的实数。
  22. 如权利要求16-19中任一项所述的装置,其特征在于,所述第二确定模块具体用于:
    根据所述当前帧的衰减窗的窗长确定所述当前帧的衰减窗;
    根据所述当前帧的衰减窗对所述初始线性预测分析窗进行修正,其中,所述修正的线性预测分析窗从第L-sub_window_len点至第L-1点的取值相对于所述初始线性预测分析窗 的第L-sub_window_len点至第L-1点中对应点的取值的衰减值逐渐增大。
  23. 如权利要求22所述的装置,其特征在于,所述第二确定模块具体用于:
    根据所述当前帧的衰减窗的窗长,从预先存储的多个候选衰减窗中确定所述当前帧的衰减窗,其中,所述多个候选衰减窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
  24. 如权利要求22所述的装置,其特征在于,所述当前帧的衰减窗满足公式:
    Figure PCTCN2018101524-appb-100008
    其中,sub_window(i)为所述当前帧的衰减窗,MAX_ATTEN为预设的大于0的实数。
  25. 如权利要求24所述的装置,其特征在于,所述修正的线性预测分析窗满足公式:
    Figure PCTCN2018101524-appb-100009
    其中,w adp(i)为所述修正的线性预测分析窗的窗函数,w(i)为所述初始线性预测分析窗,sub_window(.)为所述当前帧的衰减窗。
  26. 如权利要求16-19中任一项所述的装置,其特征在于,所述第二确定模块具体用于:
    根据所述当前帧的衰减窗的窗长,从预先存储的多个候选线性预测分析窗中确定所述修正的线性预测分析窗,其中,所述多个候选线性预测分析窗对应不同的窗长取值范围,所述不同的窗长取值范围之间没有交集。
  27. 如权利要求15-26中任一项所述的装置,其特征在于,在所述第二确定模块根据所述当前帧的衰减窗的窗长确定修正的线性预测分析窗之前,所述装置还包括:
    修正模块,用于根据预设的间隔步长,对所述当前帧的衰减窗的窗长进行修正,以获得修正的衰减窗的窗长,其中,所述间隔步长为预设的正整数;
    所述第二确定模块具体用于:
    根据所述初始线性预测分析窗以及所述修正的衰减窗的窗长确定修正的线性预测分析窗。
  28. 如权利要求27所述的装置,其特征在于,所述修正的衰减窗的窗长满足公式:
    Figure PCTCN2018101524-appb-100010
    其中,sub_window_len_mod为所述修正的衰减窗的窗长,len_step为所述间隔步长。
PCT/CN2018/101524 2017-08-23 2018-08-21 立体声信号的编码方法和编码装置 WO2019037714A1 (zh)

Priority Applications (7)

Application Number Priority Date Filing Date Title
ES18848208T ES2873880T3 (es) 2017-08-23 2018-08-21 Procedimiento de codificación y aparato de codificación para señal estéreo
EP21160112.5A EP3901949B1 (en) 2017-08-23 2018-08-21 Encoding apparatus
KR1020207008343A KR102380642B1 (ko) 2017-08-23 2018-08-21 스테레오 신호 인코딩 방법 및 인코딩 장치
KR1020227010056A KR102486258B1 (ko) 2017-08-23 2018-08-21 스테레오 신호 인코딩 방법 및 인코딩 장치
EP18848208.7A EP3664089B1 (en) 2017-08-23 2018-08-21 Encoding method and encoding apparatus for stereo signal
US16/797,484 US11244691B2 (en) 2017-08-23 2020-02-21 Stereo signal encoding method and encoding apparatus
US17/552,682 US11636863B2 (en) 2017-08-23 2021-12-16 Stereo signal encoding method and encoding apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710731482.1A CN109427338B (zh) 2017-08-23 2017-08-23 立体声信号的编码方法和编码装置
CN201710731482.1 2017-08-23

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/797,484 Continuation US11244691B2 (en) 2017-08-23 2020-02-21 Stereo signal encoding method and encoding apparatus

Publications (1)

Publication Number Publication Date
WO2019037714A1 true WO2019037714A1 (zh) 2019-02-28

Family

ID=65438398

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/101524 WO2019037714A1 (zh) 2017-08-23 2018-08-21 立体声信号的编码方法和编码装置

Country Status (6)

Country Link
US (2) US11244691B2 (zh)
EP (2) EP3901949B1 (zh)
KR (2) KR102380642B1 (zh)
CN (1) CN109427338B (zh)
ES (1) ES2873880T3 (zh)
WO (1) WO2019037714A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109427338B (zh) * 2017-08-23 2021-03-30 华为技术有限公司 立体声信号的编码方法和编码装置
CN113129910A (zh) * 2019-12-31 2021-07-16 华为技术有限公司 音频信号的编解码方法和编解码装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009107054A1 (en) * 2008-02-26 2009-09-03 Koninklijke Philips Electronics N.V. Method of embedding data in stereo image
CN102089809A (zh) * 2008-06-13 2011-06-08 诺基亚公司 用于提供改进的音频处理的方法、装置及计算机程序产品
CN102307323A (zh) * 2009-04-20 2012-01-04 华为技术有限公司 对多声道信号的声道延迟参数进行修正的方法
JP2013088522A (ja) * 2011-10-14 2013-05-13 Nippon Telegr & Teleph Corp <Ntt> 声道スペクトル抽出装置、声道スペクトル抽出方法及びプログラム
CN103403800A (zh) * 2011-02-02 2013-11-20 瑞典爱立信有限公司 确定多声道音频信号的声道间时间差
CN104205211A (zh) * 2012-04-05 2014-12-10 华为技术有限公司 多声道音频编码器以及用于对多声道音频信号进行编码的方法

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5956674A (en) 1995-12-01 1999-09-21 Digital Theater Systems, Inc. Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
SE519552C2 (sv) * 1998-09-30 2003-03-11 Ericsson Telefon Ab L M Flerkanalig signalkodning och -avkodning
GB2453117B (en) * 2007-09-25 2012-05-23 Motorola Mobility Inc Apparatus and method for encoding a multi channel audio signal
KR101441896B1 (ko) * 2008-01-29 2014-09-23 삼성전자주식회사 적응적 lpc 계수 보간을 이용한 오디오 신호의 부호화,복호화 방법 및 장치
EP2838086A1 (en) * 2013-07-22 2015-02-18 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. In an reduction of comb filter artifacts in multi-channel downmix with adaptive phase alignment
ES2955962T3 (es) * 2015-09-25 2023-12-11 Voiceage Corp Método y sistema que utiliza una diferencia de correlación a largo plazo entre los canales izquierdo y derecho para mezcla descendente en el dominio del tiempo de una señal de sonido estéreo en canales primarios y secundarios
US9978381B2 (en) 2016-02-12 2018-05-22 Qualcomm Incorporated Encoding of multiple audio signals
CN109427338B (zh) * 2017-08-23 2021-03-30 华为技术有限公司 立体声信号的编码方法和编码装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009107054A1 (en) * 2008-02-26 2009-09-03 Koninklijke Philips Electronics N.V. Method of embedding data in stereo image
CN102089809A (zh) * 2008-06-13 2011-06-08 诺基亚公司 用于提供改进的音频处理的方法、装置及计算机程序产品
CN102307323A (zh) * 2009-04-20 2012-01-04 华为技术有限公司 对多声道信号的声道延迟参数进行修正的方法
CN103403800A (zh) * 2011-02-02 2013-11-20 瑞典爱立信有限公司 确定多声道音频信号的声道间时间差
JP2013088522A (ja) * 2011-10-14 2013-05-13 Nippon Telegr & Teleph Corp <Ntt> 声道スペクトル抽出装置、声道スペクトル抽出方法及びプログラム
CN104205211A (zh) * 2012-04-05 2014-12-10 华为技术有限公司 多声道音频编码器以及用于对多声道音频信号进行编码的方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3664089A4

Also Published As

Publication number Publication date
KR20220044857A (ko) 2022-04-11
EP3664089A4 (en) 2020-08-19
EP3664089A1 (en) 2020-06-10
EP3901949B1 (en) 2022-12-28
ES2873880T3 (es) 2021-11-04
CN109427338A (zh) 2019-03-05
KR20200039789A (ko) 2020-04-16
US11636863B2 (en) 2023-04-25
US20200194015A1 (en) 2020-06-18
US20220108709A1 (en) 2022-04-07
KR102486258B1 (ko) 2023-01-09
KR102380642B1 (ko) 2022-03-29
US11244691B2 (en) 2022-02-08
EP3901949A1 (en) 2021-10-27
CN109427338B (zh) 2021-03-30
EP3664089B1 (en) 2021-03-31

Similar Documents

Publication Publication Date Title
RU2439718C1 (ru) Способ и устройство для обработки звукового сигнала
JP2021103326A (ja) チャネル間時間差を推定する装置及び方法
JP5688852B2 (ja) オーディオコーデックポストフィルタ
US20090204397A1 (en) Linear predictive coding of an audio signal
US20230352034A1 (en) Encoding and decoding methods, and encoding and decoding apparatuses for stereo signal
US11636863B2 (en) Stereo signal encoding method and encoding apparatus
US11922958B2 (en) Method and apparatus for determining weighting factor during stereo signal encoding
US11176954B2 (en) Encoding and decoding of multichannel or stereo audio signals
KR102353050B1 (ko) 스테레오 신호 인코딩에서의 신호 재구성 방법 및 디바이스

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18848208

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018848208

Country of ref document: EP

Effective date: 20200305

ENP Entry into the national phase

Ref document number: 20207008343

Country of ref document: KR

Kind code of ref document: A