WO2018028170A1 - 多声道信号的编码方法和编码器 - Google Patents

多声道信号的编码方法和编码器 Download PDF

Info

Publication number
WO2018028170A1
WO2018028170A1 PCT/CN2017/074419 CN2017074419W WO2018028170A1 WO 2018028170 A1 WO2018028170 A1 WO 2018028170A1 CN 2017074419 W CN2017074419 W CN 2017074419W WO 2018028170 A1 WO2018028170 A1 WO 2018028170A1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter
current frame
channel
signal
frame
Prior art date
Application number
PCT/CN2017/074419
Other languages
English (en)
French (fr)
Inventor
刘泽新
张兴涛
李海婷
苗磊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to BR112019002656-8A priority Critical patent/BR112019002656B1/pt
Priority to EP22179454.8A priority patent/EP4120252A1/en
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to KR1020197005937A priority patent/KR102205596B1/ko
Priority to EP17838306.3A priority patent/EP3493203B1/en
Priority to KR1020227005726A priority patent/KR102486604B1/ko
Priority to RU2019106315A priority patent/RU2705427C1/ru
Priority to AU2017310759A priority patent/AU2017310759B2/en
Priority to CA3033225A priority patent/CA3033225C/en
Priority to JP2019507137A priority patent/JP6768924B2/ja
Priority to KR1020217001206A priority patent/KR102367538B1/ko
Priority to ES17838306T priority patent/ES2928335T3/es
Publication of WO2018028170A1 publication Critical patent/WO2018028170A1/zh
Priority to US16/272,397 priority patent/US11133014B2/en
Priority to AU2020267256A priority patent/AU2020267256B2/en
Priority to US17/408,116 priority patent/US11935548B2/en
Priority to AU2022218507A priority patent/AU2022218507B2/en
Priority to US18/419,794 priority patent/US20240161756A1/en
Priority to AU2024205199A priority patent/AU2024205199A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]

Definitions

  • the present application relates to the field of audio signal coding, and more particularly to an encoding method and encoder for a multi-channel signal.
  • stereo has the sense of orientation and distribution of each sound source, which can improve the clarity, intelligibility and presence of sound, and is therefore favored by people.
  • Stereo processing techniques mainly include Mid/Sid (MS) encoding, Intensity Stereo (IS) encoding, and Parametric Stereo (PS) encoding.
  • MS Mid/Sid
  • IS Intensity Stereo
  • PS Parametric Stereo
  • the MS code combines and converts the two signals based on the inter-channel correlation.
  • the energy of each channel is mainly concentrated in the sum channel, so that the inter-channel redundancy is removed.
  • the rate saving depends on the correlation of the input signals. When the correlation of the left and right channel signals is poor, the left channel signal and the right channel signal need to be separately transmitted.
  • the IS code is based on the characteristic that the human ear hearing system is insensitive to the phase difference of the high frequency component of the channel (for example, a component larger than 2 kHz), and the high frequency components of the left and right signals are simplified.
  • the high frequency component of the channel for example, a component larger than 2 kHz
  • IS coding technology is only effective for high frequency components. For example, extending IS coding technology to low frequency will cause serious artificial noise.
  • PS coding is based on the binaural auditory model. As shown in Figure 1 (x L in Figure 1 is the left channel time domain signal, x R is the right channel time domain signal), during the PS encoding process, the encoding end converts the stereo signal into a mono signal and A small number of spatial parameters (or spatially-perceived parameters) describing the spatial sound field. As shown in Figure 2, after the decoder receives the mono signal and spatial parameters, the stereo signal is recovered in conjunction with the spatial parameters. Compared with MS coding, the PS coding compression ratio is high, and therefore, PS coding can obtain higher coding gain while maintaining good sound quality. In addition, PS encoding can work in full audio bandwidth, which can restore the stereo space perception.
  • multi-channel parameters include inter-channel coherent (IC), inter-channel level difference (ILD), and inter-channel.
  • the IC describes the cross-correlation or coherence between channels, which determines the perception of the sound field range and improves the spatial and acoustic stability of the audio signal.
  • ILD is used to distinguish the horizontal direction of the stereo source and describes the energy difference between the channels, which will affect the frequency content of the entire spectrum.
  • ITD and IPD are spatial parameters that represent the horizontal orientation of the sound source and describe the difference in time and phase between the channels. ILD, ITD and IPD can determine the human ear's perception of the sound source position, can effectively determine the sound field position, and play an important role in the recovery of stereo signals.
  • the multi-channel parameters calculated according to the existing PS coding method often appear unstable (multi-channel parameters take values back and forth The phenomenon of jumping). If the downmix signal is calculated based on such multi-channel parameters, the downmix signal will be discontinuous, As a result, the stereo quality obtained by the decoder is poor. For example, the stereo image played by the decoder end is frequently shaken, and even the click on the sense of hearing is present.
  • the present application provides an encoding method and an encoder for a multi-channel signal to improve the stability of multi-channel parameters in PS encoding, thereby improving the encoding quality of the audio signal.
  • a method for encoding a multi-channel signal including:
  • a difference parameter according to an initial multi-channel parameter of the current frame and a multi-channel parameter of a front K frame of the current frame, the difference parameter being used to represent an initial multi-channel parameter and a location of the current frame a difference of multi-channel parameters of the preceding K frame, where K is an integer greater than or equal to 1;
  • the multi-channel signal is encoded according to a multi-channel parameter of the current frame.
  • the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the front K frame and the characteristic parameters of the current frame, and the determination manner is more reasonable, and the previous frame is directly multiplexed with the current frame. Compared with the channel parameters, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
  • the determining, according to the difference parameter and the feature parameter of the current frame, the multi-channel parameter of the current frame including:
  • the difference parameter is an absolute value of a difference between an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame
  • the first preset condition is that the difference parameter is greater than a preset first threshold.
  • the difference parameter is a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame,
  • the first preset condition is that the difference parameter is less than or equal to zero.
  • the determining, according to the feature parameter of the current frame, the multi-channel parameter of the current frame including:
  • the method further comprises:
  • the correlation parameter is determined according to a target channel signal in the multi-channel signal of the current frame and a target channel signal in the multi-channel signal of the previous frame.
  • the target channel signal in the multi-channel signal according to the current frame, and the target sound in the multi-channel signal of the previous frame a channel signal that determines the correlation parameter, including:
  • the frequency domain parameter is at least one of a frequency domain amplitude value and a frequency domain coefficient of the target channel signal.
  • the method further comprises:
  • the correlation parameter is determined according to a pitch period of the current frame and a pitch period of the previous frame.
  • the determining, according to the feature parameter of the current frame, the multi-channel parameter of the current frame including:
  • the determining, according to the multi-channel parameter of the first T frame of the current frame, determining the multi-channel parameter of the current frame includes:
  • the multi-channel parameter of the pre-T frame is determined as a multi-channel parameter of the current frame, where T is equal to one.
  • the determining, according to the multi-channel parameter of the first T frame of the current frame, determining the multi-channel parameter of the current frame includes:
  • T Determining a multi-channel parameter of the current frame according to a change trend of the multi-channel parameter of the pre-T frame, wherein T is greater than or equal to 2.
  • the feature parameter includes at least one of a correlation parameter and a peak-to-average ratio parameter of the current frame
  • the correlation parameter is used to characterize the current a degree of correlation between a frame and a previous frame of the current frame
  • the peak-to-average ratio parameter used to represent a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame
  • the second pre- The condition is that the feature parameter is greater than a preset threshold.
  • the initial multi-channel parameter of the current frame includes at least one of: an initial inter-channel correlation IC value of the current frame, The initial inter-channel time difference ITD value of the current frame, the initial inter-channel phase difference IPD value of the current frame, the initial overall phase difference OPD value of the current frame, and the initial inter-channel level difference ILD value of the current frame .
  • the feature parameter of the current frame includes at least one of the following: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, And a spectral tilt parameter for characterizing a degree of correlation of the current frame with the previous frame, the peak-to-average ratio parameter being used to characterize at least one of the multi-channel signals of the current frame a peak-to-average ratio of a signal of the track, the signal-to-noise ratio parameter being used to characterize a signal-to-noise ratio of a signal of at least one of the multi-channel signals of the current frame, the spectral tilt parameter being used to characterize the current The degree of spectral tilt of the signal of at least one of the multi-channel signals of the frame.
  • an encoder including:
  • An acquiring unit configured to acquire a multi-channel signal of a current frame
  • a first determining unit configured to determine an initial multi-channel parameter of the current frame
  • a second determining unit configured to determine a difference parameter according to an initial multi-channel parameter of the current frame, and a multi-channel parameter of a front K frame of the current frame, where the difference parameter is used to represent the current frame a difference between an initial multi-channel parameter and a multi-channel parameter of the pre-K frame, wherein K is an integer greater than or equal to 1;
  • a third determining unit configured to determine, according to the difference parameter and a feature parameter of the current frame, a multi-channel parameter of the current frame
  • a coding unit configured to encode the multi-channel signal according to the multi-channel parameter of the current frame.
  • the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the front K frame and the characteristic parameters of the current frame, and the determination manner is more reasonable, and the previous frame is directly multiplexed with the current frame. Compared with the channel parameters, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
  • the third determining unit is specifically configured to When the difference parameter satisfies the first preset condition, determining the multi-channel parameter of the current frame according to the feature parameter of the current frame.
  • the difference parameter is an absolute value of a difference between an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame
  • the first preset condition is that the difference parameter is greater than a preset first threshold.
  • the difference parameter is a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame,
  • the first preset condition is that the difference parameter is less than or equal to zero.
  • the third determining unit is specifically configured to determine, according to the correlation parameter of the current frame, a multi-channel parameter of the current frame, where A correlation parameter is used to characterize the degree of correlation of the current frame with a previous frame of the current frame.
  • the encoder further includes:
  • a fourth determining unit configured to determine the correlation parameter according to the target channel signal in the multi-channel signal of the current frame and the target channel signal in the multi-channel signal of the previous frame.
  • the fourth determining unit is specifically configured to: according to a frequency domain parameter of the target channel signal in the multi-channel signal of the current frame, and the front Determining the correlation parameter by a frequency domain parameter of a target channel signal in a multi-channel signal, the frequency domain parameter being at least one of a frequency domain amplitude value and a frequency domain coefficient of the target channel signal .
  • the encoder further includes:
  • a fifth determining unit configured to determine the correlation parameter according to a pitch period of the current frame and a pitch period of the previous frame.
  • the third determining unit is specifically configured to: according to the first T-frame of the current frame, if the feature parameter meets a second preset condition a multi-channel parameter that determines a multi-channel parameter of the current frame, T being an integer greater than or equal to one.
  • the third determining unit is specifically configured to determine a multi-channel parameter of the pre-T frame as a multi-channel parameter of the current frame, where T is equal to 1.
  • the third determining unit is specifically configured to determine a multi-channel parameter of the current frame according to a change trend of the multi-channel parameter of the pre-T frame Where T is greater than or equal to 2.
  • the feature parameter includes at least one of a correlation parameter and a peak-to-average ratio parameter of the current frame
  • the correlation parameter is used to represent the current a degree of correlation between a frame and a previous frame of the current frame
  • the peak-to-average ratio parameter used to represent a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame
  • the second pre- The condition is that the feature parameter is greater than a preset threshold.
  • the initial multi-channel parameter of the current frame includes at least one of: an initial inter-channel correlation IC value of the current frame, The initial inter-channel time difference ITD value of the current frame, the initial inter-channel phase difference IPD value of the current frame, the initial overall phase difference OPD value of the current frame, and the initial inter-channel level difference ILD value of the current frame .
  • the feature parameter of the current frame includes at least one of the following of the current frame: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, And a spectral tilt parameter
  • the correlation parameter is used to represent a degree of correlation between the current frame and the previous frame
  • the peak-to-average ratio parameter is used for a table And a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame
  • the signal-to-noise ratio parameter being used to represent a signal of at least one of the multi-channel signals of the current frame a signal to noise ratio
  • the spectral tilt parameter being used to characterize a degree of spectral tilt of a signal of at least one of the multi-channel signals of the current frame.
  • an encoder comprising a memory for storing a program, the processor for executing a program, and when the program is executed, the processor performs the first aspect method.
  • a computer readable medium storing program code for execution by an encoder, the program code comprising instructions for performing the method of the first aspect.
  • the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the previous K frame and the feature parameters of the current frame, and the determination manner is more reasonable, and is directly multiplexed with the current frame. Compared with the multi-channel parameter of one frame, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
  • FIG. 3 is an exemplary flow chart of a time domain based ITD parameter extraction method in the prior art.
  • FIG. 4 is an exemplary flow chart of a frequency domain based ITD parameter extraction method in the prior art.
  • FIG. 5 is a schematic flowchart of a method for encoding a multi-channel signal according to an embodiment of the present application.
  • Figure 6 is a detailed flow diagram of step 540 of Figure 5.
  • FIG. 7 is a schematic flowchart of a method for encoding a multi-channel signal according to an embodiment of the present application.
  • FIG. 8 is a schematic block diagram of an encoder according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an encoder according to an embodiment of the present application.
  • the stereo signal can also be referred to as a multi-channel signal.
  • the function and meaning of the multi-channel parameters ILD, ITD and IPD of the multi-channel signal are briefly introduced.
  • the signal picked up by the first microphone is the first channel signal, and the second microphone picks up.
  • the incoming signal is an example of a second channel signal, and ILD, ITD, and IPD are described in more detail.
  • the ILD describes the energy difference between the first channel signal and the second channel signal, which is typically calculated by the ratio of the energy of the left and right channels and then converted to the logarithmic domain. For example, if the ILD value is greater than 0, it means that the energy of the first channel signal is higher than the energy of the second channel signal; if the ILD value is equal to 0, it means that the energy of the first channel signal is equal to the energy of the second channel signal; The ILD value is less than 0, indicating that the energy of the first channel signal is less than the energy of the second channel signal.
  • the ILD is less than 0, it means that the energy of the first channel signal is higher than the energy of the second channel signal; if the ILD is equal to 0, it means that the energy of the first channel signal is equal to the energy of the second channel signal; if ILD Greater than 0 indicates that the energy of the first channel signal is less than the energy of the second channel signal. It should be understood that the above numerical values are merely examples, and the relationship between the value of the ILD and the energy difference between the first channel signal and the second channel signal may be defined according to experience or actual needs.
  • the ITD describes the time difference between the first channel signal and the second channel signal, that is, the time difference between the sound generated by the sound source reaching the first microphone and the second microphone. For example, if the ITD value is greater than 0, it means that the sound generated by the sound source reaches the first mic earlier than the sound generated by the sound source reaches the second mic; if the ITD value is equal to 0, the sound generated by the sound source arrives at the same time. The first mic and the second mic; if the ITD value is less than 0, the sound source The time that the generated sound reaches the first mic is later than the time the sound produced by the sound source reaches the second mic.
  • the ITD is less than 0, it means that the sound generated by the sound source reaches the first microphone earlier than the sound generated by the sound source reaches the second microphone; if the ITD is equal to 0, the sound generated by the sound source reaches the same time. A mic and a second mic; if the ITD is greater than 0, it means that the sound produced by the sound source reaches the first mic time later than the sound generated by the sound source reaches the second mic. It should be understood that the above values are merely the relationship between the value of the example ITD and the time difference between the first channel signal and the second channel signal, which may be defined according to experience or actual needs.
  • the IPD describes the phase difference between the first channel signal and the second channel signal, which is usually combined with the ITD for the decoder to recover the phase information of the multi-channel signal.
  • the calculation method of the existing multi-channel parameters may cause the phenomenon that the multi-channel parameters are discontinuous.
  • the multi-channel signals are used as the left and right channel signals in conjunction with FIG. 3 and FIG.
  • the channel parameters are examples of ITD values, which describe in detail the calculation methods and shortcomings of existing multi-channel parameters.
  • the ITD value can be calculated in various ways, for example, the ITD value can be calculated in the time domain, or the ITD value can be calculated in the frequency domain.
  • FIG. 3 is an exemplary flowchart of a time domain based ITD value calculation method.
  • the method of Figure 3 includes:
  • the ITD parameter can be calculated by using a time domain cross-correlation function based on the left and right channel time domain signals, for example, in the range of 0 ⁇ i ⁇ Tmax, and calculating:
  • T 1 takes the opposite of the index value corresponding to max(C n (i)); otherwise T 1 takes the index value corresponding to max(C p (i)); where i is the index value of the computed cross-correlation function, x R is the right channel time domain signal, x L is the left channel time domain signal, T max corresponds to the maximum value of the ITD value at different sampling rates, and Length is the frame length.
  • FIG. 4 is an exemplary flow chart of a frequency domain based ITD value calculation method.
  • the method of Figure 4 includes:
  • the time-frequency transform may use a Discrete Fourier Transformation (DFT) or a Modified Discrete Cosine Transform (MDCT) technique to transform the time domain signal into a frequency domain signal.
  • DFT Discrete Fourier Transformation
  • MDCT Modified Discrete Cosine Transform
  • the time-frequency transform may employ a DFT transform, and specifically, the DFT transform may be performed using the following formula.
  • n is the index value of the sample of the time domain signal
  • k is the index value of the frequency point of the frequency domain signal
  • L is the time frequency transform length.
  • x(n) is the left channel time domain signal or the right channel time domain signal.
  • the L frequency bins of the frequency domain signal may be divided into a plurality of subbands, and for the b th subband, the frequency bins included A b-1 ⁇ k ⁇ A b -1.
  • search range -T max ⁇ j ⁇ T max it can be calculated using the following formula amplitude:
  • the ITD value of the bth subband can be That is, the index value of the sample corresponding to the maximum value calculated by the above formula.
  • the calculated ITD value is considered to be inaccurate, in which case the ITD value of the current frame will be set to zero. Affected by factors such as background noise, reverberation, and simultaneous speech by multiple people, the ITD value calculated according to the existing PS coding method may be frequently set to zero, causing the ITD value to jump back and forth, using such ITD values.
  • the calculated downmix signal may have a discontinuity between frames, resulting in poor auditory quality of the multichannel signal.
  • one feasible processing method is as follows: when the calculated multi-channel parameter of the current frame is considered to be inaccurate, the multi-channel parameter of the previous frame of the current frame can be multiplexed. .
  • This kind of processing can well solve the problem of multi-channel parameters going back and forth.
  • this processing may cause the following problems: if the signal quality in the current frame is good, the calculated multi-channel of the current frame The parameters are generally more accurate. In this case, if the above processing mode is still used, the multi-channel parameters of the current frame may still multiplex the multi-channel parameters of the previous frame, and discard their own relatively accurate multi-channel parameters, which will result in more Inaccurate information between the channels of the channel signal.
  • FIG. 5 is a schematic flowchart of a method for encoding a multi-channel signal according to an embodiment of the present application.
  • the method of Figure 5 includes:
  • the number of multi-channel signals is not specifically limited in the embodiment of the present application.
  • the multi-channel signal may be a two-channel signal, a three-channel signal, or a signal of three or more channels.
  • the multi-channel signal may include a left channel signal and a right channel signal.
  • the multi-channel signal can include a left channel signal, a center channel signal, a right channel signal, and a back channel signal.
  • the initial multi-channel parameters of the current frame can be used to characterize the correlation between the multi-channel signals.
  • the initial multi-channel parameters of the current frame include at least one of: an initial IC value of the current frame, an initial ITD value of the current frame, an initial IPD value of the current frame, an initial OPD value of the current frame, and The initial ILD value of the current frame, and so on.
  • step 520 may use the time domain-based ITD value calculation method shown in FIG. 3, or may use the frequency domain-based ITD value calculation method described in FIG. 4, and may also be based on , using the mixed domain (time domain + frequency domain) based ITD value calculation method:
  • L i (f) represents the frequency domain coefficient of the left channel frequency domain signal
  • argmax() characterizes the maximum of multiple values
  • IDFT() characterizes the inverse discrete Fourier transform.
  • the pre-K frame of the current frame refers to the pre-K frame in the vicinity of the current frame among all the frames of the audio signal to be encoded.
  • the pre-K frames appearing below refer to the first K frames of the current frame
  • the previous frame appearing below refers to the previous frame of the current frame.
  • the representation of the multi-channel parameters may be a numerical value, and therefore, the multi-channel parameters may also be referred to as multi-channel parameter values.
  • the feature parameters of the current frame may include mono parameters of the current frame, which may be used to characterize the characteristics of a signal of a certain one of the multi-channel signals of the current frame.
  • determining the multi-channel parameters of the current frame as described in step 540 may include modifying the initial multi-channel parameters to obtain multi-channel parameters for the current frame. Taking the characteristic parameter of the current frame as the mono parameter of the current frame as an example, the step 540 may include: correcting the initial multi-channel parameter of the current frame according to the difference parameter and the mono parameter of the current frame to obtain the current frame. Multi-channel parameters.
  • the feature parameters of the current frame include at least one of the following parameters of the current frame: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, and a spectral tilt parameter.
  • the correlation parameter is used to represent the correlation degree between the current frame and the previous frame
  • the peak-to-average ratio parameter is used to represent the peak-to-average ratio of the signal of at least one channel of the multi-channel signal of the current frame
  • the signal-to-noise ratio parameter is used.
  • the spectral tilt parameter being used to represent a spectral tilt or spectral energy of a signal of at least one of the multi-channel signals of the current frame Trend.
  • operations such as mono audio coding, spatial parameter coding, and bit stream multiplexing shown in FIG. 1 may be performed.
  • operations such as mono audio coding, spatial parameter coding, and bit stream multiplexing shown in FIG. 1 may be performed.
  • specific coding method reference may be made to the prior art.
  • the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the front K frame and the feature parameters of the current frame, and the determination manner is more reasonable, and the current frame is directly restored. Compared with the multi-channel parameters of the previous frame, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
  • step 540 The implementation of step 540 is described in detail below.
  • step 540 may include: adjusting, according to the size of the feature parameter of the current frame, the size of the initial multi-channel parameter of the current frame, if the difference parameter satisfies the first preset condition, Get the multi-channel parameters of the current frame.
  • the step 540 may include: adjusting, according to the size of the difference parameter, the size of the initial multi-channel parameter of the current frame, if the feature parameter of the current frame satisfies the first preset condition, Get the multi-channel parameters of the current frame.
  • first preset condition may be a condition, or may be a combination of multiple conditions.
  • the determination may be continued in combination with other conditions, when all the conditions are met. In the case of the next step.
  • step 540 may include:
  • difference parameter there are multiple ways to define the difference parameter, and different ways of defining the difference parameter may correspond to different first preset conditions.
  • the difference parameter and its corresponding first preset condition are described in detail below.
  • the difference parameter may be an absolute value of a difference or a difference value of the initial multi-channel parameter of the current frame and the multi-channel parameter of the previous frame;
  • the first preset condition may be a difference parameter
  • the first threshold may be greater than a preset first threshold.
  • the first threshold may be 0.3-0.7 times the target value.
  • the first threshold may be 0.5 times the target value, where the target value is a multi-channel parameter of the previous frame and the current value.
  • the difference parameter may be a difference value of the initial multi-channel parameter of the current frame and a mean value of the multi-channel parameter of the pre-K frame or an absolute value of the difference value;
  • the first preset condition may be The difference parameter is greater than a preset first threshold, which may be 0.3-0.7 times the target value.
  • the first threshold may be 0.5 times the target value, where the target value is a multi-channel parameter of the previous frame.
  • the difference parameter may be a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of the previous frame; the first preset condition may be that the difference parameter is less than or equal to zero.
  • step 544 The specific implementation of step 544 is described in detail below.
  • step 544 may include determining a multi-channel parameter of the current frame according to a correlation parameter and/or a spectral tilt parameter of the current frame, where the correlation parameter is used to represent the current frame and the front The degree of correlation of a frame, the spectral tilt parameter is used to characterize the spectral tilt or spectral energy variation of the signal of at least one of the multi-channel signals of the current frame.
  • step 544 may include determining a multi-channel parameter of the current frame according to a correlation parameter and/or a peak-to-average ratio parameter of the current frame, where the correlation parameter is used to represent the current frame and The degree of correlation of the previous frame, the peak-to-average ratio parameter is used to characterize the peak-to-average ratio of the signal of at least one of the multi-channel signals of the current frame.
  • the correlation parameter can be used to characterize the degree of correlation between the current frame and the previous frame.
  • the degree of correlation of the current frame with the previous frame may be characterized by the degree of correlation of the target channel signal in the multi-channel signal of the current frame and the previous frame.
  • the target channel signal of the current frame and the target channel signal of the previous frame correspond to each other, that is, if the target channel signal of the current frame is the left channel signal, the target channel signal of the previous frame is the left channel. Signal; if the target channel signal of the current frame is a right channel signal, the target channel signal of the previous frame is a right channel signal; if the target channel signal of the current frame is a left and right channel signal, the target sound of the previous frame The channel signal is a left and right channel signal.
  • the target channel signal may be a target channel time domain signal or a target channel frequency signal.
  • the target channel signal is a frequency domain signal
  • the correlation parameter is determined according to the target channel signal in the multi-channel signal of the current frame and the previous frame, which may include: according to the current frame and the previous frame.
  • a frequency domain parameter of the target channel signal in the channel signal, determining a correlation parameter, and the frequency domain parameter of the target channel signal includes the target channel signal Frequency domain amplitude values and/or frequency domain coefficients.
  • the frequency domain amplitude value of the target channel signal may refer to a frequency domain amplitude value of some or all of the sub-bands of the target channel signal.
  • it may be the frequency domain amplitude value of the subband of the low frequency portion of the target channel signal.
  • the target channel signal as the left channel frequency domain signal as an example, assuming that the frequency domain amplitude value of the low frequency portion of the left channel frequency domain signal includes M subbands, each subband includes N frequency domain amplitude values, Calculating the normalized cross-correlation values of the frequency domain amplitude values of the sub-bands of the current frame and the previous frame according to the following formula, and obtaining M normalized cross-correlation values corresponding to the M sub-bands one by one:
  • the M normalized cross-correlation values may be determined as the correlation parameters of the current frame and the previous frame; or, the sum of the M normalized cross-correlation values or the average of the M normalized cross-correlation values may be The value is determined as the correlation parameter of the current frame.
  • the above manner of calculating a correlation parameter based on frequency domain amplitude values may be replaced with calculating a correlation parameter based on frequency domain coefficients.
  • the above manner of calculating the correlation parameter based on the frequency domain amplitude value may be replaced with calculating the correlation parameter based on the absolute value of the frequency domain coefficient.
  • the multi-channel signal of the current frame may refer to the multi-channel signal of one or more subframes of the current frame; for the same reason, the multi-channel signal of the previous frame may refer to more than one or more subframes of the previous frame.
  • Channel signal the correlation parameter can be calculated based on all the multi-channel signals of the current frame and the previous frame, or based on the multi-channel signals of one or some of the previous frame and the previous frame.
  • the normalized mutual phase time domain signal of the current frame and the left and right channel time domain signals of the previous frame can be calculated according to the following formula. Correlation values, obtain N normalized cross-correlation values, and search for the largest normalized cross-correlation value from the N normalized cross-correlation values:
  • L(n) represents the left channel time domain signal
  • R(n) represents the right channel time domain signal
  • N is the total number of samples of the left channel time domain signal
  • L is the nth of the right channel time domain signal. The number of samples offset between the sample point and the nth sample of the left channel time domain signal.
  • the maximum normalized cross-correlation value calculated by the above equation can be used as the correlation parameter of the current frame.
  • the multi-channel signal of the current frame may refer to the multi-channel signal of one or more subframes of the current frame; for the same reason, the multi-channel signal of the previous frame may refer to more than one or more subframes of the previous frame.
  • Channel signal For example, a plurality of maximum normalized cross-correlation values corresponding to a plurality of subframes may be calculated by using the above formula in units of subframes, and then the plurality of subframes may be compared. A maximum normalized cross-correlation value, the sum of the plurality of maximum normalized cross-correlation values, or one or more of the mean values of the plurality of maximum normalized cross-correlation values as a correlation parameter of the current frame.
  • the degree of correlation of the current frame with the previous frame may be characterized by the degree of correlation of the pitch period of the current frame and the previous frame.
  • the correlation parameter can be determined based on the pitch period of the current frame and the pitch period of the previous frame.
  • the pitch period of the current frame or the previous frame may include the pitch period of each subframe of the current frame or the previous frame.
  • the pitch period of each subframe in the current frame or the current frame may be calculated according to an existing pitch period algorithm, and the pitch period of each subframe in the previous frame or the previous frame may be calculated. Then, the deviation value of the pitch period of each subframe in the current frame or the previous frame is calculated, or the deviation value of the pitch period between each subframe in the current frame and each subframe in the previous frame is calculated. Then, the calculated deviation value of the pitch period can be used as the correlation parameter of the current frame and the previous frame.
  • the peak-to-average ratio parameters of the current frame are described in detail below.
  • the peak-to-average ratio parameter of the current frame can be used to characterize the peak-to-average ratio of the signal of at least one of the multi-channel signals of the current frame.
  • the multi-channel signal includes a left channel signal and a right channel signal
  • the peak-to-average ratio parameter may be a peak-to-average ratio of the left channel signal, or may be a peak-to-average ratio of the right channel signal, or may be a left channel.
  • the peak-to-average ratio parameter can be calculated in a variety of ways. For example, it can be calculated based on the frequency domain amplitude value of the frequency domain signal. As another example, it can be calculated based on the frequency domain coefficients of the frequency domain signal or the absolute values of the frequency domain coefficients.
  • the frequency domain amplitude value of the frequency domain signal may refer to a frequency domain amplitude value of some or all of the subbands of the frequency domain signal.
  • it may be the frequency domain amplitude value of the subband of the low frequency portion of the frequency domain signal.
  • the low frequency portion of the left channel frequency domain signal includes M subbands, each subband includes N frequency domain amplitude values, and the peaks of the N frequency domain amplitude values of each subband can be calculated. Comparing, the M peak-to-average ratios of M sub-bands are obtained one by one, and then the M peak-to-average ratio, or the M peak-to-average ratios, or the mean values of the M peak-to-average ratios are taken as the peak-to-average ratio of the current frame. parameter.
  • the ratio of the maximum frequency domain amplitude value of each sub-band to the sum of the N frequency-domain amplitude values of each sub-band may be used.
  • the peak-to-average ratio When the peak-to-average ratio is compared with the preset threshold, the product of the maximum frequency domain amplitude value and the preset threshold value and the sum of the N frequency domain amplitude values of each sub-band may be compared; or the maximum frequency domain amplitude value may be used. The product of the preset threshold and the average of the N frequency domain amplitude values of each subband is compared.
  • the multi-channel signal of the current frame may refer to a multi-channel signal of one or more sub-frames of the current frame.
  • the characteristic parameters of the current frame may also include the signal to noise ratio parameter of the current frame, and the signal to noise ratio parameter is described in detail below.
  • the signal to noise ratio parameter of the current frame can be used to characterize the signal to noise ratio or signal to noise ratio characteristic of at least one of the multi-channel signals of the current frame.
  • the signal-to-noise ratio parameter of the current frame may include one or more parameters, and the specific selection manner of the parameter is not limited in the embodiment of the present application.
  • the signal to noise ratio parameter of the current frame may include a sub-band signal to noise ratio of the multi-channel signal, a modified sub-band signal to noise ratio, a segmented signal to noise ratio, a modified segmented signal to noise ratio, a full band signal to noise ratio, Modified full band signal to noise ratio And at least one of other parameters that can characterize the signal to noise ratio characteristics of the multi-channel signal.
  • the signal to noise ratio parameter of the current frame can be calculated using all of the signals of the multi-channel signal.
  • a portion of the multi-channel signal can be used to calculate a signal to noise ratio parameter for the current frame.
  • the signal of any one of the multi-channel signals can be adaptively selected to calculate the signal-to-noise ratio parameter of the current frame.
  • the data representing the multi-channel signal may be weighted averaged to form a new signal, and then the signal-to-noise ratio of the new signal is used to characterize the signal-to-noise ratio parameter of the current frame.
  • the characteristic parameters of the current frame may also include the spectral tilt parameters of the current frame, and the spectral tilt parameters are described in detail below.
  • the spectral tilt parameter of the current frame can be used to characterize the spectral tilt or spectral energy trend of the signal of at least one of the multi-channel signals of the current frame. It should be understood that the greater the degree of spectral tilt, the weaker the signaled voicedness; the smaller the degree of spectral tilt, the stronger the voicedness of the signal.
  • step 544 The manner of determining the multi-channel parameters of the current frame based on the feature parameters of the current frame in step 544 is described in detail below.
  • whether the current frame multiplexes the multi-channel parameters of the previous frame may be determined according to the feature parameters of the current frame.
  • the multi-channel parameter of the previous frame may be multiplexed in the current frame if the feature parameter satisfies the second preset condition.
  • the initial multi-channel parameter of the current frame may be used as the multi-channel parameter of the current frame in the case that the feature parameter does not satisfy the second preset condition. It should be understood that the feature parameter does not satisfy the feature in the embodiment of the present application.
  • the processing method of the two preset conditions is not specifically limited.
  • the initial multi-channel parameters may be corrected by other existing methods.
  • whether the multi-channel parameter of the current frame is determined according to a change trend of the multi-channel parameter of the previous T frame may be determined according to the feature parameter of the current frame, where T is greater than or equal to 2.
  • the multi-channel parameter of the current frame may be determined according to the trend of the multi-channel parameter of the previous T frame.
  • the initial multi-channel parameter of the current frame may be used as the multi-channel parameter of the current frame in the case that the feature parameter does not satisfy the second preset condition. It should be understood that the feature parameter does not satisfy the feature in the embodiment of the present application.
  • the processing method of the two preset conditions is not specifically limited.
  • the initial multi-channel parameters may be corrected by other existing methods.
  • the second preset condition may be a condition or a combination of multiple conditions.
  • the determination may be continued in combination with other conditions, when all the conditions are met. In the case of the next step.
  • the first T frame of the current frame refers to the first T frame of all the frames of the audio signal to be encoded that is immediately adjacent to the current frame.
  • the ITD value of the current frame ITD[i]
  • ITD[i-1] represents the ITD value of the previous frame of the current frame
  • ITD[i-2] An ITD value characterizing the previous frame of the previous frame of the current frame.
  • the second preset condition may be defined in multiple manners, and the setting of the second preset condition is related to the selection of the feature parameter, which is not specifically limited in this embodiment of the present application.
  • the characteristic parameter is the correlation parameter and/or the peak-to-average ratio parameter
  • the correlation parameter is the mean value of the correlation value of the multi-channel signal of the current frame and the previous frame in each sub-band
  • the peak-to-average ratio parameter is the multi-voice of the current frame.
  • the average value of the peak-to-average ratio of the track signals in each sub-band is an example, and the second preset condition may be one or more of the following conditions:
  • the correlation parameter is greater than the second threshold, wherein the second threshold may be, for example, 0.6-0.95, for example, 0.85;
  • the peak-to-average ratio parameter is greater than the third threshold, and the value range of the third threshold may be, for example, 0.4-0.8, for example, may be 0.6;
  • the correlation parameter is greater than the fourth threshold and the correlation value of a certain sub-band is greater than the fifth threshold, wherein the fourth threshold may be in the range of 0.6 to 0.85, for example, 0.7; and the fifth threshold may be in the range of 0.8 to 0.95. , for example, can be 0.9;
  • the peak-to-average ratio parameter is greater than the sixth threshold and the peak-to-average ratio of a certain sub-band is greater than the seventh threshold.
  • the sixth threshold may be in the range of 0.4 to 0.75, for example, 0.55; and the seventh threshold may be in the range of 0.6 to 0.9, for example, can be 0.7;
  • the second threshold in the above may be greater than the fourth threshold, and the fourth threshold may be less than the fifth threshold; or the third threshold may be greater than the sixth threshold, and the sixth threshold may be less than the seventh threshold.
  • the relationship between the peak-to-average ratio parameter and the preset threshold value needs to be determined.
  • the comparison process of the peak-to-average ratio parameter with the preset threshold value may be converted into a peak-to-average ratio peak value compared with the target value, and the target value may be the product of the preset threshold value and the mean value of the peak-to-average ratio, or may be The product of the preset threshold and the sum of the parameters used to calculate the peak-to-average ratio.
  • the parameter used to calculate the peak-to-average ratio is the frequency domain amplitude value of the sub-band, and each sub-band includes N frequency-domain amplitude values.
  • the peak-to-average ratio is compared with the preset threshold, the maximum of each sub-band can be passed.
  • the frequency domain amplitude value is compared with the product of the preset threshold and the sum of the N frequency domain amplitude values of each subband; it is also possible to pass the maximum frequency domain amplitude value of each subband with a preset threshold and N frequency of each subband The product of the average of the domain amplitude values is compared.
  • FIG. 7 mainly illustrates that the multi-channel signal of the current frame includes a left channel signal and a right channel signal, and the multi-channel parameter is an ITD value.
  • the example of FIG. 7 is merely for assisting the technology in the field.
  • the embodiments of the present application are understood by those skilled in the art, and the embodiments of the present application are not limited to the specific numerical values or specific examples illustrated. A person skilled in the art will be able to make various modifications or changes in the embodiments according to the example of FIG. 7. The modifications or variations are also within the scope of the embodiments of the present application.
  • FIG. 7 is a schematic flowchart of a method for encoding a multi-channel signal according to an embodiment of the present application. It should be understood that the processing steps or operations illustrated in FIG. 7 are merely examples, and the embodiments of the present application may also perform other operations or variations of the various operations in FIG. Moreover, the various steps in FIG. 7 may be performed in a different order than that presented in FIG. 7, and it is possible that not all operations in FIG. 7 are to be performed.
  • the method of Figure 7 includes:
  • steps 720-740 can be expressed by:
  • L i (f) represents the frequency domain coefficient of the left channel frequency domain signal, Characterizing the conjugate of the frequency domain coefficients of the right channel frequency domain signal; argmax() characterizes the maximum of multiple values, and IDFT() characterizes the inverse discrete Fourier transform.
  • steps 760-770 can refer to the prior art and will not be described in detail herein.
  • Step 750 corresponds to step 530 in FIG. 5, and any of the implementations given in step 530 may be employed. Several alternative implementations are listed below.
  • the low frequency portion of the left channel frequency domain signal of the current frame may be divided into M subbands, and each subband includes N frequency domain amplitude values.
  • step 2 the correlation parameter between the current frame and the previous frame may be calculated according to the following formula:
  • the correlation parameter between the current frame and the previous frame is obtained, and the correlation parameter may be a normalized cross-correlation value of each sub-band, or may be a normalized cross-correlation of each sub-band.
  • step three the peak-to-average ratio of each sub-band of the current frame is calculated.
  • step two and step three may be performed simultaneously or sequentially.
  • the peak-to-average ratio of each sub-band can be expressed as a ratio of the peak value and the mean value of the frequency domain amplitude value of each sub-band, and the peak value of the frequency domain amplitude value of each sub-band and the frequency-domain amplitude value in the sub-band can also be used. The ratio of the sums is expressed, which reduces the computational complexity.
  • the peak-to-average ratio parameter of the multi-channel signal of the current frame can be obtained, and the peak-to-average ratio parameter can be the peak-to-average ratio of each sub-band, or the peak-to-average ratio of each sub-band. And the mean of the peak-to-average ratio of each subband.
  • Step 4 If the initial ITD value of the current frame and the ITD value of the previous frame satisfy the first preset condition, determine whether the current frame is multiplexed according to the correlation parameter and/or the peak-to-average ratio parameter of the current frame. ITD value.
  • the first preset condition can be, for example:
  • the product of the ITD value of the previous frame and the initial ITD value of the current frame is 0; or,
  • the product of the ITD value of the previous frame and the initial ITD value of the current frame is negative; or,
  • the absolute value of the difference between the ITD value of the previous frame and the initial ITD value of the current frame is greater than half of the target value, wherein the target value is greater than the absolute value of the ITD value of the previous frame and the initial ITD value of the current frame. ITD value.
  • first preset condition may be a condition, or may be a combination of multiple conditions.
  • the determination may be continued in combination with other conditions, when all the conditions are met. If both are satisfied, perform the next steps.
  • Determining whether the current frame multiplexes the ITD value of the previous frame according to the correlation parameter and/or the peak-to-average ratio parameter of the current frame may specifically determine whether the correlation parameter of the current frame and/or the peak-to-average ratio parameter satisfy the second pre- It is assumed that, in the case that the correlation parameter and/or the peak-to-average ratio parameter of the current frame satisfy the second preset condition, the current frame multiplexes the ITD value of the previous frame.
  • the second preset condition can be, for example:
  • the mean value of the normalized cross-correlation values of each sub-band is greater than the first threshold
  • the mean value of the peak-to-average ratio of each sub-band is greater than the second threshold
  • the mean value of the normalized cross-correlation value of each sub-band is greater than a third threshold and the normalized cross-correlation value of a sub-band is greater than a fourth threshold;
  • the mean value of the peak-to-average ratio of each sub-band is greater than a fifth threshold and the peak-to-average ratio of a certain sub-band is greater than a sixth threshold;
  • the first threshold is greater than the third threshold, the third threshold is less than the fourth threshold; the second threshold is greater than the fifth threshold, and the fifth threshold is less than the sixth threshold.
  • the second preset condition may be a condition or a combination of multiple conditions.
  • the determination may be continued in combination with other conditions, when all the conditions are met. If both are satisfied, perform the next steps.
  • the left channel frequency domain signal of the current frame described in the above may be the left channel frequency domain signal of a certain subframe or some subframes in the current frame, and the left frame of the previous frame described above.
  • the channel frequency domain signal may be a left channel frequency domain signal of a certain subframe or some subframes in the previous frame.
  • the correlation parameter can be calculated by the parameters of the current frame and the previous frame, or can be calculated by the parameters of a certain subframe or some subframes in the current frame and the previous frame.
  • the peak-to-average ratio parameter can be calculated by the parameters of the current frame, or can be calculated by using a certain subframe or some subframes in the current frame.
  • the second implementation manner is different from the foregoing implementation manner in that the implementation manner is to calculate the correlation parameter between the current frame and the previous frame based on the frequency domain amplitude value of the subband, and the implementation manner is based on the frequency domain coefficient or frequency of the subband.
  • the absolute value of the domain coefficient calculates the correlation parameter of the current frame and the previous frame.
  • the implementation mode 2 is similar to the specific implementation process of the foregoing implementation manner, and is not described in detail herein.
  • the implementation method 3 differs from the above implementation manner in that the above implementation manner is based on the frequency-domain amplitude value of the sub-band to calculate the peak-to-average ratio parameter, and the implementation manner is based on the absolute value of the sub-band frequency domain coefficient to calculate the peak-to-average ratio parameter. .
  • the third implementation manner is similar to the specific implementation process of the foregoing implementation manner, and is not described in detail herein.
  • the implementation method 4 is different from the above implementation manner in that the implementation manner is based on the left channel frequency domain signal to calculate the correlation parameter and/or the peak-to-average ratio parameter, and the implementation manner 4 is based on the right channel frequency domain signal to calculate the correlation. Parameter and / or peak-to-average ratio parameters.
  • the implementation manner 4 is similar to the specific implementation process of the foregoing implementation manner, and is not described in detail herein.
  • the implementation method 5 is different from the above implementation manner in that the implementation manner is based on the left channel frequency domain signal or the right channel frequency domain signal to calculate the correlation parameter and/or the peak-to-average ratio parameter, and the implementation manner 5 is based on the left and right sound.
  • the channel frequency domain signal calculates a correlation parameter and/or a peak-to-average ratio parameter.
  • a set of correlation parameters and/or peak-to-average ratio parameters may be calculated according to the left channel frequency domain signal; and a set of correlation parameters and/or peak-to-average ratio parameters are calculated by using the right channel frequency domain signal. Then, one of the two sets of parameters can be selected as the final correlation parameter and/or the peak-to-average ratio parameter.
  • Other processes of implementing mode 5 and the above The current mode is similar and will not be described in detail here.
  • the difference between the implementation manner 6 and the foregoing implementation manner is that the foregoing implementation manner is based on the frequency domain signal to calculate the correlation parameter, and the implementation manner 6 is to calculate the correlation parameter based on the time domain signal.
  • the correlation parameter of the current frame and the previous frame can be calculated by:
  • L(n) represents the left channel time domain signal
  • R(n) represents the right channel time domain signal
  • N is the total number of samples of the left channel time domain signal
  • L is the nth sample of the right channel signal. The number of samples offset between the point and the nth sample of the left channel.
  • left channel time domain signal and the right channel time domain signal herein may be all left channel signals and right channel signals in the current frame, or may be some or some subframes in the current frame. Left channel signal and right channel signal.
  • the implementation manner 7 is different from the foregoing implementation manner in that the foregoing implementation manner is to determine whether the current frame multiplexes the ITD value of the previous frame, and the implementation manner 7 is to determine whether the ITD value of the current frame passes the previous T frame of the current frame.
  • the trend of the ITD value is estimated, and T is an integer greater than or equal to 2.
  • the ITD value of the current frame, ITD[i] can be calculated as follows:
  • ITD[i-1] represents the ITD value of the previous frame of the current frame
  • ITD[i-2] represents the previous frame of the current frame.
  • the implementation manner 8 is different from the foregoing implementation manner in that the foregoing implementation manner is to calculate a correlation parameter between the current frame and the previous frame based on the current frame and the time-frequency signal of the previous frame, and the implementation manner 8 is based on the current frame and the previous one.
  • the pitch period of the frame calculates the correlation parameter.
  • the pitch period of the current frame or the current frame may be calculated according to an existing pitch period algorithm; the pitch period of the corresponding previous frame is calculated at the same time; the deviation of the pitch period of the current frame from the previous frame is calculated; The deviation of the pitch period of the previous frame is used as the correlation parameter of the current frame and the previous frame.
  • the deviation of the pitch period of the current frame and the previous frame may be the deviation of the pitch period of the current frame and the previous frame as a whole, or may be the pitch period of one or some subframes in the current frame and the previous frame.
  • the deviation may also be the sum of the deviations of the pitch periods of the current frame and some subframes in the previous frame, or may be the mean of the deviations of the pitch periods of the current frame and some subframes in the previous frame.
  • the implementation manner 9 differs from the foregoing implementation manner in that the foregoing implementation manner determines the ITD value of the current frame based on the correlation parameter and/or the peak-to-average ratio parameter, and the implementation manner 9 is determined based on the correlation parameter and/or the spectrum tilt parameter.
  • the ITD value of the current frame is determined based on the correlation parameter and/or the spectrum tilt parameter.
  • the second preset condition may be: the correlation value in the correlation parameter of the current frame and the previous frame is greater than a certain threshold, and/or the spectral slope value in the spectral slope parameter is less than a certain threshold (should be understood, the spectrum The larger the slope value, the weaker the voicedness of the signal; the smaller the spectral slope value, the stronger the voicedness of the signal.
  • the implementation ten is different from the above implementation manner in that the above implementation calculates the ITD value of the current frame, and the implementation ten calculates the IPD value of the current frame. It should be understood that the calculation process related to the ITD value in steps 710-770 needs to be replaced with the process related to the IPD value.
  • the calculation method of the IPD value can refer to the prior art, and will not be described in detail herein.
  • FIG. 8 is a schematic block diagram of an encoder according to an embodiment of the present application.
  • the encoder 800 of Figure 8 includes:
  • An obtaining unit 810 configured to acquire a multi-channel signal of a current frame
  • a first determining unit 820 configured to determine an initial multi-channel parameter of the current frame
  • a second determining unit 830 configured to determine, according to an initial multi-channel parameter of the current frame, and a multi-channel parameter of a front K frame of the current frame, the difference parameter is used to represent the current frame a difference between an initial multi-channel parameter and a multi-channel parameter of the pre-K frame, wherein K is an integer greater than or equal to 1;
  • a third determining unit 840 configured to determine, according to the difference parameter and a feature parameter of the current frame, a multi-channel parameter of the current frame;
  • the encoding unit 850 is configured to encode the multi-channel signal according to the multi-channel parameter of the current frame.
  • the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the front K frame and the feature parameters of the current frame, and the determination manner is more reasonable, and the current frame is directly restored. Compared with the multi-channel parameters of the previous frame, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
  • the third determining unit 840 is specifically configured to determine, according to a feature parameter of the current frame, the current frame, if the difference parameter meets a first preset condition. Multi-channel parameters.
  • the difference parameter is an absolute value of a difference between an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame, the first The preset condition is that the difference parameter is greater than a preset first threshold.
  • the difference parameter is a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame, where the first preset condition is The difference parameter is less than or equal to zero.
  • the third determining unit 840 is specifically configured to determine, according to the correlation parameter of the current frame, a multi-channel parameter of the current frame, where the correlation parameter is used to Characterizing the degree of correlation of the current frame with a previous frame of the current frame.
  • the third determining unit 840 is specifically configured to determine, according to a peak-to-average ratio parameter of the current frame, a multi-channel parameter of the current frame, where the peak-to-average ratio parameter is used. And a peak-to-average ratio of a signal characterizing at least one of the multi-channel signals of the current frame.
  • the third determining unit 840 is specifically configured to determine, according to the correlation parameter and the peak-to-average ratio parameter of the current frame, a multi-channel parameter of the current frame, where A correlation parameter is used to characterize a degree of correlation between the current frame and a previous frame of the current frame, the peak-to-average ratio parameter used to characterize a signal of at least one of the multi-channel signals of the current frame Peak to average ratio.
  • the encoder further includes:
  • a fourth determining unit configured to determine the correlation parameter according to the target channel signal in the multi-channel signal of the current frame and the target channel signal in the multi-channel signal of the previous frame.
  • the fourth determining unit is specifically configured to: according to a frequency domain parameter of the target channel signal in the multi-channel signal of the current frame, and the multi-channel of the previous frame And determining, by the frequency domain parameter of the target channel signal in the signal, the correlation parameter, wherein the frequency domain parameter is at least one of a frequency domain amplitude value and a frequency domain coefficient of the target channel signal.
  • the encoder further includes:
  • a fifth determining unit configured to determine the correlation parameter according to a pitch period of the current frame and a pitch period of the previous frame.
  • the third determining unit 840 is specifically configured to: according to the multi-channel parameter of the first T frame of the current frame, if the feature parameter meets the second preset condition, Determining a multi-channel parameter of the current frame, T being an integer greater than or equal to one.
  • the third determining unit 840 is specifically configured to determine a multi-channel parameter of the pre-T frame as a multi-channel parameter of the current frame, where T is equal to 1.
  • the third determining unit 840 is specifically configured to determine, according to a trend of the multi-channel parameter of the pre-T frame, a multi-channel parameter of the current frame, where T is greater than Or equal to 2.
  • the feature parameter includes a correlation parameter and/or a peak-to-average ratio parameter of the current frame, where the correlation parameter is used to represent the current frame and the current frame.
  • the correlation parameter is used to represent the current frame and the current frame.
  • a correlation degree of a frame the peak-to-average ratio parameter is used to represent a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame, and the second preset condition is that the characteristic parameter is greater than The preset threshold.
  • the initial multi-channel parameter of the current frame includes at least one of: an initial inter-channel correlation IC value of the current frame, an initial channel of the current frame The inter-time difference ITD value, the initial inter-channel phase difference IPD value of the current frame, the initial overall phase difference OPD value of the current frame, and the initial inter-channel level difference ILD value of the current frame.
  • the feature parameter of the current frame includes at least one of the following: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, and a spectral tilt parameter.
  • a correlation parameter for characterizing a degree of correlation between the current frame and the previous frame the peak-to-average ratio parameter for characterizing a peak of a signal of at least one of the multi-channel signals of the current frame Ratio
  • the signal to noise ratio parameter is used to characterize a signal to noise ratio of a signal of at least one of the multi-channel signals of the current frame
  • the spectral tilt parameter being used to characterize the multi-channel signal of the current frame The degree of spectral tilt of the signal of at least one of the channels.
  • FIG. 9 is a schematic block diagram of an encoder according to an embodiment of the present application.
  • the encoder 900 of Figure 9 includes:
  • a memory 910 configured to store a program
  • a processor 920 configured to execute a program, when the program is executed, the processor 920 is configured to acquire a multi-channel signal of a current frame; determine an initial multi-channel parameter of the current frame; according to the current frame An initial multi-channel parameter, and a multi-channel parameter of the first K frame of the current frame, determining a difference parameter, the difference parameter being used to characterize an initial multi-channel parameter of the current frame and the pre-K frame a difference of a multi-channel parameter, where K is an integer greater than or equal to 1; determining a multi-channel parameter of the current frame according to the difference parameter and a characteristic parameter of the current frame;
  • the channel parameters encode the multi-channel signal.
  • the multi-channel parameter of the current frame is comprehensively considering the difference between the current frame and the previous K frame. And determining the characteristic parameters of the current frame, such a determination manner is more reasonable, and the inter-channel information of the multi-channel signal can be better ensured than the manner in which the current frame directly multiplexes the multi-channel parameters of the previous frame. The accuracy.
  • the processor 920 is specifically configured to determine, according to a feature parameter of the current frame, multiple sounds of the current frame, if the difference parameter meets a first preset condition. Road parameters.
  • the difference parameter is an absolute value of a difference between an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame, the first The preset condition is that the difference parameter is greater than a preset first threshold.
  • the difference parameter is a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame, where the first preset condition is The difference parameter is less than or equal to zero.
  • the processor 920 is specifically configured to determine a multi-channel parameter of the current frame according to a correlation parameter of the current frame, where the correlation parameter is used to represent the The degree of correlation between the current frame and the previous frame of the current frame.
  • the processor 920 is specifically configured to determine a multi-channel parameter of the current frame according to a peak-to-average ratio parameter of the current frame, where the peak-to-average ratio parameter is used for A peak-to-average ratio of a signal characterizing at least one of the multi-channel signals of the current frame.
  • the processor 920 is specifically configured to determine, according to the correlation parameter and the peak-to-average ratio parameter of the current frame, a multi-channel parameter of the current frame, where the correlation is a parameter for characterizing a degree of correlation between the current frame and a previous frame of the current frame, the peak-to-average ratio parameter used to represent a peak of a signal of at least one of the multi-channel signals of the current frame ratio.
  • the processor 920 is further configured to: target channel signals in the multi-channel signal according to the current frame, and target sounds in the multi-channel signal of the previous frame.
  • the processor 920 is specifically configured to: according to a frequency domain parameter of a target channel signal in a multi-channel signal of the current frame, and a multi-channel signal of the previous frame.
  • the frequency domain parameter of the target channel signal is determined, and the correlation parameter is determined, and the frequency domain parameter is a frequency domain amplitude value of the target channel signal.
  • the processor 920 is specifically configured to: according to a frequency domain parameter of a target channel signal in a multi-channel signal of the current frame, and a multi-channel signal of the previous frame.
  • the frequency domain parameter of the target channel signal is determined, and the correlation parameter is determined, and the frequency domain parameter is a frequency domain coefficient of the target channel signal.
  • the processor 920 is specifically configured to: according to a frequency domain parameter of a target channel signal in a multi-channel signal of the current frame, and a multi-channel signal of the previous frame.
  • the frequency domain parameter of the target channel signal is determined, and the frequency domain parameter is a frequency domain amplitude value and a frequency domain coefficient of the target channel signal.
  • the processor 920 is further configured to determine the correlation parameter according to a pitch period of the current frame and a pitch period of the previous frame.
  • the processor 920 is specifically configured to determine, according to the multi-channel parameter of the first T frame of the current frame, that the feature parameter meets the second preset condition.
  • the multi-channel parameter of the current frame, T is an integer greater than or equal to 1.
  • the processor 920 is specifically configured to determine a multi-channel parameter of the pre-T frame as a multi-channel parameter of the current frame, where T is equal to 1.
  • the processor 920 is specifically configured to perform multi-channel parameters according to the pre-T frame.
  • a trend of change determining a multi-channel parameter of the current frame, wherein T is greater than or equal to two.
  • the feature parameter includes a correlation parameter and/or a peak-to-average ratio parameter of the current frame, where the correlation parameter is used to represent the current frame and the current frame.
  • the correlation parameter is used to represent the current frame and the current frame.
  • a correlation degree of a frame the peak-to-average ratio parameter is used to represent a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame, and the second preset condition is that the characteristic parameter is greater than The preset threshold.
  • the initial multi-channel parameter of the current frame includes at least one of: an initial inter-channel correlation IC value of the current frame, an initial channel of the current frame The inter-time difference ITD value, the initial inter-channel phase difference IPD value of the current frame, the initial overall phase difference OPD value of the current frame, and the initial inter-channel level difference ILD value of the current frame.
  • the feature parameter of the current frame includes at least one of the following: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, and a spectral tilt parameter.
  • a correlation parameter for characterizing a degree of correlation between the current frame and the previous frame the peak-to-average ratio parameter for characterizing a peak of a signal of at least one of the multi-channel signals of the current frame Ratio
  • the signal to noise ratio parameter is used to characterize a signal to noise ratio of a signal of at least one of the multi-channel signals of the current frame
  • the spectral tilt parameter being used to characterize the multi-channel signal of the current frame The degree of spectral tilt of the signal of at least one of the channels.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present application or the part contributing to the prior art or the part of the technical solution may be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Mathematical Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Stereophonic System (AREA)

Abstract

一种多声道信号的编码方法和编码器,该编码方法包括:获取当前帧的多声道信号(510);确定当前帧的初始多声道参数(520);根据当前帧的初始多声道参数,以及当前帧的前K帧的多声道参数,确定差异参数(530),差异参数用于表征当前帧的初始多声道参数与前K帧的多声道参数的差异,其中,K为大于或等于1的整数;根据差异参数和当前帧的特征参数,确定当前帧的多声道参数(540);根据当前帧的多声道参数对多声道信号进行编码(550)。能更好地保证多声道信号的声道间信息的准确性。

Description

多声道信号的编码方法和编码器
本申请要求于2016年08月10日提交中国专利局、申请号为201610652506.X、发明名称为“多声道信号的编码方法和编码器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及音频信号编码领域,并且更为具体地,涉及一种多声道信号的编码方法和编码器。
背景技术
随着生活质量的提高,人们对高质量音频的需求不断增大。相对于单声道信号,立体声具有各声源的方位感和分布感,能够提高声音的清晰度、可懂度及临场感,因而备受人们青睐。
立体声处理技术主要有和差(Mid/Sid,MS)编码、强度立体声(Intensity Stereo,IS)编码以及参数立体声(Parametric Stereo,PS)编码。
MS编码基于声道间相关性将两路信号作和、差变换,各声道能量主要集中在和声道,使声道间冗余得以去除。在MS编码技术中,码率的节省依赖于输入信号的相关性,当左右声道信号的相关性差时,需分别传输左声道信号和右声道信号。
IS编码基于人耳听觉系统对声道的高频成分(例如,大于2kHz的成分)的相位差异不敏感的特性,将左右两路信号的高频分量进行简化处理。但IS编码技术仅对高频成分有效,如将IS编码技术扩展到低频,将会引起严重的人为噪声。
PS编码是基于双耳听觉模型的编码方式。如图1所示(图1中的xL为左声道时域信号,xR为右声道时域信号),在PS编码过程中,编码端会将立体声信号转换成单声道信号和少量描述空间声场的空间参数(或称空间感知参数)。如图2所示,解码端得到单声道信号和空间参数之后,会结合空间参数恢复立体声信号。相对于MS编码,PS编码压缩比高,因此,PS编码可以在保持较好音质的前提下,获得更高的编码增益。此外,PS编码可以工作在全音频带宽中,能够很好地还原立体声的空间感知效果。
PS编码中,多声道参数(也可称为空间参数)包括声道间相关性(Inter-channel Coherent,IC)、声道间电平差(Inter-channel Level Difference,ILD)、声道间时间差(Inter-channel Time Difference,ITD),整体相位差(Overall Phase Difference,OPD)以及声道间相位差(Inter-channel Phase Difference,IPD)等。IC描述了声道间的互相关或相干性,该参数决定了声场范围的感知,可以提高音频信号的空间感和声响稳定性。ILD用于分辨立体声源的水平方向角度,描述了声道间的能量差别,该参数将影响整个频谱的频率成分。ITD和IPD为表示声源水平方位的空间参数,描述了声道间的时间和相位的差别。ILD、ITD和IPD能够决定人耳对声源位置的感知,可以有效确定声场位置,对立体声信号的恢复具有重要作用。
在立体声的录音过程中,受到背景噪声、混响、多人同时讲话等因素的影响,按照现有的PS编码方式计算出的多声道参数经常会出现不稳定(多声道参数取值来回跳变)的现象。如果基于这样的多声道参数计算下混合信号,就会导致下混合信号不连续,从 而导致解码端得到的立体声质量差,如解码端播放的立体声的声像会频繁晃动,甚至出现听感上的卡顿。
发明内容
本申请提供一种多声道信号的编码方法和编码器,以提升PS编码中的多声道参数的稳定性,从而提升音频信号的编码质量。
第一方面,提供一种多声道信号的编码方法,包括:
获取当前帧的多声道信号;
确定所述当前帧的初始多声道参数;
根据所述当前帧的初始多声道参数,以及所述当前帧的前K帧的多声道参数,确定差异参数,所述差异参数用于表征所述当前帧的初始多声道参数与所述前K帧的多声道参数的差异,其中,K为大于或等于1的整数;
根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数;
根据所述当前帧的多声道参数对所述多声道信号进行编码。
当前帧的多声道参数是在综合考虑了当前帧与前K帧之间的差异以及当前帧的特征参数之后确定的,这样的确定方式更加合理,与当前帧直接复用前一帧的多声道参数的方式相比,能够更好地保证多声道信号的声道间信息的准确性。
结合第一方面,在第一方面的某些实现方式中,所述根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:
在所述差异参数满足第一预设条件的情况下,根据所述当前帧的特征参数,确定所述当前帧的多声道参数。
结合第一方面,在第一方面的某些实现方式中,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的差值的绝对值,所述第一预设条件为所述差异参数大于预设的第一阈值。
结合第一方面,在第一方面的某些实现方式中,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的乘积,所述第一预设条件为所述差异参数小于或等于0。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:
根据所述当前帧的相关性参数,确定所述当前帧的多声道参数,其中,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:
根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数,包括:
根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值和频域系数中的至少一个。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:
根据所述当前帧的基音周期,以及所述前一帧的基音周期,确定所述相关性参数。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:
在所述特征参数满足第二预设条件的情况下,根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,T为大于或等于1的整数。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,包括:
将所述前T帧的多声道参数确定为所述当前帧的多声道参数,其中,T等于1。
结合第一方面,在第一方面的某些实现方式中,所述根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,包括:
根据所述前T帧的多声道参数的变化趋势,确定所述当前帧的多声道参数,其中,T大于或等于2。
结合第一方面,在第一方面的某些实现方式中,所述特征参数包括所述当前帧的相关性参数和峰均比参数中的至少一个,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述第二预设条件为所述特征参数大于预设阈值。
结合第一方面,在第一方面的某些实现方式中,所述当前帧的初始多声道参数包括以下中的至少一种:所述当前帧的初始声道间相关性IC值,所述当前帧的初始声道间时间差ITD值,所述当前帧的初始声道间相位差IPD值,当前帧的初始整体相位差OPD值,以及所述当前帧的初始声道间电平差ILD值。
结合第一方面,在第一方面的某些实现方式中,所述当前帧的特征参数包括所述当前帧的以下中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数,所述相关性参数用于表征所述当前帧与所述前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述信噪比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的信噪比,所述谱倾斜参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度。
第二方面,提供一种编码器,包括:
获取单元,用于获取当前帧的多声道信号;
第一确定单元,用于确定所述当前帧的初始多声道参数;
第二确定单元,用于根据所述当前帧的初始多声道参数,以及所述当前帧的前K帧的多声道参数,确定差异参数,所述差异参数用于表征所述当前帧的初始多声道参数与所述前K帧的多声道参数的差异,其中,K为大于或等于1的整数;
第三确定单元,用于根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数;
编码单元,用于根据所述当前帧的多声道参数对所述多声道信号进行编码。
当前帧的多声道参数是在综合考虑了当前帧与前K帧之间的差异以及当前帧的特征参数之后确定的,这样的确定方式更加合理,与当前帧直接复用前一帧的多声道参数的方式相比,能够更好地保证多声道信号的声道间信息的准确性。
结合第二方面,在第二方面的某些实现方式中,所述第三确定单元具体用于在所述 差异参数满足第一预设条件的情况下,根据所述当前帧的特征参数,确定所述当前帧的多声道参数。
结合第二方面,在第二方面的某些实现方式中,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的差值的绝对值,所述第一预设条件为所述差异参数大于预设的第一阈值。
结合第二方面,在第二方面的某些实现方式中,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的乘积,所述第一预设条件为所述差异参数小于或等于0。
结合第二方面,在第二方面的某些实现方式中,所述第三确定单元具体用于根据所述当前帧的相关性参数,确定所述当前帧的多声道参数,其中,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度。
结合第二方面,在第二方面的某些实现方式中,所述编码器还包括:
第四确定单元,用于根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数。
结合第二方面,在第二方面的某些实现方式中,所述第四确定单元具体用于根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值和频域系数中的至少一个。
结合第二方面,在第二方面的某些实现方式中,所述编码器还包括:
第五确定单元,用于根据所述当前帧的基音周期,以及所述前一帧的基音周期,确定所述相关性参数。
结合第二方面,在第二方面的某些实现方式中,所述第三确定单元具体用于在所述特征参数满足第二预设条件的情况下,根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,T为大于或等于1的整数。
结合第二方面,在第二方面的某些实现方式中,所述第三确定单元具体用于将所述前T帧的多声道参数确定为所述当前帧的多声道参数,其中,T等于1。
结合第二方面,在第二方面的某些实现方式中,所述第三确定单元具体用于根据所述前T帧的多声道参数的变化趋势,确定所述当前帧的多声道参数,其中,T大于或等于2。
结合第二方面,在第二方面的某些实现方式中,所述特征参数包括所述当前帧的相关性参数和峰均比参数中的至少一个,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述第二预设条件为所述特征参数大于预设阈值。
结合第二方面,在第二方面的某些实现方式中,所述当前帧的初始多声道参数包括以下中的至少一种:所述当前帧的初始声道间相关性IC值,所述当前帧的初始声道间时间差ITD值,所述当前帧的初始声道间相位差IPD值,当前帧的初始整体相位差OPD值,以及所述当前帧的初始声道间电平差ILD值。
结合第二方面,在第二方面的某些实现方式中,所述当前帧的特征参数包括所述当前帧的以下中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数,所述相关性参数用于表征所述当前帧与所述前一帧的相关程度,所述峰均比参数用于表 征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述信噪比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的信噪比,所述谱倾斜参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度。
第三方面,提供一种编码器,包括存储器和处理器,所述存储器用于存储程序,所述处理器用于执行程序,当所述程序被执行时,所述处理器执行第一方面中的方法。
第四方面,提供一种计算机可读介质,所述计算机可读介质存储用于编码器执行的程序代码,所述程序代码包括用于执行第一方面中的方法的指令。
本申请中,当前帧的多声道参数是在综合考虑了当前帧与前K帧之间的差异以及当前帧的特征参数之后确定的,这样的确定方式更加合理,与当前帧直接复用前一帧的多声道参数的方式相比,能够更好地保证多声道信号的声道间信息的准确性。
附图说明
图1是现有技术中的PS编码的流程图。
图2是现有技术中的PS解码的流程图。
图3是现有技术中的基于时域的ITD参数提取方法的示例性流程图。
图4是现有技术中的基于频域的ITD参数提取方法的示例性流程图。
图5是本申请实施例的多声道信号的编码方法的示意性流程图。
图6是图5中的步骤540的详细流程图。
图7是本申请实施例的多声道信号的编码方法的示意性流程图。
图8是本申请实施例的编码器的示意性框图。
图9是本申请实施例的编码器的示意性结构图。
具体实施方式
需要说明的是,立体声信号也可称为多声道信号。上文简单介绍了多声道信号的多声道参数ILD、ITD以及IPD的作用和含义,为了便于理解,下文以第一个麦克拾取到的信号为第一声道信号,第二个麦克拾取到的信号为第二声道信号为例,对ILD、ITD以及IPD进行更为详细的说明。
ILD描述了第一声道信号和第二声道信号之间的能量差别,一般情况下,通过左右声道的能量的比值计算,然后转换到对数域。例如,如果ILD值大于0,表示第一声道信号的能量高于第二声道信号的能量;如果ILD值等于0,表示第一声道信号的能量等于第二声道信号的能量;如果ILD值小于0,表示第一声道信号的能量小于第二声道信号的能量。又如,如果ILD小于0,表示第一声道信号的能量高于第二声道信号的能量;如果ILD等于0,表示第一声道信号的能量等于第二声道信号的能量;如果ILD大于0,表示第一声道信号的能量小于第二声道信号的能量。应理解,以上数值仅是举例,ILD的取值与第一声道信号和第二声道信号之间的能量差别的关系可以根据经验或实际需要定义。
ITD描述了第一声道信号和第二声道信号之间的时间差别,即声源产生的声音到达第一个麦克和第二个麦克的时间差异。例如,如果ITD值大于0,表示声源产生的声音到达第一个麦克的时间早于声源产生的声音到达第二个麦克的时间;如果ITD值等于0,表示声源产生的声音同时到达第一个麦克和第二个麦克;如果ITD值小于0,表示声源 产生的声音达到第一个麦克的时间晚于声源产生的声音到达第二个麦克的时间。又如,如果ITD小于0,表示声源产生的声音到达第一个麦克的时间早于声源产生的声音到达第二个麦克的时间;如果ITD等于0,表示声源产生的声音同时到达第一个麦克和第二个麦克;如果ITD大于0,表示声源产生的声音达到第一个麦克的时间晚于声源产生的声音到达第二个麦克的时间。应理解,以上数值仅是举例ITD的取值与第一声道信号和第二声道信号之间的时间差别的关系可以根据经验或实际需要定义。
IPD描述了第一声道信号和第二声道信号的相位差别,该参数通常和ITD结合在一起,用于解码端恢复多声道信号的相位信息。
由上文可知,现有的多声道参数的计算方式会引起多声道参数不连续的现象,为了便于理解,下文结合图3和图4,以多声道信号为左右声道信号,多声道参数为ITD值为例,详细描述现有多声道参数的计算方式及其缺点。
在现有技术中,ITD值的计算方式可以有多种,例如,可以在时域进行ITD值的计算,也可以在频域进行ITD值的计算。
图3是基于时域的ITD值计算方法的示例性流程图。图3的方法包括:
310、基于左右声道时域信号计算ITD值。
具体而言,可以基于左右声道时域信号,采用时域互相关函数计算ITD参数,例如:在0≤i≤Tmax范围内,计算:
Figure PCTCN2017074419-appb-000001
Figure PCTCN2017074419-appb-000002
如果
Figure PCTCN2017074419-appb-000003
则T1取max(Cn(i))对应的索引值的相反数;否则T1取max(Cp(i))对应的索引值;其中,i为计算互相关函数的索引值,xR为右声道时域信号,xL为左声道时域信号,Tmax对应于不同采样率下ITD取值的最大值,Length为帧长。
320、对ITD值进行量化处理。
图4是基于频域的ITD值计算方法的示例性流程图。图4的方法包括:
410、对左右声道时域信号进行时频变换,得到左右声道频域信号。
具体而言,时频变换可以采用离散傅里叶变换(Discrete Fourier Transformation,DFT)、修正的离散余弦变换(Modified Discrete Cosine Transform,MDCT)等技术,将时域信号变换为频域信号。
例如,对于输入的左右声道的时域信号,时频变换可以采用DFT变换,具体地,可以采用如下公式进行DFT变换。
Figure PCTCN2017074419-appb-000004
其中,n为时域信号的样点的索引值,k为频域信号的频点的索引值,L为时频变换长度。x(n)为左声道时域信号或右声道时域信号。
420、基于左右声道频域信号计算ITD值。
具体地,可以将频域信号的L个频点(Frequency Bin)划分为多个子带,对于第b个子带,其包含的频点为Ab-1≤k≤Ab-1。在搜索范围-Tmax≤j≤Tmax,可以采用如下公 式计算幅值:
Figure PCTCN2017074419-appb-000005
则第b个子带的ITD值可以为
Figure PCTCN2017074419-appb-000006
即上式计算出的最大值对应的样点的索引值。
430、对ITD值进行量化处理。
现有技术中,如果当前帧的多声道信号的互相关系数峰值较小,计算出的ITD值被认为是不准确的,在这种情况下,当前帧的ITD值将被置零。受到背景噪声、混响、多人同时讲话等因素的影响,按照现有的PS编码方式计算出的ITD值会出现被频繁置零的情况,从而导致ITD值来回跳变,利用这样的ITD值计算出的下混合信号会出现帧间不连续的现象,从而导致多声道信号的听觉质量差。
为了解决多声道参数来回跳变的问题,一种可行处理方式如下:当计算出的当前帧的多声道参数被认为不准确时,可以复用当前帧的前一帧的多声道参数。这种处理方式可以很好地解决多声道参数来回跳变的问题,但是,这种处理方式可能会引起如下问题:如果当前帧中的信号质量较好,计算出的当前帧的多声道参数一般是比较准确的。在这种情况下,如果仍沿用上述处理方式,当前帧的多声道参数可能仍会复用前一帧的多声道参数,而舍弃自身的比较准确的多声道参数,这样会导致多声道信号的声道间信息的不准确。
下文结合图5至图6,详细描述根据本申请实施例的音频信号的编码方法。
图5是本申请实施例的多声道信号的编码方法的示意性流程图。图5的方法包括:
510、获取当前帧的多声道信号。
需要说明的是,本申请实施例对多声道信号的数量不作具体限定。具体地,多声道信号可以是双声道信号,也可以是三声道信号,也可以是三个以上声道的信号。例如,多声道信号可以包括左声道信号和右声道信号。又如,多声道信号可以包括左声道信号、中声道信号、右声道信号和后声道信号。
520、确定当前帧的初始多声道参数。
在一些实施例中,当前帧的初始多声道参数可用于表征多声道信号之间的相关性。
在一些实施例中,当前帧的初始多声道参数包括以下中的至少一种:当前帧的初始IC值,当前帧的初始ITD值,当前帧的初始IPD值,当前帧的初始OPD值以及当前帧的初始ILD值等。
当前帧的初始多声道参数的计算方式可以有多种,具体可以参照现有技术。以多声道参数是ITD值为例,步骤520可以采用图3所示的基于时域的ITD值计算方式,也可以采用图4所述的基于频域的ITD值计算方式,还可以基于下式,采用基于混合域(时域+频域)的ITD值计算方式:
Figure PCTCN2017074419-appb-000007
其中,Li(f)表征左声道频域信号的频域系数,
Figure PCTCN2017074419-appb-000008
表征右声道频域信号的频域系数的共轭;argmax()表征取多个值中的最大值,IDFT()表征逆离散傅里叶变换。
530、根据当前帧的初始多声道参数,以及当前帧的前K帧的多声道参数,确定差异 参数,差异参数用于表征当前帧的初始多声道参数与前K帧的多声道参数的差异,其中,K为大于或等于1的整数。
应理解,当前帧的前K帧是指:待编码的音频信号的所有帧中的与当前帧紧邻的前K帧。例如,假设待编码的音频信号包括10帧,K=1,如果当前帧为10帧中的第5帧,那么当前帧的前K帧指该10帧中的第4帧。又如,假设待编码的音频信号包括10帧,K=2,如果当前帧为10帧中的第7帧,那么当前帧的前K帧指该10帧中的第5帧和第6帧。
除非特别说明,下文中出现的前K帧均指当前帧的前K帧,下文出现的前一帧均指当前帧的前一帧。
540、根据差异参数和当前帧的特征参数,确定当前帧的多声道参数。
需要说明的是,多声道参数(包括初始多声道参数)的表现形式可以是数值,因此,多声道参数也可称为多声道参数值。
在一些实施例中,当前帧的特征参数可包含当前帧的单声道参数,所述单声道参数可用于表征所述当前帧的多声道信号中的某个声道的信号的特性。
在一些实施例中,步骤540描述的确定当前帧的多声道参数可包括对初始多声道参数进行修正以获得当前帧的多声道参数。以当前帧的特征参数为当前帧的单声道参数为例,步骤540可以包括:根据差异参数和当前帧的单声道参数,对当前帧的初始多声道参数进行修正以获得当前帧的多声道参数。
在一些实施例中,当前帧的特征参数包括当前帧的以下参数中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数。其中,相关性参数用于表征当前帧与前一帧的相关程度,峰均比参数用于表征当前帧的多声道信号中的至少一个声道的信号的峰均比,信噪比参数用于表征当前帧的多声道信号中的至少一个声道的信号的信噪比,谱倾斜参数用于表征当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度或频谱能量变化趋势。
550、根据当前帧的多声道参数对多声道信号进行编码。
例如,可以执行图1所示的单声道音频编码、空间参数编码、比特流复用等操作,具体编码方式可以参照现有技术。
本申请实施例中,当前帧的多声道参数是在综合考虑了当前帧与前K帧之间的差异以及当前帧的特征参数之后确定的,这样的确定方式更加合理,与当前帧直接复用前一帧的多声道参数的方式相比,能够更好地保证多声道信号的声道间信息的准确性。
下面详细描述步骤540的实现方式。
可选地,在一些实施例中,步骤540可以包括:在差异参数满足第一预设条件的情况下,根据当前帧的特征参数的大小对当前帧的初始多声道参数的大小进行调整,得到当前帧的多声道参数。
可选地,在一些实施例中,步骤540可以包括:在当前帧的特征参数满足第一预设条件的情况下,根据差异参数的大小对当前帧的初始多声道参数的大小进行调整,得到当前帧的多声道参数。
应理解,上述第一预设条件可以是一个条件,也可以是多个条件的组合,此外,在第一预设条件满足的情况下,还可以结合其他条件继续进行判断,当所有条件都满足的情况下,再执行后续步骤。
可选地,在一些实施例中,如图6所示,步骤540可包括:
542、确定差异参数是否满足第一预设条件;
544、在差异参数满足第一预设条件的情况下,根据当前帧的特征参数,确定当前帧的多声道参数。
应理解,差异参数的定义方式有多种,不同的差异参数的定义方式可以对应不同的第一预设条件。下面对差异参数及其对应的第一预设条件进行详细描述。
可选地,在一些实施例中,差异参数可以是当前帧的初始多声道参数和前一帧的多声道参数的差值或差值的绝对值;第一预设条件可以是差异参数大于预设的第一阈值,该第一阈值可以是目标值的0.3-0.7倍,例如,该第一阈值可以是目标值的0.5倍,其中目标值为前一帧的多声道参数和当前帧的初始多声道参数中的绝对值较大的多声道参数。
可选地,在一些实施例中,差异参数可以是当前帧的初始多声道参数和前K帧的多声道参数的均值的差值或差值的绝对值;第一预设条件可以是差异参数大于预设的第一阈值,该第一阈值可以是目标值的0.3-0.7倍,例如,该第一阈值可以是目标值的0.5倍,其中目标值为前一帧的多声道参数和当前帧的初始多声道参数中的绝对值较大的多声道参数。
可选地,在一些实施例中,差异参数可以是当前帧的初始多声道参数和前一帧的多声道参数的乘积;第一预设条件可以是差异参数小于或等于0。
下文对步骤544的具体实现方式进行详细描述。
可选地,在一些实施例中,步骤544可包括:根据当前帧的相关性参数和/或谱倾斜参数,确定当前帧的多声道参数,其中,相关性参数用于表征当前帧与前一帧的相关程度,谱倾斜参数用于表征当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度或频谱能量变化趋势。
可选地,在一些实施例中,步骤544可包括:根据当前帧的相关性参数和/或峰均比参数,确定当前帧的多声道参数,其中,相关性参数用于表征当前帧与前一帧的相关程度,峰均比参数用于表征当前帧的多声道信号中的至少一个声道的信号的峰均比。
下文对当前帧的相关性参数进行详细描述。
具体地,相关性参数可用于表征当前帧与前一帧的相关程度。当前帧与前一帧的相关程度的表征方式可以有多种,不同的表征方式可以对应不同的相关性参数的计算方式,下面结合具体的实施例进行详细描述。
可选地,在一些实施例中,当前帧与前一帧的相关程度可以通过当前帧和前一帧的多声道信号中的目标声道信号的相关程度进行表征。应理解,当前帧的目标声道信号和前一帧的目标声道信号相互对应,即:如果当前帧的目标声道信号为左声道信号,前一帧的目标声道信号为左声道信号;如果当前帧的目标声道信号为右声道信号,前一帧的目标声道信号为右声道信号;如果当前帧的目标声道信号为左右声道信号,前一帧的目标声道信号为左右声道信号。还应理解,目标声道信号可以是目标声道时域信号或目标声道频信号。
以目标声道信号是频域信号为例,上述根据当前帧和前一帧的多声道信号中的目标声道信号,确定相关性参数,具体可以包括:根据当前帧和前一帧的多声道信号中的目标声道信号的频域参数,确定相关性参数,目标声道信号的频域参数包括目标声道信号 的频域幅度值和/或频域系数。
在一些实施例中,目标声道信号的频域幅度值可以是指目标声道信号的部分或全部子带的频域幅度值。例如,可以是目标声道信号的低频部分的子带的频域幅度值。
具体地,以目标声道信号是左声道频域信号为例,假设左声道频域信号的低频部分的频域幅度值包括M个子带,每个子带包括N个频域幅度值,可以根据下式计算当前帧和前一帧的各子带的频域幅度值的归一化互相关值,得到M个子带一一对应的M个归一化互相关值:
Figure PCTCN2017074419-appb-000009
其中,|L(i*N+j)|表征当前帧的左声道频域信号的低频部分的第i个子带的第j个频域幅度值,|L(-1)(i*N+j)|表征前一帧的左声道频域信号的低频部分的第i个子带的第j个频域幅度值,cor(i)表征M个子带中的第i个子带的归一化互相关值。
然后,可以将M个归一化互相关值确定为当前帧和前一帧的相关性参数;或者,可以将M个归一化互相关值之和或M个归一化互相关值的平均值确定为当前帧的相关性参数。
在一些实施例中,可以将上文的基于频域幅度值计算相关性参数的方式替换为基于频域系数计算相关性参数。
在一些实施例中,可以将上文的基于频域幅度值计算相关性参数的方式替换为基于频域系数的绝对值计算相关性参数。
应理解,当前帧的多声道信号可以指当前帧的一个或多个子帧的多声道信号;同理,前一帧的多声道信号可以指前一帧的一个或多个子帧的多声道信号。也就是说,相关性参数既可以基于当前帧和前一帧的所有多声道信号进行计算,也可以基于当前帧和前一帧中的一个或一些子帧的多声道信号进行计算。
以目标声道信号为左右声道时域信号为例,可以根据下式计算当前帧的左右声道时域信号与前一帧的左右声道时域信号在每个样点的归一化互相关值,得到N个归一化互相关值,并从该N个归一化互相关值中搜索出最大的归一化互相关值:
Figure PCTCN2017074419-appb-000010
其中,L(n)表征左声道时域信号,R(n)表征右声道时域信号,N为左声道时域信号的总样点数,L为右声道时域信号的第n个样点与左声道时域信号的第n个样点之间偏移的样点数。
在一些实施例中,可以将上式计算出的最大归一化互相关值作为当前帧的相关性参数。
应理解,当前帧的多声道信号可以指当前帧的一个或多个子帧的多声道信号;同理,前一帧的多声道信号可以指前一帧的一个或多个子帧的多声道信号。例如,可以以子帧为单位,通过上式计算出多个子帧一一对应的多个最大归一化互相关值,然后将该多个 最大归一化互相关值,该多个最大归一化互相关值之和,或该多个最大归一化互相关值的均值中的一个或多个作为当前帧的相关性参数。
上文给出的是基于时频信号的相关性参数计算方式,下文详细描述基于基音周期的相关性参数计算方式。
可选地,在一些实施例中,当前帧与前一帧的相关程度可以通过当前帧和前一帧的基音周期的相关程度进行表征。在这种情况下,可以根据当前帧的基音周期,以及前一帧的基音周期,确定相关性参数。
在一些实施例中,当前帧或前一帧的基音周期可以包括当前帧或前一帧的各个子帧的基音周期。
具体地,可以根据现有的基音周期算法,计算当前帧或当前帧内各子帧的基音周期,并计算前一帧或前一帧内各子帧的基音周期。然后,计算当前帧或前一帧内各子帧的基音周期的偏差值,或者,计算当前帧内各子帧与前一帧内各子帧间的基音周期的偏差值。然后,可以将计算得到的基音周期的偏差值作为当前帧和前一帧的相关性参数。
下文对当前帧的峰均比参数进行详细描述。
当前帧的峰均比参数可用于表征当前帧的多声道信号中的至少一个声道的信号的峰均比。
例如,多声道信号包括左声道信号和右声道信号,峰均比参数可以是左声道信号的峰均比,也可以是右声道信号的峰均比,也可以是左声道信号的峰均比和右声道信号的峰均比的混合。
峰均比参数的计算方式可以有多种。例如,可以基于频域信号的频域幅度值计算。又如,可以基于频域信号的频域系数或频域系数的绝对值计算。
在一些实施例中,频域信号的频域幅度值可以是指频域信号的部分或全部子带的频域幅度值。例如,可以是频域信号的低频部分的子带的频域幅度值。
以左声道频域信号为例,假设左声道频域信号的低频部分包括M个子带,每个子带包括N个频域幅度值,可以计算各子带的N个频域幅度值的峰均比,得到M个子带一一对应的M个峰均比,然后将该M个峰均比,或M个峰均比之和,或M个峰均比的均值作为当前帧的峰均比参数。需要说明的是,在计算各子带的峰均比的过程中,为了减少计算复杂度,可以将各子带的最大频域幅度值与每个子带的N个频域幅度值之和的比值作为峰均比。在峰均比与预设阈值做比较时,可以通过最大频域幅度值与预设阈值和每个子带的N个频域幅度值之和的乘积做比较;也可以通过最大频域幅度值与预设阈值和每个子带的N个频域幅度值的平均值的乘积做比较。
在一些实施例中,当前帧的多声道信号可以指当前帧的一个或多个子帧的多声道信号。
当前帧的特征参数还可以包括当前帧的信噪比参数,下面对信噪比参数进行详细描述。
当前帧的信噪比参数可用于表征当前帧的多声道信号中的至少一个声道的信噪比或信噪比特性。
应理解,当前帧的信噪比参数可以包括一个或多个参数,本申请实施例对参数的具体选取方式不作限定。例如,当前帧的信噪比参数可以包括多声道信号的子带信噪比、修正的子带信噪比、分段信噪比、修正的分段信噪比、全带信噪比、修正的全带信噪比 以及可以表征多声道信号的信噪比特性的其他参数中的至少一种。
需要说明的是,本申请实施例对信噪比参数的确定方式不作具体限定。
例如,可以采用多声道信号的全部信号计算当前帧的信噪比参数。
又如,可以采用多声道信号中的部分信号计算当前帧的信噪比参数。
又如,可以自适应选择多声道信号中的任意一个声道的信号进行计算当前帧的信噪比参数。
又如,可以先对表征多声道信号的数据进行加权平均,形成新的信号,然后利用新的信号的信噪比表征该当前帧的信噪比参数。
当前帧的特征参数还可以包括当前帧的谱倾斜参数,下面对谱倾斜参数进行详细描述。
当前帧的谱倾斜参数可用于表征当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度或频谱能量变化趋势。应理解,频谱倾斜程度越大,表示信号浊音性越弱;频谱倾斜程度越小,表示信号的浊音性越强。
下面详细描述步骤544中的根据当前帧的特征参数,确定当前帧的多声道参数的方式。
可选地,在一些实施例中,可以根据当前帧的特征参数,确定当前帧是否复用前一帧的多声道参数。
例如,可以在特征参数满足第二预设条件的情况下,当前帧复用前一帧的多声道参数。或者,可以在特征参数不满足该第二预设条件的情况下,将当前帧的初始多声道参数作为当前帧的多声道参数,应理解,本申请实施例对特征参数不满足该第二预设条件时的处理方式不作具体限定,例如,还可采用现有的其他方式对初始多声道参数进行修正。
可选地,在一些实施例中,可以根据当前帧的特征参数,确定是否根据前T帧的多声道参数的变化趋势,确定当前帧的多声道参数,其中,T大于或等于2。
例如,可以在特征参数满足第二预设条件的情况下,根据前T帧的多声道参数的变化趋势,确定当前帧的多声道参数。或者,可以在特征参数不满足该第二预设条件的情况下,将当前帧的初始多声道参数作为当前帧的多声道参数,应理解,本申请实施例对特征参数不满足该第二预设条件时的处理方式不作具体限定,例如,还可采用现有的其他方式对初始多声道参数进行修正。
应理解,上述第二预设条件可以是一个条件,也可以是多个条件的组合,此外,在第二预设条件满足的情况下,还可以结合其他条件继续进行判断,当所有条件都满足的情况下,再执行后续步骤。
应理解,当前帧的前T帧是指:待编码的音频信号的所有帧中的与当前帧紧邻的前T帧。例如,待编码的音频信号包括10帧,T=2,当前帧为10帧中的第5帧,那么当前帧的前T帧指该10帧中的第3帧和第4帧。
应理解,根据前T帧的多声道参数的变化趋势,确定当前帧的多声道参数的方式可以有多种。以多声道参数为ITD值为例,当前帧的ITD值ITD[i]可以通过如下方式计算得到:
ITD[i]=ITD[i-1]+delta
其中,delta=ITD[i-1]–ITD[i-2],ITD[i-1]表征当前帧的前一帧的ITD值,ITD[i-2] 表征当前帧的前一帧的前一帧的ITD值。
下面对上文中的第二预设条件进行详细描述。
应理解,第二预设条件的定义方式可以有多种,且第二预设条件的设定与特征参数的选取有关,本申请实施例对此不作具体限定。
以特征参数为相关性参数和/或峰均比参数,相关性参数为当前帧和前一帧的多声道信号在各个子带的相关值的均值,峰均比参数为当前帧的多声道信号在各个子带的峰均比的均值为例,第二预设条件可以是以下条件中的一个或多个:
相关性参数大于第二阈值,其中,第二阈值的取值范围例如可以是0.6-0.95,例如可以是0.85;
峰均比参数大于第三阈值,第三阈值的取值范围例如可以是0.4-0.8,例如可以是0.6;
相关性参数大于第四阈值且某个子带的相关值大于第五阈值,其中第四阈值的取值范围可以是0.6~0.85,例如可以是0.7;第五阈值的取值范围可以为0.8~0.95,例如可以是0.9;
峰均比参数大于第六阈值且某个子带的峰均比大于第七阈值,第六阈值的取值范围可以为0.4~0.75,例如可以是0.55;第七阈值的取值范围可以是0.6~0.9,例如可以是0.7;
上文中的第二阈值可以大于第四阈值,第四阈值可以小于第五阈值;或者,第三阈值可以大于第六阈值,第六阈值可以小于第七阈值。
需要说明的是,在特征参数包括峰均比参数,第二预设条件包括峰均比参数大于或等于某个预设阈值的情况下,需要确定峰均比参数与预设阈值的大小关系,为了简化计算,可以将峰均比参数与预设阈值的比较过程转换成峰均比中的峰值与目标值进行比较,目标值可以是预设阈值与峰均比的均值的乘积,也可以是预设阈值与用于计算峰均比的参数的和的乘积。以用于计算峰均比的参数为子带的频域幅度值,每个子带包括N个频域幅度值为例,在峰均比与预设阈值做比较时,可以通过每个子带的最大频域幅度值与预设阈值和每个子带的N个频域幅度值之和的乘积做比较;也可以通过每个子带的最大频域幅度值与预设阈值和每个子带的N个频域幅度值的平均值的乘积做比较。
下面结合图7的例子,更加详细地描述本申请实施例。图7主要是以当前帧的多声道信号包括左声道信号和右声道信号,多声道参数是ITD值为例进行说明的,应注意,图7的例子仅仅是为了帮助本领域技术人员理解本申请实施例,而非要将本申请实施例限于所例示的具体数值或具体场景。本领域技术人员根据所给出的图7的例子,显然可以进行各种等价的修改或变化,这样的修改或变化也落入本申请实施例的范围内。
图7是本申请实施例的多声道信号的编码方法的示意性流程图。应理解,图7示出的处理步骤或操作仅是示例,本申请实施例还可以执行其它操作或者图7中的各种操作的变形。此外,图7中的各个步骤可以按照与图7呈现的不同的顺序来执行,并且有可能并非要执行图7中的全部操作。
图7的方法包括:
710、对当前帧的左右声道时域信号进行时频变换,得到左右声道频域信号。
720、对左右声道频域信号进行归一化互相关运算,得到目标频域信号。
730、对目标频域信号进行频时变换,得到目标时域信号。
740、根据所述目标时域信号,确定当前帧的初始ITD值。
步骤720-740描述的过程可以通过下式表示:
Figure PCTCN2017074419-appb-000011
其中,Li(f)表征左声道频域信号的频域系数,
Figure PCTCN2017074419-appb-000012
表征右声道频域信号的频域系数的共轭;argmax()表征取多个值中的最大值,IDFT()表征逆离散傅里叶变换。
750、进行ITD精细控制,以计算当前帧的ITD值。
760、根据当前帧的ITD值,对左右声道时域信号进行相位偏移。
770、对左右声道时域信号进行下混合。
步骤760-770的实现方式可以参照现有技术,此处不再详述。
步骤750对应于图5中的步骤530,可以采用步骤530给出的任一种实现方式,下文列举几种可选的实现方式。
实现方式一:
步骤一,可以将当前帧的左声道频域信号的低频部分分成M个子带,每个子带包含N个频域幅度值。
步骤二,可以根据下式计算当前帧与前一帧的相关性参数:
Figure PCTCN2017074419-appb-000013
其中,|L(i*N+j)|表征当前帧的左声道频域信号的低频部分的第i个子带的第j个频域幅度值,|L(-1)(i*N+j)|表征前一帧的左声道频域信号的低频部分的第i个子带的第j个频域幅度值,cor(i)表征M个子带中的第i个子带对应的归一化互相关值。
应理解,通过步骤二的计算,得到当前帧与前一帧的相关性参数,该相关性参数可以是各子带的归一化互相关值,也可以是各子带的归一化互相关值的均值。
步骤三,计算当前帧的各子带的峰均比。
应理解,步骤二和步骤三可以同时执行,也可以先后执行。此外,每个子带的峰均比可以用每个子带的频域幅度值的峰值和均值的比值表示,也可以用每个子带的频域幅度值的峰值和该子带内的频域幅度值之和的比值表示,这样可以减少计算复杂度。
应理解,通过步骤三的计算,可以得到当前帧的多声道信号的峰均比参数,该峰均比参数可以是各子带的峰均比,也可以是各子带的峰均比之和或各子带的峰均比的均值。
步骤四、如果当前帧的初始ITD值与前一帧的ITD值满足第一预设条件,则根据当前帧的相关性参数和/或峰均比参数,判断当前帧是否复用前一帧的ITD值。
第一预设条件例如可以是:
前一帧的ITD值与当前帧的初始ITD值的乘积为0;或者,
前一帧的ITD值与当前帧的初始ITD值的乘积为负;或者,
前一帧的ITD值与当前帧的初始ITD值的差值的绝对值大于目标值的一半,其中,目标值是前一帧的ITD值和当前帧的初始ITD值中的绝对值较大的ITD值。
需要说明的是,上述第一预设条件可以是一个条件,也可以是多个条件的组合,此外,在第一预设条件满足的情况下,还可以结合其他条件继续进行判断,当所有条件都满足的情况下,再执行后续步骤。
根据当前帧的相关性参数和/或峰均比参数,判断当前帧是否复用前一帧的ITD值具体可以指:判断当前帧的相关性参数和/或峰均比参数是否满足第二预设条件,在当前帧的相关性参数和/或峰均比参数满足第二预设条件的情况下,当前帧复用前一帧的ITD值。
第二预设条件例如可以是:
各子带的归一化互相关值的均值大于第一阈值;或者,
各子带的峰均比的均值大于第二阈值;或者,
各子带的归一化互相关值的均值大于第三阈值且某个子带的归一化互相关值大于第四阈值;或者,
各子带的峰均比的均值大于第五阈值且某个子带的峰均比大于第六阈值;
其中,上述第一阈值大于第三阈值,第三阈值小于第四阈值;第二阈值大于第五阈值,第五阈值小于第六阈值。
需要说明的是,上述第二预设条件可以是一个条件,也可以是多个条件的组合,此外,在第二预设条件满足的情况下,还可以结合其他条件继续进行判断,当所有条件都满足的情况下,再执行后续步骤。
需要说明的是,上文中描述的当前帧的左声道频域信号可以是当前帧中的某一子帧或某一些子帧的左声道频域信号,上文中描述的前一帧的左声道频域信号可以是前一帧中的某一子帧或某一些子帧的左声道频域信号。换句话说,相关性参数可以通过当前帧和前一帧的参数计算得到,也可以通过当前帧和前一帧中的某一子帧或某些子帧的参数计算得到。同理,峰均比参数可以通过当前帧的参数计算得到,也可以通过当前帧中的某一子帧或某些子帧计算得到。
实现方式二:
实现方式二与上述实现方式的不同之处在于:上述实现方式是基于子带的频域幅度值计算当前帧和前一帧的相关性参数,实现方式二是基于子带的频域系数或频域系数的绝对值计算当前帧和前一帧的相关性参数。实现方式二与上述实现方式的具体实现过程类似,此处不再详述。
实现方式三:
实现方式三与上述实现方式的不同之处在于:上述实现方式是基于子带的频域幅度值计算峰均比参数,实现方式三是基于子带的频域系数的绝对值计算峰均比参数。实现方式三与上述实现方式的具体实现过程类似,此处不再详述。
实现方式四:
实现方式四与上述实现方式的不同之处在于:上述实现方式是基于左声道频域信号计算相关性参数和/或峰均比参数,实现方式四是基于右声道频域信号计算相关性参数和/或峰均比参数。实现方式四与上述实现方式的具体实现过程类似,此处不再详述。
实现方式五:
实现方式五与上述实现方式的不同之处在于:上述实现方式是基于左声道频域信号或右声道频域信号计算相关性参数和/或峰均比参数,实现方式五是基于左右声道频域信号计算相关性参数和/或峰均比参数。
具体实现时,可以根据左声道频域信号计算一组相关性参数和/或峰均比参数;再利用右声道频域信号计算一组相关性参数和/或峰均比参数。然后,可以选取两组参数中取值较大的一组作为最终的相关性参数和/或峰均比参数。实现方式五的其他过程与上述实 现方式类似,此处不再详述。
实现方式六:
实现方式六与上述实现方式的不同之处:上述实现方式是基于频域信号计算相关性参数,实现方式六是基于时域信号计算相关性参数。
具体地,可以通过下式计算当前帧和前一帧的相关性参数:
Figure PCTCN2017074419-appb-000014
其中,L(n)表征左声道时域信号,R(n)表征右声道时域信号,N为左声道时域信号的总样点数,L为右声道信号的第n个样点与左声道的第n个样点之间偏移的样点数。
应理解,这里的左声道时域信号和右声道时域信号可以是当前帧中的所有左声道信号和右声道信号,也可以是当前帧中的某个或某些子帧的左声道信号和右声道信号。
实现方式六的其他实现过程与上述实现方式类似,此处不再详述。
实现方式七:
实现方式七与上述实现方式的不同之处在于:上述实现方式是要判断当前帧是否复用前一帧的ITD值,实现方式七是要判断当前帧的ITD值是否通过当前帧的前T帧的ITD值的变化趋势进行估计,T为大于或等于2的整数。
当前帧的ITD值ITD[i]可以通过如下方式计算得到:
ITD[i]=ITD[i-1]+delta,
其中,delta=ITD[i-1]–ITD[i-2],ITD[i-1]表征当前帧的前一帧的ITD值,ITD[i-2]表征当前帧的前一帧的前一帧的ITD值。
实现方式八:
实现方式八和上述实现方式的不同之处在于:上述实现方式是基于当前帧和前一帧的时频信号计算当前帧和前一帧的相关性参数,实现方式八是基于当前帧和前一帧的基音周期计算该相关性参数。
具体地,可以根据现有的基音周期算法,计算当前帧或当前帧的基音周期;同时计算相应的前一帧的基音周期;计算当前帧与前一帧的基音周期的偏差;将当前帧和前一帧的基音周期的偏差作为当前帧和前一帧的相关性参数。
应理解,当前帧和前一帧的基音周期的偏差可以是当前帧和前一帧整体的基音周期的偏差,也可以是当前帧和前一帧中的某个或某些子帧的基音周期的偏差,也可以是当前帧和前一帧中的某些子帧的基音周期的偏差之和,也可以是当前帧和前一帧中的某些子帧的基音周期的偏差的均值。
实现方式九:
实现方式九和上述实现方式的不同之处在于:上述实现方式是基于相关性参数和/或峰均比参数确定当前帧的ITD值,实现方式九是基于相关性参数和/或频谱倾斜参数确定当前帧的ITD值。
此时,第二预设条件可以是:当前帧和前一帧的相关性参数中的相关值大于某个阈值,和/或频谱斜率参数中的频谱斜率值小于某个阈值(应理解,谱斜率值越大,表示信号的浊音性越弱;频谱斜率值越小,表示信号的浊音性越强)。
实现方式九的其他过程与上述实现方式类似,此处不再详述。
实现方式十:
实现方式十与上述实现方式的区别在于:上述实现方式计算的是当前帧的ITD值,实现方式十计算的是当前帧的IPD值。应理解,步骤710-770中的ITD值相关的计算过程均需要替换成IPD值相关的过程,IPD值的计算方式可以参照现有技术,此处不再详述。
实现方式十的其他过程与上述实现方式大致类似,此处不再详述。
应理解,以上十种实现方式仅是举例说明,实际中,这些实现方式可以相互替换或相互组合,以得到新的实现方式,为了简洁,此处不再一一列举。
下面对本申请的装置实施例进行描述,由于装置实施例可以执行上述方法,因此未详细描述的部分可以参见前面各方法实施例。
图8是本申请实施例的编码器的示意性框图。图8的编码器800包括:
获取单元810,用于获取当前帧的多声道信号;
第一确定单元820,用于确定所述当前帧的初始多声道参数;
第二确定单元830,用于根据所述当前帧的初始多声道参数,以及所述当前帧的前K帧的多声道参数,确定差异参数,所述差异参数用于表征所述当前帧的初始多声道参数与所述前K帧的多声道参数的差异,其中,K为大于或等于1的整数;
第三确定单元840,用于根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数;
编码单元850,用于根据所述当前帧的多声道参数对所述多声道信号进行编码。
本申请实施例中,当前帧的多声道参数是在综合考虑了当前帧与前K帧之间的差异以及当前帧的特征参数之后确定的,这样的确定方式更加合理,与当前帧直接复用前一帧的多声道参数的方式相比,能够更好地保证多声道信号的声道间信息的准确性。
可选地,在一些实施例中,所述第三确定单元840具体用于在所述差异参数满足第一预设条件的情况下,根据所述当前帧的特征参数,确定所述当前帧的多声道参数。
可选地,在一些实施例中,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的差值的绝对值,所述第一预设条件为所述差异参数大于预设的第一阈值。
可选地,在一些实施例中,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的乘积,所述第一预设条件为所述差异参数小于或等于0。
可选地,在一些实施例中,所述第三确定单元840具体用于根据所述当前帧的相关性参数,确定所述当前帧的多声道参数,其中,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度。
可选地,在一些实施例中,所述第三确定单元840具体用于根据所述当前帧的峰均比参数,确定所述当前帧的多声道参数,其中所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比。
可选地,在一些实施例中,所述第三确定单元840具体用于根据所述当前帧的相关性参数和峰均比参数,确定所述当前帧的多声道参数,其中,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比。
可选地,在一些实施例中,所述编码器还包括:
第四确定单元,用于根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数。
可选地,在一些实施例中,所述第四确定单元具体用于根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值和频域系数中的至少一个。
可选地,在一些实施例中,所述编码器还包括:
第五确定单元,用于根据所述当前帧的基音周期,以及所述前一帧的基音周期,确定所述相关性参数。
可选地,在一些实施例中,所述第三确定单元840具体用于在所述特征参数满足第二预设条件的情况下,根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,T为大于或等于1的整数。
可选地,在一些实施例中,所述第三确定单元840具体用于将所述前T帧的多声道参数确定为所述当前帧的多声道参数,其中,T等于1。
可选地,在一些实施例中,所述第三确定单元840具体用于根据所述前T帧的多声道参数的变化趋势,确定所述当前帧的多声道参数,其中,T大于或等于2。
可选地,在一些实施例中,所述特征参数包括所述当前帧的相关性参数和/或峰均比参数,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述第二预设条件为所述特征参数大于预设的阈值。
可选地,在一些实施例中,所述当前帧的初始多声道参数包括以下中的至少一种:所述当前帧的初始声道间相关性IC值,所述当前帧的初始声道间时间差ITD值,所述当前帧的初始声道间相位差IPD值,当前帧的初始整体相位差OPD值,以及所述当前帧的初始声道间电平差ILD值。
可选地,在一些实施例中,所述当前帧的特征参数包括所述当前帧的以下中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数,所述相关性参数用于表征所述当前帧与所述前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述信噪比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的信噪比,所述谱倾斜参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度。
图9是本申请实施例的编码器的示意性框图。图9的编码器900包括:
存储器910,用于存储程序;
处理器920,用于执行程序,当所述程序被执行时,所述处理器920用于获取当前帧的多声道信号;确定所述当前帧的初始多声道参数;根据所述当前帧的初始多声道参数,以及所述当前帧的前K帧的多声道参数,确定差异参数,所述差异参数用于表征所述当前帧的初始多声道参数与所述前K帧的多声道参数的差异,其中,K为大于或等于1的整数;根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数;根据所述当前帧的多声道参数对所述多声道信号进行编码。
本申请实施例中,当前帧的多声道参数是在综合考虑了当前帧与前K帧之间的差异 以及当前帧的特征参数之后确定的,这样的确定方式更加合理,与当前帧直接复用前一帧的多声道参数的方式相比,能够更好地保证多声道信号的声道间信息的准确性。
可选地,在一些实施例中,所述处理器920具体用于在所述差异参数满足第一预设条件的情况下,根据所述当前帧的特征参数,确定所述当前帧的多声道参数。
可选地,在一些实施例中,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的差值的绝对值,所述第一预设条件为所述差异参数大于预设的第一阈值。
可选地,在一些实施例中,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的乘积,所述第一预设条件为所述差异参数小于或等于0。
可选地,在一些实施例中,所述处理器920具体用于根据所述当前帧的相关性参数,确定所述当前帧的多声道参数,其中,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度。
可选地,在一些实施例中,所述处理器920具体用于根据所述当前帧的峰均比参数,确定所述当前帧的多声道参数,其中,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比。
可选地,在一些实施例中,所述处理器920具体用于根据所述当前帧的相关性参数和峰均比参数,确定所述当前帧的多声道参数,其中,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比。
可选地,在一些实施例中,所述处理器920还用于根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数。
可选地,在一些实施例中,所述处理器920具体用于根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值。
可选地,在一些实施例中,所述处理器920具体用于根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域系数。
可选地,在一些实施例中,所述处理器920具体用于根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值和频域系数。
可选地,在一些实施例中,所述处理器920还用于根据所述当前帧的基音周期,以及所述前一帧的基音周期,确定所述相关性参数。
可选地,在一些实施例中,所述处理器920具体用于在所述特征参数满足第二预设条件的情况下,根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,T为大于或等于1的整数。
可选地,在一些实施例中,所述处理器920具体用于将所述前T帧的多声道参数确定为所述当前帧的多声道参数,其中,T等于1。
可选地,在一些实施例中,所述处理器920具体用于根据所述前T帧的多声道参数 的变化趋势,确定所述当前帧的多声道参数,其中,T大于或等于2。
可选地,在一些实施例中,所述特征参数包括所述当前帧的相关性参数和/或峰均比参数,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述第二预设条件为所述特征参数大于预设的阈值。
可选地,在一些实施例中,所述当前帧的初始多声道参数包括以下中的至少一种:所述当前帧的初始声道间相关性IC值,所述当前帧的初始声道间时间差ITD值,所述当前帧的初始声道间相位差IPD值,当前帧的初始整体相位差OPD值,以及所述当前帧的初始声道间电平差ILD值。
可选地,在一些实施例中,所述当前帧的特征参数包括所述当前帧的以下中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数,所述相关性参数用于表征所述当前帧与所述前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述信噪比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的信噪比,所述谱倾斜参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度。
本文中术语“和/或”表示可以存在三种关系。例如,A和/或B可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中的字符“/”一般表示前后关联对象是一种“或”的关系。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该 计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。

Claims (28)

  1. 一种多声道信号的编码方法,其特征在于,包括:
    获取当前帧的多声道信号;
    确定所述当前帧的初始多声道参数;
    根据所述当前帧的初始多声道参数,以及所述当前帧的前K帧的多声道参数,确定差异参数,所述差异参数用于表征所述当前帧的初始多声道参数与所述前K帧的多声道参数的差异,其中,K为大于或等于1的整数;
    根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数;
    根据所述当前帧的多声道参数对所述多声道信号进行编码。
  2. 如权利要求1所述的方法,其特征在于,所述根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:
    在所述差异参数满足第一预设条件的情况下,根据所述当前帧的特征参数,确定所述当前帧的多声道参数。
  3. 如权利要求2所述的方法,其特征在于,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的差值的绝对值,所述第一预设条件为所述差异参数大于预设的第一阈值。
  4. 如权利要求2所述的方法,其特征在于,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的乘积,所述第一预设条件为所述差异参数小于或等于0。
  5. 如权利要求2-4中任一项所述的方法,其特征在于,所述根据所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:
    根据所述当前帧的相关性参数,确定所述当前帧的多声道参数,其中,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度。
  6. 如权利要求5所述的方法,其特征在于,所述方法还包括:
    根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数。
  7. 如权利要求6所述的方法,其特征在于,所述根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数,包括:
    根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值和频域系数中的至少一个。
  8. 如权利要求5所述的方法,其特征在于,所述方法还包括:
    根据所述当前帧的基音周期,以及所述前一帧的基音周期,确定所述相关性参数。
  9. 如权利要求2-8中任一项所述的方法,其特征在于,所述根据所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:
    在所述特征参数满足第二预设条件的情况下,根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,T为大于或等于1的整数。
  10. 如权利要求9所述的方法,其特征在于,所述根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,包括:
    将所述前T帧的多声道参数确定为所述当前帧的多声道参数,其中,T等于1。
  11. 如权利要求9所述的方法,其特征在于,所述根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,包括:
    根据所述前T帧的多声道参数的变化趋势,确定所述当前帧的多声道参数,其中,T大于或等于2。
  12. 如权利要求9-11中任一项所述的方法,其特征在于,所述当前帧的特征参数包括所述当前帧的相关性参数和峰均比参数中的至少一个,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述第二预设条件为所述特征参数大于预设阈值。
  13. 如权利要求1-12中任一项所述的方法,其特征在于,所述当前帧的初始多声道参数包括以下中的至少一种:所述当前帧的初始声道间相关性IC值,所述当前帧的初始声道间时间差ITD值,所述当前帧的初始声道间相位差IPD值,当前帧的初始整体相位差OPD值,以及所述当前帧的初始声道间电平差ILD值。
  14. 如权利要求1-13中任一项所述的方法,其特征在于,所述当前帧的特征参数包括所述当前帧的以下中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数,所述相关性参数用于表征所述当前帧与所述前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述信噪比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的信噪比,所述谱倾斜参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度。
  15. 一种编码器,其特征在于,包括:
    获取单元,用于获取当前帧的多声道信号;
    第一确定单元,用于确定所述当前帧的初始多声道参数;
    第二确定单元,用于根据所述当前帧的初始多声道参数,以及所述当前帧的前K帧的多声道参数,确定差异参数,所述差异参数用于表征所述当前帧的初始多声道参数与所述前K帧的多声道参数的差异,其中,K为大于或等于1的整数;
    第三确定单元,用于根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数;
    编码单元,用于根据所述当前帧的多声道参数对所述多声道信号进行编码。
  16. 如权利要求15所述的编码器,其特征在于,所述第三确定单元具体用于在所述差异参数满足第一预设条件的情况下,根据所述当前帧的特征参数,确定所述当前帧的多声道参数。
  17. 如权利要求16所述的编码器,其特征在于,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的差值的绝对值,所述第一预设条件为所述差异参数大于预设的第一阈值。
  18. 如权利要求16所述的编码器,其特征在于,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的乘积,所述第一预设条件为所述差异参数小于或等于0。
  19. 如权利要求16-18中任一项所述的编码器,其特征在于,所述第三确定单元具体用于根据所述当前帧的相关性参数,确定所述当前帧的多声道参数,其中,所述相关 性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度。
  20. 如权利要求19所述的编码器,其特征在于,所述编码器还包括:
    第四确定单元,用于根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数。
  21. 如权利要求20所述的编码器,其特征在于,所述第四确定单元具体用于根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值和频域系数中的至少一个。
  22. 如权利要求19所述的编码器,其特征在于,所述编码器还包括:
    第五确定单元,用于根据所述当前帧的基音周期,以及所述前一帧的基音周期,确定所述相关性参数。
  23. 如权利要求16-22中任一项所述的编码器,其特征在于,所述第三确定单元具体用于在所述特征参数满足第二预设条件的情况下,根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,T为大于或等于1的整数。
  24. 如权利要求23所述的编码器,其特征在于,所述第三确定单元具体用于将所述前T帧的多声道参数确定为所述当前帧的多声道参数,其中,T等于1。
  25. 如权利要求23所述的编码器,其特征在于,所述第三确定单元具体用于根据所述前T帧的多声道参数的变化趋势,确定所述当前帧的多声道参数,其中,T大于或等于2。
  26. 如权利要求23-25中任一项所述的编码器,其特征在于,所述特征参数包括所述当前帧的相关性参数和峰均比参数中的至少一个,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述第二预设条件为所述特征参数大于预设阈值。
  27. 如权利要求15-26中任一项所述的编码器,其特征在于,所述当前帧的初始多声道参数包括以下中的至少一种:所述当前帧的初始声道间相关性IC值,所述当前帧的初始声道间时间差ITD值,所述当前帧的初始声道间相位差IPD值,当前帧的初始整体相位差OPD值,以及所述当前帧的初始声道间电平差ILD值。
  28. 如权利要求15-27中任一项所述的编码器,其特征在于,所述当前帧的特征参数包括所述当前帧的以下中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数,所述相关性参数用于表征所述当前帧与所述前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述信噪比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的信噪比,所述谱倾斜参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度。
PCT/CN2017/074419 2016-08-10 2017-02-22 多声道信号的编码方法和编码器 WO2018028170A1 (zh)

Priority Applications (17)

Application Number Priority Date Filing Date Title
JP2019507137A JP6768924B2 (ja) 2016-08-10 2017-02-22 マルチチャネル信号の符号化方法およびエンコーダ
CA3033225A CA3033225C (en) 2016-08-10 2017-02-22 Multi-channel signal encoding method and encoder
KR1020197005937A KR102205596B1 (ko) 2016-08-10 2017-02-22 다중 채널 신호 인코딩 방법 및 인코더
EP17838306.3A EP3493203B1 (en) 2016-08-10 2017-02-22 Method for encoding multi-channel signal and encoder
KR1020227005726A KR102486604B1 (ko) 2016-08-10 2017-02-22 다중 채널 신호 인코딩 방법 및 인코더
RU2019106315A RU2705427C1 (ru) 2016-08-10 2017-02-22 Способ кодирования многоканального сигнала и кодировщик
AU2017310759A AU2017310759B2 (en) 2016-08-10 2017-02-22 Multi-channel signal encoding method and encoder
BR112019002656-8A BR112019002656B1 (pt) 2016-08-10 2017-02-22 Método de codificação de sinal de canal múltiplo, codificador, e meio de armazenamento legível por computador
EP22179454.8A EP4120252A1 (en) 2016-08-10 2017-02-22 Multi-channel signal encoder and computer readable medium
ES17838306T ES2928335T3 (es) 2016-08-10 2017-02-22 Método para codificar señales multicanal y codificador
KR1020217001206A KR102367538B1 (ko) 2016-08-10 2017-02-22 다중 채널 신호 인코딩 방법 및 인코더
US16/272,397 US11133014B2 (en) 2016-08-10 2019-02-11 Multi-channel signal encoding method and encoder
AU2020267256A AU2020267256B2 (en) 2016-08-10 2020-11-12 Multi-channel signal encoding method and encoder
US17/408,116 US11935548B2 (en) 2016-08-10 2021-08-20 Multi-channel signal encoding method and encoder
AU2022218507A AU2022218507B2 (en) 2016-08-10 2022-08-17 Multi-channel signal encoding method and encoder
US18/419,794 US20240161756A1 (en) 2016-08-10 2024-01-23 Multi-Channel Signal Encoding Method and Encoder
AU2024205199A AU2024205199A1 (en) 2016-08-10 2024-07-30 Multi-channel signal encoding method and encoder

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610652506.X 2016-08-10
CN201610652506.XA CN107731238B (zh) 2016-08-10 2016-08-10 多声道信号的编码方法和编码器

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/272,397 Continuation US11133014B2 (en) 2016-08-10 2019-02-11 Multi-channel signal encoding method and encoder

Publications (1)

Publication Number Publication Date
WO2018028170A1 true WO2018028170A1 (zh) 2018-02-15

Family

ID=61161463

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/074419 WO2018028170A1 (zh) 2016-08-10 2017-02-22 多声道信号的编码方法和编码器

Country Status (10)

Country Link
US (3) US11133014B2 (zh)
EP (2) EP4120252A1 (zh)
JP (4) JP6768924B2 (zh)
KR (3) KR102486604B1 (zh)
CN (1) CN107731238B (zh)
AU (4) AU2017310759B2 (zh)
CA (1) CA3033225C (zh)
ES (1) ES2928335T3 (zh)
RU (1) RU2705427C1 (zh)
WO (1) WO2018028170A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020069219A1 (en) 2018-09-26 2020-04-02 Cala Health, Inc. Predictive therapy neurostimulation systems
US10765856B2 (en) 2015-06-10 2020-09-08 Cala Health, Inc. Systems and methods for peripheral nerve stimulation to treat tremor with detachable therapy and monitoring units
US10905879B2 (en) 2014-06-02 2021-02-02 Cala Health, Inc. Methods for peripheral nerve stimulation
US11331480B2 (en) 2017-04-03 2022-05-17 Cala Health, Inc. Systems, methods and devices for peripheral neuromodulation for treating diseases related to overactive bladder
US11344722B2 (en) 2016-01-21 2022-05-31 Cala Health, Inc. Systems, methods and devices for peripheral neuromodulation for treating diseases related to overactive bladder
US11596785B2 (en) 2015-09-23 2023-03-07 Cala Health, Inc. Systems and methods for peripheral nerve stimulation in the finger or hand to treat hand tremors
US11857778B2 (en) 2018-01-17 2024-01-02 Cala Health, Inc. Systems and methods for treating inflammatory bowel disease through peripheral nerve stimulation
US11890468B1 (en) 2019-10-03 2024-02-06 Cala Health, Inc. Neurostimulation systems with event pattern detection and classification

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107731238B (zh) * 2016-08-10 2021-07-16 华为技术有限公司 多声道信号的编码方法和编码器
CN108877815B (zh) * 2017-05-16 2021-02-23 华为技术有限公司 一种立体声信号处理方法及装置
CN110556118B (zh) * 2018-05-31 2022-05-10 华为技术有限公司 立体声信号的编码方法和装置
CN110556116B (zh) 2018-05-31 2021-10-22 华为技术有限公司 计算下混信号和残差信号的方法和装置
CN109243471B (zh) * 2018-09-26 2022-09-23 杭州联汇科技股份有限公司 一种快速编码广播用数字音频的方法
CN112233682B (zh) * 2019-06-29 2024-07-16 华为技术有限公司 一种立体声编码方法、立体声解码方法和装置
CN115346537A (zh) * 2021-05-14 2022-11-15 华为技术有限公司 一种音频编码、解码方法及装置
CN114365509B (zh) * 2021-12-03 2024-03-01 北京小米移动软件有限公司 一种立体声音频信号处理方法及设备/存储介质/装置
CN115691515A (zh) * 2022-07-12 2023-02-03 南京拓灵智能科技有限公司 一种音频编解码方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1954642A (zh) * 2004-06-30 2007-04-25 德商弗朗霍夫应用研究促进学会 多信道合成器及产生多信道输出信号方法
CN101188878A (zh) * 2007-12-05 2008-05-28 武汉大学 一种立体声音频信号的空间参数量化及熵编码方法及其所用系统结构
CN102157151A (zh) * 2010-02-11 2011-08-17 华为技术有限公司 一种多声道信号编码方法、解码方法、装置和系统
CN104246873A (zh) * 2012-02-17 2014-12-24 华为技术有限公司 用于编码多声道音频信号的参数编码器
CN104641414A (zh) * 2012-07-19 2015-05-20 诺基亚公司 立体声音频信号编码器

Family Cites Families (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5659520A (en) * 1995-04-24 1997-08-19 Sonatech, Inc. Super short baseline navigation using phase-delay processing of spread-spectrum-coded reply signals
US6168568B1 (en) * 1996-10-04 2001-01-02 Karmel Medical Acoustic Technologies Ltd. Phonopneumograph system
ATE420432T1 (de) * 2000-04-24 2009-01-15 Qualcomm Inc Verfahren und vorrichtung zur prädiktiven quantisierung von stimmhaften sprachsignalen
ES2268340T3 (es) * 2002-04-22 2007-03-16 Koninklijke Philips Electronics N.V. Representacion de audio parametrico de multiples canales.
AU2003244932A1 (en) * 2002-07-12 2004-02-02 Koninklijke Philips Electronics N.V. Audio coding
ATE527654T1 (de) * 2004-03-01 2011-10-15 Dolby Lab Licensing Corp Mehrkanal-audiodecodierung
KR100745688B1 (ko) * 2004-07-09 2007-08-03 한국전자통신연구원 다채널 오디오 신호 부호화/복호화 방법 및 장치
SE0402650D0 (sv) 2004-11-02 2004-11-02 Coding Tech Ab Improved parametric stereo compatible coding of spatial audio
RU2393550C2 (ru) * 2005-06-30 2010-06-27 ЭлДжи ЭЛЕКТРОНИКС ИНК. Устройство и способ кодирования и декодирования звукового сигнала
RU2376656C1 (ru) * 2005-08-30 2009-12-20 ЭлДжи ЭЛЕКТРОНИКС ИНК. Способ кодирования и декодирования аудиосигнала и устройство для его осуществления
EP1953736A4 (en) * 2005-10-31 2009-08-05 Panasonic Corp STEREO CODING DEVICE AND METHOD FOR PREDICTING STEREO SIGNAL
US7839948B2 (en) * 2005-12-02 2010-11-23 Qualcomm Incorporated Time slicing techniques for variable data rate encoding
ATE448638T1 (de) * 2006-04-13 2009-11-15 Fraunhofer Ges Forschung Audiosignaldekorrelator
EP2063416B1 (en) * 2006-09-13 2011-11-16 Nippon Telegraph And Telephone Corporation Feeling detection method, feeling detection device, feeling detection program containing the method, and recording medium containing the program
KR101505831B1 (ko) * 2007-10-30 2015-03-26 삼성전자주식회사 멀티 채널 신호의 부호화/복호화 방법 및 장치
US8239210B2 (en) * 2007-12-19 2012-08-07 Dts, Inc. Lossless multi-channel audio codec
PL2301020T3 (pl) * 2008-07-11 2013-06-28 Fraunhofer Ges Forschung Urządzenie i sposób do kodowania/dekodowania sygnału audio z użyciem algorytmu przełączania aliasingu
EP2169665B1 (en) * 2008-09-25 2018-05-02 LG Electronics Inc. A method and an apparatus for processing a signal
US8666752B2 (en) * 2009-03-18 2014-03-04 Samsung Electronics Co., Ltd. Apparatus and method for encoding and decoding multi-channel signal
CN102307323B (zh) * 2009-04-20 2013-12-18 华为技术有限公司 对多声道信号的声道延迟参数进行修正的方法
CN101582262B (zh) * 2009-06-16 2011-12-28 武汉大学 一种空间音频参数帧间预测编解码方法
CN102025892A (zh) * 2009-09-16 2011-04-20 索尼株式会社 镜头转换检测方法及装置
CN102498515B (zh) * 2009-09-17 2014-06-18 延世大学工业学术合作社 处理音频信号的方法和设备
AU2010303039B9 (en) * 2009-09-29 2014-10-23 Dolby International Ab Audio signal decoder, audio signal encoder, method for providing an upmix signal representation, method for providing a downmix signal representation, computer program and bitstream using a common inter-object-correlation parameter value
PL2491551T3 (pl) * 2009-10-20 2015-06-30 Fraunhofer Ges Forschung Urządzenie do dostarczania reprezentacji sygnału upmixu w oparciu o reprezentację sygnału downmixu, urządzenie do dostarczania strumienia bitów reprezentującego wielokanałowy sygnał audio, sposoby, program komputerowy i strumień bitów wykorzystujący sygnalizację sterowania zniekształceniami
ES2656815T3 (es) * 2010-03-29 2018-02-28 Fraunhofer-Gesellschaft Zur Förderung Der Angewandten Forschung Procesador de audio espacial y procedimiento para proporcionar parámetros espaciales en base a una señal de entrada acústica
US9112591B2 (en) * 2010-04-16 2015-08-18 Samsung Electronics Co., Ltd. Apparatus for encoding/decoding multichannel signal and method thereof
US8305099B2 (en) 2010-08-31 2012-11-06 Nxp B.V. High speed full duplex test interface
KR101429564B1 (ko) * 2010-09-28 2014-08-13 후아웨이 테크놀러지 컴퍼니 리미티드 디코딩된 다중채널 오디오 신호 또는 디코딩된 스테레오 신호를 포스트프로세싱하기 위한 장치 및 방법
US9514757B2 (en) * 2010-11-17 2016-12-06 Panasonic Intellectual Property Corporation Of America Stereo signal encoding device, stereo signal decoding device, stereo signal encoding method, and stereo signal decoding method
PL2671222T3 (pl) * 2011-02-02 2016-08-31 Ericsson Telefon Ab L M Określanie międzykanałowej różnicy czasu wielokanałowego sygnału audio
US9117440B2 (en) * 2011-05-19 2015-08-25 Dolby International Ab Method, apparatus, and medium for detecting frequency extension coding in the coding history of an audio signal
CN102800317B (zh) * 2011-05-25 2014-09-17 华为技术有限公司 信号分类方法及设备、编解码方法及设备
JP6063555B2 (ja) * 2012-04-05 2017-01-18 華為技術有限公司Huawei Technologies Co.,Ltd. マルチチャネルオーディオエンコーダ及びマルチチャネルオーディオ信号を符号化する方法
US9601122B2 (en) * 2012-06-14 2017-03-21 Dolby International Ab Smooth configuration switching for multichannel audio
US20140086416A1 (en) * 2012-07-15 2014-03-27 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for three-dimensional audio coding using basis function coefficients
KR20140017338A (ko) * 2012-07-31 2014-02-11 인텔렉추얼디스커버리 주식회사 오디오 신호 처리 장치 및 방법
EP2922052B1 (en) 2012-11-13 2021-10-13 Samsung Electronics Co., Ltd. Method for determining an encoding mode
WO2014108738A1 (en) * 2013-01-08 2014-07-17 Nokia Corporation Audio signal multi-channel parameter encoder
CN116665683A (zh) * 2013-02-21 2023-08-29 杜比国际公司 用于参数化多声道编码的方法
WO2014174344A1 (en) * 2013-04-26 2014-10-30 Nokia Corporation Audio signal encoder
US9412385B2 (en) * 2013-05-28 2016-08-09 Qualcomm Incorporated Performing spatial masking with respect to spherical harmonic coefficients
KR20160015280A (ko) * 2013-05-28 2016-02-12 노키아 테크놀로지스 오와이 오디오 신호 인코더
CN104282309A (zh) * 2013-07-05 2015-01-14 杜比实验室特许公司 丢包掩蔽装置和方法以及音频处理系统
EP2830052A1 (en) * 2013-07-22 2015-01-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio decoder, audio encoder, method for providing at least four audio channel signals on the basis of an encoded representation, method for providing an encoded representation on the basis of at least four audio channel signals and computer program using a bandwidth extension
EP2838086A1 (en) * 2013-07-22 2015-02-18 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. In an reduction of comb filter artifacts in multi-channel downmix with adaptive phase alignment
CN104681029B (zh) * 2013-11-29 2018-06-05 华为技术有限公司 立体声相位参数的编码方法及装置
US9595269B2 (en) * 2015-01-19 2017-03-14 Qualcomm Incorporated Scaling for gain shape circuitry
EP3067886A1 (en) * 2015-03-09 2016-09-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal
JP6721977B2 (ja) * 2015-12-15 2020-07-15 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America 音声音響信号符号化装置、音声音響信号復号装置、音声音響信号符号化方法、及び、音声音響信号復号方法
WO2017125559A1 (en) * 2016-01-22 2017-07-27 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatuses and methods for encoding or decoding an audio multi-channel signal using spectral-domain resampling
US9978381B2 (en) * 2016-02-12 2018-05-22 Qualcomm Incorporated Encoding of multiple audio signals
CN107731238B (zh) * 2016-08-10 2021-07-16 华为技术有限公司 多声道信号的编码方法和编码器

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1954642A (zh) * 2004-06-30 2007-04-25 德商弗朗霍夫应用研究促进学会 多信道合成器及产生多信道输出信号方法
CN101188878A (zh) * 2007-12-05 2008-05-28 武汉大学 一种立体声音频信号的空间参数量化及熵编码方法及其所用系统结构
CN102157151A (zh) * 2010-02-11 2011-08-17 华为技术有限公司 一种多声道信号编码方法、解码方法、装置和系统
CN104246873A (zh) * 2012-02-17 2014-12-24 华为技术有限公司 用于编码多声道音频信号的参数编码器
CN104641414A (zh) * 2012-07-19 2015-05-20 诺基亚公司 立体声音频信号编码器

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10905879B2 (en) 2014-06-02 2021-02-02 Cala Health, Inc. Methods for peripheral nerve stimulation
US10960207B2 (en) 2014-06-02 2021-03-30 Cala Health, Inc. Systems for peripheral nerve stimulation
US12109413B2 (en) 2014-06-02 2024-10-08 Cala Health, Inc. Systems and methods for peripheral nerve stimulation to treat tremor
US10765856B2 (en) 2015-06-10 2020-09-08 Cala Health, Inc. Systems and methods for peripheral nerve stimulation to treat tremor with detachable therapy and monitoring units
US11596785B2 (en) 2015-09-23 2023-03-07 Cala Health, Inc. Systems and methods for peripheral nerve stimulation in the finger or hand to treat hand tremors
US11344722B2 (en) 2016-01-21 2022-05-31 Cala Health, Inc. Systems, methods and devices for peripheral neuromodulation for treating diseases related to overactive bladder
US11918806B2 (en) 2016-01-21 2024-03-05 Cala Health, Inc. Systems, methods and devices for peripheral neuromodulation of the leg
US11331480B2 (en) 2017-04-03 2022-05-17 Cala Health, Inc. Systems, methods and devices for peripheral neuromodulation for treating diseases related to overactive bladder
US11857778B2 (en) 2018-01-17 2024-01-02 Cala Health, Inc. Systems and methods for treating inflammatory bowel disease through peripheral nerve stimulation
WO2020069219A1 (en) 2018-09-26 2020-04-02 Cala Health, Inc. Predictive therapy neurostimulation systems
EP4338662A3 (en) * 2018-09-26 2024-04-17 Cala Health, Inc. Predictive therapy neurostimulation systems
US11890468B1 (en) 2019-10-03 2024-02-06 Cala Health, Inc. Neurostimulation systems with event pattern detection and classification

Also Published As

Publication number Publication date
EP3493203A4 (en) 2019-06-19
JP2019527856A (ja) 2019-10-03
AU2022218507B2 (en) 2024-05-02
US20240161756A1 (en) 2024-05-16
ES2928335T3 (es) 2022-11-17
US11133014B2 (en) 2021-09-28
US20190172474A1 (en) 2019-06-06
CN107731238A (zh) 2018-02-23
AU2020267256B2 (en) 2022-05-26
KR20220028159A (ko) 2022-03-08
JP2021009399A (ja) 2021-01-28
CA3033225A1 (en) 2018-02-15
RU2705427C1 (ru) 2019-11-07
CN107731238B (zh) 2021-07-16
KR20190034302A (ko) 2019-04-01
EP4120252A1 (en) 2023-01-18
BR112019002656A2 (pt) 2019-05-28
CA3033225C (en) 2021-11-16
AU2017310759B2 (en) 2020-12-03
AU2017310759A1 (en) 2019-02-28
JP7091411B2 (ja) 2022-06-27
JP6768924B2 (ja) 2020-10-14
US11935548B2 (en) 2024-03-19
KR102486604B1 (ko) 2023-01-09
US20210383815A1 (en) 2021-12-09
EP3493203A1 (en) 2019-06-05
KR20210008566A (ko) 2021-01-22
JP2024063059A (ja) 2024-05-10
EP3493203B1 (en) 2022-07-27
KR102367538B1 (ko) 2022-02-24
KR102205596B1 (ko) 2021-01-20
JP2022137052A (ja) 2022-09-21
AU2024205199A1 (en) 2024-08-15
JP7443423B2 (ja) 2024-03-05
AU2022218507A1 (en) 2022-09-08
AU2020267256A1 (en) 2020-12-10

Similar Documents

Publication Publication Date Title
WO2018028170A1 (zh) 多声道信号的编码方法和编码器
WO2018028171A1 (zh) 多声道信号的编码方法和编码器
WO2017206794A1 (zh) 一种声道间相位差参数的提取方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17838306

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3033225

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2019507137

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20197005937

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017310759

Country of ref document: AU

Date of ref document: 20170222

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017838306

Country of ref document: EP

Effective date: 20190227

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112019002656

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112019002656

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20190208