WO2018028170A1 - 多声道信号的编码方法和编码器 - Google Patents
多声道信号的编码方法和编码器 Download PDFInfo
- Publication number
- WO2018028170A1 WO2018028170A1 PCT/CN2017/074419 CN2017074419W WO2018028170A1 WO 2018028170 A1 WO2018028170 A1 WO 2018028170A1 CN 2017074419 W CN2017074419 W CN 2017074419W WO 2018028170 A1 WO2018028170 A1 WO 2018028170A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameter
- current frame
- channel
- signal
- frame
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 68
- 230000003595 spectral effect Effects 0.000 claims description 36
- 230000008859 change Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 description 15
- 230000008569 process Effects 0.000 description 14
- 230000005236 sound signal Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000005314 correlation function Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001427 coherent effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
Definitions
- the present application relates to the field of audio signal coding, and more particularly to an encoding method and encoder for a multi-channel signal.
- stereo has the sense of orientation and distribution of each sound source, which can improve the clarity, intelligibility and presence of sound, and is therefore favored by people.
- Stereo processing techniques mainly include Mid/Sid (MS) encoding, Intensity Stereo (IS) encoding, and Parametric Stereo (PS) encoding.
- MS Mid/Sid
- IS Intensity Stereo
- PS Parametric Stereo
- the MS code combines and converts the two signals based on the inter-channel correlation.
- the energy of each channel is mainly concentrated in the sum channel, so that the inter-channel redundancy is removed.
- the rate saving depends on the correlation of the input signals. When the correlation of the left and right channel signals is poor, the left channel signal and the right channel signal need to be separately transmitted.
- the IS code is based on the characteristic that the human ear hearing system is insensitive to the phase difference of the high frequency component of the channel (for example, a component larger than 2 kHz), and the high frequency components of the left and right signals are simplified.
- the high frequency component of the channel for example, a component larger than 2 kHz
- IS coding technology is only effective for high frequency components. For example, extending IS coding technology to low frequency will cause serious artificial noise.
- PS coding is based on the binaural auditory model. As shown in Figure 1 (x L in Figure 1 is the left channel time domain signal, x R is the right channel time domain signal), during the PS encoding process, the encoding end converts the stereo signal into a mono signal and A small number of spatial parameters (or spatially-perceived parameters) describing the spatial sound field. As shown in Figure 2, after the decoder receives the mono signal and spatial parameters, the stereo signal is recovered in conjunction with the spatial parameters. Compared with MS coding, the PS coding compression ratio is high, and therefore, PS coding can obtain higher coding gain while maintaining good sound quality. In addition, PS encoding can work in full audio bandwidth, which can restore the stereo space perception.
- multi-channel parameters include inter-channel coherent (IC), inter-channel level difference (ILD), and inter-channel.
- the IC describes the cross-correlation or coherence between channels, which determines the perception of the sound field range and improves the spatial and acoustic stability of the audio signal.
- ILD is used to distinguish the horizontal direction of the stereo source and describes the energy difference between the channels, which will affect the frequency content of the entire spectrum.
- ITD and IPD are spatial parameters that represent the horizontal orientation of the sound source and describe the difference in time and phase between the channels. ILD, ITD and IPD can determine the human ear's perception of the sound source position, can effectively determine the sound field position, and play an important role in the recovery of stereo signals.
- the multi-channel parameters calculated according to the existing PS coding method often appear unstable (multi-channel parameters take values back and forth The phenomenon of jumping). If the downmix signal is calculated based on such multi-channel parameters, the downmix signal will be discontinuous, As a result, the stereo quality obtained by the decoder is poor. For example, the stereo image played by the decoder end is frequently shaken, and even the click on the sense of hearing is present.
- the present application provides an encoding method and an encoder for a multi-channel signal to improve the stability of multi-channel parameters in PS encoding, thereby improving the encoding quality of the audio signal.
- a method for encoding a multi-channel signal including:
- a difference parameter according to an initial multi-channel parameter of the current frame and a multi-channel parameter of a front K frame of the current frame, the difference parameter being used to represent an initial multi-channel parameter and a location of the current frame a difference of multi-channel parameters of the preceding K frame, where K is an integer greater than or equal to 1;
- the multi-channel signal is encoded according to a multi-channel parameter of the current frame.
- the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the front K frame and the characteristic parameters of the current frame, and the determination manner is more reasonable, and the previous frame is directly multiplexed with the current frame. Compared with the channel parameters, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
- the determining, according to the difference parameter and the feature parameter of the current frame, the multi-channel parameter of the current frame including:
- the difference parameter is an absolute value of a difference between an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame
- the first preset condition is that the difference parameter is greater than a preset first threshold.
- the difference parameter is a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame,
- the first preset condition is that the difference parameter is less than or equal to zero.
- the determining, according to the feature parameter of the current frame, the multi-channel parameter of the current frame including:
- the method further comprises:
- the correlation parameter is determined according to a target channel signal in the multi-channel signal of the current frame and a target channel signal in the multi-channel signal of the previous frame.
- the target channel signal in the multi-channel signal according to the current frame, and the target sound in the multi-channel signal of the previous frame a channel signal that determines the correlation parameter, including:
- the frequency domain parameter is at least one of a frequency domain amplitude value and a frequency domain coefficient of the target channel signal.
- the method further comprises:
- the correlation parameter is determined according to a pitch period of the current frame and a pitch period of the previous frame.
- the determining, according to the feature parameter of the current frame, the multi-channel parameter of the current frame including:
- the determining, according to the multi-channel parameter of the first T frame of the current frame, determining the multi-channel parameter of the current frame includes:
- the multi-channel parameter of the pre-T frame is determined as a multi-channel parameter of the current frame, where T is equal to one.
- the determining, according to the multi-channel parameter of the first T frame of the current frame, determining the multi-channel parameter of the current frame includes:
- T Determining a multi-channel parameter of the current frame according to a change trend of the multi-channel parameter of the pre-T frame, wherein T is greater than or equal to 2.
- the feature parameter includes at least one of a correlation parameter and a peak-to-average ratio parameter of the current frame
- the correlation parameter is used to characterize the current a degree of correlation between a frame and a previous frame of the current frame
- the peak-to-average ratio parameter used to represent a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame
- the second pre- The condition is that the feature parameter is greater than a preset threshold.
- the initial multi-channel parameter of the current frame includes at least one of: an initial inter-channel correlation IC value of the current frame, The initial inter-channel time difference ITD value of the current frame, the initial inter-channel phase difference IPD value of the current frame, the initial overall phase difference OPD value of the current frame, and the initial inter-channel level difference ILD value of the current frame .
- the feature parameter of the current frame includes at least one of the following: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, And a spectral tilt parameter for characterizing a degree of correlation of the current frame with the previous frame, the peak-to-average ratio parameter being used to characterize at least one of the multi-channel signals of the current frame a peak-to-average ratio of a signal of the track, the signal-to-noise ratio parameter being used to characterize a signal-to-noise ratio of a signal of at least one of the multi-channel signals of the current frame, the spectral tilt parameter being used to characterize the current The degree of spectral tilt of the signal of at least one of the multi-channel signals of the frame.
- an encoder including:
- An acquiring unit configured to acquire a multi-channel signal of a current frame
- a first determining unit configured to determine an initial multi-channel parameter of the current frame
- a second determining unit configured to determine a difference parameter according to an initial multi-channel parameter of the current frame, and a multi-channel parameter of a front K frame of the current frame, where the difference parameter is used to represent the current frame a difference between an initial multi-channel parameter and a multi-channel parameter of the pre-K frame, wherein K is an integer greater than or equal to 1;
- a third determining unit configured to determine, according to the difference parameter and a feature parameter of the current frame, a multi-channel parameter of the current frame
- a coding unit configured to encode the multi-channel signal according to the multi-channel parameter of the current frame.
- the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the front K frame and the characteristic parameters of the current frame, and the determination manner is more reasonable, and the previous frame is directly multiplexed with the current frame. Compared with the channel parameters, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
- the third determining unit is specifically configured to When the difference parameter satisfies the first preset condition, determining the multi-channel parameter of the current frame according to the feature parameter of the current frame.
- the difference parameter is an absolute value of a difference between an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame
- the first preset condition is that the difference parameter is greater than a preset first threshold.
- the difference parameter is a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame,
- the first preset condition is that the difference parameter is less than or equal to zero.
- the third determining unit is specifically configured to determine, according to the correlation parameter of the current frame, a multi-channel parameter of the current frame, where A correlation parameter is used to characterize the degree of correlation of the current frame with a previous frame of the current frame.
- the encoder further includes:
- a fourth determining unit configured to determine the correlation parameter according to the target channel signal in the multi-channel signal of the current frame and the target channel signal in the multi-channel signal of the previous frame.
- the fourth determining unit is specifically configured to: according to a frequency domain parameter of the target channel signal in the multi-channel signal of the current frame, and the front Determining the correlation parameter by a frequency domain parameter of a target channel signal in a multi-channel signal, the frequency domain parameter being at least one of a frequency domain amplitude value and a frequency domain coefficient of the target channel signal .
- the encoder further includes:
- a fifth determining unit configured to determine the correlation parameter according to a pitch period of the current frame and a pitch period of the previous frame.
- the third determining unit is specifically configured to: according to the first T-frame of the current frame, if the feature parameter meets a second preset condition a multi-channel parameter that determines a multi-channel parameter of the current frame, T being an integer greater than or equal to one.
- the third determining unit is specifically configured to determine a multi-channel parameter of the pre-T frame as a multi-channel parameter of the current frame, where T is equal to 1.
- the third determining unit is specifically configured to determine a multi-channel parameter of the current frame according to a change trend of the multi-channel parameter of the pre-T frame Where T is greater than or equal to 2.
- the feature parameter includes at least one of a correlation parameter and a peak-to-average ratio parameter of the current frame
- the correlation parameter is used to represent the current a degree of correlation between a frame and a previous frame of the current frame
- the peak-to-average ratio parameter used to represent a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame
- the second pre- The condition is that the feature parameter is greater than a preset threshold.
- the initial multi-channel parameter of the current frame includes at least one of: an initial inter-channel correlation IC value of the current frame, The initial inter-channel time difference ITD value of the current frame, the initial inter-channel phase difference IPD value of the current frame, the initial overall phase difference OPD value of the current frame, and the initial inter-channel level difference ILD value of the current frame .
- the feature parameter of the current frame includes at least one of the following of the current frame: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, And a spectral tilt parameter
- the correlation parameter is used to represent a degree of correlation between the current frame and the previous frame
- the peak-to-average ratio parameter is used for a table And a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame
- the signal-to-noise ratio parameter being used to represent a signal of at least one of the multi-channel signals of the current frame a signal to noise ratio
- the spectral tilt parameter being used to characterize a degree of spectral tilt of a signal of at least one of the multi-channel signals of the current frame.
- an encoder comprising a memory for storing a program, the processor for executing a program, and when the program is executed, the processor performs the first aspect method.
- a computer readable medium storing program code for execution by an encoder, the program code comprising instructions for performing the method of the first aspect.
- the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the previous K frame and the feature parameters of the current frame, and the determination manner is more reasonable, and is directly multiplexed with the current frame. Compared with the multi-channel parameter of one frame, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
- FIG. 3 is an exemplary flow chart of a time domain based ITD parameter extraction method in the prior art.
- FIG. 4 is an exemplary flow chart of a frequency domain based ITD parameter extraction method in the prior art.
- FIG. 5 is a schematic flowchart of a method for encoding a multi-channel signal according to an embodiment of the present application.
- Figure 6 is a detailed flow diagram of step 540 of Figure 5.
- FIG. 7 is a schematic flowchart of a method for encoding a multi-channel signal according to an embodiment of the present application.
- FIG. 8 is a schematic block diagram of an encoder according to an embodiment of the present application.
- FIG. 9 is a schematic structural diagram of an encoder according to an embodiment of the present application.
- the stereo signal can also be referred to as a multi-channel signal.
- the function and meaning of the multi-channel parameters ILD, ITD and IPD of the multi-channel signal are briefly introduced.
- the signal picked up by the first microphone is the first channel signal, and the second microphone picks up.
- the incoming signal is an example of a second channel signal, and ILD, ITD, and IPD are described in more detail.
- the ILD describes the energy difference between the first channel signal and the second channel signal, which is typically calculated by the ratio of the energy of the left and right channels and then converted to the logarithmic domain. For example, if the ILD value is greater than 0, it means that the energy of the first channel signal is higher than the energy of the second channel signal; if the ILD value is equal to 0, it means that the energy of the first channel signal is equal to the energy of the second channel signal; The ILD value is less than 0, indicating that the energy of the first channel signal is less than the energy of the second channel signal.
- the ILD is less than 0, it means that the energy of the first channel signal is higher than the energy of the second channel signal; if the ILD is equal to 0, it means that the energy of the first channel signal is equal to the energy of the second channel signal; if ILD Greater than 0 indicates that the energy of the first channel signal is less than the energy of the second channel signal. It should be understood that the above numerical values are merely examples, and the relationship between the value of the ILD and the energy difference between the first channel signal and the second channel signal may be defined according to experience or actual needs.
- the ITD describes the time difference between the first channel signal and the second channel signal, that is, the time difference between the sound generated by the sound source reaching the first microphone and the second microphone. For example, if the ITD value is greater than 0, it means that the sound generated by the sound source reaches the first mic earlier than the sound generated by the sound source reaches the second mic; if the ITD value is equal to 0, the sound generated by the sound source arrives at the same time. The first mic and the second mic; if the ITD value is less than 0, the sound source The time that the generated sound reaches the first mic is later than the time the sound produced by the sound source reaches the second mic.
- the ITD is less than 0, it means that the sound generated by the sound source reaches the first microphone earlier than the sound generated by the sound source reaches the second microphone; if the ITD is equal to 0, the sound generated by the sound source reaches the same time. A mic and a second mic; if the ITD is greater than 0, it means that the sound produced by the sound source reaches the first mic time later than the sound generated by the sound source reaches the second mic. It should be understood that the above values are merely the relationship between the value of the example ITD and the time difference between the first channel signal and the second channel signal, which may be defined according to experience or actual needs.
- the IPD describes the phase difference between the first channel signal and the second channel signal, which is usually combined with the ITD for the decoder to recover the phase information of the multi-channel signal.
- the calculation method of the existing multi-channel parameters may cause the phenomenon that the multi-channel parameters are discontinuous.
- the multi-channel signals are used as the left and right channel signals in conjunction with FIG. 3 and FIG.
- the channel parameters are examples of ITD values, which describe in detail the calculation methods and shortcomings of existing multi-channel parameters.
- the ITD value can be calculated in various ways, for example, the ITD value can be calculated in the time domain, or the ITD value can be calculated in the frequency domain.
- FIG. 3 is an exemplary flowchart of a time domain based ITD value calculation method.
- the method of Figure 3 includes:
- the ITD parameter can be calculated by using a time domain cross-correlation function based on the left and right channel time domain signals, for example, in the range of 0 ⁇ i ⁇ Tmax, and calculating:
- T 1 takes the opposite of the index value corresponding to max(C n (i)); otherwise T 1 takes the index value corresponding to max(C p (i)); where i is the index value of the computed cross-correlation function, x R is the right channel time domain signal, x L is the left channel time domain signal, T max corresponds to the maximum value of the ITD value at different sampling rates, and Length is the frame length.
- FIG. 4 is an exemplary flow chart of a frequency domain based ITD value calculation method.
- the method of Figure 4 includes:
- the time-frequency transform may use a Discrete Fourier Transformation (DFT) or a Modified Discrete Cosine Transform (MDCT) technique to transform the time domain signal into a frequency domain signal.
- DFT Discrete Fourier Transformation
- MDCT Modified Discrete Cosine Transform
- the time-frequency transform may employ a DFT transform, and specifically, the DFT transform may be performed using the following formula.
- n is the index value of the sample of the time domain signal
- k is the index value of the frequency point of the frequency domain signal
- L is the time frequency transform length.
- x(n) is the left channel time domain signal or the right channel time domain signal.
- the L frequency bins of the frequency domain signal may be divided into a plurality of subbands, and for the b th subband, the frequency bins included A b-1 ⁇ k ⁇ A b -1.
- search range -T max ⁇ j ⁇ T max it can be calculated using the following formula amplitude:
- the ITD value of the bth subband can be That is, the index value of the sample corresponding to the maximum value calculated by the above formula.
- the calculated ITD value is considered to be inaccurate, in which case the ITD value of the current frame will be set to zero. Affected by factors such as background noise, reverberation, and simultaneous speech by multiple people, the ITD value calculated according to the existing PS coding method may be frequently set to zero, causing the ITD value to jump back and forth, using such ITD values.
- the calculated downmix signal may have a discontinuity between frames, resulting in poor auditory quality of the multichannel signal.
- one feasible processing method is as follows: when the calculated multi-channel parameter of the current frame is considered to be inaccurate, the multi-channel parameter of the previous frame of the current frame can be multiplexed. .
- This kind of processing can well solve the problem of multi-channel parameters going back and forth.
- this processing may cause the following problems: if the signal quality in the current frame is good, the calculated multi-channel of the current frame The parameters are generally more accurate. In this case, if the above processing mode is still used, the multi-channel parameters of the current frame may still multiplex the multi-channel parameters of the previous frame, and discard their own relatively accurate multi-channel parameters, which will result in more Inaccurate information between the channels of the channel signal.
- FIG. 5 is a schematic flowchart of a method for encoding a multi-channel signal according to an embodiment of the present application.
- the method of Figure 5 includes:
- the number of multi-channel signals is not specifically limited in the embodiment of the present application.
- the multi-channel signal may be a two-channel signal, a three-channel signal, or a signal of three or more channels.
- the multi-channel signal may include a left channel signal and a right channel signal.
- the multi-channel signal can include a left channel signal, a center channel signal, a right channel signal, and a back channel signal.
- the initial multi-channel parameters of the current frame can be used to characterize the correlation between the multi-channel signals.
- the initial multi-channel parameters of the current frame include at least one of: an initial IC value of the current frame, an initial ITD value of the current frame, an initial IPD value of the current frame, an initial OPD value of the current frame, and The initial ILD value of the current frame, and so on.
- step 520 may use the time domain-based ITD value calculation method shown in FIG. 3, or may use the frequency domain-based ITD value calculation method described in FIG. 4, and may also be based on , using the mixed domain (time domain + frequency domain) based ITD value calculation method:
- L i (f) represents the frequency domain coefficient of the left channel frequency domain signal
- argmax() characterizes the maximum of multiple values
- IDFT() characterizes the inverse discrete Fourier transform.
- the pre-K frame of the current frame refers to the pre-K frame in the vicinity of the current frame among all the frames of the audio signal to be encoded.
- the pre-K frames appearing below refer to the first K frames of the current frame
- the previous frame appearing below refers to the previous frame of the current frame.
- the representation of the multi-channel parameters may be a numerical value, and therefore, the multi-channel parameters may also be referred to as multi-channel parameter values.
- the feature parameters of the current frame may include mono parameters of the current frame, which may be used to characterize the characteristics of a signal of a certain one of the multi-channel signals of the current frame.
- determining the multi-channel parameters of the current frame as described in step 540 may include modifying the initial multi-channel parameters to obtain multi-channel parameters for the current frame. Taking the characteristic parameter of the current frame as the mono parameter of the current frame as an example, the step 540 may include: correcting the initial multi-channel parameter of the current frame according to the difference parameter and the mono parameter of the current frame to obtain the current frame. Multi-channel parameters.
- the feature parameters of the current frame include at least one of the following parameters of the current frame: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, and a spectral tilt parameter.
- the correlation parameter is used to represent the correlation degree between the current frame and the previous frame
- the peak-to-average ratio parameter is used to represent the peak-to-average ratio of the signal of at least one channel of the multi-channel signal of the current frame
- the signal-to-noise ratio parameter is used.
- the spectral tilt parameter being used to represent a spectral tilt or spectral energy of a signal of at least one of the multi-channel signals of the current frame Trend.
- operations such as mono audio coding, spatial parameter coding, and bit stream multiplexing shown in FIG. 1 may be performed.
- operations such as mono audio coding, spatial parameter coding, and bit stream multiplexing shown in FIG. 1 may be performed.
- specific coding method reference may be made to the prior art.
- the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the front K frame and the feature parameters of the current frame, and the determination manner is more reasonable, and the current frame is directly restored. Compared with the multi-channel parameters of the previous frame, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
- step 540 The implementation of step 540 is described in detail below.
- step 540 may include: adjusting, according to the size of the feature parameter of the current frame, the size of the initial multi-channel parameter of the current frame, if the difference parameter satisfies the first preset condition, Get the multi-channel parameters of the current frame.
- the step 540 may include: adjusting, according to the size of the difference parameter, the size of the initial multi-channel parameter of the current frame, if the feature parameter of the current frame satisfies the first preset condition, Get the multi-channel parameters of the current frame.
- first preset condition may be a condition, or may be a combination of multiple conditions.
- the determination may be continued in combination with other conditions, when all the conditions are met. In the case of the next step.
- step 540 may include:
- difference parameter there are multiple ways to define the difference parameter, and different ways of defining the difference parameter may correspond to different first preset conditions.
- the difference parameter and its corresponding first preset condition are described in detail below.
- the difference parameter may be an absolute value of a difference or a difference value of the initial multi-channel parameter of the current frame and the multi-channel parameter of the previous frame;
- the first preset condition may be a difference parameter
- the first threshold may be greater than a preset first threshold.
- the first threshold may be 0.3-0.7 times the target value.
- the first threshold may be 0.5 times the target value, where the target value is a multi-channel parameter of the previous frame and the current value.
- the difference parameter may be a difference value of the initial multi-channel parameter of the current frame and a mean value of the multi-channel parameter of the pre-K frame or an absolute value of the difference value;
- the first preset condition may be The difference parameter is greater than a preset first threshold, which may be 0.3-0.7 times the target value.
- the first threshold may be 0.5 times the target value, where the target value is a multi-channel parameter of the previous frame.
- the difference parameter may be a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of the previous frame; the first preset condition may be that the difference parameter is less than or equal to zero.
- step 544 The specific implementation of step 544 is described in detail below.
- step 544 may include determining a multi-channel parameter of the current frame according to a correlation parameter and/or a spectral tilt parameter of the current frame, where the correlation parameter is used to represent the current frame and the front The degree of correlation of a frame, the spectral tilt parameter is used to characterize the spectral tilt or spectral energy variation of the signal of at least one of the multi-channel signals of the current frame.
- step 544 may include determining a multi-channel parameter of the current frame according to a correlation parameter and/or a peak-to-average ratio parameter of the current frame, where the correlation parameter is used to represent the current frame and The degree of correlation of the previous frame, the peak-to-average ratio parameter is used to characterize the peak-to-average ratio of the signal of at least one of the multi-channel signals of the current frame.
- the correlation parameter can be used to characterize the degree of correlation between the current frame and the previous frame.
- the degree of correlation of the current frame with the previous frame may be characterized by the degree of correlation of the target channel signal in the multi-channel signal of the current frame and the previous frame.
- the target channel signal of the current frame and the target channel signal of the previous frame correspond to each other, that is, if the target channel signal of the current frame is the left channel signal, the target channel signal of the previous frame is the left channel. Signal; if the target channel signal of the current frame is a right channel signal, the target channel signal of the previous frame is a right channel signal; if the target channel signal of the current frame is a left and right channel signal, the target sound of the previous frame The channel signal is a left and right channel signal.
- the target channel signal may be a target channel time domain signal or a target channel frequency signal.
- the target channel signal is a frequency domain signal
- the correlation parameter is determined according to the target channel signal in the multi-channel signal of the current frame and the previous frame, which may include: according to the current frame and the previous frame.
- a frequency domain parameter of the target channel signal in the channel signal, determining a correlation parameter, and the frequency domain parameter of the target channel signal includes the target channel signal Frequency domain amplitude values and/or frequency domain coefficients.
- the frequency domain amplitude value of the target channel signal may refer to a frequency domain amplitude value of some or all of the sub-bands of the target channel signal.
- it may be the frequency domain amplitude value of the subband of the low frequency portion of the target channel signal.
- the target channel signal as the left channel frequency domain signal as an example, assuming that the frequency domain amplitude value of the low frequency portion of the left channel frequency domain signal includes M subbands, each subband includes N frequency domain amplitude values, Calculating the normalized cross-correlation values of the frequency domain amplitude values of the sub-bands of the current frame and the previous frame according to the following formula, and obtaining M normalized cross-correlation values corresponding to the M sub-bands one by one:
- the M normalized cross-correlation values may be determined as the correlation parameters of the current frame and the previous frame; or, the sum of the M normalized cross-correlation values or the average of the M normalized cross-correlation values may be The value is determined as the correlation parameter of the current frame.
- the above manner of calculating a correlation parameter based on frequency domain amplitude values may be replaced with calculating a correlation parameter based on frequency domain coefficients.
- the above manner of calculating the correlation parameter based on the frequency domain amplitude value may be replaced with calculating the correlation parameter based on the absolute value of the frequency domain coefficient.
- the multi-channel signal of the current frame may refer to the multi-channel signal of one or more subframes of the current frame; for the same reason, the multi-channel signal of the previous frame may refer to more than one or more subframes of the previous frame.
- Channel signal the correlation parameter can be calculated based on all the multi-channel signals of the current frame and the previous frame, or based on the multi-channel signals of one or some of the previous frame and the previous frame.
- the normalized mutual phase time domain signal of the current frame and the left and right channel time domain signals of the previous frame can be calculated according to the following formula. Correlation values, obtain N normalized cross-correlation values, and search for the largest normalized cross-correlation value from the N normalized cross-correlation values:
- L(n) represents the left channel time domain signal
- R(n) represents the right channel time domain signal
- N is the total number of samples of the left channel time domain signal
- L is the nth of the right channel time domain signal. The number of samples offset between the sample point and the nth sample of the left channel time domain signal.
- the maximum normalized cross-correlation value calculated by the above equation can be used as the correlation parameter of the current frame.
- the multi-channel signal of the current frame may refer to the multi-channel signal of one or more subframes of the current frame; for the same reason, the multi-channel signal of the previous frame may refer to more than one or more subframes of the previous frame.
- Channel signal For example, a plurality of maximum normalized cross-correlation values corresponding to a plurality of subframes may be calculated by using the above formula in units of subframes, and then the plurality of subframes may be compared. A maximum normalized cross-correlation value, the sum of the plurality of maximum normalized cross-correlation values, or one or more of the mean values of the plurality of maximum normalized cross-correlation values as a correlation parameter of the current frame.
- the degree of correlation of the current frame with the previous frame may be characterized by the degree of correlation of the pitch period of the current frame and the previous frame.
- the correlation parameter can be determined based on the pitch period of the current frame and the pitch period of the previous frame.
- the pitch period of the current frame or the previous frame may include the pitch period of each subframe of the current frame or the previous frame.
- the pitch period of each subframe in the current frame or the current frame may be calculated according to an existing pitch period algorithm, and the pitch period of each subframe in the previous frame or the previous frame may be calculated. Then, the deviation value of the pitch period of each subframe in the current frame or the previous frame is calculated, or the deviation value of the pitch period between each subframe in the current frame and each subframe in the previous frame is calculated. Then, the calculated deviation value of the pitch period can be used as the correlation parameter of the current frame and the previous frame.
- the peak-to-average ratio parameters of the current frame are described in detail below.
- the peak-to-average ratio parameter of the current frame can be used to characterize the peak-to-average ratio of the signal of at least one of the multi-channel signals of the current frame.
- the multi-channel signal includes a left channel signal and a right channel signal
- the peak-to-average ratio parameter may be a peak-to-average ratio of the left channel signal, or may be a peak-to-average ratio of the right channel signal, or may be a left channel.
- the peak-to-average ratio parameter can be calculated in a variety of ways. For example, it can be calculated based on the frequency domain amplitude value of the frequency domain signal. As another example, it can be calculated based on the frequency domain coefficients of the frequency domain signal or the absolute values of the frequency domain coefficients.
- the frequency domain amplitude value of the frequency domain signal may refer to a frequency domain amplitude value of some or all of the subbands of the frequency domain signal.
- it may be the frequency domain amplitude value of the subband of the low frequency portion of the frequency domain signal.
- the low frequency portion of the left channel frequency domain signal includes M subbands, each subband includes N frequency domain amplitude values, and the peaks of the N frequency domain amplitude values of each subband can be calculated. Comparing, the M peak-to-average ratios of M sub-bands are obtained one by one, and then the M peak-to-average ratio, or the M peak-to-average ratios, or the mean values of the M peak-to-average ratios are taken as the peak-to-average ratio of the current frame. parameter.
- the ratio of the maximum frequency domain amplitude value of each sub-band to the sum of the N frequency-domain amplitude values of each sub-band may be used.
- the peak-to-average ratio When the peak-to-average ratio is compared with the preset threshold, the product of the maximum frequency domain amplitude value and the preset threshold value and the sum of the N frequency domain amplitude values of each sub-band may be compared; or the maximum frequency domain amplitude value may be used. The product of the preset threshold and the average of the N frequency domain amplitude values of each subband is compared.
- the multi-channel signal of the current frame may refer to a multi-channel signal of one or more sub-frames of the current frame.
- the characteristic parameters of the current frame may also include the signal to noise ratio parameter of the current frame, and the signal to noise ratio parameter is described in detail below.
- the signal to noise ratio parameter of the current frame can be used to characterize the signal to noise ratio or signal to noise ratio characteristic of at least one of the multi-channel signals of the current frame.
- the signal-to-noise ratio parameter of the current frame may include one or more parameters, and the specific selection manner of the parameter is not limited in the embodiment of the present application.
- the signal to noise ratio parameter of the current frame may include a sub-band signal to noise ratio of the multi-channel signal, a modified sub-band signal to noise ratio, a segmented signal to noise ratio, a modified segmented signal to noise ratio, a full band signal to noise ratio, Modified full band signal to noise ratio And at least one of other parameters that can characterize the signal to noise ratio characteristics of the multi-channel signal.
- the signal to noise ratio parameter of the current frame can be calculated using all of the signals of the multi-channel signal.
- a portion of the multi-channel signal can be used to calculate a signal to noise ratio parameter for the current frame.
- the signal of any one of the multi-channel signals can be adaptively selected to calculate the signal-to-noise ratio parameter of the current frame.
- the data representing the multi-channel signal may be weighted averaged to form a new signal, and then the signal-to-noise ratio of the new signal is used to characterize the signal-to-noise ratio parameter of the current frame.
- the characteristic parameters of the current frame may also include the spectral tilt parameters of the current frame, and the spectral tilt parameters are described in detail below.
- the spectral tilt parameter of the current frame can be used to characterize the spectral tilt or spectral energy trend of the signal of at least one of the multi-channel signals of the current frame. It should be understood that the greater the degree of spectral tilt, the weaker the signaled voicedness; the smaller the degree of spectral tilt, the stronger the voicedness of the signal.
- step 544 The manner of determining the multi-channel parameters of the current frame based on the feature parameters of the current frame in step 544 is described in detail below.
- whether the current frame multiplexes the multi-channel parameters of the previous frame may be determined according to the feature parameters of the current frame.
- the multi-channel parameter of the previous frame may be multiplexed in the current frame if the feature parameter satisfies the second preset condition.
- the initial multi-channel parameter of the current frame may be used as the multi-channel parameter of the current frame in the case that the feature parameter does not satisfy the second preset condition. It should be understood that the feature parameter does not satisfy the feature in the embodiment of the present application.
- the processing method of the two preset conditions is not specifically limited.
- the initial multi-channel parameters may be corrected by other existing methods.
- whether the multi-channel parameter of the current frame is determined according to a change trend of the multi-channel parameter of the previous T frame may be determined according to the feature parameter of the current frame, where T is greater than or equal to 2.
- the multi-channel parameter of the current frame may be determined according to the trend of the multi-channel parameter of the previous T frame.
- the initial multi-channel parameter of the current frame may be used as the multi-channel parameter of the current frame in the case that the feature parameter does not satisfy the second preset condition. It should be understood that the feature parameter does not satisfy the feature in the embodiment of the present application.
- the processing method of the two preset conditions is not specifically limited.
- the initial multi-channel parameters may be corrected by other existing methods.
- the second preset condition may be a condition or a combination of multiple conditions.
- the determination may be continued in combination with other conditions, when all the conditions are met. In the case of the next step.
- the first T frame of the current frame refers to the first T frame of all the frames of the audio signal to be encoded that is immediately adjacent to the current frame.
- the ITD value of the current frame ITD[i]
- ITD[i-1] represents the ITD value of the previous frame of the current frame
- ITD[i-2] An ITD value characterizing the previous frame of the previous frame of the current frame.
- the second preset condition may be defined in multiple manners, and the setting of the second preset condition is related to the selection of the feature parameter, which is not specifically limited in this embodiment of the present application.
- the characteristic parameter is the correlation parameter and/or the peak-to-average ratio parameter
- the correlation parameter is the mean value of the correlation value of the multi-channel signal of the current frame and the previous frame in each sub-band
- the peak-to-average ratio parameter is the multi-voice of the current frame.
- the average value of the peak-to-average ratio of the track signals in each sub-band is an example, and the second preset condition may be one or more of the following conditions:
- the correlation parameter is greater than the second threshold, wherein the second threshold may be, for example, 0.6-0.95, for example, 0.85;
- the peak-to-average ratio parameter is greater than the third threshold, and the value range of the third threshold may be, for example, 0.4-0.8, for example, may be 0.6;
- the correlation parameter is greater than the fourth threshold and the correlation value of a certain sub-band is greater than the fifth threshold, wherein the fourth threshold may be in the range of 0.6 to 0.85, for example, 0.7; and the fifth threshold may be in the range of 0.8 to 0.95. , for example, can be 0.9;
- the peak-to-average ratio parameter is greater than the sixth threshold and the peak-to-average ratio of a certain sub-band is greater than the seventh threshold.
- the sixth threshold may be in the range of 0.4 to 0.75, for example, 0.55; and the seventh threshold may be in the range of 0.6 to 0.9, for example, can be 0.7;
- the second threshold in the above may be greater than the fourth threshold, and the fourth threshold may be less than the fifth threshold; or the third threshold may be greater than the sixth threshold, and the sixth threshold may be less than the seventh threshold.
- the relationship between the peak-to-average ratio parameter and the preset threshold value needs to be determined.
- the comparison process of the peak-to-average ratio parameter with the preset threshold value may be converted into a peak-to-average ratio peak value compared with the target value, and the target value may be the product of the preset threshold value and the mean value of the peak-to-average ratio, or may be The product of the preset threshold and the sum of the parameters used to calculate the peak-to-average ratio.
- the parameter used to calculate the peak-to-average ratio is the frequency domain amplitude value of the sub-band, and each sub-band includes N frequency-domain amplitude values.
- the peak-to-average ratio is compared with the preset threshold, the maximum of each sub-band can be passed.
- the frequency domain amplitude value is compared with the product of the preset threshold and the sum of the N frequency domain amplitude values of each subband; it is also possible to pass the maximum frequency domain amplitude value of each subband with a preset threshold and N frequency of each subband The product of the average of the domain amplitude values is compared.
- FIG. 7 mainly illustrates that the multi-channel signal of the current frame includes a left channel signal and a right channel signal, and the multi-channel parameter is an ITD value.
- the example of FIG. 7 is merely for assisting the technology in the field.
- the embodiments of the present application are understood by those skilled in the art, and the embodiments of the present application are not limited to the specific numerical values or specific examples illustrated. A person skilled in the art will be able to make various modifications or changes in the embodiments according to the example of FIG. 7. The modifications or variations are also within the scope of the embodiments of the present application.
- FIG. 7 is a schematic flowchart of a method for encoding a multi-channel signal according to an embodiment of the present application. It should be understood that the processing steps or operations illustrated in FIG. 7 are merely examples, and the embodiments of the present application may also perform other operations or variations of the various operations in FIG. Moreover, the various steps in FIG. 7 may be performed in a different order than that presented in FIG. 7, and it is possible that not all operations in FIG. 7 are to be performed.
- the method of Figure 7 includes:
- steps 720-740 can be expressed by:
- L i (f) represents the frequency domain coefficient of the left channel frequency domain signal, Characterizing the conjugate of the frequency domain coefficients of the right channel frequency domain signal; argmax() characterizes the maximum of multiple values, and IDFT() characterizes the inverse discrete Fourier transform.
- steps 760-770 can refer to the prior art and will not be described in detail herein.
- Step 750 corresponds to step 530 in FIG. 5, and any of the implementations given in step 530 may be employed. Several alternative implementations are listed below.
- the low frequency portion of the left channel frequency domain signal of the current frame may be divided into M subbands, and each subband includes N frequency domain amplitude values.
- step 2 the correlation parameter between the current frame and the previous frame may be calculated according to the following formula:
- the correlation parameter between the current frame and the previous frame is obtained, and the correlation parameter may be a normalized cross-correlation value of each sub-band, or may be a normalized cross-correlation of each sub-band.
- step three the peak-to-average ratio of each sub-band of the current frame is calculated.
- step two and step three may be performed simultaneously or sequentially.
- the peak-to-average ratio of each sub-band can be expressed as a ratio of the peak value and the mean value of the frequency domain amplitude value of each sub-band, and the peak value of the frequency domain amplitude value of each sub-band and the frequency-domain amplitude value in the sub-band can also be used. The ratio of the sums is expressed, which reduces the computational complexity.
- the peak-to-average ratio parameter of the multi-channel signal of the current frame can be obtained, and the peak-to-average ratio parameter can be the peak-to-average ratio of each sub-band, or the peak-to-average ratio of each sub-band. And the mean of the peak-to-average ratio of each subband.
- Step 4 If the initial ITD value of the current frame and the ITD value of the previous frame satisfy the first preset condition, determine whether the current frame is multiplexed according to the correlation parameter and/or the peak-to-average ratio parameter of the current frame. ITD value.
- the first preset condition can be, for example:
- the product of the ITD value of the previous frame and the initial ITD value of the current frame is 0; or,
- the product of the ITD value of the previous frame and the initial ITD value of the current frame is negative; or,
- the absolute value of the difference between the ITD value of the previous frame and the initial ITD value of the current frame is greater than half of the target value, wherein the target value is greater than the absolute value of the ITD value of the previous frame and the initial ITD value of the current frame. ITD value.
- first preset condition may be a condition, or may be a combination of multiple conditions.
- the determination may be continued in combination with other conditions, when all the conditions are met. If both are satisfied, perform the next steps.
- Determining whether the current frame multiplexes the ITD value of the previous frame according to the correlation parameter and/or the peak-to-average ratio parameter of the current frame may specifically determine whether the correlation parameter of the current frame and/or the peak-to-average ratio parameter satisfy the second pre- It is assumed that, in the case that the correlation parameter and/or the peak-to-average ratio parameter of the current frame satisfy the second preset condition, the current frame multiplexes the ITD value of the previous frame.
- the second preset condition can be, for example:
- the mean value of the normalized cross-correlation values of each sub-band is greater than the first threshold
- the mean value of the peak-to-average ratio of each sub-band is greater than the second threshold
- the mean value of the normalized cross-correlation value of each sub-band is greater than a third threshold and the normalized cross-correlation value of a sub-band is greater than a fourth threshold;
- the mean value of the peak-to-average ratio of each sub-band is greater than a fifth threshold and the peak-to-average ratio of a certain sub-band is greater than a sixth threshold;
- the first threshold is greater than the third threshold, the third threshold is less than the fourth threshold; the second threshold is greater than the fifth threshold, and the fifth threshold is less than the sixth threshold.
- the second preset condition may be a condition or a combination of multiple conditions.
- the determination may be continued in combination with other conditions, when all the conditions are met. If both are satisfied, perform the next steps.
- the left channel frequency domain signal of the current frame described in the above may be the left channel frequency domain signal of a certain subframe or some subframes in the current frame, and the left frame of the previous frame described above.
- the channel frequency domain signal may be a left channel frequency domain signal of a certain subframe or some subframes in the previous frame.
- the correlation parameter can be calculated by the parameters of the current frame and the previous frame, or can be calculated by the parameters of a certain subframe or some subframes in the current frame and the previous frame.
- the peak-to-average ratio parameter can be calculated by the parameters of the current frame, or can be calculated by using a certain subframe or some subframes in the current frame.
- the second implementation manner is different from the foregoing implementation manner in that the implementation manner is to calculate the correlation parameter between the current frame and the previous frame based on the frequency domain amplitude value of the subband, and the implementation manner is based on the frequency domain coefficient or frequency of the subband.
- the absolute value of the domain coefficient calculates the correlation parameter of the current frame and the previous frame.
- the implementation mode 2 is similar to the specific implementation process of the foregoing implementation manner, and is not described in detail herein.
- the implementation method 3 differs from the above implementation manner in that the above implementation manner is based on the frequency-domain amplitude value of the sub-band to calculate the peak-to-average ratio parameter, and the implementation manner is based on the absolute value of the sub-band frequency domain coefficient to calculate the peak-to-average ratio parameter. .
- the third implementation manner is similar to the specific implementation process of the foregoing implementation manner, and is not described in detail herein.
- the implementation method 4 is different from the above implementation manner in that the implementation manner is based on the left channel frequency domain signal to calculate the correlation parameter and/or the peak-to-average ratio parameter, and the implementation manner 4 is based on the right channel frequency domain signal to calculate the correlation. Parameter and / or peak-to-average ratio parameters.
- the implementation manner 4 is similar to the specific implementation process of the foregoing implementation manner, and is not described in detail herein.
- the implementation method 5 is different from the above implementation manner in that the implementation manner is based on the left channel frequency domain signal or the right channel frequency domain signal to calculate the correlation parameter and/or the peak-to-average ratio parameter, and the implementation manner 5 is based on the left and right sound.
- the channel frequency domain signal calculates a correlation parameter and/or a peak-to-average ratio parameter.
- a set of correlation parameters and/or peak-to-average ratio parameters may be calculated according to the left channel frequency domain signal; and a set of correlation parameters and/or peak-to-average ratio parameters are calculated by using the right channel frequency domain signal. Then, one of the two sets of parameters can be selected as the final correlation parameter and/or the peak-to-average ratio parameter.
- Other processes of implementing mode 5 and the above The current mode is similar and will not be described in detail here.
- the difference between the implementation manner 6 and the foregoing implementation manner is that the foregoing implementation manner is based on the frequency domain signal to calculate the correlation parameter, and the implementation manner 6 is to calculate the correlation parameter based on the time domain signal.
- the correlation parameter of the current frame and the previous frame can be calculated by:
- L(n) represents the left channel time domain signal
- R(n) represents the right channel time domain signal
- N is the total number of samples of the left channel time domain signal
- L is the nth sample of the right channel signal. The number of samples offset between the point and the nth sample of the left channel.
- left channel time domain signal and the right channel time domain signal herein may be all left channel signals and right channel signals in the current frame, or may be some or some subframes in the current frame. Left channel signal and right channel signal.
- the implementation manner 7 is different from the foregoing implementation manner in that the foregoing implementation manner is to determine whether the current frame multiplexes the ITD value of the previous frame, and the implementation manner 7 is to determine whether the ITD value of the current frame passes the previous T frame of the current frame.
- the trend of the ITD value is estimated, and T is an integer greater than or equal to 2.
- the ITD value of the current frame, ITD[i] can be calculated as follows:
- ITD[i-1] represents the ITD value of the previous frame of the current frame
- ITD[i-2] represents the previous frame of the current frame.
- the implementation manner 8 is different from the foregoing implementation manner in that the foregoing implementation manner is to calculate a correlation parameter between the current frame and the previous frame based on the current frame and the time-frequency signal of the previous frame, and the implementation manner 8 is based on the current frame and the previous one.
- the pitch period of the frame calculates the correlation parameter.
- the pitch period of the current frame or the current frame may be calculated according to an existing pitch period algorithm; the pitch period of the corresponding previous frame is calculated at the same time; the deviation of the pitch period of the current frame from the previous frame is calculated; The deviation of the pitch period of the previous frame is used as the correlation parameter of the current frame and the previous frame.
- the deviation of the pitch period of the current frame and the previous frame may be the deviation of the pitch period of the current frame and the previous frame as a whole, or may be the pitch period of one or some subframes in the current frame and the previous frame.
- the deviation may also be the sum of the deviations of the pitch periods of the current frame and some subframes in the previous frame, or may be the mean of the deviations of the pitch periods of the current frame and some subframes in the previous frame.
- the implementation manner 9 differs from the foregoing implementation manner in that the foregoing implementation manner determines the ITD value of the current frame based on the correlation parameter and/or the peak-to-average ratio parameter, and the implementation manner 9 is determined based on the correlation parameter and/or the spectrum tilt parameter.
- the ITD value of the current frame is determined based on the correlation parameter and/or the spectrum tilt parameter.
- the second preset condition may be: the correlation value in the correlation parameter of the current frame and the previous frame is greater than a certain threshold, and/or the spectral slope value in the spectral slope parameter is less than a certain threshold (should be understood, the spectrum The larger the slope value, the weaker the voicedness of the signal; the smaller the spectral slope value, the stronger the voicedness of the signal.
- the implementation ten is different from the above implementation manner in that the above implementation calculates the ITD value of the current frame, and the implementation ten calculates the IPD value of the current frame. It should be understood that the calculation process related to the ITD value in steps 710-770 needs to be replaced with the process related to the IPD value.
- the calculation method of the IPD value can refer to the prior art, and will not be described in detail herein.
- FIG. 8 is a schematic block diagram of an encoder according to an embodiment of the present application.
- the encoder 800 of Figure 8 includes:
- An obtaining unit 810 configured to acquire a multi-channel signal of a current frame
- a first determining unit 820 configured to determine an initial multi-channel parameter of the current frame
- a second determining unit 830 configured to determine, according to an initial multi-channel parameter of the current frame, and a multi-channel parameter of a front K frame of the current frame, the difference parameter is used to represent the current frame a difference between an initial multi-channel parameter and a multi-channel parameter of the pre-K frame, wherein K is an integer greater than or equal to 1;
- a third determining unit 840 configured to determine, according to the difference parameter and a feature parameter of the current frame, a multi-channel parameter of the current frame;
- the encoding unit 850 is configured to encode the multi-channel signal according to the multi-channel parameter of the current frame.
- the multi-channel parameter of the current frame is determined after comprehensively considering the difference between the current frame and the front K frame and the feature parameters of the current frame, and the determination manner is more reasonable, and the current frame is directly restored. Compared with the multi-channel parameters of the previous frame, the accuracy of the inter-channel information of the multi-channel signal can be better ensured.
- the third determining unit 840 is specifically configured to determine, according to a feature parameter of the current frame, the current frame, if the difference parameter meets a first preset condition. Multi-channel parameters.
- the difference parameter is an absolute value of a difference between an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame, the first The preset condition is that the difference parameter is greater than a preset first threshold.
- the difference parameter is a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame, where the first preset condition is The difference parameter is less than or equal to zero.
- the third determining unit 840 is specifically configured to determine, according to the correlation parameter of the current frame, a multi-channel parameter of the current frame, where the correlation parameter is used to Characterizing the degree of correlation of the current frame with a previous frame of the current frame.
- the third determining unit 840 is specifically configured to determine, according to a peak-to-average ratio parameter of the current frame, a multi-channel parameter of the current frame, where the peak-to-average ratio parameter is used. And a peak-to-average ratio of a signal characterizing at least one of the multi-channel signals of the current frame.
- the third determining unit 840 is specifically configured to determine, according to the correlation parameter and the peak-to-average ratio parameter of the current frame, a multi-channel parameter of the current frame, where A correlation parameter is used to characterize a degree of correlation between the current frame and a previous frame of the current frame, the peak-to-average ratio parameter used to characterize a signal of at least one of the multi-channel signals of the current frame Peak to average ratio.
- the encoder further includes:
- a fourth determining unit configured to determine the correlation parameter according to the target channel signal in the multi-channel signal of the current frame and the target channel signal in the multi-channel signal of the previous frame.
- the fourth determining unit is specifically configured to: according to a frequency domain parameter of the target channel signal in the multi-channel signal of the current frame, and the multi-channel of the previous frame And determining, by the frequency domain parameter of the target channel signal in the signal, the correlation parameter, wherein the frequency domain parameter is at least one of a frequency domain amplitude value and a frequency domain coefficient of the target channel signal.
- the encoder further includes:
- a fifth determining unit configured to determine the correlation parameter according to a pitch period of the current frame and a pitch period of the previous frame.
- the third determining unit 840 is specifically configured to: according to the multi-channel parameter of the first T frame of the current frame, if the feature parameter meets the second preset condition, Determining a multi-channel parameter of the current frame, T being an integer greater than or equal to one.
- the third determining unit 840 is specifically configured to determine a multi-channel parameter of the pre-T frame as a multi-channel parameter of the current frame, where T is equal to 1.
- the third determining unit 840 is specifically configured to determine, according to a trend of the multi-channel parameter of the pre-T frame, a multi-channel parameter of the current frame, where T is greater than Or equal to 2.
- the feature parameter includes a correlation parameter and/or a peak-to-average ratio parameter of the current frame, where the correlation parameter is used to represent the current frame and the current frame.
- the correlation parameter is used to represent the current frame and the current frame.
- a correlation degree of a frame the peak-to-average ratio parameter is used to represent a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame, and the second preset condition is that the characteristic parameter is greater than The preset threshold.
- the initial multi-channel parameter of the current frame includes at least one of: an initial inter-channel correlation IC value of the current frame, an initial channel of the current frame The inter-time difference ITD value, the initial inter-channel phase difference IPD value of the current frame, the initial overall phase difference OPD value of the current frame, and the initial inter-channel level difference ILD value of the current frame.
- the feature parameter of the current frame includes at least one of the following: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, and a spectral tilt parameter.
- a correlation parameter for characterizing a degree of correlation between the current frame and the previous frame the peak-to-average ratio parameter for characterizing a peak of a signal of at least one of the multi-channel signals of the current frame Ratio
- the signal to noise ratio parameter is used to characterize a signal to noise ratio of a signal of at least one of the multi-channel signals of the current frame
- the spectral tilt parameter being used to characterize the multi-channel signal of the current frame The degree of spectral tilt of the signal of at least one of the channels.
- FIG. 9 is a schematic block diagram of an encoder according to an embodiment of the present application.
- the encoder 900 of Figure 9 includes:
- a memory 910 configured to store a program
- a processor 920 configured to execute a program, when the program is executed, the processor 920 is configured to acquire a multi-channel signal of a current frame; determine an initial multi-channel parameter of the current frame; according to the current frame An initial multi-channel parameter, and a multi-channel parameter of the first K frame of the current frame, determining a difference parameter, the difference parameter being used to characterize an initial multi-channel parameter of the current frame and the pre-K frame a difference of a multi-channel parameter, where K is an integer greater than or equal to 1; determining a multi-channel parameter of the current frame according to the difference parameter and a characteristic parameter of the current frame;
- the channel parameters encode the multi-channel signal.
- the multi-channel parameter of the current frame is comprehensively considering the difference between the current frame and the previous K frame. And determining the characteristic parameters of the current frame, such a determination manner is more reasonable, and the inter-channel information of the multi-channel signal can be better ensured than the manner in which the current frame directly multiplexes the multi-channel parameters of the previous frame. The accuracy.
- the processor 920 is specifically configured to determine, according to a feature parameter of the current frame, multiple sounds of the current frame, if the difference parameter meets a first preset condition. Road parameters.
- the difference parameter is an absolute value of a difference between an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame, the first The preset condition is that the difference parameter is greater than a preset first threshold.
- the difference parameter is a product of an initial multi-channel parameter of the current frame and a multi-channel parameter of a previous frame of the current frame, where the first preset condition is The difference parameter is less than or equal to zero.
- the processor 920 is specifically configured to determine a multi-channel parameter of the current frame according to a correlation parameter of the current frame, where the correlation parameter is used to represent the The degree of correlation between the current frame and the previous frame of the current frame.
- the processor 920 is specifically configured to determine a multi-channel parameter of the current frame according to a peak-to-average ratio parameter of the current frame, where the peak-to-average ratio parameter is used for A peak-to-average ratio of a signal characterizing at least one of the multi-channel signals of the current frame.
- the processor 920 is specifically configured to determine, according to the correlation parameter and the peak-to-average ratio parameter of the current frame, a multi-channel parameter of the current frame, where the correlation is a parameter for characterizing a degree of correlation between the current frame and a previous frame of the current frame, the peak-to-average ratio parameter used to represent a peak of a signal of at least one of the multi-channel signals of the current frame ratio.
- the processor 920 is further configured to: target channel signals in the multi-channel signal according to the current frame, and target sounds in the multi-channel signal of the previous frame.
- the processor 920 is specifically configured to: according to a frequency domain parameter of a target channel signal in a multi-channel signal of the current frame, and a multi-channel signal of the previous frame.
- the frequency domain parameter of the target channel signal is determined, and the correlation parameter is determined, and the frequency domain parameter is a frequency domain amplitude value of the target channel signal.
- the processor 920 is specifically configured to: according to a frequency domain parameter of a target channel signal in a multi-channel signal of the current frame, and a multi-channel signal of the previous frame.
- the frequency domain parameter of the target channel signal is determined, and the correlation parameter is determined, and the frequency domain parameter is a frequency domain coefficient of the target channel signal.
- the processor 920 is specifically configured to: according to a frequency domain parameter of a target channel signal in a multi-channel signal of the current frame, and a multi-channel signal of the previous frame.
- the frequency domain parameter of the target channel signal is determined, and the frequency domain parameter is a frequency domain amplitude value and a frequency domain coefficient of the target channel signal.
- the processor 920 is further configured to determine the correlation parameter according to a pitch period of the current frame and a pitch period of the previous frame.
- the processor 920 is specifically configured to determine, according to the multi-channel parameter of the first T frame of the current frame, that the feature parameter meets the second preset condition.
- the multi-channel parameter of the current frame, T is an integer greater than or equal to 1.
- the processor 920 is specifically configured to determine a multi-channel parameter of the pre-T frame as a multi-channel parameter of the current frame, where T is equal to 1.
- the processor 920 is specifically configured to perform multi-channel parameters according to the pre-T frame.
- a trend of change determining a multi-channel parameter of the current frame, wherein T is greater than or equal to two.
- the feature parameter includes a correlation parameter and/or a peak-to-average ratio parameter of the current frame, where the correlation parameter is used to represent the current frame and the current frame.
- the correlation parameter is used to represent the current frame and the current frame.
- a correlation degree of a frame the peak-to-average ratio parameter is used to represent a peak-to-average ratio of a signal of at least one of the multi-channel signals of the current frame, and the second preset condition is that the characteristic parameter is greater than The preset threshold.
- the initial multi-channel parameter of the current frame includes at least one of: an initial inter-channel correlation IC value of the current frame, an initial channel of the current frame The inter-time difference ITD value, the initial inter-channel phase difference IPD value of the current frame, the initial overall phase difference OPD value of the current frame, and the initial inter-channel level difference ILD value of the current frame.
- the feature parameter of the current frame includes at least one of the following: a correlation parameter, a peak-to-average ratio parameter, a signal to noise ratio parameter, and a spectral tilt parameter.
- a correlation parameter for characterizing a degree of correlation between the current frame and the previous frame the peak-to-average ratio parameter for characterizing a peak of a signal of at least one of the multi-channel signals of the current frame Ratio
- the signal to noise ratio parameter is used to characterize a signal to noise ratio of a signal of at least one of the multi-channel signals of the current frame
- the spectral tilt parameter being used to characterize the multi-channel signal of the current frame The degree of spectral tilt of the signal of at least one of the channels.
- the disclosed systems, devices, and methods may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present application or the part contributing to the prior art or the part of the technical solution may be embodied in the form of a software product.
- the computer software product is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Mathematical Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Stereophonic System (AREA)
Abstract
Description
Claims (28)
- 一种多声道信号的编码方法,其特征在于,包括:获取当前帧的多声道信号;确定所述当前帧的初始多声道参数;根据所述当前帧的初始多声道参数,以及所述当前帧的前K帧的多声道参数,确定差异参数,所述差异参数用于表征所述当前帧的初始多声道参数与所述前K帧的多声道参数的差异,其中,K为大于或等于1的整数;根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数;根据所述当前帧的多声道参数对所述多声道信号进行编码。
- 如权利要求1所述的方法,其特征在于,所述根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:在所述差异参数满足第一预设条件的情况下,根据所述当前帧的特征参数,确定所述当前帧的多声道参数。
- 如权利要求2所述的方法,其特征在于,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的差值的绝对值,所述第一预设条件为所述差异参数大于预设的第一阈值。
- 如权利要求2所述的方法,其特征在于,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的乘积,所述第一预设条件为所述差异参数小于或等于0。
- 如权利要求2-4中任一项所述的方法,其特征在于,所述根据所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:根据所述当前帧的相关性参数,确定所述当前帧的多声道参数,其中,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度。
- 如权利要求5所述的方法,其特征在于,所述方法还包括:根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数。
- 如权利要求6所述的方法,其特征在于,所述根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数,包括:根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值和频域系数中的至少一个。
- 如权利要求5所述的方法,其特征在于,所述方法还包括:根据所述当前帧的基音周期,以及所述前一帧的基音周期,确定所述相关性参数。
- 如权利要求2-8中任一项所述的方法,其特征在于,所述根据所述当前帧的特征参数,确定所述当前帧的多声道参数,包括:在所述特征参数满足第二预设条件的情况下,根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,T为大于或等于1的整数。
- 如权利要求9所述的方法,其特征在于,所述根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,包括:将所述前T帧的多声道参数确定为所述当前帧的多声道参数,其中,T等于1。
- 如权利要求9所述的方法,其特征在于,所述根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,包括:根据所述前T帧的多声道参数的变化趋势,确定所述当前帧的多声道参数,其中,T大于或等于2。
- 如权利要求9-11中任一项所述的方法,其特征在于,所述当前帧的特征参数包括所述当前帧的相关性参数和峰均比参数中的至少一个,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述第二预设条件为所述特征参数大于预设阈值。
- 如权利要求1-12中任一项所述的方法,其特征在于,所述当前帧的初始多声道参数包括以下中的至少一种:所述当前帧的初始声道间相关性IC值,所述当前帧的初始声道间时间差ITD值,所述当前帧的初始声道间相位差IPD值,当前帧的初始整体相位差OPD值,以及所述当前帧的初始声道间电平差ILD值。
- 如权利要求1-13中任一项所述的方法,其特征在于,所述当前帧的特征参数包括所述当前帧的以下中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数,所述相关性参数用于表征所述当前帧与所述前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述信噪比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的信噪比,所述谱倾斜参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度。
- 一种编码器,其特征在于,包括:获取单元,用于获取当前帧的多声道信号;第一确定单元,用于确定所述当前帧的初始多声道参数;第二确定单元,用于根据所述当前帧的初始多声道参数,以及所述当前帧的前K帧的多声道参数,确定差异参数,所述差异参数用于表征所述当前帧的初始多声道参数与所述前K帧的多声道参数的差异,其中,K为大于或等于1的整数;第三确定单元,用于根据所述差异参数和所述当前帧的特征参数,确定所述当前帧的多声道参数;编码单元,用于根据所述当前帧的多声道参数对所述多声道信号进行编码。
- 如权利要求15所述的编码器,其特征在于,所述第三确定单元具体用于在所述差异参数满足第一预设条件的情况下,根据所述当前帧的特征参数,确定所述当前帧的多声道参数。
- 如权利要求16所述的编码器,其特征在于,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的差值的绝对值,所述第一预设条件为所述差异参数大于预设的第一阈值。
- 如权利要求16所述的编码器,其特征在于,所述差异参数为所述当前帧的初始多声道参数和所述当前帧的前一帧的多声道参数的乘积,所述第一预设条件为所述差异参数小于或等于0。
- 如权利要求16-18中任一项所述的编码器,其特征在于,所述第三确定单元具体用于根据所述当前帧的相关性参数,确定所述当前帧的多声道参数,其中,所述相关 性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度。
- 如权利要求19所述的编码器,其特征在于,所述编码器还包括:第四确定单元,用于根据所述当前帧的多声道信号中的目标声道信号,以及所述前一帧的多声道信号中的目标声道信号,确定所述相关性参数。
- 如权利要求20所述的编码器,其特征在于,所述第四确定单元具体用于根据所述当前帧的多声道信号中的目标声道信号的频域参数,以及所述前一帧的多声道信号中的目标声道信号的频域参数,确定所述相关性参数,所述频域参数为所述目标声道信号的频域幅度值和频域系数中的至少一个。
- 如权利要求19所述的编码器,其特征在于,所述编码器还包括:第五确定单元,用于根据所述当前帧的基音周期,以及所述前一帧的基音周期,确定所述相关性参数。
- 如权利要求16-22中任一项所述的编码器,其特征在于,所述第三确定单元具体用于在所述特征参数满足第二预设条件的情况下,根据所述当前帧的前T帧的多声道参数,确定所述当前帧的多声道参数,T为大于或等于1的整数。
- 如权利要求23所述的编码器,其特征在于,所述第三确定单元具体用于将所述前T帧的多声道参数确定为所述当前帧的多声道参数,其中,T等于1。
- 如权利要求23所述的编码器,其特征在于,所述第三确定单元具体用于根据所述前T帧的多声道参数的变化趋势,确定所述当前帧的多声道参数,其中,T大于或等于2。
- 如权利要求23-25中任一项所述的编码器,其特征在于,所述特征参数包括所述当前帧的相关性参数和峰均比参数中的至少一个,所述相关性参数用于表征所述当前帧与所述当前帧的前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述第二预设条件为所述特征参数大于预设阈值。
- 如权利要求15-26中任一项所述的编码器,其特征在于,所述当前帧的初始多声道参数包括以下中的至少一种:所述当前帧的初始声道间相关性IC值,所述当前帧的初始声道间时间差ITD值,所述当前帧的初始声道间相位差IPD值,当前帧的初始整体相位差OPD值,以及所述当前帧的初始声道间电平差ILD值。
- 如权利要求15-27中任一项所述的编码器,其特征在于,所述当前帧的特征参数包括所述当前帧的以下中的至少一种:相关性参数,峰均比参数,信噪比参数,以及谱倾斜参数,所述相关性参数用于表征所述当前帧与所述前一帧的相关程度,所述峰均比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的峰均比,所述信噪比参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的信噪比,所述谱倾斜参数用于表征所述当前帧的多声道信号中的至少一个声道的信号的频谱倾斜程度。
Priority Applications (17)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2019507137A JP6768924B2 (ja) | 2016-08-10 | 2017-02-22 | マルチチャネル信号の符号化方法およびエンコーダ |
CA3033225A CA3033225C (en) | 2016-08-10 | 2017-02-22 | Multi-channel signal encoding method and encoder |
KR1020197005937A KR102205596B1 (ko) | 2016-08-10 | 2017-02-22 | 다중 채널 신호 인코딩 방법 및 인코더 |
EP17838306.3A EP3493203B1 (en) | 2016-08-10 | 2017-02-22 | Method for encoding multi-channel signal and encoder |
KR1020227005726A KR102486604B1 (ko) | 2016-08-10 | 2017-02-22 | 다중 채널 신호 인코딩 방법 및 인코더 |
RU2019106315A RU2705427C1 (ru) | 2016-08-10 | 2017-02-22 | Способ кодирования многоканального сигнала и кодировщик |
AU2017310759A AU2017310759B2 (en) | 2016-08-10 | 2017-02-22 | Multi-channel signal encoding method and encoder |
BR112019002656-8A BR112019002656B1 (pt) | 2016-08-10 | 2017-02-22 | Método de codificação de sinal de canal múltiplo, codificador, e meio de armazenamento legível por computador |
EP22179454.8A EP4120252A1 (en) | 2016-08-10 | 2017-02-22 | Multi-channel signal encoder and computer readable medium |
ES17838306T ES2928335T3 (es) | 2016-08-10 | 2017-02-22 | Método para codificar señales multicanal y codificador |
KR1020217001206A KR102367538B1 (ko) | 2016-08-10 | 2017-02-22 | 다중 채널 신호 인코딩 방법 및 인코더 |
US16/272,397 US11133014B2 (en) | 2016-08-10 | 2019-02-11 | Multi-channel signal encoding method and encoder |
AU2020267256A AU2020267256B2 (en) | 2016-08-10 | 2020-11-12 | Multi-channel signal encoding method and encoder |
US17/408,116 US11935548B2 (en) | 2016-08-10 | 2021-08-20 | Multi-channel signal encoding method and encoder |
AU2022218507A AU2022218507B2 (en) | 2016-08-10 | 2022-08-17 | Multi-channel signal encoding method and encoder |
US18/419,794 US20240161756A1 (en) | 2016-08-10 | 2024-01-23 | Multi-Channel Signal Encoding Method and Encoder |
AU2024205199A AU2024205199A1 (en) | 2016-08-10 | 2024-07-30 | Multi-channel signal encoding method and encoder |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610652506.X | 2016-08-10 | ||
CN201610652506.XA CN107731238B (zh) | 2016-08-10 | 2016-08-10 | 多声道信号的编码方法和编码器 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/272,397 Continuation US11133014B2 (en) | 2016-08-10 | 2019-02-11 | Multi-channel signal encoding method and encoder |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018028170A1 true WO2018028170A1 (zh) | 2018-02-15 |
Family
ID=61161463
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/074419 WO2018028170A1 (zh) | 2016-08-10 | 2017-02-22 | 多声道信号的编码方法和编码器 |
Country Status (10)
Country | Link |
---|---|
US (3) | US11133014B2 (zh) |
EP (2) | EP4120252A1 (zh) |
JP (4) | JP6768924B2 (zh) |
KR (3) | KR102486604B1 (zh) |
CN (1) | CN107731238B (zh) |
AU (4) | AU2017310759B2 (zh) |
CA (1) | CA3033225C (zh) |
ES (1) | ES2928335T3 (zh) |
RU (1) | RU2705427C1 (zh) |
WO (1) | WO2018028170A1 (zh) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020069219A1 (en) | 2018-09-26 | 2020-04-02 | Cala Health, Inc. | Predictive therapy neurostimulation systems |
US10765856B2 (en) | 2015-06-10 | 2020-09-08 | Cala Health, Inc. | Systems and methods for peripheral nerve stimulation to treat tremor with detachable therapy and monitoring units |
US10905879B2 (en) | 2014-06-02 | 2021-02-02 | Cala Health, Inc. | Methods for peripheral nerve stimulation |
US11331480B2 (en) | 2017-04-03 | 2022-05-17 | Cala Health, Inc. | Systems, methods and devices for peripheral neuromodulation for treating diseases related to overactive bladder |
US11344722B2 (en) | 2016-01-21 | 2022-05-31 | Cala Health, Inc. | Systems, methods and devices for peripheral neuromodulation for treating diseases related to overactive bladder |
US11596785B2 (en) | 2015-09-23 | 2023-03-07 | Cala Health, Inc. | Systems and methods for peripheral nerve stimulation in the finger or hand to treat hand tremors |
US11857778B2 (en) | 2018-01-17 | 2024-01-02 | Cala Health, Inc. | Systems and methods for treating inflammatory bowel disease through peripheral nerve stimulation |
US11890468B1 (en) | 2019-10-03 | 2024-02-06 | Cala Health, Inc. | Neurostimulation systems with event pattern detection and classification |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107731238B (zh) * | 2016-08-10 | 2021-07-16 | 华为技术有限公司 | 多声道信号的编码方法和编码器 |
CN108877815B (zh) * | 2017-05-16 | 2021-02-23 | 华为技术有限公司 | 一种立体声信号处理方法及装置 |
CN110556118B (zh) * | 2018-05-31 | 2022-05-10 | 华为技术有限公司 | 立体声信号的编码方法和装置 |
CN110556116B (zh) | 2018-05-31 | 2021-10-22 | 华为技术有限公司 | 计算下混信号和残差信号的方法和装置 |
CN109243471B (zh) * | 2018-09-26 | 2022-09-23 | 杭州联汇科技股份有限公司 | 一种快速编码广播用数字音频的方法 |
CN112233682B (zh) * | 2019-06-29 | 2024-07-16 | 华为技术有限公司 | 一种立体声编码方法、立体声解码方法和装置 |
CN115346537A (zh) * | 2021-05-14 | 2022-11-15 | 华为技术有限公司 | 一种音频编码、解码方法及装置 |
CN114365509B (zh) * | 2021-12-03 | 2024-03-01 | 北京小米移动软件有限公司 | 一种立体声音频信号处理方法及设备/存储介质/装置 |
CN115691515A (zh) * | 2022-07-12 | 2023-02-03 | 南京拓灵智能科技有限公司 | 一种音频编解码方法及装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1954642A (zh) * | 2004-06-30 | 2007-04-25 | 德商弗朗霍夫应用研究促进学会 | 多信道合成器及产生多信道输出信号方法 |
CN101188878A (zh) * | 2007-12-05 | 2008-05-28 | 武汉大学 | 一种立体声音频信号的空间参数量化及熵编码方法及其所用系统结构 |
CN102157151A (zh) * | 2010-02-11 | 2011-08-17 | 华为技术有限公司 | 一种多声道信号编码方法、解码方法、装置和系统 |
CN104246873A (zh) * | 2012-02-17 | 2014-12-24 | 华为技术有限公司 | 用于编码多声道音频信号的参数编码器 |
CN104641414A (zh) * | 2012-07-19 | 2015-05-20 | 诺基亚公司 | 立体声音频信号编码器 |
Family Cites Families (53)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5659520A (en) * | 1995-04-24 | 1997-08-19 | Sonatech, Inc. | Super short baseline navigation using phase-delay processing of spread-spectrum-coded reply signals |
US6168568B1 (en) * | 1996-10-04 | 2001-01-02 | Karmel Medical Acoustic Technologies Ltd. | Phonopneumograph system |
ATE420432T1 (de) * | 2000-04-24 | 2009-01-15 | Qualcomm Inc | Verfahren und vorrichtung zur prädiktiven quantisierung von stimmhaften sprachsignalen |
ES2268340T3 (es) * | 2002-04-22 | 2007-03-16 | Koninklijke Philips Electronics N.V. | Representacion de audio parametrico de multiples canales. |
AU2003244932A1 (en) * | 2002-07-12 | 2004-02-02 | Koninklijke Philips Electronics N.V. | Audio coding |
ATE527654T1 (de) * | 2004-03-01 | 2011-10-15 | Dolby Lab Licensing Corp | Mehrkanal-audiodecodierung |
KR100745688B1 (ko) * | 2004-07-09 | 2007-08-03 | 한국전자통신연구원 | 다채널 오디오 신호 부호화/복호화 방법 및 장치 |
SE0402650D0 (sv) | 2004-11-02 | 2004-11-02 | Coding Tech Ab | Improved parametric stereo compatible coding of spatial audio |
RU2393550C2 (ru) * | 2005-06-30 | 2010-06-27 | ЭлДжи ЭЛЕКТРОНИКС ИНК. | Устройство и способ кодирования и декодирования звукового сигнала |
RU2376656C1 (ru) * | 2005-08-30 | 2009-12-20 | ЭлДжи ЭЛЕКТРОНИКС ИНК. | Способ кодирования и декодирования аудиосигнала и устройство для его осуществления |
EP1953736A4 (en) * | 2005-10-31 | 2009-08-05 | Panasonic Corp | STEREO CODING DEVICE AND METHOD FOR PREDICTING STEREO SIGNAL |
US7839948B2 (en) * | 2005-12-02 | 2010-11-23 | Qualcomm Incorporated | Time slicing techniques for variable data rate encoding |
ATE448638T1 (de) * | 2006-04-13 | 2009-11-15 | Fraunhofer Ges Forschung | Audiosignaldekorrelator |
EP2063416B1 (en) * | 2006-09-13 | 2011-11-16 | Nippon Telegraph And Telephone Corporation | Feeling detection method, feeling detection device, feeling detection program containing the method, and recording medium containing the program |
KR101505831B1 (ko) * | 2007-10-30 | 2015-03-26 | 삼성전자주식회사 | 멀티 채널 신호의 부호화/복호화 방법 및 장치 |
US8239210B2 (en) * | 2007-12-19 | 2012-08-07 | Dts, Inc. | Lossless multi-channel audio codec |
PL2301020T3 (pl) * | 2008-07-11 | 2013-06-28 | Fraunhofer Ges Forschung | Urządzenie i sposób do kodowania/dekodowania sygnału audio z użyciem algorytmu przełączania aliasingu |
EP2169665B1 (en) * | 2008-09-25 | 2018-05-02 | LG Electronics Inc. | A method and an apparatus for processing a signal |
US8666752B2 (en) * | 2009-03-18 | 2014-03-04 | Samsung Electronics Co., Ltd. | Apparatus and method for encoding and decoding multi-channel signal |
CN102307323B (zh) * | 2009-04-20 | 2013-12-18 | 华为技术有限公司 | 对多声道信号的声道延迟参数进行修正的方法 |
CN101582262B (zh) * | 2009-06-16 | 2011-12-28 | 武汉大学 | 一种空间音频参数帧间预测编解码方法 |
CN102025892A (zh) * | 2009-09-16 | 2011-04-20 | 索尼株式会社 | 镜头转换检测方法及装置 |
CN102498515B (zh) * | 2009-09-17 | 2014-06-18 | 延世大学工业学术合作社 | 处理音频信号的方法和设备 |
AU2010303039B9 (en) * | 2009-09-29 | 2014-10-23 | Dolby International Ab | Audio signal decoder, audio signal encoder, method for providing an upmix signal representation, method for providing a downmix signal representation, computer program and bitstream using a common inter-object-correlation parameter value |
PL2491551T3 (pl) * | 2009-10-20 | 2015-06-30 | Fraunhofer Ges Forschung | Urządzenie do dostarczania reprezentacji sygnału upmixu w oparciu o reprezentację sygnału downmixu, urządzenie do dostarczania strumienia bitów reprezentującego wielokanałowy sygnał audio, sposoby, program komputerowy i strumień bitów wykorzystujący sygnalizację sterowania zniekształceniami |
ES2656815T3 (es) * | 2010-03-29 | 2018-02-28 | Fraunhofer-Gesellschaft Zur Förderung Der Angewandten Forschung | Procesador de audio espacial y procedimiento para proporcionar parámetros espaciales en base a una señal de entrada acústica |
US9112591B2 (en) * | 2010-04-16 | 2015-08-18 | Samsung Electronics Co., Ltd. | Apparatus for encoding/decoding multichannel signal and method thereof |
US8305099B2 (en) | 2010-08-31 | 2012-11-06 | Nxp B.V. | High speed full duplex test interface |
KR101429564B1 (ko) * | 2010-09-28 | 2014-08-13 | 후아웨이 테크놀러지 컴퍼니 리미티드 | 디코딩된 다중채널 오디오 신호 또는 디코딩된 스테레오 신호를 포스트프로세싱하기 위한 장치 및 방법 |
US9514757B2 (en) * | 2010-11-17 | 2016-12-06 | Panasonic Intellectual Property Corporation Of America | Stereo signal encoding device, stereo signal decoding device, stereo signal encoding method, and stereo signal decoding method |
PL2671222T3 (pl) * | 2011-02-02 | 2016-08-31 | Ericsson Telefon Ab L M | Określanie międzykanałowej różnicy czasu wielokanałowego sygnału audio |
US9117440B2 (en) * | 2011-05-19 | 2015-08-25 | Dolby International Ab | Method, apparatus, and medium for detecting frequency extension coding in the coding history of an audio signal |
CN102800317B (zh) * | 2011-05-25 | 2014-09-17 | 华为技术有限公司 | 信号分类方法及设备、编解码方法及设备 |
JP6063555B2 (ja) * | 2012-04-05 | 2017-01-18 | 華為技術有限公司Huawei Technologies Co.,Ltd. | マルチチャネルオーディオエンコーダ及びマルチチャネルオーディオ信号を符号化する方法 |
US9601122B2 (en) * | 2012-06-14 | 2017-03-21 | Dolby International Ab | Smooth configuration switching for multichannel audio |
US20140086416A1 (en) * | 2012-07-15 | 2014-03-27 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for three-dimensional audio coding using basis function coefficients |
KR20140017338A (ko) * | 2012-07-31 | 2014-02-11 | 인텔렉추얼디스커버리 주식회사 | 오디오 신호 처리 장치 및 방법 |
EP2922052B1 (en) | 2012-11-13 | 2021-10-13 | Samsung Electronics Co., Ltd. | Method for determining an encoding mode |
WO2014108738A1 (en) * | 2013-01-08 | 2014-07-17 | Nokia Corporation | Audio signal multi-channel parameter encoder |
CN116665683A (zh) * | 2013-02-21 | 2023-08-29 | 杜比国际公司 | 用于参数化多声道编码的方法 |
WO2014174344A1 (en) * | 2013-04-26 | 2014-10-30 | Nokia Corporation | Audio signal encoder |
US9412385B2 (en) * | 2013-05-28 | 2016-08-09 | Qualcomm Incorporated | Performing spatial masking with respect to spherical harmonic coefficients |
KR20160015280A (ko) * | 2013-05-28 | 2016-02-12 | 노키아 테크놀로지스 오와이 | 오디오 신호 인코더 |
CN104282309A (zh) * | 2013-07-05 | 2015-01-14 | 杜比实验室特许公司 | 丢包掩蔽装置和方法以及音频处理系统 |
EP2830052A1 (en) * | 2013-07-22 | 2015-01-28 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio decoder, audio encoder, method for providing at least four audio channel signals on the basis of an encoded representation, method for providing an encoded representation on the basis of at least four audio channel signals and computer program using a bandwidth extension |
EP2838086A1 (en) * | 2013-07-22 | 2015-02-18 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | In an reduction of comb filter artifacts in multi-channel downmix with adaptive phase alignment |
CN104681029B (zh) * | 2013-11-29 | 2018-06-05 | 华为技术有限公司 | 立体声相位参数的编码方法及装置 |
US9595269B2 (en) * | 2015-01-19 | 2017-03-14 | Qualcomm Incorporated | Scaling for gain shape circuitry |
EP3067886A1 (en) * | 2015-03-09 | 2016-09-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal |
JP6721977B2 (ja) * | 2015-12-15 | 2020-07-15 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | 音声音響信号符号化装置、音声音響信号復号装置、音声音響信号符号化方法、及び、音声音響信号復号方法 |
WO2017125559A1 (en) * | 2016-01-22 | 2017-07-27 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatuses and methods for encoding or decoding an audio multi-channel signal using spectral-domain resampling |
US9978381B2 (en) * | 2016-02-12 | 2018-05-22 | Qualcomm Incorporated | Encoding of multiple audio signals |
CN107731238B (zh) * | 2016-08-10 | 2021-07-16 | 华为技术有限公司 | 多声道信号的编码方法和编码器 |
-
2016
- 2016-08-10 CN CN201610652506.XA patent/CN107731238B/zh active Active
-
2017
- 2017-02-22 RU RU2019106315A patent/RU2705427C1/ru active
- 2017-02-22 EP EP22179454.8A patent/EP4120252A1/en active Pending
- 2017-02-22 EP EP17838306.3A patent/EP3493203B1/en active Active
- 2017-02-22 CA CA3033225A patent/CA3033225C/en active Active
- 2017-02-22 AU AU2017310759A patent/AU2017310759B2/en active Active
- 2017-02-22 KR KR1020227005726A patent/KR102486604B1/ko active IP Right Grant
- 2017-02-22 KR KR1020217001206A patent/KR102367538B1/ko active IP Right Grant
- 2017-02-22 WO PCT/CN2017/074419 patent/WO2018028170A1/zh unknown
- 2017-02-22 KR KR1020197005937A patent/KR102205596B1/ko active IP Right Grant
- 2017-02-22 JP JP2019507137A patent/JP6768924B2/ja active Active
- 2017-02-22 ES ES17838306T patent/ES2928335T3/es active Active
-
2019
- 2019-02-11 US US16/272,397 patent/US11133014B2/en active Active
-
2020
- 2020-09-23 JP JP2020158348A patent/JP7091411B2/ja active Active
- 2020-11-12 AU AU2020267256A patent/AU2020267256B2/en active Active
-
2021
- 2021-08-20 US US17/408,116 patent/US11935548B2/en active Active
-
2022
- 2022-06-15 JP JP2022096616A patent/JP7443423B2/ja active Active
- 2022-08-17 AU AU2022218507A patent/AU2022218507B2/en active Active
-
2024
- 2024-01-23 US US18/419,794 patent/US20240161756A1/en active Pending
- 2024-02-21 JP JP2024024588A patent/JP2024063059A/ja active Pending
- 2024-07-30 AU AU2024205199A patent/AU2024205199A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1954642A (zh) * | 2004-06-30 | 2007-04-25 | 德商弗朗霍夫应用研究促进学会 | 多信道合成器及产生多信道输出信号方法 |
CN101188878A (zh) * | 2007-12-05 | 2008-05-28 | 武汉大学 | 一种立体声音频信号的空间参数量化及熵编码方法及其所用系统结构 |
CN102157151A (zh) * | 2010-02-11 | 2011-08-17 | 华为技术有限公司 | 一种多声道信号编码方法、解码方法、装置和系统 |
CN104246873A (zh) * | 2012-02-17 | 2014-12-24 | 华为技术有限公司 | 用于编码多声道音频信号的参数编码器 |
CN104641414A (zh) * | 2012-07-19 | 2015-05-20 | 诺基亚公司 | 立体声音频信号编码器 |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10905879B2 (en) | 2014-06-02 | 2021-02-02 | Cala Health, Inc. | Methods for peripheral nerve stimulation |
US10960207B2 (en) | 2014-06-02 | 2021-03-30 | Cala Health, Inc. | Systems for peripheral nerve stimulation |
US12109413B2 (en) | 2014-06-02 | 2024-10-08 | Cala Health, Inc. | Systems and methods for peripheral nerve stimulation to treat tremor |
US10765856B2 (en) | 2015-06-10 | 2020-09-08 | Cala Health, Inc. | Systems and methods for peripheral nerve stimulation to treat tremor with detachable therapy and monitoring units |
US11596785B2 (en) | 2015-09-23 | 2023-03-07 | Cala Health, Inc. | Systems and methods for peripheral nerve stimulation in the finger or hand to treat hand tremors |
US11344722B2 (en) | 2016-01-21 | 2022-05-31 | Cala Health, Inc. | Systems, methods and devices for peripheral neuromodulation for treating diseases related to overactive bladder |
US11918806B2 (en) | 2016-01-21 | 2024-03-05 | Cala Health, Inc. | Systems, methods and devices for peripheral neuromodulation of the leg |
US11331480B2 (en) | 2017-04-03 | 2022-05-17 | Cala Health, Inc. | Systems, methods and devices for peripheral neuromodulation for treating diseases related to overactive bladder |
US11857778B2 (en) | 2018-01-17 | 2024-01-02 | Cala Health, Inc. | Systems and methods for treating inflammatory bowel disease through peripheral nerve stimulation |
WO2020069219A1 (en) | 2018-09-26 | 2020-04-02 | Cala Health, Inc. | Predictive therapy neurostimulation systems |
EP4338662A3 (en) * | 2018-09-26 | 2024-04-17 | Cala Health, Inc. | Predictive therapy neurostimulation systems |
US11890468B1 (en) | 2019-10-03 | 2024-02-06 | Cala Health, Inc. | Neurostimulation systems with event pattern detection and classification |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018028170A1 (zh) | 多声道信号的编码方法和编码器 | |
WO2018028171A1 (zh) | 多声道信号的编码方法和编码器 | |
WO2017206794A1 (zh) | 一种声道间相位差参数的提取方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17838306 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 3033225 Country of ref document: CA |
|
ENP | Entry into the national phase |
Ref document number: 2019507137 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 20197005937 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2017310759 Country of ref document: AU Date of ref document: 20170222 Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2017838306 Country of ref document: EP Effective date: 20190227 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112019002656 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112019002656 Country of ref document: BR Kind code of ref document: A2 Effective date: 20190208 |