EP3427259A1 - Verfahren und vorrichtung zur erhöhung der stabilität eines zeitdifferenzparameters zwischen kanälen - Google Patents
Verfahren und vorrichtung zur erhöhung der stabilität eines zeitdifferenzparameters zwischen kanälenInfo
- Publication number
- EP3427259A1 EP3427259A1 EP17709654.2A EP17709654A EP3427259A1 EP 3427259 A1 EP3427259 A1 EP 3427259A1 EP 17709654 A EP17709654 A EP 17709654A EP 3427259 A1 EP3427259 A1 EP 3427259A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- ictd
- estimate
- valid
- icc
- hang
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 206010019133 Hangover Diseases 0.000 claims abstract description 70
- 238000005314 correlation function Methods 0.000 claims description 13
- 230000015654 memory Effects 0.000 claims description 12
- 230000003044 adaptive effect Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 7
- 230000007774 longterm Effects 0.000 claims description 6
- 238000012935 Averaging Methods 0.000 claims description 5
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 description 17
- 238000010988 intraclass correlation coefficient Methods 0.000 description 17
- 238000012545 processing Methods 0.000 description 12
- 230000005236 sound signal Effects 0.000 description 9
- 238000005259 measurement Methods 0.000 description 7
- 239000000203 mixture Substances 0.000 description 7
- 238000000605 extraction Methods 0.000 description 4
- 238000009877 rendering Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 210000002370 ICC Anatomy 0.000 description 1
- 241001025261 Neoraja caerulea Species 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000005311 autocorrelation function Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000013707 sensory perception of sound Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
- G10L19/265—Pre-filtering, e.g. high frequency emphasis prior to encoding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/0308—Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/06—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being correlation coefficients
Definitions
- the present application relates to parametric coding of spatial audio or stereo signals.
- Spatial or 3D audio is a generic formulation which denotes various kinds of multi-channel audio signals.
- the audio scene is represented by a spatial audio format.
- Typical spatial audio formats defined by the capturing method are for example denoted as stereo, binaural, ambisonics, etc.
- Spatial audio rendering systems are able to render spatial audio scenes with stereo (left and right channels 2.0) or more advanced multichannel audio signals (2.1, 5.1, 7.1, etc.).
- Recent technologies for the transmission and manipulation of such audio signals allow the end user to have an enhanced audio experience with higher spatial quality often resulting in a better intelligibility as well as an augmented reality.
- Spatial audio coding techniques such as MPEG Surround or M PEG-H 3D Audio, generate a compact representation of spatial audio signals which is compatible with data rate constraint applications such as streaming over the internet.
- the transmission of spatial audio signals is however limited when the data rate constraint is strong and therefore post-processing of the decoded audio channels is also used to enhanced the spatial audio playback.
- Commonly used techniques are for example able to blindly up-mix decoded mono or stereo signals into multi-channel audio (5.1 channels or more). I n order to efficiently render spatial audio scenes, the spatial audio coding and processing technologies make use of the spatial characteristics of the multi-channel audio signal.
- the time and level differences between the channels of the spatial audio capture are used to approximate the inter-aural cues which characterize our perception of directional sounds in space. Since the inter-channel time and level differences are only an approximation of what the auditory system is able to detect (i.e. the inter-aural time and level differences at the ear entrances), it is of high importance that the inter-channel time difference is relevant from a perceptual aspect.
- the inter-channel time and level differences are commonly used to model the directional components of multi-channel audio signals, while the inter-channel cross-correlation - that models the inter-aural cross-correlation (IACC) - is used to characterize the width of the audio image. Especially for lower frequencies the stereo image may as well be modeled with inter-channel phase differences (ICPD).
- inter-aural level difference IFD
- inter-aural time difference ITD
- inter-aural coherence or correlation IC or IACC
- inter-channel level difference ICLD
- inter-channel time difference ICTD
- ICC inter-channel coherence or correlation
- FIG 2 illustrates a basic block diagram of a parametric stereo coder 200.
- a stereo signal pair is input to the stereo encoder 201.
- the parameter extraction 202 aids the down-mix process, where a downmixer 204 prepares a single channel representation of the two input channels to be encoded with a mono encoder 206. That is, the stereo channels are down-mixed into a mono signal 207 that is encoded and transmitted to the decoder 203 together with encoded parameters 205 describing the spatial image.
- the stereo parameters are represented in spectral sub-bands on a perceptual frequency scale such as the equivalent rectangular bandwidth (ERB) scale.
- ERP equivalent rectangular bandwidth
- the decoder performs stereo synthesis based on the decoded mono signal and the transmitted parameters. That is, the decoder reconstructs the single channel using a mono decoder 210 and synthesizes the stereo channels using the parametric representation.
- the decoded mono signal and received encoded parameters are input to a parametric synthesis unit 212 or process that decodes the parameters, synthesizes the stereo channels using the decoded parameters, and outputs a synthesized stereo signal pair.
- the encoded parameters are used to render spatial audio for the human auditory system, it is important that the inter-channel parameters are extracted and encoded with perceptual considerations for maximized perceived quality.
- Stereo and multi-channel audio signals are complex signals difficult to model especially when the environment is noisy or reverberant or when various audio components of the mixtures overlap in time and frequency i.e. noisy speech, speech over music or
- the object of the embodiments is to increase the stability of the ICTD parameter, thereby improving both the down-mix signal that is encoded by the mono codec and the perceived stability in the spatial audio rendering in the decoder.
- a method for increasing stability of an inter-channel time difference (ICTD) parameter in parametric audio coding wherein a multi-channel audio input signal comprising at least two channels is received.
- the method comprises obtaining an ICTD estimate, ICTD est (m), for an audio frame m and a stability estimate of said ICTD estimate, and determining whether the obtained ICTD estimate, ICTD est (m), is valid. If the ICTD est (m) is not found valid, and a determined sufficient number of valid ICTD estimates have been found in preceding frames, a hang-over time is determined using the stability estimate.
- a previously obtained valid ICTD parameter, ICTD(m— 1) is selected as an output parameter, ICTD(m), during the hang-over time.
- the output parameter, ICTD(m) is set to zero if valid ICTD est (m) is not found during the hang-over time.
- an apparatus for parametric audio coding.
- the apparatus is configured to receive a multi-channel audio input signal comprising at least two channels, and to obtain an ICTD estimate, ICTD est (m), for an audio frame m.
- the apparatus is configured to determine whether the obtained ICTD estimate, ICTD est (m), is valid and to obtain a stability estimate of said ICTD estimate.
- the apparatus is further configured to determine a hang-over time using the stability estimate if the ICTD est (m) is not found valid and a determined sufficient number of valid ICTD estimates have been found in preceding frames, and to select a previously obtained valid ICTD parameter, ICTD(m— 1), as an output parameter, ICTD(m), during the hang-over time, and to set the output parameter, ICTD(m), to zero if valid ICTD est (m) is not found during the hang-over time.
- a computer program comprises instructions which, when executed on at least one processor, cause the at least one processor to obtain an ICTD estimate, ICTD est (m), for an audio frame m and a stability estimate of said ICTD estimate, and to determine whether the obtained ICTD
- ICTD est (m) is valid. If the ICTD est (m) is not found valid, and a determined sufficient number of valid ICTD estimates have been found in preceding frames, to determine a hang-over time using the stability estimate, and to select a previously obtained valid ICTD parameter, ICTD(m— 1), as an output parameter, ICTD(m), during the hangover time, and to set the output parameter, ICTD(m), to zero if valid ICTD est (m) is not found during the hang-over time.
- a method comprises obtaining a long term estimate of the stability of the ICTD parameter by averaging an ICC measure, and when reliable ICTD estimates cannot be obtained, using this stability estimate to determine a hysteresis period, or hang-over time, when a previously obtained reliable ICTD estimate is used. If reliable ICTD estimates are not obtained within the hysteresis period, the ICTD is set to zero.
- Figure 1 illustrates spatial audio playback with a 5.1 surround system.
- Figure 2 illustrates a basic block diagram of a parametric stereo coder.
- FIG. 3 illustrates the pure delay situation.
- Figure 4a is a flow chart illustration of the ICTD/ICC processing according to an embodiment.
- Figure 4b is a flow chart illustration of the ICTD/ICC processing in the branch of relevant ICTD est (m) according to an embodiment.
- Figure 4c is a flow chart illustration of the ICTD/ICC processing in the branch of non-relevant ICTD est (m) according to an embodiment.
- Figure 5 shows a mapping function for determining a number of hang-over frames according to an embodiment.
- Figure 6 illustrates an example of how the ITD hang-over logic is applied according to an embodiment.
- Figure 7 illustrates an example of a parameter hysteresis unit.
- Figure 8 is another example illustration of a parameter hysteresis unit.
- Figure 9 illustrates an apparatus for implementing the methods described herein.
- Figure 10 illustrates a parameter hysteresis unit according to an embodiment.
- the ICC is conventionally obtained as the maximum of the CCF which is normalized by the signal energies as follows
- the time lag ⁇ corresponding to the ICC is determined as the ICTD between the channels x and y.
- the cross-correlation function can equivalently be expressed as a function of the cross-spectrum of the frequency spectra X[k] and Y[k] (with discrete frequency index k) as
- X[k] is the discrete Fourier transform (DFT) of the time domain signal x[n], i.e.
- the delta functions might then be spread into each other and make it difficult to identify the several delays within the signal frame.
- GCC cross-correlation
- phase transform where is a frequency weighting.
- the phase transform Especially for spatial audio, the phase transform
- phase transform is basically the absolute value of each frequency coefficient, i.e.
- Figure 3 illustrates the pure delay situation.
- the middle plot shows the cross-correlation function (CCF) of the two signals. It corresponds to the autocorrelation of the source displaced by a convolution with a delta function ⁇ ( ⁇ — ⁇ 0 ).
- the bottom plot shows the GCC-PHAT of the input signals, yielding a delta function for the pure delay situation.
- the present method is based on an adaptive hang-over time, also called a hang-over period, that depends on the long-term estimate of the ICC.
- a long term estimate of the stability of the ICTD parameter is obtained by averaging an ICC measure.
- the stability estimate is used to determine a hysteresis period, or hang-over time, when a previously obtained reliable estimate is used. If reliable estimates are not obtained within the hysteresis period, the ICTD is set to zero.
- spatial representation parameters for an audio input consisting of two or more audio channels. Each channel is segmented into time frames m.
- the spatial parameters are typically obtained for channel pairs, and for a stereo setup this pair is simply the left and right channel.
- n denotes sample number
- m denotes frame number.
- a cross-correlation measure and an ICTD estimate is obtained for each frame m. After the ICC(m) and ICTD est (m) for the current frame have been obtained, a decision is made whether ICTD est (m) is valid, i.e. relevant/useful/reliable, or not.
- the ICC is filtered to obtain an estimate of the peak envelope of the ICC.
- the output ICTD parameter ICTD(m) is set to the valid estimate ICTD est (m).
- ICTD measure the terms “ICTD measure”
- ICTD parameter” the terms “ICTD value” are used interchangeably for ICTD(m).
- the hang-over counter N H0 is set to zero to indicate no hang-over state.
- a long term estimate of the ICC, ICC LP (m), is initialized to 0.
- the counter N H0 keeps track of the number of hang-over frames to be used and the counter ICTD_count is used for maintaining the number of consecutively observed valid ICTD values. Both counters may be initialized to 0.
- the realization with discrete frame counters is just an example for implementing an adaptive hysteresis. For instance, a real-valued counter, a floating point counter or a fractional time counter may also be used, and the adaptive
- increment/decrement may also assume fractional values.
- the processing steps are repeated for each frame m. Given the input waveform signals x[n, m] and y[n, m] of frame m, a cross-correlation measure is obtained in block 403. In this embodiment the Generalized Cross Correlation with Phase Transform (GCC PHAT) is used.
- GCC PHAT Generalized Cross Correlation with Phase Transform
- an ICTD estimate ICTD est m
- the estimates for ICC and ICTD will be obtained using the same cross-correlation method to consume the least amount of computational power.
- the ⁇ that maximizes the cross-correlation may be selected as the ICTD estimate.
- the GCC PHAT is used.
- the search range for ⁇ would be limited to the range of ICTDs that needs to be represented, but it is also limited by the length of the audio frame and/or the length of the DFT used for the correlation computation (see N in equation (5)). This means that the audio frame length and DFT analysis windows need to be long enough to accommodate the longest time difference r max that needs to be represented, which means that
- the search range would be [—T max , T max ] where After the ICC(m) and ICTD est (m) for the current frame have been obtained, a decision in block 407 is made whether ICTD est (m) is valid or not. This may be done by comparing the relative peak magnitude of a cross-correlation function to a threshold ICC thres (m) based on the cross-correlation function, e.g.
- Another method is to sort the search range and use the value at e.g. the 95 percentile multiplied with a constant.
- sort() is a function that sorts the input vector in ascending order.
- the steps of block 409, outlined in figure 4b, are carried out.
- the ICC is filtered to obtain an estimate of the peak envelope of the ICC. This may be done using a first order MR filter where the filter coefficient (forgetting/update factor) is dependent on the current ICC value relative to the last filtered ICC value.
- the motivation is to have an estimate of the last highest ICCs when coming to a situation where the ICC has dropped to a low level (and not just indicate the last few values in the transition to a low ICC).
- the counter ICTD_count is incremented to keep track of the number of consecutive valid ICTDs.
- the ICTD_count is set to ICTD-maxcount if it is determined in block 423 that the ICTD-maxcount is exceeded or if the system is currently in an ICTD hang-over state and N H0 > 0.
- the former criterion is there to prevent the counter for wrapping around in a limited precision integer number.
- the latter criterion would capture the event that a valid ICTD is found during a hang-over period. Setting the ICTD_count to ICTD-maxcount will trigger a new hang-over period, which may be desirable in this case.
- the output ICTD measure ICTD(m) is set to the valid estimate ICTD est (m).
- the hang-over counter N H0 is also set to zero to indicate that a current state is not a hang-over state.
- a sufficient number of valid ICTD measurements have been found in the preceding frames, which is determined in block 431, a hysteresis period, or hang-over time, is calculated in block 433.
- the sufficient number of valid ICTD is calculated in block 433.
- ICTD_count ICTD-maxcount.
- ICTD-maxcount 2 which means two consecutive valid ICTD measurements is enough to trigger the hang-over logic.
- a higher ICTD-maxcount such as 3, 4 or 5 would also be possible. This would further restrict the hang-over logic to be used only when longer sequences of valid ICTD
- the hang-over time N H0 is adaptive and depends on the ICC such that if the recent ICC estimates have been low (corresponding to low ICC LP (m)), the hang-over time should be long, and vice versa. That is,
- N HOmax , c and d may be set to e.g.
- any parameter indicating the correlation, i.e. coherence or similarity, between the channels may be used as a control parameter ICC m), but the mapping function described in equation (22) has to be adapted to give suitable number of hang-over frames for the low/high correlation cases.
- a low correlation situation should give around 3-8 frames of hang-over, while a high correlation case should give 0 frames of hangover.
- ICTD count ⁇ ICTD maxcount this means either that insufficient number of consecutive ICTD estimates have been registered in the past frames, or that the current state is a hang-over state.
- N H0 > 0. If N H0 0, then ICTD(m) is set to 0 in block 439.
- the top plot shows the audio input channels, in this case left and right of a stereo recording.
- the second plot shows the ICC(m) and ICC LP (m) of the example file, and the bottom plot shows the ITD hang-over counter N H0 . It can be seen that for low correlation during the noisy speech segment in the beginning of the file triggers ITD hangover frames, while the clean speech segment does not trigger any hang-over frames.
- FIG. 7 shows a parameter hysteresis unit 700 that takes the ICTD est (m), ICC(m) and Valid(lCTD est (rri)) as input parameters.
- the final parameter is a decision whether the ICTD est (m) is valid or not.
- the output parameter is the selected ICTD (m).
- An input 701 of the parameter hysteresis unit may be communicatively coupled to the parameter extraction unit 202 shown in figure 2, and an output 703 of the parameter hysteresis unit may be communicatively coupled to the parameter encoder 208 shown in figure 2.
- the parameter hysteresis unit may be comprised in the parameter extraction unit 202 shown in figure 2.
- Figure 8 describes a parameter hysteresis unit, or a hang-over logic unit 700 in more detail.
- correlation estimator 801. there may be benefits of having the ICC measure decoupled from the ICTD estimation. Further, the described method does not imply a certain method of deciding if the ICTD parameter is valid (i.e. reliable), but can be
- the ICC estimate is filtered by an ICC filter 805 to form a long- term estimate of the ICC, preferably tuned to follow the peaks of the ICC.
- An ICTD counter 807 keeps track of the number of consecutive valid ICTD estimates ICTD_count, as well as the number of hang-over frames in a hang-over state N H0 .
- FIG. 9 shows an example of an apparatus performing the method illustrated in Figures 4a- 4c.
- the apparatus 900 comprises a processor 910, e.g. a central processing unit (CPU), and a computer program product 920 in the form of a memory for storing the instructions, e.g. computer program 930 that, when retrieved from the memory and executed by the processor 910 causes the apparatus 900 to perform processes connected with embodiments of the present adaptive parameter hysteresis processing.
- the processor 910 is e.g. a central processing unit (CPU), and a computer program product 920 in the form of a memory for storing the instructions, e.g. computer program 930 that, when retrieved from the memory and executed by the processor 910 causes the apparatus 900 to perform processes connected with embodiments of the present adaptive parameter hysteresis processing.
- the processor 910 is e.g. a central processing unit (CPU), and a computer program product 920 in the form of a memory for storing the instructions,
- the apparatus may further comprise an input node for receiving input parameters, and an output node for outputting processed parameters.
- the input node and the output node are both communicatively coupled to the processor 910.
- the software or computer program 930 may be realized as a computer program product, which is normally carried or stored on a computer-readable medium, preferably non-volatile computer-readable storage medium.
- the computer-readable medium may include one or more removable or non-removable memory devices including, but not limited to a Read-Only Memory (ROM), a Random Access Memory (RAM), a
- CD Compact Disc
- DVD Digital Versatile Disc
- USB Universal Serial Bus
- HDD Hard Disk Drive
- Figure 10 shows a device 1000 comprising a parameter hysteresis unit that is illustrated in Figures 7 and 8.
- the device may be an encoder, e.g., an audio encoder.
- An input signal is a stereo or multi-channel audio signal.
- the output signal is an encoded mono signal with encoded parameters describing the spatial image.
- the device may further comprise a transmitter (not shown) for transmitting the output signal to an audio decoder.
- the device may further comprise a downmixer and a parameter extraction unit/module, and a mono encoder and a parameter encoder as shown in figure 2.
- a device comprises obtaining units for obtaining a cross-correlation measure and an ICTD estimate, and a decision unit for deciding whether ICTD est (m) is valid or not.
- the device further comprises an obtaining unit for obtaining an estimate of the peak envelope of the ICC, and a determining units for determining whether a sufficient number of valid ICTD measurements have been found in the preceding frames and for determining whether a current state is a hang-over state.
- the device further comprises an output unit for outputting ICTD measure.
- the method for increasing stability of an inter-channel time difference (ICTD) parameter in parametric audio coding comprises receiving a multi-channel audio input signal comprising at least two channels. Obtaining an ICTD estimate, ICTD est (m), for an audio frame m, determining whether the obtained ICTD estimate, ICTD est (m), is valid and obtaining a stability estimate of said ICTD estimate.
- ICTD inter-channel time difference
- ICTD est (m) If the ICTD est (m) is not found valid, and a determined sufficient number of valid ICTD estimates have been found in preceding frames, determining a hang-over time using the stability estimate, selecting a previously obtained valid ICTD parameter, ICTD(m— 1), as an output parameter, ICTD (m), during the hang-over time; and setting the output parameter, ICTD (m), to zero if valid ICTD est (m) is not found during the hang-over time.
- the stability estimate is an inter channel correlation (ICC) measure between a channel pair for an audio frame m.
- the stability estimate is a low-pass filtered inter-channel correlation, ICC LP (m).
- the stability estimate is calculated by averaging the ICC measure, ICC(m).
- the hang-over time is adaptive. For instance, the hang-over is applied with increasing number of frames for decreasing ICC LP (m). In an embodiment a Generalized Cross Correlation with Phase Transform is used for obtaining the ICC measure for the frame m.
- ICTD est (m) is determined to be valid if the inter-channel correlation measure, ICC(m), is larger than a threshold ICC thres (m).
- the validity of the obtained ICTD estimate, ICTD est (m) is determined by comparing a relative peak magnitude of a cross-correlation function to a threshold, ICC thres (m), based on the cross correlation function.
- ICC thres (m) may be formed by a constant multiplied by a value of the cross-correlation at a predetermined position in an ordered set of cross correlation values for frame m.
- I n an embodiment the sufficient number of valid ICTD estimates is 2.
- Embodiments of the present invention may be implemented in software, hardware, application logic or a combination of software, hardware and application logic.
- the software, application logic and/or hardware may reside on a memory, a microprocessor or a central processing unit. If desired, part of the software, application logic and/or hardware may reside on a host device or on a memory, a microprocessor or a central processing unit of the host.
- the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Quality & Reliability (AREA)
- Stereophonic System (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19189961.6A EP3582219B1 (de) | 2016-03-09 | 2017-03-08 | Verfahren und vorrichtung zur erhöhung der stabilität eines zeitdifferenzparameters zwischen kanälen |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662305683P | 2016-03-09 | 2016-03-09 | |
PCT/EP2017/055430 WO2017153466A1 (en) | 2016-03-09 | 2017-03-08 | A method and apparatus for increasing stability of an inter-channel time difference parameter |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19189961.6A Division EP3582219B1 (de) | 2016-03-09 | 2017-03-08 | Verfahren und vorrichtung zur erhöhung der stabilität eines zeitdifferenzparameters zwischen kanälen |
Publications (2)
Publication Number | Publication Date |
---|---|
EP3427259A1 true EP3427259A1 (de) | 2019-01-16 |
EP3427259B1 EP3427259B1 (de) | 2019-08-07 |
Family
ID=58264521
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19189961.6A Active EP3582219B1 (de) | 2016-03-09 | 2017-03-08 | Verfahren und vorrichtung zur erhöhung der stabilität eines zeitdifferenzparameters zwischen kanälen |
EP17709654.2A Active EP3427259B1 (de) | 2016-03-09 | 2017-03-08 | Verfahren und vorrichtung zur erhöhung der stabilität eines zeitdifferenzparameters zwischen kanälen |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19189961.6A Active EP3582219B1 (de) | 2016-03-09 | 2017-03-08 | Verfahren und vorrichtung zur erhöhung der stabilität eines zeitdifferenzparameters zwischen kanälen |
Country Status (8)
Country | Link |
---|---|
US (4) | US10832689B2 (de) |
EP (2) | EP3582219B1 (de) |
JP (2) | JP6641027B2 (de) |
AR (1) | AR107842A1 (de) |
AU (1) | AU2017229323B2 (de) |
ES (1) | ES2877061T3 (de) |
WO (1) | WO2017153466A1 (de) |
ZA (1) | ZA201804224B (de) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107742521B (zh) | 2016-08-10 | 2021-08-13 | 华为技术有限公司 | 多声道信号的编码方法和编码器 |
CN109215667B (zh) | 2017-06-29 | 2020-12-22 | 华为技术有限公司 | 时延估计方法及装置 |
EP3588495A1 (de) * | 2018-06-22 | 2020-01-01 | FRAUNHOFER-GESELLSCHAFT zur Förderung der angewandten Forschung e.V. | Codierung von mehrkanaligem audio |
US11606659B2 (en) * | 2021-03-29 | 2023-03-14 | Zoox, Inc. | Adaptive cross-correlation |
JP2024521486A (ja) * | 2021-06-15 | 2024-05-31 | テレフオンアクチーボラゲット エルエム エリクソン(パブル) | コインシデントステレオ捕捉のためのチャネル間時間差(itd)推定器の改善された安定性 |
WO2024160859A1 (en) | 2023-01-31 | 2024-08-08 | Telefonaktiebolaget Lm Ericsson (Publ) | Refined inter-channel time difference (itd) selection for multi-source stereo signals |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05130067A (ja) * | 1991-10-31 | 1993-05-25 | Nec Corp | 可変閾値型音声検出器 |
WO2010037426A1 (en) * | 2008-10-03 | 2010-04-08 | Nokia Corporation | An apparatus |
US8504378B2 (en) * | 2009-01-22 | 2013-08-06 | Panasonic Corporation | Stereo acoustic signal encoding apparatus, stereo acoustic signal decoding apparatus, and methods for the same |
PL3035330T3 (pl) * | 2011-02-02 | 2020-05-18 | Telefonaktiebolaget Lm Ericsson (Publ) | Określanie międzykanałowej różnicy czasu wielokanałowego sygnału audio |
DK3182409T3 (en) * | 2011-02-03 | 2018-06-14 | Ericsson Telefon Ab L M | DETERMINING THE INTERCHANNEL TIME DIFFERENCE FOR A MULTI-CHANNEL SIGNAL |
ES2571742T3 (es) * | 2012-04-05 | 2016-05-26 | Huawei Tech Co Ltd | Método de determinación de un parámetro de codificación para una señal de audio multicanal y un codificador de audio multicanal |
EP2834813B1 (de) * | 2012-04-05 | 2015-09-30 | Huawei Technologies Co., Ltd. | Mehrkanal-toncodierer und verfahren zur codierung eines mehrkanal-tonsignals |
EP2648418A1 (de) * | 2012-04-05 | 2013-10-09 | Thomson Licensing | Synchronisierung von Multimedia-Strömen |
JP5970985B2 (ja) * | 2012-07-05 | 2016-08-17 | 沖電気工業株式会社 | 音声信号処理装置、方法及びプログラム |
-
2017
- 2017-03-08 JP JP2018546695A patent/JP6641027B2/ja active Active
- 2017-03-08 US US16/082,137 patent/US10832689B2/en active Active
- 2017-03-08 EP EP19189961.6A patent/EP3582219B1/de active Active
- 2017-03-08 ES ES19189961T patent/ES2877061T3/es active Active
- 2017-03-08 WO PCT/EP2017/055430 patent/WO2017153466A1/en active Application Filing
- 2017-03-08 EP EP17709654.2A patent/EP3427259B1/de active Active
- 2017-03-08 AU AU2017229323A patent/AU2017229323B2/en active Active
- 2017-03-09 AR ARP170100591A patent/AR107842A1/es active IP Right Grant
-
2018
- 2018-06-22 ZA ZA201804224A patent/ZA201804224B/en unknown
-
2019
- 2019-12-26 JP JP2019236198A patent/JP6858836B2/ja active Active
-
2020
- 2020-10-09 US US17/066,541 patent/US11380337B2/en active Active
-
2022
- 2022-06-16 US US17/842,499 patent/US11869518B2/en active Active
-
2023
- 2023-12-04 US US18/528,082 patent/US20240177719A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
US20200286495A1 (en) | 2020-09-10 |
US11380337B2 (en) | 2022-07-05 |
US11869518B2 (en) | 2024-01-09 |
AU2017229323B2 (en) | 2020-01-16 |
EP3582219B1 (de) | 2021-05-05 |
WO2017153466A1 (en) | 2017-09-14 |
JP6858836B2 (ja) | 2021-04-14 |
ES2877061T3 (es) | 2021-11-16 |
AU2017229323A1 (en) | 2018-07-05 |
US20240177719A1 (en) | 2024-05-30 |
EP3427259B1 (de) | 2019-08-07 |
AR107842A1 (es) | 2018-06-13 |
JP6641027B2 (ja) | 2020-02-05 |
ZA201804224B (en) | 2019-11-27 |
US20210027793A1 (en) | 2021-01-28 |
US20220392463A1 (en) | 2022-12-08 |
US10832689B2 (en) | 2020-11-10 |
JP2019511864A (ja) | 2019-04-25 |
EP3582219A1 (de) | 2019-12-18 |
JP2020065283A (ja) | 2020-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11869518B2 (en) | Method and apparatus for increasing stability of an inter-channel time difference parameter | |
US11942098B2 (en) | Method and apparatus for adaptive control of decorrelation filters | |
EP2671222B1 (de) | Bestimmung der zeitdifferenz eines mehrkanal-audiosignals zwischen kanälen |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20180828 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
INTG | Intention to grant announced |
Effective date: 20190226 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP Ref country code: AT Ref legal event code: REF Ref document number: 1165063 Country of ref document: AT Kind code of ref document: T Effective date: 20190815 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602017005971 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: SE Ref legal event code: TRGR |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20190807 |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG4D |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191107 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191107 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191209 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 1165063 Country of ref document: AT Kind code of ref document: T Effective date: 20190807 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191207 Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191108 Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200224 Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602017005971 Country of ref document: DE |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG2D | Information on lapse in contracting state deleted |
Ref country code: IS |
|
26N | No opposition filed |
Effective date: 20200603 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20200331 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200308 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200331 Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200308 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200331 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200331 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 |
|
P01 | Opt-out of the competence of the unified patent court (upc) registered |
Effective date: 20230517 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190807 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20240327 Year of fee payment: 8 Ref country code: GB Payment date: 20240327 Year of fee payment: 8 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: TR Payment date: 20240226 Year of fee payment: 8 Ref country code: SE Payment date: 20240327 Year of fee payment: 8 Ref country code: FR Payment date: 20240325 Year of fee payment: 8 |