WO2012034890A1 - Cross product enhanced subband block based harmonic transposition - Google Patents

Cross product enhanced subband block based harmonic transposition Download PDF

Info

Publication number
WO2012034890A1
WO2012034890A1 PCT/EP2011/065318 EP2011065318W WO2012034890A1 WO 2012034890 A1 WO2012034890 A1 WO 2012034890A1 EP 2011065318 W EP2011065318 W EP 2011065318W WO 2012034890 A1 WO2012034890 A1 WO 2012034890A1
Authority
WO
WIPO (PCT)
Prior art keywords
subband
samples
analysis
synthesis
input
Prior art date
Application number
PCT/EP2011/065318
Other languages
French (fr)
Inventor
Lars Villemoes
Original Assignee
Dolby International Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby International Ab filed Critical Dolby International Ab
Priority to DK11763872.6T priority Critical patent/DK2617035T3/en
Priority to IL291501A priority patent/IL291501B2/en
Priority to KR1020237026369A priority patent/KR102694615B1/en
Priority to JP2013528595A priority patent/JP5951614B2/en
Priority to KR1020177014269A priority patent/KR101863035B1/en
Priority to IL296448A priority patent/IL296448A/en
Priority to UAA201304657A priority patent/UA105988C2/en
Priority to KR1020227029790A priority patent/KR102564590B1/en
Priority to MX2013002876A priority patent/MX2013002876A/en
Priority to SG2013011804A priority patent/SG188229A1/en
Priority to EP21204206.3A priority patent/EP3975178B1/en
Priority to KR1020197023879A priority patent/KR102073544B1/en
Priority to US13/822,601 priority patent/US9172342B2/en
Priority to KR1020187033935A priority patent/KR101980070B1/en
Priority to IL303921A priority patent/IL303921B1/en
Priority to KR1020147026155A priority patent/KR101744621B1/en
Priority to IL313284A priority patent/IL313284A/en
Priority to KR1020197013601A priority patent/KR102014696B1/en
Priority to BR122019025115-0A priority patent/BR122019025115B1/en
Priority to EP11763872.6A priority patent/EP2617035B1/en
Priority to AU2011304113A priority patent/AU2011304113C1/en
Priority to BR122019025142-8A priority patent/BR122019025142B1/en
Priority to PL11763872T priority patent/PL2617035T3/en
Priority to BR112013005676-2A priority patent/BR112013005676B1/en
Priority to ES11763872T priority patent/ES2699750T3/en
Priority to KR1020247026002A priority patent/KR20240122593A/en
Priority to EP22202637.9A priority patent/EP4148732B1/en
Priority to KR1020207002646A priority patent/KR102312475B1/en
Priority to EP22202639.5A priority patent/EP4145445B1/en
Priority to KR1020217032100A priority patent/KR102439053B1/en
Priority to KR1020137009361A priority patent/KR101610626B1/en
Priority to EP21204205.5A priority patent/EP3975177B1/en
Priority to KR1020187014134A priority patent/KR101924326B1/en
Priority to BR122019025121-5A priority patent/BR122019025121B1/en
Priority to CN201180044307.6A priority patent/CN103262164B/en
Priority to EP18198247.1A priority patent/EP3503100A1/en
Priority to RU2013117038/08A priority patent/RU2551817C2/en
Priority to IL298230A priority patent/IL298230B2/en
Priority to CA2808353A priority patent/CA2808353C/en
Publication of WO2012034890A1 publication Critical patent/WO2012034890A1/en
Priority to IL224785A priority patent/IL224785A/en
Priority to AU2015202647A priority patent/AU2015202647B2/en
Priority to IL240068A priority patent/IL240068A/en
Priority to US14/854,498 priority patent/US9735750B2/en
Priority to US15/480,859 priority patent/US9940941B2/en
Priority to AU2017204074A priority patent/AU2017204074C1/en
Priority to IL253387A priority patent/IL253387B/en
Priority to US15/904,702 priority patent/US10192562B2/en
Priority to IL259070A priority patent/IL259070A/en
Priority to AU2018241064A priority patent/AU2018241064B2/en
Priority to US16/211,563 priority patent/US10446161B2/en
Priority to IL265722A priority patent/IL265722B/en
Priority to US16/545,359 priority patent/US10706863B2/en
Priority to AU2020200340A priority patent/AU2020200340B2/en
Priority to US16/917,171 priority patent/US11355133B2/en
Priority to IL278478A priority patent/IL278478B/en
Priority to AU2021200095A priority patent/AU2021200095B2/en
Priority to IL285298A priority patent/IL285298B/en
Priority to AU2022201270A priority patent/AU2022201270B2/en
Priority to US17/829,733 priority patent/US11817110B2/en
Priority to AU2023201183A priority patent/AU2023201183B2/en
Priority to US18/376,913 priority patent/US12033645B2/en
Priority to US18/675,865 priority patent/US20240312470A1/en
Priority to AU2024204430A priority patent/AU2024204430A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • G10L19/0208Subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • G10L19/025Detection of transients or attacks for time/frequency resolution switching
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • G10L19/265Pre-filtering, e.g. high frequency emphasis prior to encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03GCONTROL OF AMPLIFICATION
    • H03G3/00Gain control in amplifiers or frequency changers
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03GCONTROL OF AMPLIFICATION
    • H03G3/00Gain control in amplifiers or frequency changers
    • H03G3/20Automatic control
    • H03G3/30Automatic control in amplifiers having semiconductor devices
    • H03G3/3089Control of digital or coded signals

Definitions

  • the present invention relates to audio source coding systems which make use of a harmonic transposition method for high-frequency reconstruction (HFR), to digital effect processors, such as exciters which generate harmonic distortion to add brightness to a processed signal, and So time stretchers which prolong a signal duration with maintained spectral content.
  • HFR high-frequency reconstruction
  • W098/57436 the concept of transposition was established as a method to recreate a high frequency band from a lower frequency band of an audio signal.
  • a substantial saving in bitrate can be obtained by using this concept in audio coding.
  • a low bandwidth signal is presented to a core waveform coder and the higher frequencies are regenerated using transposition and additional side information of very low bitrate describing the. target spectral shape at the decoder side.
  • the harmonic transposition defined in W098/57436 performs very well for complex musical material in a situation with low cross over frequency.
  • a harmonic transposition is that a sinusoid with frequency 6) is mapped to a sinusoid with frequency ⁇ ) ⁇ ⁇ where Q > 1 is an integer defining the order of the transposition, in contrast to this, a single sideband modulation (SSB) based HFR maps a sinusoid with frequency eo to a sinusoid with frequency &s+ Aft) where ⁇ » is a fixed frequency shift. Given a core signal with low bandwidth, a dissonant ringing artifact will result: from the SSB transposition.
  • SSB single sideband modulation
  • high quality harmonic HFR methods employ complex modulated filter batiks with very fine frequency resolution and a high degree of oversampling to reach the required audio quality.
  • the fine resolution is necessary to avoid unwanted intermodulatton distortion arising from the nonlinear treatment of sums of sinusoids.
  • the high quality methods aim at having at most one sinusoid in each subband.
  • a high degree of oversampling in time is necessary to avoid alias type distortion, and a cer- tain degree of oversampling in frequency is necessary to avoid pre-eehoes for transient signals.
  • the obvious drawback is thai the computational complexity becomes very high.
  • harmonic transposes Another common drawback associated with harmonic transposes becomes apparent for signals with a prominent periodic structure, Such signals are superimpositions of harmonically related sinusoids with frequencies ⁇ , 2 ⁇ ,30,... . where D. is the fundamental frequency.
  • the output sinusoids Upon harmonic transposition of order Q , the output sinusoids have frequencies ⁇ ,, ⁇ , 2Q ⁇ ,, 3 ⁇ ) ⁇ ⁇ which, in case
  • the present invention achieves at least, one of these objects by providing devices and methods as set forth in the independent claims.
  • the invention provides a system configured to generate a time stretched and/or frequency transposed signal from an input signal.
  • the system comprises:
  • each analysis subband signal comprises a plurality of complex- valued analysis samples, each having a phase and a magnitude;
  • subband processing unit configured to determine a synthesis subband signal from the F analysis subband signals using a subband transposition factor Q and a sub- band stretch factor 5, at least one of Q and S being greater than one, wherein the subband processing unit comprises:
  • o a block extractor configured to:
  • a nonlinear frame processing unit configured to generate, on the basis of
  • the phase of the processed sample is based on the respective phases of the corresponding input sample in each of the >' frames of input samples
  • the magnitude of the processed sample is based on the magnitude of the corresponding input sample in each of the ⁇ frames of input samples
  • an overlap and add unit configured to determine the synthesis subband signal by overlapping and adding the samples of a sequence of frames of processed samples
  • a synthesis filter bank configured to generate the time stretched and/or frequency transposed signal from the synthesis subband signal.
  • the invention provides method for generating a time-stretched and/or frequency-transposed signal from an input signal.
  • the method comprises:
  • each analysis subband signal comprises a plurality of complex-valued analysis samples, each having a phase and a magnitude;
  • phase of the processed sample is based on the respective phases of the corresponding input sample in at least one of the Y frames of input samples; and o the magnitude of the processed sample is based on the magnitude of the corresponding input sample in each of the Y frames of input samples; " ' ⁇ determining the synthesis subband signal by overlapping and adding the samples of a sequence of frames of processed samples; and • generating the time stretched and or frequency transposed signal from the synthesis subband signal.
  • Y is an arbitrary integer greater than one.
  • a third aspect of the invention provides a computer program product inciuding a computer readable medium ⁇ or data carrier) storing software instructions for causing a programmable computer to execute the method according to the second aspect.
  • the invention is based on the realization that the general concept of cross-product enhanced HFR will provide improved results when the data are processed arranged in blocks of complex sub- band samples, infer alia, this makes it possible to apply a frame-wise phase offset to the samples, which has been found to reduce intermoduiation products in some situations. It is further possible to apply a magnitude adjustment, which may lead to similar advantageous effects.
  • the inventive implementation of cross-product enhanced HFR includes subband block based harmonic transposition, which may significantly reduce intermoduiation products.
  • a filter bank with a coarser frequen- cy resolution and/or a Sower degree of oversampling can be used while preserving a high output quality, lis subband block based processing, a time block of complex subband samples is processed with a common phase modification, and the superposition of several modified samples to form an output subband sample has the net effect of suppressing intermoduiation products which would otherwise occur when the input subband signal consists of several sinusoids.
  • Transposi- tion based on block based subband processing has much lower computational complexity than high- resolution transposers and reaches almost the same quality for many signals.
  • the non-linear processing unit uses as input Y "corresponding" frames of input samples in the sense that the frames are synchronous or near synchronous.
  • the samples in the respective frames may relate to time intervals having a substantial time overlap between the. frames.
  • the term “corresponding” is also used with respect to samples to indicate that these are synchronous or approximately so.
  • the term “frame” wi d be used interchangeably with "block”.
  • the "block hop size" may be equal to the frame length (possibly adjusted with respect to downsampling if such is applied) or may be smaller than the frame length (possibly adjusted with respect to downsampling if such is applied), in which case consecutive frames overlap in the sense that an input sample may belong to more than one frame.
  • the system does not necessarily generate every processed sample in a frame by determining its phase and magnitude based on the phase and magnitude of all Y corresponding frames of input samples; without departing from the invention, the system may generate the phase and/or magnitude of some processed samples based.oti a smaller n mbe ,pf o ⁇
  • the analysis filter bank is a quadrature mirror filter (QMF) bank or pseu- do-QMF bank with any number of taps and points, it may for instance be a 64-point QMF bank.
  • the anaiysis filter bank may further be chosen from the class of windowed discrete. Fourier transforms or a wavelet transforms.
  • the synthesis filter bank matches the anaiysis filter bank by being, respectively, an inverse QMF bank, an inverse. pseudo-QMF bank etc. It is known thai such filter banks may have a relatively coarse frequency resolution and/or a relatively low degree of over- sampling. Unlike the prior art, the invention may be embodied using such relatively simpler components without necessarily suffering from a decreased output quality; hence such embodiments represent an economic advantage over the prior art.
  • anaiysis filter bank includes N > I analysis subbands indexed by an analysis subband index n— 0,..,, N— I ;
  • an analysis subband is associated with a frequency band of the input signal.
  • one or more of the fol lowing is true of the synthesi s filter bank:
  • a synthesis time stride is At s ;
  • the synthesis filter batik includes > 1 synthesis subbands indexed by a synthesis subband index m ⁇ -- ⁇ ,.,., - ⁇ ;
  • a synthesis subband is associated with a frequency band of the time- stretched and/or frequency-transposed signal.
  • the control data may specify subbands (e.g., identified by indices) that differ in frequency by a fundamental frequency of the input signal.
  • the indices identifying the subbands may differ by an integer approximating the ratio of such fundamental frequency divided by the analysis frequency spacing. This will lead to a psychoacousticaliy pleasing output, as the new spectral components generated by the harmonic transposition will be compatible with the series of natural harmonics.
  • the (input) analysis and (output) synthesis subband indices are chosen ' so a to satisfy equation (16) below.
  • equation (16) A parameter a appearing in this equation makes it applicable to both oddly and evenly stacked filter banks.
  • subband indices obtained as an approximate (e.g., least squares) solution to equation (16) the new spectral component obtained by harmonic transposition will be likely to be compatible with the series of natural harmonics.
  • the HFR wi.Si be likely to provide a faithful reconstruction of an original signal which has had its high-frequency content removed.
  • a further development of the preceding embodiment provides a way of selecting parameter r appearing in equation (36) and representing the order of the cross-product transposition. Given an output subband index m, each value of the transposition order r will determine two analysis subband indices « 1( «2. This further development assesses the magnitudes of the two subbands for a number of r options and selects that value which gives maximizes the minimum of the two analysis subband magnitudes. This way of selecting indices may avoid the need to restore sufficient magnitude by arnpiify- i « weak components of the input signal, which, may lead to poor output quality.
  • the subband magnitudes may be computed in a manner per se known, such as by the square root of squared input samples forming a frame (block) or part of a frame.
  • a subband magnitude may also be computed as a magnitude of a central or near-central sample in a frame. Such a computation may provide a simple yet adequate magnitude measure.
  • a synthesis subband may receive contributions from harmonic transposition instances according to both direct processing and cross-product based processing.
  • decision criteria may be applied to determine whether a particular possibility of .regenerating a missing partial by cross-product based processing is to be used or not.
  • this further development may he adapted to refrain from using one cross subband pro- cessing unit if one of the following conditions is fulfilled:
  • the ratio of the magnitude M s of the direct source term analysis subband yielding the synthesis subband and the least magnitude M in an optimal pair of cross source terms yielding the synthesis subband is greater than a predetermined constant
  • a fundamental frequency 33 ⁇ 4 is smaller than the analysis filter bank spacing &f A .
  • the invention includes downsampling (decimation) of the input signal, indeed, one or more of the frames of input samples may be determined by downsampling the complex-valued analysis samples in a subband, as may be effected by the block extractor,
  • Equation (15) defines a relationship between, the downsaropiing factors D,, 3 ⁇ 4 with the subhand stretch, factor S and the subband transposition factor Q, and further with phase coef- ensures a matching of the phase of the processed samples with the other components of the input signal, to which the processed samples are to be added.
  • the frames of processed samples are windowed before they are overlapped and added together.
  • a windowing unit may be adapted to apply a finite-length window function to frames of processed samples, Suitable window functions are enumerated in the appended claims.
  • WO2010/08 ] 892 are not entirely compatible with subband block based processing techniques from the outset, although such a method may be satisfactorily applied to one of the subband samples in a block, it njight lead to aliasing artifacts if it were extended in the straightforward manner to the other samples of the block.
  • window functions comprising window samples which add up - when weighted by complex weights and shifted by a hop size - to a substantially constant sequence.
  • the hop size may be the product of the block hop size h and the subband stretch factor S.
  • the use of such window functions reduces the impact of aliasing artifacts.
  • such window functions may also allow for other measures for reducing artifacts, such as phase rotations of processed samples.
  • consecutive complex weights which are applied for assessing the condition on the window samples, differ only by a Fixed phase rotation.
  • said fixed phase rotation is proportional to a fundamental frequency of the input signal.
  • the phase rotation may also be proportional to the order of the cross-product transposition to be applied and/or to the physical transposition parameter and/or to the difference of the downsarnpling factors and/or io the analysis time stride.
  • the phase rotation may be given by equation (21), at least in an approximate sense.
  • the present invention enables cross-product enhanced harmonic transposition by modifying the synthesis windowing in response to a fundamental frequency parameter.
  • successive frames of processed samples are added with a certain overlap.
  • the frames of processed frames are suitably shifted by a hop size which is the block hop size k upscaied by the subband stretch factor S.
  • the system may regenerate missing partials not only by a cross-product based approach (such as by equation (13)) but also by a direct subband approach (such as by equation (5) or (11 )).
  • a control unit is configured to control the operation of the system, including which approach is to be used to regenerate a particular missing partial.
  • a processed sample on the basis of more than three samples, i.e., for Y > 3.
  • a processed sample may be obtained by multiple instances of cross-product based harmonic transposition may contribute to a processed sample, by multiple instances of direct subband processing, or by a combination of cross-product transposition and direct transposition.
  • One embodiment is configured to determine a processed sample as a complex number having a magnitude which is a mean value of the respective magnitudes of corresponding input samples.
  • the mean value may be a (weighted) arithmetic, (weighted) geometric or (weighted) harmonic mean of two or more input samples.
  • the mean is based on two complex input samples.
  • the magnitude of the processed sample is a weighted geometric value. More preferably, the geometric value, is weighted by parameters p and 1 - p, as in equation (13).
  • th geometrical magnitude weighting parameter p is a real number inversely proportional to the subband transposition factor Q.
  • the parameter p may further be inversely proportional to the stretch factor S.
  • the system is adapted to determine a processed sample as a complex number having a phase which is a linear combination of respective phases of corresponding input samples in the frames of input samples.
  • the linear combination may comprise phases relating to two input samples (Y- 2).
  • the linear combination of two phases may apply integer nonzero coefficients, the sum of which is equal to the stretch factor S multiplied by the subband transposition factor Q.
  • the phase obtained by such linear combination is further adjusted by a fixed phase correction parameter.
  • the phase of the. processed sample may be given by equation (13).
  • the block extractor (or an analogous .step in a method according to the in- vention) is adapted to interpolate two or more analysis samples from at! analysis subband signal in order to obtain one input; sample which will be included in a frame (block). Such interpolation may enable downmixing of the input signal by a non-integer factor.
  • the analysis samples to be interpolated may or may not be consecutive,
  • the configuration of the subband processing may be controlled by control data provided from outside the unit effecting the processing.
  • the control data may relate to momentary acoustic properties of the input signal.
  • the system itself may include a section adapted to determine momentary acoustic properties of the signal, such as the (dominant) fundamental frequency of the signal. Knowledge of the fundamental frequency provides a guidance in selecting the analysis subbands from which the processed samples are to be derived. Suitably, the spacing of the analysis subbands is proportional to such, fundamental frequency of the input signal.
  • the control data may also be provided from outside the system, preferably by being included in a coding format suitable for transmission as a bit stream over a digital communication network. In addition to the control data, such coding format may include information relating to lower-frequency compo- .nents .of .a signal . (e,g.,,compcme «ts ⁇
  • the format preferably does not include complete information relating to higher-frequency components (pos. 702), which may be regenerated by the invention.
  • the invention may in particular provide a decoding system with a control data reception unit configured to receive such control data, whether included in a received bit stream that also encodes the input signal or received as a separate signal or bit stream.
  • a hardware implementation may include a pre-normalizer for reseahng the magnitudes of the corresponding input samples in some of the Y frames on which a frame of processed samples are to be based.
  • a processed sample cats be computed as a (weighted) complex product of rescaled and, possibly, non-rescaled input samples.
  • An input sample appearing as a rescaled factor in the product normally need not reappear as a non-rescaled factor.
  • the phase correction parameter ⁇ it is possible to evaluate equation ( 13) as a product of (possibly rescaled) complex input samples. This represents a computational advantage in comparison with separate treatments of the magnitude and the phase of a processed sample.
  • a system may comprise a plurality of subband processing units, each of which is conf igured to determine an intermediate synthesis subband signal using a different subband transposition factor and/or a different subband stretch factor and/or transposition method differing by being cross-product based or direct.
  • the subband processing units may be arranged in parallel, for parallel operation.
  • the system may further comprise a merging unit arranged downstream of the subband processing units and upstream of the synthesis filter bank.
  • the merging unit may be adapted to merge (e.g., by mixing together) corresponding intermediate synthesis subband signals to obtain the synthesis subband signal.
  • a system according to the embodiment may further comprise a core decoder for decoding a bit stream into an input signal, it may also comprise a HER processing unit adapted to apply spectral band information, notably by performing spectral shaping.
  • the operation of the HF processing unit may be controlled by information encoded in the bit stream.
  • One embodiment provides HFR of multi-dimensional signals, e.g.. in a system for reproducing audio in a stereo format comprising Z channels, such as left, right, center, surround etc.
  • the processed samples of each channel are based on the same number of input samples although the stretch factor S and transposition factor Q for each band may vary between channels.
  • the implementation may comprise an analysis filter bank for producing Y analysis subband signals from each channel, a subband processing unit for generating Z subband signals and a synthesis filter bank for generating Z •time stretcbed and/o -frequenc ransposed-.signais.which.fonn.-dis-output-.sign-d , ⁇ .,.
  • the output signal may comprise output channels that are based on different numbers of analysis subband signals. For instance, it may be advisable to devote a greater amount of computational resources to HFR of acoustically prominent channels; e.g., channels to be reproduced by audio sources located in front a listener may be favored over surround or rear channels.
  • Fig. 3 illustrates the principle of subband block based harmonic transposition.
  • Fig. 2 illustrates the operation of nonlinear subband block processing with one subband input.
  • Fig. 3 illustrates the operation of nonlinear subband block processing with two subband inputs.
  • Fig. 4 illustrates the operation of cross product enhanced subband block based harmonic transposition.
  • Fig, 5 illustrates an example scenario for the application of subband block based transposition using several orders of transposition in a HFR enhanced audio codec.
  • Fig. 6 illustrates an example scenario for the operation of a multiple order subband block based transposition applying a 64 band QMF analysis filter bank.
  • Figs. 7 and 8 illustrate experimental results of the described subband block based transposition method.
  • Fig, 9 shows a detail of the non-linear processing unit of Fig. 2, including a pre-normalizer and a multiplier,
  • Fig, 1 illustrates the principle of subband block based transposition, time stretch, or a combi- nation of transposition and time stretch.
  • the input time domain signal is fed to an analysis filter bank ! 01 which provides a multitude of complex valued subband signals. These are fed to the subband processing unit 102, whose operation can be influenced by the control data 104.
  • Each output subband can either be obtained from the processing of one or from two input subbands, or even as a superposi- iien of he-resu
  • the optional control data 104 describes the configuration and parameters of the subband processing. which may be adapted to the signal to be transposed. For the case of cross product enhanced transposition, this data may c n information relating to a dominating fundamental frequency.
  • Fig. 2 illustrates the operation of nonlinear subband block processing with one subband input. Given the target values of physical time stretch and transposition, and the physical parameters of the analysis and synthesis filter banks 101 arid 103, one deduces subband time stretch and transposition parameters as well as a source subband index for each target subband index. The aim of the subband block processing then is to realize the corresponding transposition, time stretch, or a combination of transposition and time stretch of the complex valued source subband signal in order to produce the target subband signal.
  • a block extractor 201 samples a finite frame of samples from the complex valued input signal.
  • the frame is defined by an input pointer position and the subband transposition factor.
  • This frame undergoes nonlinear processing in processing section 202 and is subsequently windowed by windows of finite and possibly variable length in windowing section 203.
  • the resulting samples are added to previously output samples in an overlap and add unit 204 where the output frame position is defined by an output pointer position.
  • the input pointer is incremented by a fixed amount and the output pointer is incremented by the subband stretch factor times the same amount.
  • An iteration of this chain of operations will produce an output signal with duration being the subband stretch factor tinies the input subband sigrtal duration, up to the length of the synthesis window, and with complex frequencies transposed by the subband transposition factor.
  • the control signal 104 may influence each of the three sections 201 , 202, 203.
  • Fig. 3 illustrates the operation of nonlinear subband block processing with two subband inputs.
  • the configuration of sections 301-1 , 301 -2, 302, 303, as well as the values of the two source subband indices may depend on the output 403 of a cross processing control unit 404.
  • the aim of the subband block- rocessing is to realize the corresponding transposition, time stretch, or a combination of transposition and time stretch of the combination of the two complex valued source subband sig- nals in order to produce the target subband signal.
  • a first block extractor 301 -1 samples a finite time frame of samples from the first complex valued source subband
  • the second block extractor 301-2 samples a finite frame of samples from the second complex valued source subband.
  • the frames are defined by a common input pointer position and the subband transposition factor.
  • the two frames underge-nonlinear -processing- in 302 and are subsequently windowecl by , finite k
  • the overlap and add unit 204 may have a similar or identical structure to that shown in Fig. 2. An iteration of this chain of operations will produce an output: signal with duration being the subband stretch factor times the longest of the two input subband signals, (up to the length of the synthesis window). In case ihe two input subband signals carry the same frequencies, the output signal will have complex frequencies transposed by the subband transposition factor. In the case that the two subband signals carry different frequencies, the present invention teaches that the windowing 303 can be adapted to generate at! output signal which has a target frequency suitable for th generation of missing partials in the transposed signal
  • Fig. 4 illustrates the principle of cross product enhanced subband block based transposition, time stretch, or a combination of transposition and time stretch.
  • a cross sub- band processing unit 402 is also fed with the multitude of complex valued subband signals, and its operation is influenced by the cross processing control data 403.
  • the cross processing control data 403 may vary for each input pointer position and consists of at least
  • a cross processing control unit 404 furnishes this cross processing control data 403 given a portion of the control data 104 describing a fundamental frequency and the multitude of complex valued subband signals output from the analysis filter bank 101.
  • the control data 104 may also carry other signal dependent configuration parameters which influence the cross product processing.
  • the two main configuration parameters of the overall harmonic transposer and/or time stretcher are the two main configuration parameters of the overall harmonic transposer and/or time stretcher.
  • the filter banks 101 and 103 can be of any complex exponential modulated type such as QMF or a windowed DFT or a wavelet transform.
  • the analysis filter bank 101 and the synthesis filter bank 103 can be evenly or oddly stacked in the modulation and can be defined from a wide range of prototype filters and or windows. While ail these second order choices affect the details in the subsequent design such as phase corrections and subband mapping management, the main system design parameters for the subband processing can typically be derived from the two quotients At s ! At A and 4f s / & ⁇ of the following four filter bank parameters, all measured in physical units.
  • * At A is the subband sample time step or time stride of the analysis filter bank 101 (e.g. measured in seconds [s]);
  • Af A is the subband frequency spacing of the analysis filier bank 103 (e.g. measured in Hertz [1/s]);
  • At s is the subband sample time step or time stride of the synthesis filter bank 103 (e.g. measured in seconds [s]);
  • Af s is the subband frequency spacing of the synthesis filter bank 103 (e.g. measured in Hertz [1/s ⁇ ).
  • S the subband stretch factor, i.e. the stretch factor which is applied within the subband processing unit 102 as a ratio of input and output samples in order to achieve at] overall physical time stretch of the time domain signal by S 9 ;
  • the subband transposition factor i.e. the transposition factor which is applied within the subband processing unit 102 in order to achieve an overall physical frequency transposition of the time domain signal by the factor Q ;
  • n denotes an index of an analysis subband entering the subband processing unit 102
  • m denotes an index of a corresponding synthesis subband at the output of the subband pro- cessing unit 102.
  • An output sinusoid at the output of the synthesis filter bank 103 of the desired transposed physical frequency Q il will result from feeding the synthesis subband with index m ⁇ Q 9 ⁇ O / Af s with a complex subband signal of discrete angular frequency 2 Q ⁇ ⁇
  • care should be taken in order to avoid the synthesis of aliased output frequencies different from Q x ⁇ Typically this can be avoided by making appropriate second order choices as discussed, e.g. by selecting appropriate analysis and/or synthesis filter banks.
  • the discrete frequency 2 Q ⁇ ⁇ ⁇ At s at the output of the subband processing unit 102 should correspond to the discrete time frequency ⁇ ) ⁇ 2 ⁇ At A at the input of the subband processing unit. 102 multiplied by the subband transposition factor Q . I.e., by setting equal 2ftQQAt A and 2 Q ⁇ ⁇ AS S , the following relation between the physical transposition factor Q and the .subband transposition factor Q may be determined:
  • the subband index mapping may depend on the details of the filter bank parameters, in particular, if the fraction of the frequency spacing of the synthesis filter bank 103 and the analysis inter bank 101 is different from the physical transposition factor Q , one or two source subbands may be assigned to a given target subband. In the case of two source subbands, it may be preferable io use two adjacent source subbands with index n, n+l, respectively.
  • the first and second source subbands are given by either ( n(m) , n(m) + 1 ) or ( n(m) ⁇ 1 , n( ) ).
  • x(k) be the input signal to the block extractor 201, and let /? be the input block stride.
  • x(k) is a complex valued analysis subband sig- nai of an analy sis subband with index n.
  • phase correction parameter ⁇ depends on the filter bank details and the source and target subband indices, in an embodiment, the phase correction parameter ⁇ may be determined ex- peri entally by sweeping a set of input sinusoids. Furthermore, the phase correction parameter
  • phase correction parameter ⁇ may be set to zero, or omitted.
  • the phase modification factor T should be an integer such that the coefficients ⁇ ⁇ i ni3 ⁇ 4 r iire integers in ' ?h£ iin ir ombination of phases in iht: first lira? x;f formula (5).
  • formula (5) specifies that the phase, of an output frame sample is determined by offsetting the phase of a corresponding input frame sample by a constant offset value.
  • This constant offset value may depend on the modification factor T , which itself depends on the subband stretch factor and/or the subband transposition factor.
  • the constant offset value may depend on the phase of a particular Input frasne sample from the input frame. This particular input, frame sample is kept fixed for the determination of the phase of ail the output frame samples of a given block, in the ease of formula (5), the phase of the center sample of the input frame is used as the phase of the particular input frame sample.
  • the second line of formula (5) specifies that the magnitude of a sample of the output frame may depend on the magnitude of the corresponding sample of the input frame. Furthermore, the magnitude of a sample of the output frame may depend on the magnitude of a particular input frame sample. This particular input frame sample may be used for the determination of the magnitude of all the output frame samples, in the case of formula (5), the center sample of the input frame is used as the particular input frasne sample, in an embodiment, the magnitude of a sample of the output frame may correspond to the geometrical mean of the magnitude of the corresponding sample of the input frame and the particular input frame sample.
  • a window w of length L is applied on the output: frame, resiiiting in the v/indowed output frame
  • the overlap and add unit 204 applies a block stride of Sh , i.e., a time stride which is S times higher than the input block stride h . Due to this difference in time strides of forrnuia (4) and (?) the duration of the output signal z(k) is S times the duration of the input signal x(k) , i.e., the synthesis subband signal has been stretched by the subband stretch factor S compared to the analysis subband signal. It should be noted that this observation typically applies if the length Lof the window is negligible in comparison to the signal duration.
  • the subband processing 102 i.e., an analysis subband signal corresponding to a complex sinusoid
  • the subband processing unit 102 may use the control data 104 to set certain processing parameters, e.g. the block length of the block extractors.
  • the nonlinear processing 302 produces the output frame > ⁇ and may be defined by the processing in 303 is again described by (6) and (7) and 204 is identical to the overlap and add processing described in the context of the single input case.
  • the ratio of the frequency spacing ⁇ 5 of the synthesis filter bank 103 and the frequency spacing Af A of the analysis filter bank 101 is differeiit from the desired physical transposition factor Q , it may be beneficial to determine the samples of a synthesis sub- band with index m from two analysis subbands with index n, n ⁇ 1 , respectively.
  • the corresponding index n may be given by the integer value obtained by truncating the analysis index value n given by formula (3),
  • One of the analysis subband signals e.g., the analysis subband signal corresponding to index «, is fed into the first block extractor 303 -1 and the other analysis subband signal, e.g.
  • the one corresponding to index n ⁇ 1 is fed into the second block extractor 301-2.
  • a synthesis subband signal corresponding to index m is deter- mined in accordance with the processing outlined above.
  • the assignment of the adjacent analysis subband signals to the two block extractors 301- S and 302-1 may be based on the remainder that is obtained when truncating the index value of formula (3), i.e. the difference of the exact index value given by formula (3) and the truncated integer value n obtained from formula (3). If the remainder is greater than 0.5, then the analysis subband signal corresponding to index n may be assigned to the second block extractor 301-2, otherwise this analysis subband signal may be assigned to the first block extractor 301-1.
  • the parameters may be designed such that input sub- band signals sharing the same complex frequency ⁇ .
  • the design criteria are different.
  • the aim of a cross product addition is to produce output at the frequencies Q f , ⁇ rOkir for r - ⁇ ,... t Q - 1 given inputs at frequencies ⁇ and £! + £ ⁇ 4, , where i is a fundamental frequency belonging to a dominant pitched component of the input signal.
  • i is a fundamental frequency belonging to a dominant pitched component of the input signal.
  • ⁇ p ii,, /Af A the fundamental frequency measured in units of the analysis filter bank frequency spacing
  • thai is if p ⁇ 1 , it may be advantageous to cancel the addition of a cross product
  • a cross product should not be added io an output subband which already has a significant main contribution from the transposition without crass products. Moreover, at most one of cases r ⁇ l,...,Q ⁇ ⁇ should contribute to the cross product output.
  • these rules may be carried out by performing the following three steps for each target output subband index m
  • Another variation is to expand the maximization in point 1 to more than Q - 1 choices, for example defined by a finite list of candidate values for fundamental frequency measured in analysis frequency spacing units p .
  • Yet another variation is to apply different measures of the s bband magnitudes, such as the magnitude of a fixed sample, a maximal magnitude, an average magnitude, a magnitude in f-norro sense, etc.
  • the list of target source bands m selected for addition of a cross product together with the values of n. and 3 ⁇ 4 constitutes a main part of tiie cross processing control data 403. What remains to be described is the configuration parameters l , D 2 , p , the nonnegative integer parameters T ,T ⁇ appearing in the phase rotation (13) and the synthesis window vvto be used in the cross subband processing 402. Inserting the sinusoidal mode! for the cross product situation leads to the following source subband signals:
  • the desired output subband is of the form
  • the magnitude weighting parameter may be advantageously chosen to p - rl Q . As can be seen, these configuration parameters only depend on the fundamental frequency ⁇ 0 through the selection of r . However, for ( 18) to hold, a new condition on the synthesis window w emerges, namely
  • a synthesis window which satisfies (21) either exactly or approximately is to be provided as the last piece of cross processing control data 403.
  • Fig. 5 illustrates an example scenario for the application of subband block based transposition using several orders of transposition in a HFR enhanced audio codec.
  • a transmitted bit-stream is received at a core decoder 501 , which provides a low bandwidth decoded core signai at a sampling frequency fs.
  • the low bandwidth decoded core signal is resampled to the output sampling frequency 2fs by means of a complex modulated 32 band QMF analysis bank 502 followed by a 64 band QMF synthesis bank (inverse QMF) 505.
  • the high frequency content of the output signal is obtained by feeding the higher subbimds of the 64 band QMF synthesis bank 505 with the output bands front a multiple transposer unit 503, subject to spectral shaping and modifica- tion performed by a HFR processing unit 504.
  • the multiple trar.sposer 503 takes as input the decoded core signal and outputs a multitude of subband signals which represent the 64 QMF band analysis of a superposition or combination of several transposed signal components.
  • each component corresponds to an integer physical transposition without time stretch of the core signal ( Q -- ⁇ 2, 3, ... , and S ⁇ 1 ).
  • the transposer control signal 104 contains data describing a fundamental frequency. This data can either be transmitted via the bitstream from the corresponding audio encoder, deduced by pitch detection in the decoder, or obtained front a combination of transmitted and detected information.
  • Fig. 6 illustrates an example scenario for the operation of a multiple order subband block based transposition applying a single 64 band QMF analysis filter bank.
  • the nterge uni 603 simply selects and combines the relevant subbands from each transposition factor branch into a single multitude of QMF subbands to be fed into she HFR processing unit.
  • a short L - 2 tap window can be used, with R t - R 2 - I , in order to keep the additional complexity of the cross products addition to a minimum.
  • a short L - 2 tap window can be used, with R. ⁇ R., ⁇ and satisfying
  • the low frequency part 70 of the signal is to be used as input for a multiple trans- poser.
  • the purpose of the transposer is to generate a signal as close as possible to the high frequency pari. 702 of the input signal, so that transmission of the high-frequency part 702 becomes non- imperative and available bit rate can be used economically.
  • Fig. 8 depicts the amplitude spectrum of outputs from a transposer which has the low fre- quency part 701 of the signal of Fig 7 as input.
  • the three different panels 801-803 represent the final output obtained by using different settings of the cross processing control data,
  • the additional output signal components compared to 801 do not align well with the desired hannonic series. This shows that it leads to insufficient audio quality to use the procedure inherited from the desigrt of direct sub- band processing for the cross product processing.
  • Fig. 9 shows a portion of ihe non-linear processing frame processing unit 202 including sections configured to receive two input samples w>, a 2 and to generate based on these a processed sample w, whose magnitude is given by a geometric mean of the magnitudes of the input samples and whose phase is a linear combination of the phases of the input samples, that is,
  • Computer readable media may comprise computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks
  • DVD digital versatile disk
  • magnetic cassettes magnetic tape
  • magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer
  • communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Superconductors And Manufacturing Methods Therefor (AREA)
  • Golf Clubs (AREA)
  • Vibration Dampers (AREA)
  • Complex Calculations (AREA)
  • Selective Calling Equipment (AREA)

Abstract

The invention provides an efficient implementation of cross-product enhanced high-frequency reconstruction (HFR), wherein a new component at frequency QΩ + Ωq is generated on the basis of existing components at Ω and QΩ + Ωq. The invention provides a block-based harmonic transposition, wherein a time block of complex subband samples is processed with a common phase modification. Superposition of several modified samples has the net effect of limiting undesirable intermodulation products, thereby enabling a coarser frequency resolution and/or lower degree of oversampling to be used. In one embodiment, the invention further includes a window function suitable for use with block-based cross-product enhanced HFR. A hardware embodiment of the invention may include an analysis filter bank (101), a subband processing unit (102) configurable by control data (104) and a synthesis filter bank (103).

Description

CROSS PRODUCT ENHANCED SUBBAND BLOCK BASED HARMONIC TRANSPOSITION
TECHNICAL FIELD
The present invention relates to audio source coding systems which make use of a harmonic transposition method for high-frequency reconstruction (HFR), to digital effect processors, such as exciters which generate harmonic distortion to add brightness to a processed signal, and So time stretchers which prolong a signal duration with maintained spectral content.
BACKGROUND OF THE INVENTION
In W098/57436 the concept of transposition was established as a method to recreate a high frequency band from a lower frequency band of an audio signal. A substantial saving in bitrate can be obtained by using this concept in audio coding. In an HFR based audio coding system, a low bandwidth signal is presented to a core waveform coder and the higher frequencies are regenerated using transposition and additional side information of very low bitrate describing the. target spectral shape at the decoder side. For low bitrates, where the bandwidth of the core coded signal is narrow, it becomes increasingly important to recreate a high band with perceptually pleasant characteristics. The harmonic transposition defined in W098/57436 performs very well for complex musical material in a situation with low cross over frequency. The principle of a harmonic transposition is that a sinusoid with frequency 6) is mapped to a sinusoid with frequency ζ)ψω where Q > 1 is an integer defining the order of the transposition, in contrast to this, a single sideband modulation (SSB) based HFR maps a sinusoid with frequency eo to a sinusoid with frequency &s+ Aft) where Δί» is a fixed frequency shift. Given a core signal with low bandwidth, a dissonant ringing artifact will result: from the SSB transposition.
In order to reach the best: possible audio quality, state of the art high quality harmonic HFR methods employ complex modulated filter batiks with very fine frequency resolution and a high degree of oversampling to reach the required audio quality. The fine resolution is necessary to avoid unwanted intermodulatton distortion arising from the nonlinear treatment of sums of sinusoids. With sufficiently narrow subbands, the high quality methods aim at having at most one sinusoid in each subband. A high degree of oversampling in time is necessary to avoid alias type distortion, and a cer- tain degree of oversampling in frequency is necessary to avoid pre-eehoes for transient signals. The obvious drawback is thai the computational complexity becomes very high.
Another common drawback associated with harmonic transposes becomes apparent for signals with a prominent periodic structure, Such signals are superimpositions of harmonically related sinusoids with frequencies Ω, 2Ω,30,... . where D. is the fundamental frequency. Upon harmonic transposition of order Q , the output sinusoids have frequencies β,,Ω, 2Q Ω,, 3ξ)φί\ which, in case
°f β« > 1 » 's onty a stri t subset of the desired full harmonic series. In terms of resulting audio quality a "ghost" pitch corresponding to the transposed fundamental frequency β?,Ω will typically be perceived. Often the harmonic transposition results in a "metallic" sounding character of the encoded and decoded audio signal.
in WO2010/081892, which is incorporated herein by reference, the method of cross products was developed to address the above ghost pitch problem in the case of high quality transposition. Givers partial or transmitted full information on the fundamental frequency value of the dominating harmonic part of the signal to be transposed with higher fidelity, the nonlinear subband modifications are supplemented with nonlinear combinations of at least two different analysis sobbands, where the di tances between the analysis subband indices are related to the fundamental frequency. The result is to regenerate the missing partials in the transposed output, which however happens at a considerable computational cost. in view of the above shortcomings of available HFR methods, it is an object of the present invention to provide a more efficient implementation of cross-product enhanced HFR. in particular, it is an object to provide such a method enabling a high-fidelity audio reproduction at a reduced computational effort compared to available techniques.
The present invention achieves at least, one of these objects by providing devices and methods as set forth in the independent claims.
in a first aspect, the invention provides a system configured to generate a time stretched and/or frequency transposed signal from an input signal. The system comprises:
• an analysis filter bank configured to derive a number Y of analysis sub- band signals from the input signal, wherein each analysis subband signal comprises a plurality of complex- valued analysis samples, each having a phase and a magnitude;
* a subband processing unit configured to determine a synthesis subband signal from the F analysis subband signals using a subband transposition factor Q and a sub- band stretch factor 5, at least one of Q and S being greater than one, wherein the subband processing unit comprises:
o a block extractor configured to:
" i) form Y frames of L input samples, each frame being ex- traded from said plurality of complex-valued analysis samples in an analysis subband signal and the frame length being L > 1; and
8 ii) apply a block hop size of h samples to said plurality of analysis samples, prior to forming a subsequent frame of L input samples, thereby generating a sequence of frames of input, samples;
o a nonlinear frame processing unit configured to generate, on the basis of
Y corresponding frames of input samples formed by the block extactor, a frame of processed samples by determining a phase and magnitude for each processed sample of the frame, wherein, for at least one processed sample:
B i) the phase of the processed sample is based on the respective phases of the corresponding input sample in each of the >' frames of input samples; and
s ii) the magnitude of the processed sample is based on the magnitude of the corresponding input sample in each of the ¥ frames of input samples; and
o an overlap and add unit configured to determine the synthesis subband signal by overlapping and adding the samples of a sequence of frames of processed samples; and
• a synthesis filter bank configured to generate the time stretched and/or frequency transposed signal from the synthesis subband signal.
The system may be operable for any positive integer value of Y. However, it is operable at least for Y= 2.
In a second aspect the invention provides method for generating a time-stretched and/or frequency-transposed signal from an input signal. The method comprises:
• deriving a number Y > 2 of analysis subband signals from the input signal, wherein each analysis subband signal comprises a plurality of complex-valued analysis samples, each having a phase and a magnitude;
» forming Y frames of L input samples, each frame being extracted from said plurality of complex-valued analysis samples in an analysis subband signal and the frame length being L > 1;
* applying a block bop size of h samples to said plurality of analysis sam- pies, prior to deriving a subsequent frame of L input samples, thereby generating a sequence of frames of input samples;
* generating, on the basis of Y corresponding frames of input samples, a frame of processed samples by determining a phase and a magnitude for each processed sample of the frame, wherein, for at least one processed sample:
o the phase of the processed sample is based on the respective phases of the corresponding input sample in at least one of the Y frames of input samples; and o the magnitude of the processed sample is based on the magnitude of the corresponding input sample in each of the Y frames of input samples; "'· determining the synthesis subband signal by overlapping and adding the samples of a sequence of frames of processed samples; and • generating the time stretched and or frequency transposed signal from the synthesis subband signal.
Here, Y is an arbitrary integer greater than one. The system according to She first aspect is operable to carry out the method at least for Y = 2.
A third aspect of the invention provides a computer program product inciuding a computer readable medium {or data carrier) storing software instructions for causing a programmable computer to execute the method according to the second aspect.
The invention is based on the realization that the general concept of cross-product enhanced HFR will provide improved results when the data are processed arranged in blocks of complex sub- band samples, infer alia, this makes it possible to apply a frame-wise phase offset to the samples, which has been found to reduce intermoduiation products in some situations. It is further possible to apply a magnitude adjustment, which may lead to similar advantageous effects. The inventive implementation of cross-product enhanced HFR includes subband block based harmonic transposition, which may significantly reduce intermoduiation products. Hence, a filter bank with a coarser frequen- cy resolution and/or a Sower degree of oversampling (such as a QMF filter bank) can be used while preserving a high output quality, lis subband block based processing, a time block of complex subband samples is processed with a common phase modification, and the superposition of several modified samples to form an output subband sample has the net effect of suppressing intermoduiation products which would otherwise occur when the input subband signal consists of several sinusoids. Transposi- tion based on block based subband processing has much lower computational complexity than high- resolution transposers and reaches almost the same quality for many signals.
For the purpose of this disclosure, it is noted that in embodiments where Y> 2, the non-linear processing unit uses as input Y "corresponding" frames of input samples in the sense that the frames are synchronous or near synchronous. E.g., the samples in the respective frames may relate to time intervals having a substantial time overlap between the. frames. The term "corresponding" is also used with respect to samples to indicate that these are synchronous or approximately so. Further, the term "frame" wi d be used interchangeably with "block". Consequently, the "block hop size" may be equal to the frame length (possibly adjusted with respect to downsampling if such is applied) or may be smaller than the frame length (possibly adjusted with respect to downsampling if such is applied), in which case consecutive frames overlap in the sense that an input sample may belong to more than one frame. The system does not necessarily generate every processed sample in a frame by determining its phase and magnitude based on the phase and magnitude of all Y corresponding frames of input samples; without departing from the invention, the system may generate the phase and/or magnitude of some processed samples based.oti a smaller n mbe ,pf o∞
input sample only.
In one embodiment, the analysis filter bank is a quadrature mirror filter (QMF) bank or pseu- do-QMF bank with any number of taps and points, it may for instance be a 64-point QMF bank. The anaiysis filter bank may further be chosen from the class of windowed discrete. Fourier transforms or a wavelet transforms. Advantageously, the synthesis filter bank matches the anaiysis filter bank by being, respectively, an inverse QMF bank, an inverse. pseudo-QMF bank etc. It is known thai such filter banks may have a relatively coarse frequency resolution and/or a relatively low degree of over- sampling. Unlike the prior art, the invention may be embodied using such relatively simpler components without necessarily suffering from a decreased output quality; hence such embodiments represent an economic advantage over the prior art.
In one embodiment, one or more of the following is true of the analysis filter bank:
• an analysis time stride is Af A ;
® an analysis frequency spacing is Δ/Λ ;
» the. anaiysis filter bank includes N > I analysis subbands indexed by an analysis subband index n— 0,..,, N— I ;
• an analysis subband is associated with a frequency band of the input signal.
In one embodiment, one or more of the fol lowing is true of the synthesi s filter bank:
» a synthesis time stride is Ats ;
• a synthesis frequency spacing is Afs ;
« the synthesis filter batik includes > 1 synthesis subbands indexed by a synthesis subband index m■-- ϋ,.,., - ί ;
* a synthesis subband is associated with a frequency band of the time- stretched and/or frequency-transposed signal.
I one embodiment, the nonlinear frame processing unit is adapted to input two frames (V = 2) in order to generate, one frame of processed samples, and the subband processing unit includes a cross processing control unit for generating cross processing control data. By thereby specifying the quanti- tative and or qualitative characteristics of the subband processing, the invention achieves flexibility and adaptability. The control data may specify subbands (e.g., identified by indices) that differ in frequency by a fundamental frequency of the input signal. In other words, the indices identifying the subbands may differ by an integer approximating the ratio of such fundamental frequency divided by the analysis frequency spacing. This will lead to a psychoacousticaliy pleasing output, as the new spectral components generated by the harmonic transposition will be compatible with the series of natural harmonics.
in a further development of the preceding embodiment, the (input) analysis and (output) synthesis subband indices are chosen' so a to satisfy equation (16) below. A parameter a appearing in this equation makes it applicable to both oddly and evenly stacked filter banks. When subband indices obtained as an approximate (e.g., least squares) solution to equation (16), the new spectral component obtained by harmonic transposition will be likely to be compatible with the series of natural harmonics. Hence, the HFR wi.Si be likely to provide a faithful reconstruction of an original signal which has had its high-frequency content removed.
A further development of the preceding embodiment provides a way of selecting parameter r appearing in equation (36) and representing the order of the cross-product transposition. Given an output subband index m, each value of the transposition order r will determine two analysis subband indices «1( «2. This further development assesses the magnitudes of the two subbands for a number of r options and selects that value which gives maximizes the minimum of the two analysis subband magnitudes. This way of selecting indices may avoid the need to restore sufficient magnitude by arnpiify- i« weak components of the input signal, which, may lead to poor output quality. In this connection, the subband magnitudes may be computed in a manner per se known, such as by the square root of squared input samples forming a frame (block) or part of a frame. A subband magnitude may also be computed as a magnitude of a central or near-central sample in a frame. Such a computation may provide a simple yet adequate magnitude measure.
In a further development of the preceding embodiment, a synthesis subband may receive contributions from harmonic transposition instances according to both direct processing and cross-product based processing. In this connection, decision criteria may be applied to determine whether a particular possibility of .regenerating a missing partial by cross-product based processing is to be used or not. For instance, this further development may he adapted to refrain from using one cross subband pro- cessing unit if one of the following conditions is fulfilled:
a) the ratio of the magnitude Ms of the direct source term analysis subband yielding the synthesis subband and the least magnitude M in an optimal pair of cross source terms yielding the synthesis subband is greater than a predetermined constant;
b) the synthesis subband already receives a significant contribution from a direct processing unit;
c) a fundamental frequency 3¾ is smaller than the analysis filter bank spacing &fA.
In one embodiment, the invention includes downsampling (decimation) of the input signal, indeed, one or more of the frames of input samples may be determined by downsampling the complex-valued analysis samples in a subband, as may be effected by the block extractor,
In a further development of the preceding embodiment, the downsampling factors to be applied satisfy equation (15) below. Not both downsampling factors are allowed to be zero, as this corresponds to a trivial case. Equation (15) defines a relationship between, the downsaropiing factors D,, ¾ with the subhand stretch, factor S and the subband transposition factor Q, and further with phase coef- ensures a matching of the phase of the processed samples with the other components of the input signal, to which the processed samples are to be added. In one embodiment, the frames of processed samples are windowed before they are overlapped and added together. A windowing unit may be adapted to apply a finite-length window function to frames of processed samples, Suitable window functions are enumerated in the appended claims.
The inventor has realized that cross-product methods of the type disclosed in
WO2010/08 ] 892 are not entirely compatible with subband block based processing techniques from the outset, Although such a method may be satisfactorily applied to one of the subband samples in a block, it njight lead to aliasing artifacts if it were extended in the straightforward manner to the other samples of the block. To this end, one embodiment applies window functions comprising window samples which add up - when weighted by complex weights and shifted by a hop size - to a substantially constant sequence. The hop size may be the product of the block hop size h and the subband stretch factor S. The use of such window functions reduces the impact of aliasing artifacts. Alternatively or additionally, such window functions may also allow for other measures for reducing artifacts, such as phase rotations of processed samples.
Preferably, consecutive complex weights, which are applied for assessing the condition on the window samples, differ only by a Fixed phase rotation. Further preferably, said fixed phase rotation is proportional to a fundamental frequency of the input signal. The phase rotation may also be proportional to the order of the cross-product transposition to be applied and/or to the physical transposition parameter and/or to the difference of the downsarnpling factors and/or io the analysis time stride. The phase rotation may be given by equation (21), at least in an approximate sense.
in one embodiment, the present invention enables cross-product enhanced harmonic transposition by modifying the synthesis windowing in response to a fundamental frequency parameter.
In one embodiment, successive frames of processed samples are added with a certain overlap. To achieve the suitable overlap, the frames of processed frames are suitably shifted by a hop size which is the block hop size k upscaied by the subband stretch factor S. Hence, if the overlap of consecutive frames of input samples is L - h, then the overlap of consecutive frames of processed samples may be SiL - h)" ,
In one embodiment, the system according to the invention is operab!e not only to generate a processed sample, on the basis of Y= 2 input samples, but also on the basis of Y- 1 sample only. Hence, the system may regenerate missing partials not only by a cross-product based approach (such as by equation (13)) but also by a direct subband approach (such as by equation (5) or (11 )). Preferably, a control unit is configured to control the operation of the system, including which approach is to be used to regenerate a particular missing partial.
-In a fu rites-, development
ate a processed sample on the basis of more than three samples, i.e., for Y > 3. For instance, a processed sample may be obtained by multiple instances of cross-product based harmonic transposition may contribute to a processed sample, by multiple instances of direct subband processing, or by a combination of cross-product transposition and direct transposition. This option of adapting the transposition method provides for a powerful and versatile HF . Consequently, this embodiment is operable to cany out the method according to the second aspect of the invention for = 3, 4, 5 etc.
One embodiment is configured to determine a processed sample as a complex number having a magnitude which is a mean value of the respective magnitudes of corresponding input samples. The mean value may be a (weighted) arithmetic, (weighted) geometric or (weighted) harmonic mean of two or more input samples. In the case Y = 2, the mean is based on two complex input samples. Preferably, the magnitude of the processed sample is a weighted geometric value. More preferably, the geometric value, is weighted by parameters p and 1 - p, as in equation (13). Here, th geometrical magnitude weighting parameter p is a real number inversely proportional to the subband transposition factor Q. The parameter p may further be inversely proportional to the stretch factor S.
In one embodiment, the system is adapted to determine a processed sample as a complex number having a phase which is a linear combination of respective phases of corresponding input samples in the frames of input samples. In particular, the linear combination may comprise phases relating to two input samples (Y- 2). The linear combination of two phases may apply integer nonzero coefficients, the sum of which is equal to the stretch factor S multiplied by the subband transposition factor Q. Optionally, the phase obtained by such linear combination is further adjusted by a fixed phase correction parameter. The phase of the. processed sample may be given by equation (13).
In one embodiment, the block extractor (or an analogous .step in a method according to the in- vention) is adapted to interpolate two or more analysis samples from at! analysis subband signal in order to obtain one input; sample which will be included in a frame (block). Such interpolation may enable downmixing of the input signal by a non-integer factor. The analysis samples to be interpolated may or may not be consecutive,
In one embodiment, the configuration of the subband processing may be controlled by control data provided from outside the unit effecting the processing. The control data may relate to momentary acoustic properties of the input signal. For instance, the system itself may include a section adapted to determine momentary acoustic properties of the signal, such as the (dominant) fundamental frequency of the signal. Knowledge of the fundamental frequency provides a guidance in selecting the analysis subbands from which the processed samples are to be derived. Suitably, the spacing of the analysis subbands is proportional to such, fundamental frequency of the input signal. As an alternative, the control data may also be provided from outside the system, preferably by being included in a coding format suitable for transmission as a bit stream over a digital communication network. In addition to the control data, such coding format may include information relating to lower-frequency compo- .nents .of .a signal . (e,g.,,compcme«ts ^
omy, the format preferably does not include complete information relating to higher-frequency components (pos. 702), which may be regenerated by the invention. The invention may in particular provide a decoding system with a control data reception unit configured to receive such control data, whether included in a received bit stream that also encodes the input signal or received as a separate signal or bit stream.
One embodiment provides a technique for efficiently carrying out computations occasioned by the inventive method. To this end, a hardware implementation may include a pre-normalizer for reseahng the magnitudes of the corresponding input samples in some of the Y frames on which a frame of processed samples are to be based. After such rescaling, a processed sample cats be computed as a (weighted) complex product of rescaled and, possibly, non-rescaled input samples. An input sample appearing as a rescaled factor in the product normally need not reappear as a non-rescaled factor. With the possible exception of the phase correction parameter Θ. it is possible to evaluate equation ( 13) as a product of (possibly rescaled) complex input samples. This represents a computational advantage in comparison with separate treatments of the magnitude and the phase of a processed sample. in one embodiment, a system configured for the ease Y = 2 comprises two block extractors adapted to form one frame of input samples each, in parallel operation.
in a further development of the embodiments representing Y > 3, a system may comprise a plurality of subband processing units, each of which is conf igured to determine an intermediate synthesis subband signal using a different subband transposition factor and/or a different subband stretch factor and/or transposition method differing by being cross-product based or direct. The subband processing units may be arranged in parallel, for parallel operation. In this embodiment, the system may further comprise a merging unit arranged downstream of the subband processing units and upstream of the synthesis filter bank. The merging unit may be adapted to merge (e.g., by mixing together) corresponding intermediate synthesis subband signals to obtain the synthesis subband signal. As already noted, the intermediate synthesis subband which are merged may have been obtained by both direct and cross-product based harmonic transposition. A system according to the embodiment may further comprise a core decoder for decoding a bit stream into an input signal, it may also comprise a HER processing unit adapted to apply spectral band information, notably by performing spectral shaping. The operation of the HF processing unit may be controlled by information encoded in the bit stream.
One embodiment provides HFR of multi-dimensional signals, e.g.. in a system for reproducing audio in a stereo format comprising Z channels, such as left, right, center, surround etc. In one possible implementation for processing an input signal with a plurality of channels, the processed samples of each channel are based on the same number of input samples although the stretch factor S and transposition factor Q for each band may vary between channels. To this end, the implementation may comprise an analysis filter bank for producing Y analysis subband signals from each channel, a subband processing unit for generating Z subband signals and a synthesis filter bank for generating Z •time stretcbed and/o -frequenc ransposed-.signais.which.fonn.-dis-output-.sign-d ,^.,.
In variations to the preceding embodiment, the output signal may comprise output channels that are based on different numbers of analysis subband signals. For instance, it may be advisable to devote a greater amount of computational resources to HFR of acoustically prominent channels; e.g., channels to be reproduced by audio sources located in front a listener may be favored over surround or rear channels.
It is emphasized that the invention relates to all combinations of the above features, even if these are recited in different claims,
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will now be described by way of illustrative examples, not limiting the scope or spirit of the invention, with reference to the accompanying drawings.
Fig. 3 illustrates the principle of subband block based harmonic transposition.
Fig. 2 illustrates the operation of nonlinear subband block processing with one subband input. Fig. 3 illustrates the operation of nonlinear subband block processing with two subband inputs.
Fig. 4 illustrates the operation of cross product enhanced subband block based harmonic transposition.
Fig, 5 illustrates an example scenario for the application of subband block based transposition using several orders of transposition in a HFR enhanced audio codec.
Fig. 6 illustrates an example scenario for the operation of a multiple order subband block based transposition applying a 64 band QMF analysis filter bank.
Figs. 7 and 8 illustrate experimental results of the described subband block based transposition method.
Fig, 9 shows a detail of the non-linear processing unit of Fig. 2, including a pre-normalizer and a multiplier,
DESCRIPTION OF PREFERRED EMBODIMENTS
The embodiments described below are merely illustrative for the principles of the present invention CROSS PRODUCT ENHANCED SUBBAND BLOCK BASED HARMONIC TRANSPOSITION, it is understood that modifications and variations of the arrangements and the details described herein will be apparent to others skilled in the art. It is the intent, therefore, that the invention be limited only by the scope of the appended patent claims and not by the specific details presented by way of description and explanation of the embodiments herein.
Fig, 1 illustrates the principle of subband block based transposition, time stretch, or a combi- nation of transposition and time stretch. The input time domain signal is fed to an analysis filter bank ! 01 which provides a multitude of complex valued subband signals. These are fed to the subband processing unit 102, whose operation can be influenced by the control data 104. Each output subband can either be obtained from the processing of one or from two input subbands, or even as a superposi- iien of he-resu
bands is fed to a synthesis filter bank 103. which in turn outputs the modified time domain signal. The optional control data 104 describes the configuration and parameters of the subband processing. which may be adapted to the signal to be transposed. For the case of cross product enhanced transposition, this data may c n information relating to a dominating fundamental frequency.
Fig. 2 illustrates the operation of nonlinear subband block processing with one subband input. Given the target values of physical time stretch and transposition, and the physical parameters of the analysis and synthesis filter banks 101 arid 103, one deduces subband time stretch and transposition parameters as well as a source subband index for each target subband index. The aim of the subband block processing then is to realize the corresponding transposition, time stretch, or a combination of transposition and time stretch of the complex valued source subband signal in order to produce the target subband signal.
A block extractor 201 samples a finite frame of samples from the complex valued input signal. The frame is defined by an input pointer position and the subband transposition factor. This frame undergoes nonlinear processing in processing section 202 and is subsequently windowed by windows of finite and possibly variable length in windowing section 203. The resulting samples are added to previously output samples in an overlap and add unit 204 where the output frame position is defined by an output pointer position. The input pointer is incremented by a fixed amount and the output pointer is incremented by the subband stretch factor times the same amount. An iteration of this chain of operations will produce an output signal with duration being the subband stretch factor tinies the input subband sigrtal duration, up to the length of the synthesis window, and with complex frequencies transposed by the subband transposition factor. The control signal 104 may influence each of the three sections 201 , 202, 203.
Fig. 3 illustrates the operation of nonlinear subband block processing with two subband inputs. Given the target values of physical time stretch and transposition, and the physical parameters of the analysis and synthesis filter banks 101 and 103, one deduces subband time stretch and transposition parameters as well as two source subband indices for each target subband index. In case the non- linear subband block processing is to be used for creation of missing partiais through cross product addition, the configuration of sections 301-1 , 301 -2, 302, 303, as well as the values of the two source subband indices, may depend on the output 403 of a cross processing control unit 404. The aim of the subband block- rocessing is to realize the corresponding transposition, time stretch, or a combination of transposition and time stretch of the combination of the two complex valued source subband sig- nals in order to produce the target subband signal. A first block extractor 301 -1 samples a finite time frame of samples from the first complex valued source subband, and the second block extractor 301-2 samples a finite frame of samples from the second complex valued source subband. The frames are defined by a common input pointer position and the subband transposition factor. The two frames underge-nonlinear -processing- in 302 and are subsequently windowecl by, finite k
windowing section 303. The overlap and add unit 204 may have a similar or identical structure to that shown in Fig. 2. An iteration of this chain of operations will produce an output: signal with duration being the subband stretch factor times the longest of the two input subband signals, (up to the length of the synthesis window). In case ihe two input subband signals carry the same frequencies, the output signal will have complex frequencies transposed by the subband transposition factor. In the case that the two subband signals carry different frequencies, the present invention teaches that the windowing 303 can be adapted to generate at! output signal which has a target frequency suitable for th generation of missing partials in the transposed signal
Fig. 4 illustrates the principle of cross product enhanced subband block based transposition, time stretch, or a combination of transposition and time stretch. The direct subband processing unit
401 can be of the kind already described with reference to Fig. 2 (section 202} or Fig. 3. A cross sub- band processing unit 402 is also fed with the multitude of complex valued subband signals, and its operation is influenced by the cross processing control data 403. The cross subband processing unit
402 performs nonlinear subband block processing of the type with two subband inputs described in Fig 3, and the output target subbands are added to those from the direct subband processing 40 i in adder 405. The cross processing control data 403 may vary for each input pointer position and consists of at least
* a selected list of target subband indices;
• a pair of source subband indices for each selected target subband index; and
* a finite length synthesis window.
A cross processing control unit 404 furnishes this cross processing control data 403 given a portion of the control data 104 describing a fundamental frequency and the multitude of complex valued subband signals output from the analysis filter bank 101. The control data 104 may also carry other signal dependent configuration parameters which influence the cross product processing.
In the following text, a description of principles of cross product enhanced subband block based time stretch and transposition will be outlined with reference to Figs. 1-4, and by adding appropriate mathematical terminology.
The two main configuration parameters of the overall harmonic transposer and/or time stretcher are
• Se : the desired physical time stretch factor, and
* gp : the desired physical transposition factor.
The filter banks 101 and 103 can be of any complex exponential modulated type such as QMF or a windowed DFT or a wavelet transform. The analysis filter bank 101 and the synthesis filter bank 103 can be evenly or oddly stacked in the modulation and can be defined from a wide range of prototype filters and or windows. While ail these second order choices affect the details in the subsequent design such as phase corrections and subband mapping management, the main system design parameters for the subband processing can typically be derived from the two quotients Ats ! AtA and 4fs / &† of the following four filter bank parameters, all measured in physical units. In the above- quotients, * AtA is the subband sample time step or time stride of the analysis filter bank 101 (e.g. measured in seconds [s]);
* AfA is the subband frequency spacing of the analysis filier bank 103 (e.g. measured in Hertz [1/s]);
* Ats is the subband sample time step or time stride of the synthesis filter bank 103 (e.g. measured in seconds [s]); and
* Afs is the subband frequency spacing of the synthesis filter bank 103 (e.g. measured in Hertz [1/s}).
For the configuration of the subband processing unit 102, the following parameters should be computed:
* S : the subband stretch factor, i.e. the stretch factor which is applied within the subband processing unit 102 as a ratio of input and output samples in order to achieve at] overall physical time stretch of the time domain signal by S9 ;
* Q : the subband transposition factor, i.e. the transposition factor which is applied within the subband processing unit 102 in order to achieve an overall physical frequency transposition of the time domain signal by the factor Q ; and
» ihe correspondence between source and target subband indices, wherein n denotes an index of an analysis subband entering the subband processing unit 102, and m denotes an index of a corresponding synthesis subband at the output of the subband pro- cessing unit 102.
In order to determine the subband stretch factor S , it is observed that an input signal to ihe analysis filter bank 101 of physical duration D corresponds to a number DIAtA of analysis subband samples at She input to the subband processing unit 102. These D/At,t samples will be stretched to S D/ AlA samples by the subband processing unit 102 which applies the subband stretch factor , At the output of th synthesis filter bank 103 these S D/ AtA samples result in an output signal having a physical duration of Δί5 · S D/AlA . Since this latter duration should meet the specified value Sv - D , i.e. since the duration of the time domain output signal should be time stretched compared to the time domain input signal by the physical time stretch factor S , the following design rale is obtained:
S = -S. . (1)
In order to determine the sobband transposition factor Q which is applied within the subband processing unit 102 in order to achieve a physical transposition Q , it is observed that an input sinus- oici to the analysis filter bank 101 of physical frequency Ω will result in a complex analysis subband signal wish discrete time angular frequency ω - ΙπΩ AtA and the main contribution occurs within the analysis subband with index n ~ Ω,ί AfA . An output sinusoid at the output of the synthesis filter bank 103 of the desired transposed physical frequency Q il will result from feeding the synthesis subband with index m ~ Q9 O / Afs with a complex subband signal of discrete angular frequency 2 Q Ω■ Ats , In this context, care should be taken in order to avoid the synthesis of aliased output frequencies different from Qx■ , Typically this can be avoided by making appropriate second order choices as discussed, e.g. by selecting appropriate analysis and/or synthesis filter banks. The discrete frequency 2 Q Ω Ats at the output of the subband processing unit 102 should correspond to the discrete time frequency ΰ) ~ 2πΩ■ AtA at the input of the subband processing unit. 102 multiplied by the subband transposition factor Q . I.e., by setting equal 2ftQQAtA and 2 Q ·Ω · ASS , the following relation between the physical transposition factor Q and the .subband transposition factor Q may be determined:
iJ - (2) Likewise, the appropriate source or analysis subband index n of the subband processing unit
102 for a given target or synthesis subband index m should obey
Figure imgf000015_0001
in one embodiment, it holds thai Afs /AfA i-e. the frequency spacing of the synthesis filter bank 103 corresponds to the frequency spacing of the analysis filter bank 101 multiplied by the physical transposition factor, and the one -to-one mapping of analysis to synthesis subband index n = m can be applied. In other embodiments, the subband index mapping may depend on the details of the filter bank parameters, in particular, if the fraction of the frequency spacing of the synthesis filter bank 103 and the analysis inter bank 101 is different from the physical transposition factor Q , one or two source subbands may be assigned to a given target subband. In the case of two source subbands, it may be preferable io use two adjacent source subbands with index n, n+l, respectively.
Thai is, the first and second source subbands are given by either ( n(m) , n(m) + 1 ) or ( n(m) ÷ 1 , n( ) ).
The subband processing of Fig. 2 with a single source subband will now be described as a function of the subband processing parameters S and Q . Let x(k) be the input signal to the block extractor 201, and let /? be the input block stride. I.e., x(k) is a complex valued analysis subband sig- nai of an analy sis subband with index n. The block extracted by the block extractor 201 can without loss of generality be considered to be defined by the L = R, + R2 samples xi (k) = x(Qk + hl), k ^ ~R. ,... R7 - l , (4) wherein the integer ί is a block counting index, L is the block length and are nonnegative integers. Note that for Q■■ 1 , she block is extracted from consecutive samples but for Q > 1 , & downsamp!ing is performed in such a manner that the input addresses are stretched out by the factor Q . If Q is an integer this operation is typically straightforward to perform, whereas an interpolation method may be reqiiired for non-integer values of Q . This statement is relevant also for non-integer values of the increment h , i.e. of the input block stride. In an embodiment, short interpolation filters, e.g. filters having two filter taps, can be applied to the complex valued subband signal. For instance, if a sample at the fractional time index k + 0.5 is required, a two tap interpolation of the form x(k ÷ 0.5} » ax(k) + bx{k 4 3) , where the coefficients , b may be constants or may depend on a subband index (see, e.g., WO2004/G97794 and WO2007/085275), may ensure a sufficient quality.
An interesting special case of formula (4) is R == 0 , R,— 1 where the extracted block consists of a single sample, i.e. the block length is L = 1 .
With the poiar representation of a complex number z = |s[exp(i'Zj) , wherein j;| is the magni- tude of the complex number and Zz is the phase of the complex number, the nonlinear processing unit 202 producing the output frame y, from the input frame jc( is advantageously defined by the phase modification factor T = SQ through
Figure imgf000016_0001
where e [0,i] is a geometrical magnitude weighting parameter. The case p ~ Q corre- sponds to a pure phase modification of the extracted block. A particularly attractive value of the magnitude weighting is p = \ -\IT for which a certain computational complexity relief is obtained irrespectively of the block length L, and the resulting transient response is somewhat improved over the case p 0. The phase correction parameter Θ depends on the filter bank details and the source and target subband indices, in an embodiment, the phase correction parameter Θ may be determined ex- peri entally by sweeping a set of input sinusoids. Furthermore, the phase correction parameter
Θ may be derived by studying the phase difference of adjacent target subband complex sinusoids or by optimizing the performance for a Dirac pulse type of input signal. Finally, with a suitable design of the analysis and synthesis filter banks 101 and 103, the phase correction parameter Θ may be set to zero, or omitted. The phase modification factor T should be an integer such that the coefficients Γ ·· i ni¾ r iire integers in' ?h£ iin ir ombination of phases in iht: first lira? x;f formula (5). Witti ih;s assumption, i.e. with the assumption that the phase modification factor T is an integer, the result of the nonlinear modification is well defined even though phases are ambiguous by identification modulo 2π ,
In words, formula (5) specifies that the phase, of an output frame sample is determined by offsetting the phase of a corresponding input frame sample by a constant offset value. This constant offset value may depend on the modification factor T , which itself depends on the subband stretch factor and/or the subband transposition factor. Furthermore, the constant offset value may depend on the phase of a particular Input frasne sample from the input frame. This particular input, frame sample is kept fixed for the determination of the phase of ail the output frame samples of a given block, in the ease of formula (5), the phase of the center sample of the input frame is used as the phase of the particular input frame sample.
The second line of formula (5) specifies that the magnitude of a sample of the output frame may depend on the magnitude of the corresponding sample of the input frame. Furthermore, the magnitude of a sample of the output frame may depend on the magnitude of a particular input frame sample. This particular input frame sample may be used for the determination of the magnitude of all the output frame samples, in the case of formula (5), the center sample of the input frame is used as the particular input frasne sample, in an embodiment, the magnitude of a sample of the output frame may correspond to the geometrical mean of the magnitude of the corresponding sample of the input frame and the particular input frame sample.
in the windowing unit 203, a window w of length L is applied on the output: frame, resiiiting in the v/indowed output frame
z, (k) = w(k)y,(k\ k = -Ri,...R2 - l . (6)
Finally, it is assumed that all frames are extended by zeros, and the overlap and add operation 204 is defined by
z(k) ^ z, (k - Shl) , (7) wherein it should be noted that the overlap and add unit 204 applies a block stride of Sh , i.e., a time stride which is S times higher than the input block stride h . Due to this difference in time strides of forrnuia (4) and (?) the duration of the output signal z(k) is S times the duration of the input signal x(k) , i.e., the synthesis subband signal has been stretched by the subband stretch factor S compared to the analysis subband signal. It should be noted that this observation typically applies if the length Lof the window is negligible in comparison to the signal duration.
For the case where a complex sinusoid is used as input, to the subband processing 102, i.e., an analysis subband signal corresponding to a complex sinusoid
Figure imgf000017_0001
it may be determined by applying the formulas (4)-(7) that the output of the subband processing 102, i.e. the corresponding synthesis subband signal, is given by
Figure imgf000018_0001
independently of p. Hence, a complex sinusoid of discrete time frequency ω will be transformed into a complex sinusoid with discrete time frequency Qfa provided the synihesis window shifts with a stride of Sh sum up to the same constant value K for ail k ,
Figure imgf000018_0002
It is illustrative to consider the special case of pure transposition where 5 = 1 and T— Q . If the input block stride is h ~ 1 and ¾ = 0 , R, ~ 1 , all the above, i.e. notably formula (5), reduces to the point-wise or sample based phase modification ruie
Zz(k) ------ TZx(k) + &}
The subband processing unit 102 may use the control data 104 to set certain processing parameters, e.g. the block length of the block extractors.
In the following, the description of the subband processing will be extended to cover the case of Fig. 3 with two subband inputs. Let xll) (k) be the input subband signal to the first, block extractor 301-1 and let xm (k) be the input subband signal to the second block extractor 30 ! -2. Each extractor can use a different dowrtsampling factor, leading to the extracted blocks
Figure imgf000018_0003
The nonlinear processing 302 produces the output frame >· and may be defined by
Figure imgf000018_0004
the processing in 303 is again described by (6) and (7) and 204 is identical to the overlap and add processing described in the context of the single input case.
The definition of the nonnegative real parameters £>, , £¾ , p and the nonnegative integer parameters 7J , Γ, and the synthesis window w now depends on the desired operation mode. Note that if the same subband is fed to both inputs, xm (k) = x i> (k) and B] = Q , D2 = Q , 7, = 1 , T2 = r -l , the operations in (12) and ( 13) reduce, to those of (4) arid (3) in the single input case.
In one embodiment, wherein the ratio of the frequency spacing Δ 5 of the synthesis filter bank 103 and the frequency spacing AfA of the analysis filter bank 101 is differeiit from the desired physical transposition factor Q , it may be beneficial to determine the samples of a synthesis sub- band with index m from two analysis subbands with index n, n ÷ 1 , respectively. For a given index m, the corresponding index n may be given by the integer value obtained by truncating the analysis index value n given by formula (3), One of the analysis subband signals, e.g., the analysis subband signal corresponding to index «, is fed into the first block extractor 303 -1 and the other analysis subband signal, e.g. the one corresponding to index n ÷ 1 , is fed into the second block extractor 301-2. Based on these two analysis subband signals a synthesis subband signal corresponding to index m is deter- mined in accordance with the processing outlined above. The assignment of the adjacent analysis subband signals to the two block extractors 301- S and 302-1 may be based on the remainder that is obtained when truncating the index value of formula (3), i.e. the difference of the exact index value given by formula (3) and the truncated integer value n obtained from formula (3). If the remainder is greater than 0.5, then the analysis subband signal corresponding to index n may be assigned to the second block extractor 301-2, otherwise this analysis subband signal may be assigned to the first block extractor 301-1. In this operation mode, the parameters may be designed such that input sub- band signals sharing the same complex frequency ω.
Figure imgf000019_0001
lead to an output subband signal being a complex sinusoid with discrete time frequency Qra. It turns out that this happens if the following relations hold:
(15) sO = T. + rL
For the operation mode of generating missing partials by means of cross products, the design criteria are different. Returning to the physical transposition parameter Q , the aim of a cross product addition is to produce output at the frequencies Qf, ÷ rO„ for r - \,...tQ - 1 given inputs at frequencies Ω and £! + £¾, , where i is a fundamental frequency belonging to a dominant pitched component of the input signal. As described in WO2010/081892, the selective addition of those terms will result in a completion of the harmonic series and a significant, reduction of the ghost pitch artifact.
A constructive algorithm for operating the cross processing control 404 will now be outlined.
Given a target output subband index in , the parameter r - 1,... , Q - 1 and the fundamental frequency
Ω0 , one. can deduce appropriate source subband indices n, and n2 by solving the following system of equations in an approximate sense,
Figure imgf000019_0002
where <r = !/ 2 for oddly stacked filter bank modulation (as typically used for QMF and MDCT filter banks) and σ = 0 for evenly stacked filter bank modulation (as typically used for FFF filter banks) ,
With the definitions
β p ii,, /AfA : the fundamental frequency measured in units of the analysis filter bank frequency spacing;
* F = Afs /Δ/Α : the quotient of synthesis to analysis subband frequency spacing; and
f {m + CT)F rp t, , , , ^ . . . ,
• nJ — -'- - : the real valued target for an integer valued lower source index,
an example of advantageous approximate solution to (16) is given by selecting n, as the integer closest to rtf , and n, as the integer closest to n! + p .
if the fundamental frequency is smaller than the analysis filter bank spacing, thai is if p < 1 , it may be advantageous to cancel the addition of a cross product,
As it is taught in WO2010/081892, a cross product should not be added io an output subband which already has a significant main contribution from the transposition without crass products. Moreover, at most one of cases r ~ l,...,Q ~ \ should contribute to the cross product output. Here, these rules may be carried out by performing the following three steps for each target output subband index m
1. Compute the maximum Mc over all choices of r ~ \,...,Q^ - \ of the minimum of the candidate source subband magnitudes and
Figure imgf000020_0001
evaluated in (or from a neighborhood of) the central time slot k— hi , wherein the source s bbands x(i) and x''5) may be given by indices rt; and % as in equation (16);
2. Compute the corresponding magnitude Ms for the
Figure imgf000020_0002
F
obtained from a source subband with index n ~— m (cf, eq, 3);
Qr
3. Activate the cross term from a winning choice for Mc in point 1 above only if Mc > qM s , where q is a predetermined threshold value.
Variations to this procedure may be desirable depending on the particular system configura- ,.iion.parameters.,.Qne,such^^
pending on the quotient Mc I Ms . Another variation is to expand the maximization in point 1 to more than Q - 1 choices, for example defined by a finite list of candidate values for fundamental frequency measured in analysis frequency spacing units p . Yet another variation is to apply different measures of the s bband magnitudes, such as the magnitude of a fixed sample, a maximal magnitude, an average magnitude, a magnitude in f-norro sense, etc.
The list of target source bands m selected for addition of a cross product together with the values of n. and ¾ constitutes a main part of tiie cross processing control data 403. What remains to be described is the configuration parameters l , D2 , p , the nonnegative integer parameters T ,T} appearing in the phase rotation (13) and the synthesis window vvto be used in the cross subband processing 402. Inserting the sinusoidal mode! for the cross product situation leads to the following source subband signals:
Figure imgf000021_0001
where ω= 2πΟΛΐΑ and <¾¾ = 2πΩνΑ . Likewise, the desired output subband is of the form
.-{λ-) = ^ βχ [;β(&>+ Γίί¾ /β?, )Α] . (18)
Computations reveal that this target output can be achieved if (15) is fulfilled jointly with
F, r
½— =— . (19) The conditions (15) and (19) are equivalent to
Figure imgf000021_0002
which defines the integer factors 7j , T2 for the phase modification in (13) and provides some design freedom in setting the values of downsampiin.g factors D, , D2. The magnitude weighting parameter may be advantageously chosen to p - rl Q . As can be seen, these configuration parameters only depend on the fundamental frequency Ω0 through the selection of r . However, for ( 18) to hold, a new condition on the synthesis window w emerges, namely
Figure imgf000021_0003
A synthesis window which satisfies (21) either exactly or approximately is to be provided as the last piece of cross processing control data 403.
It is noted that the above algorithm for computing cross processing control data 403 on the basis of input parameters, such as a target output subband index m and a fundamental frequency ί\. , is of a purely exemplifying nature and as such does not limit the scope of the invention. Variations of this disclosure within the skilled person's knowledge and routine experimentation - e.g., a further subband block based processing method providing a signal (18) as output in response to input signals ( 17) - fall entirely within the scope of the invention.
Fig. 5 illustrates an example scenario for the application of subband block based transposition using several orders of transposition in a HFR enhanced audio codec. A transmitted bit-stream is received at a core decoder 501 , which provides a low bandwidth decoded core signai at a sampling frequency fs. The low bandwidth decoded core signal is resampled to the output sampling frequency 2fs by means of a complex modulated 32 band QMF analysis bank 502 followed by a 64 band QMF synthesis bank (inverse QMF) 505. The two filter banks 502 and 505 share the same physical parameters Ais = AiA and Afs ~ AfA , and the HFR processing unit 504 simply lets through the unmodified lower subbands corresponding to the low bandwidth core signai. The high frequency content of the output signal is obtained by feeding the higher subbimds of the 64 band QMF synthesis bank 505 with the output bands front a multiple transposer unit 503, subject to spectral shaping and modifica- tion performed by a HFR processing unit 504. The multiple trar.sposer 503 takes as input the decoded core signal and outputs a multitude of subband signals which represent the 64 QMF band analysis of a superposition or combination of several transposed signal components. The objective is that if the HFR processing is bypassed, each component corresponds to an integer physical transposition without time stretch of the core signal ( Q --■ 2, 3, ... , and S ~ 1 ). in the inventive scenario, the transposer control signal 104 contains data describing a fundamental frequency. This data can either be transmitted via the bitstream from the corresponding audio encoder, deduced by pitch detection in the decoder, or obtained front a combination of transmitted and detected information.
Fig. 6 illustrates an example scenario for the operation of a multiple order subband block based transposition applying a single 64 band QMF analysis filter bank. Here three transposition or- ders Q9 = 2,3,4 are to be produced and delivered in the domain of a 64 band QMF
operating at output sampling rate 2fs , The nterge uni 603 simply selects and combines the relevant subbands from each transposition factor branch into a single multitude of QMF subbands to be fed into she HFR processing unit. The objective is specifically that the processing chain of a 64 band QMF analysis 601. a subband processing unit 602- Q , and a 64 band QMF synthesis 505 re- suits in a physical transposition of ¾, with S =l (i.e. no stretch). Identifying these three blocks with 101, 102 and 103 of Fig. I , one finds that AtA = 64 fs and Λ/Α = fs / 128 so Ats I AtA = 1/2 and F ~ fs / Δ/, = 2 , A design of specific configuration parameters for 602- Q will be described separately for each case Q„ - 2,3,4. For all cases, the analysis stride is chosen to be h = 1 , and it is assumed that the normalized fundamental frequency parameter p
Figure imgf000022_0001
= 128Ω0 / fs is known. Consider first the case Q - 2 . Then 602-2 has to perform a subband stretch of S ~ 2 , a sub- band transposition of Q - l (i.e. none) and the correspondence between source n and target subbands m is given by n m for the direct subband processing, In the inventive scenario of cross product addition, there is only one type of cross product to consider, namely r = \ (see above, after equation (15)), and the equations (20) reduce to T, - T2 - 1 and Dl + D2 = l . An exemplary solution consists of choosing IX = 0 and D2— I . For the direct processing synthesis window, a rectangular window of even length L— 10 with /<f, - R2 = 5 may be used as it satisfies the condition (10), For the cross processing synthesis window, a short L - 2 tap window can be used, with Rt - R2 - I , in order to keep the additional complexity of the cross products addition to a minimum. After all, the beneficial effect of using a long block for the subband processing is most notable in the case of complex audio signals, where unwanted intermodulatioti terms are suppressed; for the case of a dominant pitch, such artifacts are less probable to occur. The L = 2 tap window is the shortest one that can satisfy (10) since ft = 1 and S - 2 , By the present invention, however, the window advantageously satisfies (21 ). For the parameters at hand, this amounts to
Figure imgf000023_0001
which is fulfilled by choosing w(0) - l and w(-l) = exp 'ar) = exjtQbcp 12) .
For the case Q9 = T> the specifications for 602-3 given by (l)-(3) are that it has to perform a subband stretch of 5 = 2 , a subband txansposition of Q = 3/2 and that the correspondence between source n and target m subbands for the direct terns processing is given by n ~ '2m > 3 . There are two types of cross product terms r— 1,2 , and the equations (20) reduce to
Figure imgf000023_0002
An exemplary solution consists of choosing the downsampling parameters as
» D, := 0 and IX = 3 / 2 for r = 1 ;
B. ::= 3/ 2 and a = 0 for r = 2 .
For the direct processing synthesis window, a rectangular window of even length L = % with R; = R = 4 may be used. For the cross processing synthesis window, a short L - 2 tap window can be used, with R. ~ R., ~ and satisfying
Figure imgf000024_0001
wirich is fulfilled by choosing w(Q) - l an^ 1) ~ exp(iar) .
For the case Q ~ 4 , the specifications for 602-4 given by (l)-(3) are that it has to perform a siibband stretch of S ~ 2 , a subband transposition of Q = 2 and that the correspondence between source n and target subbands in for the direct term processing is given is by n∞2m . There are three types of cross product terms r = 1, 2, 3 , and the equations (20) reduce to
Figure imgf000024_0002
An exemplary solution consists of choosing
® = 0 and D2 = 2 for r = l ;
» D,■ 0 and D2 ~ I for r - 2 ;
* D, = 2 and l), ~ 0 for r =3 ;
For the direct processing synthesis window, a rectangular window of even iength L = 6 with R; = /ij = 3 may be used. For the cross processing synthesis window, a short L ~ 2 tap window can be used, with K ~ R, = i , and satisfying
Figure imgf000024_0003
which is fulfilled by choosing w(0) ~ 1 and vv(-l) = exp(jaf) .
In each of the above cases where more than one r value is applicable, a selection will take place, e.g., similarly to ihe three-step procedure described before equation (17).
Fig, 7 depicts the amplitude spectrum of a harmonic signal with fundamental frequency Ω0 = 564.7 Hz. The low frequency part 70 of the signal is to be used as input for a multiple trans- poser. The purpose of the transposer is to generate a signal as close as possible to the high frequency pari. 702 of the input signal, so that transmission of the high-frequency part 702 becomes non- imperative and available bit rate can be used economically.
Fig. 8 depicts the amplitude spectrum of outputs from a transposer which has the low fre- quency part 701 of the signal of Fig 7 as input. The multiple transposer is constructed by using 64 band QMF filter banks, input sampling frequency fs = 14400 Hz , and in accordance with the descrip- tion of Fig, 5. For clarity however, only the two transposition orders Q = 2,3 are considered. The three different panels 801-803 represent the final output obtained by using different settings of the cross processing control data,
The top panel 801 depicts the output spectrum obtained if all cross product processing is canceled and only the direct subband processing 401 is active. This will be the case if the cross processing control 404 receives no pitch or /? ~ Q . Transposition by Qx = 2 generates the output in the range from 4 to 8 kHz and transposition by {¾, = 3 genera! es the output in the range from 8 to 12 kHz. As it can be seen, the created partials are increasingly far apart and the output deviates significantly from the target high frequency signal 702. Audible double and triple "ghost" pitch artifacts will be present in the resulting audio output.
The middle panel 802 depicts ihe output spectrum obtained if cross product processing is active, the pitch parameter p - 5 is used (which is an approximation to I28i¾#s = 5.0196), but a simple two tap synthesis window with w(0) = tv(-l) = , satisfying condition (10), is used for the cross subband processing. This amounts to a straightforward combination of subband block based processing and cross-product enhanced harmonic transposition. As it can be seen, the additional output signal components compared to 801 do not align well with the desired hannonic series. This shows that it leads to insufficient audio quality to use the procedure inherited from the desigrt of direct sub- band processing for the cross product processing.
The bottom panel 803 depicts the output spectrum obtained from the same scenario as for the middle panel 802, but now with the cross subband processing synthesis windows given by the formulas described in the cases g = 2,3 of Fig. 5. That is, a two tap window of the form w(Q) ~ 1 and w(--T) - exp(/«) satisfying (21) and with the feature taught by the present invention that it depends on ihe value of p. As it can be seen, the combined output signal aligns very well with the desired harmonic series of 702.
Fig. 9 shows a portion of ihe non-linear processing frame processing unit 202 including sections configured to receive two input samples w>, a2 and to generate based on these a processed sample w, whose magnitude is given by a geometric mean of the magnitudes of the input samples and whose phase is a linear combination of the phases of the input samples, that is,
ίΜ -ΚΓΚΡ , (22)
[arg w ~ T-, arg u, + Γ, arg u2
It is possible to obtain the processed sample w according to this specification by pre¬
Figure imgf000025_0001
the pre-normalized input samples v, = H, / «, , v, = Μ- /|«2 at a weighted multiplier 910, which outputs w— "v2■ Clearly, the operation of the pre-norrnalizers 901 , 902 and the weighted multiplier 10 is determined by input parameters a, h, a. and β. it is easy to verify that equations (22) will be fulfilled if = Tt , β = T2 , a ~ 1 - p / T ,b ~ \ - (l ·■■ ) / T2. The skilled person will readily be able to generalize this layout to an arbitrary number N0 of input samples, wherein a multiplier is supplied with N0 input samples, of which some or ail have, undergone pre-normaiization. One observes, then, that a common pre-normaiization (a - h, implying that the pre-normalizers 901 , 902 produce tderiit- cal results) is possible if the parameter ρ is set to p ~ T. /(Γ, + T7 ). This results in a computational advantage when many subhands are considered, since a common pre-normaiization step can be effected on all candidate subbands prior to the multiplication. In an advantageous hardware implementation, a plurality of identically functioning pre-normalizers is replaced by a single unit which alter- nates between samples from different subbands in a time-division fashion.
Further embodiments of the present invention will become apparent to a person skilled in the art after reading the description above. Even though the present description and drawings disclos embodiments and examples, the invention is not restricted to these specific examples. Numerous modifications and variations can be made without departing from the scope of the present invention, which is defined by the accompanying claims.
The systems and methods disclosed hereinabove may be implemented as software, firmware, hardware or a combination thereof. Certain components or all components may be implemented as software executed by a digital signal processor or microprocessor, or be implemented as hardware or as an application-specific integrated circuit. Such software may be distributed on computer readable media, which may comprise computer storage media (or non-transitory media) and communication media (or transitory media). As is well known to a person skilled in the art, computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks
(DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer, Further, it is well known to the skilled person thai communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

Claims

ϊ , A system configured to generate a time stretched afid/or frequency transposed signal from ao input signal, the system comprising:
an analysis filter bank (101) configured to derive a number Y> I of analysis subband signals from the input signal, wbeieio each analysis subband signal comprises a plurality of complex- valued analysis samples, each having a phase and a magnitude;
a subband processing unit (102) configured to generate a synthesis subband signal from the Y analysis subband signals using a subband transposition factor Q and a subband stretch factor S, at least one of Q and S being greater than one. wherein the subband processing unit (102) comprises:
a block extractor (201) configured to:
i) form Y frames of L input samples, each frame being ex tracted from said plurality of complex-valued analysis samples in an analysis subband signal and the frame length being L > 1 ; and
ii) apply a block hop size of A samples to said plurality of analysis samples, prior to forming a subsequent frame of L input samples, thereby generating a sequence of frames of input samples;
a nonlinear frame processing unit (202) configured to generate, on the basis of F corresponding frames of input samples formed by the block extractor, a frame of processed sampies by determining a phase and magnitude for each processed sample of the frame, wherein, for at least one processed sample:
i) the phase of the processed sample i based on the respective phases of the corresponding input sample in each of the Y frames of input samples; and
ii) the magnitude of the processed sample is based on the magnitude of the corresponding input sample in each of the Y frames of input samples;
and
an overlap and add unit (204) configured to determine the synthesis sub- band signal by overlapping and adding the samples of a sequence of frames of pro- cessed samples;
and
a synthesis filter bank (103) configured to generate the time stretched and/or frequency transposed signal from the synthesis subband signal,
wherein the system is operable at least for 2, 2, The system of claim 1 , wherein
the analysis filter bank ( 101) is one of a quadrature mirror filter bank, a windowed discrete Fourier transform or a wavelet transform; and
the synthesis filter bank (103) is a corresponding inverse filter bank or transform.
3, The system of claim 2, wherein the analysis filter bank (101) is a 64-point quadrature minor filter bank and the synthesis filter bank (103) is an inverse 64-point quadrature mirror filter bank,
4, The system of any one of the preceding claims, wherein:
the analysis filter bank (101) applies an analysis time stride AtA to the input signal; the analysis filter bank has an analysis frequency spacing AfA ;
the analysis filter bank has a number N of analysis subbands, with N > I , where n is an analysis subband index with n --■ 0,..., N— 1 ;
an analysis subband of the N analysis subbands is associated with a frequency band of the input signal;
the synthesis filter bank (103) applies a synthesis time stride Ats to the synthesis sub- band signal;
the synthesis filter bank has a synthesis frequency spacing Afs ;
the synthesis filter bank has a number M of synthesis subbands, with M > 1 , where m is a synthesis subband index with m— 0,..., - 1 ; and
a synthesis subband of the M synthesis subbands is associated with a frequency band of the time stretched and/or frequency transposed signal.
5, The system of claim 4. wherein the subband processing unit (102) is configured for = 2 and further comprises a cross processing control unit (404) configured to generate cross processing control data (403) defining subband indices n,, ¾ associated with the analysis subband signals in such manner that the subband indices differ by an integer p approximating the ratio of a fundamental frequency ¾ of the input signal and the analysis frequency spacing AfA .
6. The system of claim 4, wherein the subband processing unit (102) is configured for
Y~ 2 and further comprises a cross processing control unit (404) cotsfigured to generate cross pro- cessing control data (403) defining subband indices R„ n2 associated with, the anaiysis subband signals and the synthesis subband index m, which subband indices are related by being approximate integer solutions of
Figure imgf000029_0001
where J¾ is a fundamental frequency of the input signal ,
σ = 0 or 1/2,
-Q^ nd r is an integer satisfying 1 < r < βφ - 1
7. The system of claim 6, wherein the cross processing control unit (404) is configured to generate processing control data such that the subband indices m, «?. are based on a value of r which maximizes the minimum of the subband magnitudes of she two frames formed by extracting analysis samples from analysis subband signals,
S. The system of claim 7, wherein the subband magnitude of each frame of L input sam ples is the magnitude of a central or near-central sample. 9. The system of any one of the preceding claims, wherein fhe block extractor (201 ) is configured to derive at least one frame of input samples by downsampling the complex-valued analysis samples in an analysis subband signal.
10. The system of claim 9 configured for Y~ 2, wherein the block extractor is configured to derive a first and seconti frame of input samples by downsampling the complex-valued analysis samples in a first and second analysis subband signal, respectively, by downsampling factors D\ and
(Q ^ T.D; + T.D, )
D2 satisfying - 1 ' ' z and either D, > 0, D2 > 0 or Di > 0, D, > 0,
! SQ = ?; + '/ J
and wherein the nonlinear frame processing unit (202) is configured to determine the phase of the processed sample based on a linear combination, with non-negative integer coefficients T T2, of respective phases of the corresponding input sample in a first and second frame of input samples.
1 ϊ , The system of say one of the preceding claims, wherein the subband processing unit
(102) further comprises a windowing unit (203) upstream of the overlap and add unit (204) aiid configured to apply a finite-length window function to the frame of processed samples,
12, Tine system of claim 11 , wherein the window function has a length which corresponds to the frame length L and the window function is one of a:
Gaussian window,
cosine window,
raised cosine window,
Hamming window,
Hann window,
rectangular window,
B nlett window, and
Blaekman window.
13, Th sysie ofcisiiiHj i, wfosstn ihe whskw
Figure imgf000030_0001
¾sfi) }es;i and wherem ovedsgjsed; a wifKiow samples o s plurality q s tu w fswcSoss, w¾e¾ elgh^d^:b^^ with a hop $¾sri>i s std stanti&!3¾:^nsii¾i . ¾*· sys¾¾m: «|* 1 ¾ wfi«fs¾¾ c asi¾«ti!i¾ cor pies, ights iffer <¾ by :«t ists! ¾as¾ rotation.
35.
Figure imgf000030_0002
::i¾4jiis;Hi?y of t¾e; input sigaa ,
Figure imgf000030_0003
top slge ¾ fews the-^^^..$i^^:'£l^ $,:
■ ;}, - ■^^ <^-^^lt^: ^'e4i^: !¾s , 5 «8¾3e aj-fe i ¾r ' 1 asti . ¥ ~ -
18. The system of claim 17, operable for at least one further value Y > 3,
19. The system of any one of the preceding claims configured for 2, wherein the frame processing unit (202) is configured to determine the magnitude of the processed sample as a mean value of the magnitude of the corresponding input sample in a first frame of input samples and the magnitude of the corresponding input sample in a second frame of input samples.
20. The system of claim 19, wherein the nonlinear frame processing unit (202) is configured to determine the magnitude of the processed sample as a weighted geometric mean value.
21. The system of claim 20, wherein the geometric magnitude weighting parameters are p and I - p, where p is a real number inversely proportional to the subband transposition factor Q.
22. The system of any one of the preceding claims configured for Y—2, wherein the nonlinear frame processing unit (202) is configured to determine the phase of the processed sample based on a linear- combination, with non-negative integer coefficients {Ί , T'i), of respective phases of the corresponding input sample in a first and second frame of input samples.
23. The system of claim 22, wherein the sum of said integer coefficients is the product Q x S of the stretch factor and the transposition factor. 24. The system of claim 22, wherein the phase of the processed sample corresponds to said linear combination of phases plus a phase correction parameter Θ.
25. The system of any one of the preceding claims, wherein the block extractor (201) is configured to interpolate two or more analysis samples to derive an input sample.
26. The system of any one of the preceding claims, farther comprising a control data reception unit configured to receive control data (104), wherein the subband processing unit (102) is configured to determine the synthesis subband signal by taking into account the control data. 27. The system of claim 26 configured for Y- 2, said control data (104) including a fundamental frequency ¾ of the input signal, wherein the subband processing unit (102) is configured to determine the analysis subbands, from which the processed samples are to be derived, in such manner that their frequency spacing is proportional to the fundamental frequency. 28. The system of any one of the preceding claims, wherein the non-linear processing unit
( 102) comprises: a pre-normalizer (901, 902) configured to rescaie the magnitudes of the corresponding input samples in at least one of the F frames of input samples ( vm = in /to j ); and
a complex multiplier (910) configured to determine the processed sample by computing a weighted complex product j ~|HJ" J~J V^™ j of faciors equal to the corresponding input sample in at least two of the Y frames of input samples, at least one of the factors ( vm , m s M≠ 0 ) being derived from a sample with a magnitude rescaled by the pre-normalizer.
29. The system of any one of the preceding claims configured for Y- 2, comprising:
an analysis filter bank (101) configured to derive a first and a second analysis subband signal from the input signal;
a subband processing unit (102) configured to determine a synthesis subband signal from the first and second analysis subband signals, wherein the subband processing unit (102) comprises:
a first block extractor (301-1 ) configured to:
i) form a first frame of i. input samples from said plurality of com lex- valued analysis samples in the first analysis subband signal, the frame length being L > I ; and
ii) apply a block hop size of h samples to said plurality of analysis samples, prior to forming a subsequent frame of L input samples, thereby generating a first sequence of frames of input samples; a second block extractor (301 -2) configured to:
i) form a second frame of L input samples from said plurality of complex-valued analysis samples in the second analysis sub- band signal ; and
ii) apply the block hop size of h samples to said plurality of analysis samples, prior to forming a subsequent frame of L input samples, thereby generating a second sequence of frames of input samples; a nonlinear frame processing unit (302) configured to generate, on the basis of the first and second frames of input samples, a frame of processed samples;
and
an overlap and add unit (204) configured to form the synthesis subband
-signals--- and
a synthesis filter bank (103) configured to generate the time stretched and/or frequency transposed signal from the synthesis subband signal.
30. The system of any one of the preceding claims, further comprising:
a plurality of subband processing units (401, 402; 503: 602-2, 602-3, 602-4), each configured to determine an intermediate synthesis subband signal using a different value of the subband transposition factor Q and/or of the subband stretch factor S; and
a merging unit (405; 603) arranged downstream of said plurality of subband processing units and upstream of the synthesis filter bank (103) configured to merge corresponding intermediate synthesis subband signals in order to determine the synthesis subband signal. 31, The system of claim 30, further comprising;
a core decoder (501) arranged upstream of the analysis filter bank (101) configured to decode a bit stream into the input signal; and
an high-frequency reconstruction, HFR, processing unit (504) arranged downstream of the merging unit (405; 603) and upstream of the synthesis filter bank (103; 505) configured to apply spectral band information derived from the bit stream to the synthesis subband signal, such as by performing spectral shaping of the synthesis subband signal.
32. The system of claim 30, wherein at least one of the subband processing units is a direct subband processing unit (403), which is configured to determine one synthesis subband signal from one analysis subband signal using a subband transposition factor Q and a subband stretch factor S, and at least one is a cross subband processing unit (402), which is configured to determine one synthesis subband signal from two analysis subband signals using a subband transposition factor Q and a sub- band stretch factor S, which are independent of the first two factors.
33. The system of claim 32 configured for Y= 2, wherein:
the analysis filter bank (101) applies an analysis time stride AtA to the input signal; the analysis filter bank has an analysis frequency spacing fA ;
the analysis filter bank has a number N of analysts subbands, with N > l , where n an analysis subband index with n --- 0,..., N— 1 ;an analysis subband of the N analysis subbands is associated with a frequency band of the input signal;
the synthesis filter bank (103) applies a synthesis time stride Afs to the synthesis subband signal; the synthesis filter bank has a synthesis frequency spacing Δ/Α. ;
the synthesis filter bank has a number M of synthesis subbands, with M > 1 , where m is a synthesis subband index with m = Ο,.,., — 1 ; and
a synthesis subband of the M synthesis subbands is associated with a frequency band of the time stretched and or frequency transposed signal,
said system being configured to deactivate at least one cross subband processing unit (402) if, for a given synthesis subband, one of the following conditions is satisfied:
a) the ratio of the magnitude Ms of the direct source term analysis subband yielding the synthesis subband and the least magnitude M in an optima] pair of cross source terms yielding the synthesis subband is greater than a predetermined constant q;
b) the synthesis subband has a significant contribution from a direct processing unit; c) a fundamental frequency Ω0 is smaller than the analysis filler bank spacing AfA.
34. The system of any one of the preceding claims, wherein:
the analysis filter bank (101) is configured Jo form Y x Z analysis subband signals from the input signal;
the subband processing unit ( 102) is configured to generate Z synthesis subband signals from the Y x Z analysis subband signals, applying a pair of S and Q values for each group of Y analysis subband signals on which one synthesis subband signal is based; and
the synthesis filter bank (103) is configured to generate Z time stretched and»'or frequency transposed signals from the Z synthesis subband signals.
35. A method for generating a time stretched and/or frequency transposed signal from an input signal, the method comprising:
deriving a number ¥> 2 of analysis subband signals from the input signal, wherein each analy sis subband signal comprises a plurality of complex-valued analysis samples, each having a phase and a magnitude;
forming Y frames of L input samples, each frame being extracted from said plurality of complex-valued analysis samples in an analysis subband signal and the frame length being L > I ;
applying a block hop size of h samples to said plurality of analysis samples, prior to deriving a subsequent frame of L input samples, thereby generating a sequence of frames of input samples:
generating, on the basis of Y corresponding frames of input samples, a frame of pro¬
Figure imgf000034_0001
wherein, for at least one processed sample: i) the phase of the processed sample Is based on the respective phases of the corresponding input sample in each of the frames of input samples; and
ii) the magnitude of the processed sample is based on the magnitude of the corresponding input sample in each of the Y frames of input samples; determining the synthesis subband signal by overlapping and adding the samples of a sequence of frames of processed samples; and
generating the time stretched and/or frequency transposed signal from the synthesis subband signal. 36. The method of claim 35, wherein a frame of processed samples is based on ¥- 2 corresponding frames of input samples, which are formed by extracting samples from two analysis subband signals representing frequencies that differ approximately by a fundamental frequency ¾ of the input signal. 37. The method of claim 35 or 36, wherein:
a frame of processed samples is based on Y- 2 corresponding frames of input samples, which are formed by extracting samples from two analysis subband signals approximately representing frequencies Ω and Ω + 0;i; and
the synthesis subband signal approximately represents a frequency ζ)φ + fi¾, where r is
At
an integer satisfying 1 < r < Qv- 1 and Q ~— ~0 , where Δί4 and Δί? are analysis and synthesis
AtA ~
time strides, respectively.
38. The method of claim 37, wherein the frequency Ω is selected in order to maximize the smaller of the subband magnitudes of the two frames of input samples extracted from anal sis sub- band signals representing frequencies Ω and Q + i¾.
39. The method of claim 38, wherein the subband magnitude of a frame of input samples is the magnitude of a central or near-central sample. 40. "The method of any one of claims 35 to 39, wherein said forming frames of input samples includes downsarnpling complex-valued analysis samples in an analysis subband signal. t . Tht- ::r r > of ·ϋ:&ί:'ί: 4-0. ¾- - rr?m;
a frame of processed samples is based on Y= 2 corresponding frames of input samples; a first frame of input samples is extracted from samples in a first analysis subband signal while applying a downsarnpling factor ; a second frame of input samples is extracted from samples in a second analysis sub- band signal while applying a downsampiing factor i¾;
the do wrtsampling factors satisfy < r _ _ and either Di > 0, D2> 0 or Dj > 0,
[SQ—7. + T2 J
D2 > 0; and
the phase of the processed sample is based on a linear combination, with non-negative integer coeffici nts T{, T2, of respective phases of the corresponding input sample in a first and second frame of input samples.
42. The method of any one of claims 35 to 41 , wherein said determining the. synthesis sub- band signal further comprises applying a finite-length window function to each frame in the sequence of frames of processed samples prior to overlapping and adding them.
43. The method of claim 42, wherein the window function has a length which corresponds to the frame length L and the window function is one of a:
Gaussian window,
cosine window,
raised cosine window,
Hamming window.
Harm window,
rectangular window,
Bartlett window, and
Blackman window.
44. The method of claim 42, wherein the window function comprises a plurality of window samples, and wherein overlapped and added window samples of a plurality of window functions, when weighted by complex weights and shifted with a hop size of Sh , form a substantially constant sequence,
45. The method of claim 44, wherein consecutive complex weights differ only by a fixed phase rotation.
46. The method of claim 45, wherein the phase rotation is proportional to a fundamental frequency of the ¾piii i n l
47. The method of any one of claims 35 to 46, wherein said determining the synthesis sub- band signal includes overlapping consecutive frames of processed samples by applying a hop size equal to the block hop size h times the subband stretch factor S. 48, The method of any one of claims 35 to 47, wherein:
a frame of processed samples is based on F= 2 corresponding frames of input samples; and
the magnitude of the processed sample is determined as a mean value of the magnitude of the corresponding input sample in a first frame of input samples and the magnitude of the eorre- spending input sample in a second frame of input samples,
49. The method of claim 48, wherein said mean value of magnitudes is a weighted geometric mean value. 50. The method of claim 49, wherein geometric magnitude weighting parameters are p and
1 - p, where p is a real number inversely proportional to the subband transposition factor Q.
51 , The method of any one of claims 35 to 50, wherein:
a frame of processed samples is based on Y= 2 corresponding frames of input samples; and
wherein the phase of the processed sample is determined as a linear combination, with non-negative integer coefficients (T , 7?,), of respective, phases of the corresponding input sample in a first mi second frame of input samples. 52. The method of claim 5 I , wherein the sum of said non-negative integer coefficients is the product Q x S of the stretch factor and the transposition factor.
53. The method of claim 51 , wherein the phase of the processed sample corresponds to said linear combination plus a phase correction parameter Θ.
54. The method of any one of claims 35 to 53, wherein at least one input, sample is derived by interpolating two or more analysis samples.
■••·55ϊ--·^
be taken into account in said generating a frame of processed samples. 56. The method of claim 55, wherein;
D10079WOQ1 a frarne of processed samples is based on Y - 2 corresponding frames of input samples; said control data include a fundamental frequency i¾ of the input signal; and the two analysis subbands, from which the input samples in each frame are extracted, represent frequencies differing by the fundamental frequency.
57. The method of any one of claims 35 to 56, wherein said generating a frame of processed samples comprises:
rescaiing a magnitude of at least, one input sample; and computing a processed sample as a weighted complex product uj" ~| ^f of
\mU «ιεΛί J factors equal to the corresponding input sample in a t least two of the Y frames of input samples, wherein at least one of the factors ( vi>s ~ umm f , m e M≠ 0 ) is an input sample with a resealed magnitude.
58. The method of any one of claims 35 to 57, comprising generating a plurality of intermediate synthesis subband signals, wherein each is generated on the basis of a plurality of corresponding frames of input samples and using a different value of the subband transposition factor Q and/or of the subband stretch factor S,
wherein said determining the. synthesis subband signal includes merging corresponding intermediate synthesis subband signals.
59. The method of claim 58, further comprising:
decoding a bit stream to obtain the input signal, from which the analysis subband signals are to be derived; and
applying spectral band information derived from the bit stream to the synthesis subband ignal, such as by performing spectral shaping of the synthesis subband signal.
60. The method of claim 58, wherein at least one of the intermediate synthesis subband signals is generated by direct subband processing, on the basis of one analysis subband signal and using a subband transposition factor Q and a subband stretch factor S, and at least one of the intermediate synthesis subband signals is generated by cross-product processing, on the basis of two analysis subband signals using a subband transposition factor Q and a subband stretch factor S, which factors are independent of the first two factors.
61. The method of claim 60, wherein said generating an intermediate synthesis subband signal by cross-product processing is suspended responsive to one of the following conditions being satisfied:
a) the ratio of the magnitude Ms of the direct source term analysis subband yielding the synthesis subband and the least magnitude Mc in an optimal pair of cross source terms yielding the synthesis subband is greater than a predetermined constant q;
b) the synthesis subband has a significant contribution from a direct processing unit; c) a fundamental frequency J¾ is smaller than the analysis filter bank spacing AfA. 62. The method of any one of claims 35 to 6 [ , wherein:
Yx Z analysis subband signals are derived;
Y x Z frames of input samples are formed;
Yx Z corresponding frames of input samples are used to generate Z frames of processed samples;
Z synthesis subband signals are determined; and
Z time stretched and/or frequency transposed signals are generated.
63. A data carrier storing computer-readable instructions for performing the method set forth in any one of claims 35 to 62.
PCT/EP2011/065318 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition WO2012034890A1 (en)

Priority Applications (63)

Application Number Priority Date Filing Date Title
DK11763872.6T DK2617035T3 (en) 2010-09-16 2011-09-05 CROSS-PRODUCT-ENHANCED SUBBOND BLOCK BASED HARMONIC TRANSPOSITION
IL291501A IL291501B2 (en) 2010-09-16 2011-09-05 Method and system for cross product enhanced subband block based harmonic transposition
KR1020237026369A KR102694615B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
JP2013528595A JP5951614B2 (en) 2010-09-16 2011-09-05 Signal generation system and signal generation method
KR1020177014269A KR101863035B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
IL296448A IL296448A (en) 2010-09-16 2011-09-05 Method and system for cross product enhanced subband block based harmonic transposition
UAA201304657A UA105988C2 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
KR1020227029790A KR102564590B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
MX2013002876A MX2013002876A (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition.
SG2013011804A SG188229A1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
EP21204206.3A EP3975178B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
KR1020197023879A KR102073544B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
US13/822,601 US9172342B2 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
KR1020187033935A KR101980070B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
IL303921A IL303921B1 (en) 2010-09-16 2011-09-05 Method and system for cross product enhanced subband block based harmonic transposition
KR1020147026155A KR101744621B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
IL313284A IL313284A (en) 2010-09-16 2011-09-05 Method and system for cross product enhanced subband block based harmonic transposition
KR1020197013601A KR102014696B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
BR122019025115-0A BR122019025115B1 (en) 2010-09-16 2011-09-05 SYSTEM AND METHOD FOR GENERATING AN EXTENDED TIME AND / OR FREQUENCY SIGN TRANSPOSED FROM AN ENTRY SIGNAL AND STORAGE MEDIA LEGIBLE BY NON-TRANSITIONAL COMPUTER
EP11763872.6A EP2617035B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
AU2011304113A AU2011304113C1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
BR122019025142-8A BR122019025142B1 (en) 2010-09-16 2011-09-05 SYSTEM AND METHOD FOR GENERATING AN EXTENDED TIME SIGNAL AND / OR A TRANSPOSED FREQUENCY SIGNAL FROM AN ENTRY SIGNAL AND STORAGE MEDIA LEGIBLE BY NON-TRANSITIONAL COMPUTER
PL11763872T PL2617035T3 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
BR112013005676-2A BR112013005676B1 (en) 2010-09-16 2011-09-05 system and method for generating an elongated time signal and / or a transposed frequency signal from an input and data carrier signal and non-transitory computer-readable storage medium
ES11763872T ES2699750T3 (en) 2010-09-16 2011-09-05 Harmonic transposition based on cross-product improved sub-band block
KR1020247026002A KR20240122593A (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
EP22202637.9A EP4148732B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
KR1020207002646A KR102312475B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
EP22202639.5A EP4145445B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
KR1020217032100A KR102439053B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
KR1020137009361A KR101610626B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
EP21204205.5A EP3975177B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
KR1020187014134A KR101924326B1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
BR122019025121-5A BR122019025121B1 (en) 2010-09-16 2011-09-05 SYSTEM AND METHOD FOR GENERATING AN EXTENDED TIME SIGNAL AND / OR A TRANSPOSED FREQUENCY SIGNAL FROM AN ENTRY SIGNAL AND STORAGE MEDIA LEGIBLE BY NON-TRANSITIONAL COMPUTER
CN201180044307.6A CN103262164B (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
EP18198247.1A EP3503100A1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
RU2013117038/08A RU2551817C2 (en) 2010-09-16 2011-09-05 Cross product-enhanced, subband block-based harmonic transposition
IL298230A IL298230B2 (en) 2010-09-16 2011-09-05 Method and system for cross product enhanced subband block based harmonic transposition
CA2808353A CA2808353C (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
IL224785A IL224785A (en) 2010-09-16 2013-02-18 Method and system for cross product enhanced subband block based harmonic transposition
AU2015202647A AU2015202647B2 (en) 2010-09-16 2015-05-15 Cross product enhanced subband block based harmonic transposition
IL240068A IL240068A (en) 2010-09-16 2015-07-21 Method and system for cross product enhanced subband block based harmonic transposition
US14/854,498 US9735750B2 (en) 2010-09-16 2015-09-15 Cross product enhanced subband block based harmonic transposition
US15/480,859 US9940941B2 (en) 2010-09-16 2017-04-06 Cross product enhanced subband block based harmonic transposition
AU2017204074A AU2017204074C1 (en) 2010-09-16 2017-06-16 Cross Product Enhanced Subband Block Based Harmonic Transposition
IL253387A IL253387B (en) 2010-09-16 2017-07-10 Method and system for cross product enhanced subband block based harmonic transposition
US15/904,702 US10192562B2 (en) 2010-09-16 2018-02-26 Cross product enhanced subband block based harmonic transposition
IL259070A IL259070A (en) 2010-09-16 2018-05-01 Method and system for cross product enhanced subband block based harmonic transposition
AU2018241064A AU2018241064B2 (en) 2010-09-16 2018-10-03 Cross Product Enhanced Subband Block Based Harmonic Transposition
US16/211,563 US10446161B2 (en) 2010-09-16 2018-12-06 Cross product enhanced subband block based harmonic transposition
IL265722A IL265722B (en) 2010-09-16 2019-03-31 Method and system for cross product enhanced subband block based harmonic transposition
US16/545,359 US10706863B2 (en) 2010-09-16 2019-08-20 Cross product enhanced subband block based harmonic transposition
AU2020200340A AU2020200340B2 (en) 2010-09-16 2020-01-17 Cross Product Enhanced Subband Block Based Harmonic Transposition
US16/917,171 US11355133B2 (en) 2010-09-16 2020-06-30 Cross product enhanced subband block based harmonic transposition
IL278478A IL278478B (en) 2010-09-16 2020-11-04 Method and system for cross product enhanced subband block based harmonic transposition
AU2021200095A AU2021200095B2 (en) 2010-09-16 2021-01-08 Cross Product Enhanced Subband Block Based Harmonic Transposition
IL285298A IL285298B (en) 2010-09-16 2021-08-02 Method and system for cross product enhanced subband block based harmonic transposition
AU2022201270A AU2022201270B2 (en) 2010-09-16 2022-02-24 Cross Product Enhanced Subband Block Based Harmonic Transposition
US17/829,733 US11817110B2 (en) 2010-09-16 2022-06-01 Cross product enhanced subband block based harmonic transposition
AU2023201183A AU2023201183B2 (en) 2010-09-16 2023-02-28 Cross Product Enhanced Subband Block Based Harmonic Transposition
US18/376,913 US12033645B2 (en) 2010-09-16 2023-10-05 Cross product enhanced subband block based harmonic transposition
US18/675,865 US20240312470A1 (en) 2010-09-16 2024-05-28 Cross product enhanced subband block based harmonic transposition
AU2024204430A AU2024204430A1 (en) 2010-09-16 2024-06-27 Cross Product Enhanced Subband Block Based Harmonic Transposition

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US38344110P 2010-09-16 2010-09-16
US61/383,441 2010-09-16
US41916410P 2010-12-02 2010-12-02
US61/419,164 2010-12-02

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US13/822,601 A-371-Of-International US9172342B2 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition
US14/854,498 Continuation US9735750B2 (en) 2010-09-16 2015-09-15 Cross product enhanced subband block based harmonic transposition

Publications (1)

Publication Number Publication Date
WO2012034890A1 true WO2012034890A1 (en) 2012-03-22

Family

ID=44720852

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2011/065318 WO2012034890A1 (en) 2010-09-16 2011-09-05 Cross product enhanced subband block based harmonic transposition

Country Status (18)

Country Link
US (10) US9172342B2 (en)
EP (6) EP3975178B1 (en)
JP (10) JP5951614B2 (en)
KR (12) KR102014696B1 (en)
CN (2) CN104851429B (en)
AU (1) AU2011304113C1 (en)
BR (4) BR112013005676B1 (en)
CA (10) CA3191597C (en)
CL (1) CL2013000717A1 (en)
DK (3) DK3975178T3 (en)
ES (3) ES2938725T3 (en)
IL (12) IL296448A (en)
MX (1) MX2013002876A (en)
MY (2) MY155990A (en)
PL (4) PL2617035T3 (en)
RU (6) RU2671619C2 (en)
SG (3) SG188229A1 (en)
WO (1) WO2012034890A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9514767B2 (en) 2012-07-02 2016-12-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Device, method and computer program for freely selectable frequency shifts in the subband domain

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8958510B1 (en) * 2010-06-10 2015-02-17 Fredric J. Harris Selectable bandwidth filter
TWI557727B (en) 2013-04-05 2016-11-11 杜比國際公司 An audio processing system, a multimedia processing system, a method of processing an audio bitstream and a computer program product
KR101782916B1 (en) 2013-09-17 2017-09-28 주식회사 윌러스표준기술연구소 Method and apparatus for processing audio signals
WO2015060654A1 (en) 2013-10-22 2015-04-30 한국전자통신연구원 Method for generating filter for audio signal and parameterizing device therefor
US10163447B2 (en) * 2013-12-16 2018-12-25 Qualcomm Incorporated High-band signal modeling
WO2015099429A1 (en) 2013-12-23 2015-07-02 주식회사 윌러스표준기술연구소 Audio signal processing method, parameterization device for same, and audio signal processing device
CN108600935B (en) 2014-03-19 2020-11-03 韦勒斯标准与技术协会公司 Audio signal processing method and apparatus
KR101856127B1 (en) 2014-04-02 2018-05-09 주식회사 윌러스표준기술연구소 Audio signal processing method and device
US9306606B2 (en) * 2014-06-10 2016-04-05 The Boeing Company Nonlinear filtering using polyphase filter banks
WO2016142002A1 (en) * 2015-03-09 2016-09-15 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder, audio decoder, method for encoding an audio signal and method for decoding an encoded audio signal
TWI693594B (en) 2015-03-13 2020-05-11 瑞典商杜比國際公司 Decoding audio bitstreams with enhanced spectral band replication metadata in at least one fill element
EP3171362B1 (en) * 2015-11-19 2019-08-28 Harman Becker Automotive Systems GmbH Bass enhancement and separation of an audio signal into a harmonic and transient signal component
CN110266287B (en) * 2019-05-05 2023-06-23 深圳信息职业技术学院 Method for constructing fractional delay filter of electronic cochlea, storage medium and electronic cochlea
US10938444B2 (en) * 2019-07-12 2021-03-02 Avago Technologies International Sales Pte. Limited Apparatus and method for noise reduction in a full duplex repeater
US11344298B2 (en) 2019-12-06 2022-05-31 Covidien Lp Surgical stapling device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010081892A2 (en) * 2009-01-16 2010-07-22 Dolby Sweden Ab Cross product enhanced harmonic transposition
WO2010086461A1 (en) * 2009-01-28 2010-08-05 Dolby International Ab Improved harmonic transposition

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5774837A (en) 1995-09-13 1998-06-30 Voxware, Inc. Speech coding system and method using voicing probability determination
US5956674A (en) * 1995-12-01 1999-09-21 Digital Theater Systems, Inc. Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
SE512719C2 (en) 1997-06-10 2000-05-02 Lars Gustaf Liljeryd A method and apparatus for reducing data flow based on harmonic bandwidth expansion
US6266003B1 (en) * 1998-08-28 2001-07-24 Sigma Audio Research Limited Method and apparatus for signal processing for time-scale and/or pitch modification of audio signals
US7272556B1 (en) 1998-09-23 2007-09-18 Lucent Technologies Inc. Scalable and embedded codec for speech and audio signals
US6782360B1 (en) * 1999-09-22 2004-08-24 Mindspeed Technologies, Inc. Gain quantization for a CELP speech coder
JP3518737B2 (en) * 1999-10-25 2004-04-12 日本ビクター株式会社 Audio encoding device, audio encoding method, and audio encoded signal recording medium
SE0004163D0 (en) 2000-11-14 2000-11-14 Coding Technologies Sweden Ab Enhancing perceptual performance or high frequency reconstruction coding methods by adaptive filtering
CA2365203A1 (en) * 2001-12-14 2003-06-14 Voiceage Corporation A signal modification method for efficient coding of speech signals
JP3537807B2 (en) * 2002-02-28 2004-06-14 株式会社神戸製鋼所 Digital data processing apparatus and method
EP1543307B1 (en) * 2002-09-19 2006-02-22 Matsushita Electric Industrial Co., Ltd. Audio decoding apparatus and method
SE0301273D0 (en) 2003-04-30 2003-04-30 Coding Technologies Sweden Ab Advanced processing based on a complex exponential-modulated filter bank and adaptive time signaling methods
RU2374703C2 (en) 2003-10-30 2009-11-27 Конинклейке Филипс Электроникс Н.В. Coding or decoding of audio signal
US7460990B2 (en) 2004-01-23 2008-12-02 Microsoft Corporation Efficient coding of digital media spectral data using wide-sense perceptual similarity
US20090299756A1 (en) * 2004-03-01 2009-12-03 Dolby Laboratories Licensing Corporation Ratio of speech to non-speech audio such as for elderly or hearing-impaired listeners
KR100608062B1 (en) * 2004-08-04 2006-08-02 삼성전자주식회사 Method and apparatus for decoding high frequency of audio data
JP5129117B2 (en) 2005-04-01 2013-01-23 クゥアルコム・インコーポレイテッド Method and apparatus for encoding and decoding a high-band portion of an audio signal
US20070078645A1 (en) * 2005-09-30 2007-04-05 Nokia Corporation Filterbank-based processing of speech signals
EP4178110B1 (en) 2006-01-27 2024-04-24 Dolby International AB Efficient filtering with a complex modulated filterbank
JP2007316254A (en) * 2006-05-24 2007-12-06 Sony Corp Audio signal interpolation method and audio signal interpolation device
EP2054876B1 (en) 2006-08-15 2011-10-26 Broadcom Corporation Packet loss concealment for sub-band predictive coding based on extrapolation of full-band audio waveform
JP4918841B2 (en) * 2006-10-23 2012-04-18 富士通株式会社 Encoding system
EP3296992B1 (en) * 2008-03-20 2021-09-22 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for modifying a parameterized representation
US9142221B2 (en) * 2008-04-07 2015-09-22 Cambridge Silicon Radio Limited Noise reduction
KR101239812B1 (en) * 2008-07-11 2013-03-06 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Apparatus and method for generating a bandwidth extended signal
EP2214165A3 (en) 2009-01-30 2010-09-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for manipulating an audio signal comprising a transient event
KR102020334B1 (en) * 2010-01-19 2019-09-10 돌비 인터네셔널 에이비 Improved subband block based harmonic transposition
ES2522171T3 (en) * 2010-03-09 2014-11-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for processing an audio signal using patching edge alignment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010081892A2 (en) * 2009-01-16 2010-07-22 Dolby Sweden Ab Cross product enhanced harmonic transposition
WO2010086461A1 (en) * 2009-01-28 2010-08-05 Dolby International Ab Improved harmonic transposition

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUAN ZHOU ET AL: "Finalization of CE on QMF based harmonic transposer", 93. MPEG MEETING; 26-7-2010 - 30-7-2010; GENEVA; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. M17807, 22 July 2010 (2010-07-22), XP030046397 *
ZHONG HAISHAN ET AL: "QMF based harmonic spectral band replication", AES 131ST CONVENTION, 23 October 2011 (2011-10-23), New York, NY, USA, pages 1 - 9, XP055013644, Retrieved from the Internet <URL:http://www.aes.org/tmpFiles/elib/20111201/16043.pdf> [retrieved on 20111201] *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9514767B2 (en) 2012-07-02 2016-12-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Device, method and computer program for freely selectable frequency shifts in the subband domain

Also Published As

Publication number Publication date
IL291501B (en) 2022-12-01
EP3503100A1 (en) 2019-06-26
IL303921B1 (en) 2024-07-01
KR101924326B1 (en) 2018-12-03
KR20140132370A (en) 2014-11-17
CA3137515A1 (en) 2012-03-22
IL224785A (en) 2015-08-31
PL2617035T3 (en) 2019-02-28
MX2013002876A (en) 2013-04-08
ES2933477T3 (en) 2023-02-09
JP6755426B2 (en) 2020-09-16
CA3043428C (en) 2020-02-18
IL291501B2 (en) 2023-04-01
US20180182404A1 (en) 2018-06-28
EP2617035B1 (en) 2018-10-03
IL240068A0 (en) 2015-08-31
CA2961088A1 (en) 2012-03-22
IL285298B (en) 2022-04-01
DK2617035T3 (en) 2019-01-02
JP2024138185A (en) 2024-10-07
US20170213563A1 (en) 2017-07-27
IL253387B (en) 2018-06-28
JP6429966B2 (en) 2018-11-28
KR101744621B1 (en) 2017-06-09
US20240046940A1 (en) 2024-02-08
PL3975178T3 (en) 2023-03-13
IL303921A (en) 2023-08-01
RU2671619C2 (en) 2018-11-02
US20160006406A1 (en) 2016-01-07
KR102564590B1 (en) 2023-08-09
KR20240122593A (en) 2024-08-12
IL278478B (en) 2021-08-31
US9940941B2 (en) 2018-04-10
KR102312475B1 (en) 2021-10-14
EP4145445B1 (en) 2024-08-28
US10192562B2 (en) 2019-01-29
CL2013000717A1 (en) 2013-07-05
EP4148732A1 (en) 2023-03-15
RU2013117038A (en) 2014-10-27
JP6218889B2 (en) 2017-10-25
CA2808353A1 (en) 2012-03-22
EP2617035A1 (en) 2013-07-24
CA3168514A1 (en) 2012-03-22
RU2694587C1 (en) 2019-07-16
KR20200013092A (en) 2020-02-05
US20240312470A1 (en) 2024-09-19
MY155990A (en) 2015-12-31
MY176574A (en) 2020-08-17
JP2023086885A (en) 2023-06-22
SG188229A1 (en) 2013-04-30
US20200395025A1 (en) 2020-12-17
CN103262164A (en) 2013-08-21
KR102694615B1 (en) 2024-08-14
EP3975177B1 (en) 2022-12-14
PL4148732T3 (en) 2024-08-19
JP6849847B2 (en) 2021-03-31
US11355133B2 (en) 2022-06-07
IL285298A (en) 2021-09-30
KR102014696B1 (en) 2019-08-27
KR102439053B1 (en) 2022-09-02
AU2011304113B2 (en) 2015-02-26
JP7273218B2 (en) 2023-05-12
KR102073544B1 (en) 2020-02-05
CA3067155A1 (en) 2012-03-22
AU2011304113A1 (en) 2013-03-07
BR122019025115B1 (en) 2021-04-13
JP2013537322A (en) 2013-09-30
ES2938725T3 (en) 2023-04-14
RU2720495C1 (en) 2020-04-30
CN104851429B (en) 2018-10-19
KR101980070B1 (en) 2019-05-20
JP7053912B6 (en) 2022-05-16
CN103262164B (en) 2015-06-17
IL259070A (en) 2018-06-28
DK3975178T3 (en) 2022-12-05
US11817110B2 (en) 2023-11-14
IL265722A (en) 2019-05-30
US20190108850A1 (en) 2019-04-11
US9735750B2 (en) 2017-08-15
JP2020106867A (en) 2020-07-09
EP3975177A1 (en) 2022-03-30
IL265722B (en) 2020-11-30
JP7053912B2 (en) 2022-04-12
CA3168514C (en) 2023-04-11
JP2016173603A (en) 2016-09-29
KR20190099092A (en) 2019-08-23
KR20210124538A (en) 2021-10-14
IL296448A (en) 2022-11-01
US10446161B2 (en) 2019-10-15
KR20170060191A (en) 2017-05-31
KR20180058847A (en) 2018-06-01
KR20130081290A (en) 2013-07-16
CA3067155C (en) 2021-01-19
JP7537723B2 (en) 2024-08-21
KR101863035B1 (en) 2018-06-01
EP3975178B1 (en) 2022-11-16
IL291501A (en) 2022-05-01
US20220293113A1 (en) 2022-09-15
AU2011304113C1 (en) 2015-08-06
BR122019025121B1 (en) 2021-04-27
RU2020111638A (en) 2021-09-20
BR122019025142B1 (en) 2021-04-27
CN104851429A (en) 2015-08-19
US12033645B2 (en) 2024-07-09
RU2015105671A (en) 2015-08-20
US9172342B2 (en) 2015-10-27
KR20190053306A (en) 2019-05-17
KR20230119038A (en) 2023-08-14
JP2022088591A (en) 2022-06-14
IL253387A0 (en) 2017-09-28
JP2020190757A (en) 2020-11-26
RU2551817C2 (en) 2015-05-27
CA3220202A1 (en) 2012-03-22
DK3975177T3 (en) 2023-01-30
US20130182870A1 (en) 2013-07-18
CA3239279A1 (en) 2012-03-22
ES2699750T3 (en) 2019-02-12
US20190378525A1 (en) 2019-12-12
RU2682340C1 (en) 2019-03-19
SG10201506914PA (en) 2015-10-29
CA3191597A1 (en) 2012-03-22
EP3975178A1 (en) 2022-03-30
JP5951614B2 (en) 2016-07-13
CA3102325C (en) 2021-12-21
CA2808353C (en) 2017-05-02
CA3102325A1 (en) 2012-03-22
PL3975177T3 (en) 2023-04-11
CA2961088C (en) 2019-07-02
JP6736634B2 (en) 2020-08-05
KR20180128983A (en) 2018-12-04
EP4148732B1 (en) 2024-06-26
EP4148732C0 (en) 2024-06-26
IL298230B1 (en) 2023-07-01
JP2019012295A (en) 2019-01-24
KR101610626B1 (en) 2016-04-20
CA3043428A1 (en) 2012-03-22
JP2018022178A (en) 2018-02-08
BR112013005676B1 (en) 2021-02-09
RU2685993C1 (en) 2019-04-23
RU2015105671A3 (en) 2018-08-27
EP4145445A1 (en) 2023-03-08
KR20220123752A (en) 2022-09-08
IL313284A (en) 2024-08-01
US10706863B2 (en) 2020-07-07
IL298230A (en) 2023-01-01
CA3137515C (en) 2022-09-20
BR112013005676A2 (en) 2016-05-03
IL298230B2 (en) 2023-11-01
JP2021081754A (en) 2021-05-27
CA3191597C (en) 2024-01-02
SG10202103492XA (en) 2021-05-28
IL240068A (en) 2017-08-31

Similar Documents

Publication Publication Date Title
US11817110B2 (en) Cross product enhanced subband block based harmonic transposition
AU2018241064B2 (en) Cross Product Enhanced Subband Block Based Harmonic Transposition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11763872

Country of ref document: EP

Kind code of ref document: A1

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
ENP Entry into the national phase

Ref document number: 2808353

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2011763872

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2011304113

Country of ref document: AU

Date of ref document: 20110905

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 13822601

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2013528595

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: MX/A/2013/002876

Country of ref document: MX

WWE Wipo information: entry into national phase

Ref document number: 2013000717

Country of ref document: CL

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20137009361

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2013117038

Country of ref document: RU

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112013005676

Country of ref document: BR

WWE Wipo information: entry into national phase

Ref document number: 240068

Country of ref document: IL

ENP Entry into the national phase

Ref document number: 112013005676

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20130308

WWE Wipo information: entry into national phase

Ref document number: IDP00201603151

Country of ref document: ID