EP2907324B1 - System und verfahren zur reduzierung der latenzzeit in transposerbasierten virtuellen basssystemen - Google Patents

System und verfahren zur reduzierung der latenzzeit in transposerbasierten virtuellen basssystemen Download PDF

Info

Publication number
EP2907324B1
EP2907324B1 EP13771123.0A EP13771123A EP2907324B1 EP 2907324 B1 EP2907324 B1 EP 2907324B1 EP 13771123 A EP13771123 A EP 13771123A EP 2907324 B1 EP2907324 B1 EP 2907324B1
Authority
EP
European Patent Office
Prior art keywords
frequency
signal
cqmf
virtual bass
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP13771123.0A
Other languages
English (en)
French (fr)
Other versions
EP2907324A1 (de
Inventor
Per Ekstrand
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dolby International AB
Original Assignee
Dolby International AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/652,023 external-priority patent/US8971551B2/en
Application filed by Dolby International AB filed Critical Dolby International AB
Publication of EP2907324A1 publication Critical patent/EP2907324A1/de
Application granted granted Critical
Publication of EP2907324B1 publication Critical patent/EP2907324B1/de
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/03Synergistic effects of band splitting and sub-band processing

Definitions

  • One or more embodiments relate generally to transform-based audio signal processing, and more specifically to reducing latency in transposer-based virtual bass synthesis systems.
  • Bass synthesis refers to methods of adding components to the low frequency range of a signal in order to enhance the perceived bass.
  • a sub-bass synthesis technique creates low frequency components below the existing partials of a signal in order to extend and improve the lowest frequency range present in the subject audio content.
  • Another method uses virtual pitch algorithms that generate audible harmonics from an inaudible bass range (e.g., low pitched bass played through small loudspeakers), hence making the harmonics, and ultimately also the pitch, audible in order to improve the bass response.
  • Virtual bass synthesis is a virtual pitch method that increases the perceived level of bass content in audio when played on small loudspeakers that cannot physically reproduce the low-end bass frequencies.
  • the method is based on the 'missing fundamental' psycho-acoustic observation that low pitches can be inferred by the human auditory system from upper harmonics even when the fundamental and the first harmonics themselves are missing.
  • the basic method of functionality is to analyze the bass frequencies present in the audio and generate audible upper harmonics that aid the perception of the missing lower frequencies.
  • a main feature of virtual bass is that it enhances the perceived bass response on devices with small speakers by synthesizing upper harmonics for frequencies below the low-frequency roll-off of the device (e.g., below 150 Hz).
  • FIG. 1A shows the frequency-amplitude spectrum of an audio signal having an inaudible range 10 of frequency components, and an audible range of frequency components above the inaudible range.
  • Harmonic transposition of frequency components in the inaudible range 10 can generate transposed frequency components in portion 11 of the audible range, which can enhance the perceived level of bass content of the audio signal during playback.
  • Such harmonic transposition may include application of multiple transposition factors to each relevant frequency component of the input audio signal to generate multiple harmonics of the component.
  • the delay or latency associated with the frequency transposition function can be excessive for certain applications.
  • a digital audio processing system that has a latency of 1025 samples may use a legacy virtual bass system that adds an additional 3200 samples of delay. This can cause a total delay to exceed 88 milliseconds, given a sampling frequency ( f s ) of 48kHz. This amount of latency is generally problematic and even prohibitive for gaming and telecommunications applications, where a latency of about 100 milliseconds starts to become noticeable in terms of audible signal delay.
  • FIG. 1B illustrates the delay associated with symmetric windows used in legacy virtual bass systems, as known in the prior art.
  • FIG. 1B graphically illustrates the delay imposed by a second-order transposer, i.e., a transposer that generates 2 nd order harmonics.
  • the center of one of the stylistic symmetric analysis window is chosen as the time zero reference, and new input samples 104 can be added from time t 0 in the analysis phase 102, assuming a time stride S A of the analysis windows.
  • Time plot 110 shows the time stretch duality of the transposer, where to is stretched to 2 ⁇ t 0 in the synthesis phase 112.
  • the input signal to the CQMF (Complex Quadrature Mirror Filter) analysis stage and the output signal from the CQMF synthesis stage generally both have the same sampling frequency f s , where f s is usually set to 44.1 or 48 kHz.
  • the input signal sampling rate to the virtual bass process may be f s /64 since the system is usually processing the first CQMF signal only from a 64-channel CQMF bank. It should be noted that CQMF sizes other than 64 channels could also be used.
  • the transposed output from the legacy virtual bass processing system has a sampling frequency of 2 ⁇ f s /64 because of the combined transposition function using a factor two base transposition factor, resulting in a factor two bandwidth expansion.
  • the base transposition factor is the factor where the source transform bins (or frequency bands) are mapped in a one-to-one relationship to the target transform bins (or frequency bands), i.e., there is no interpolation or decimation involved in the source to target bin mapping.
  • the base transposition factor also governs the relation between the time strides of the analysis and synthesis windows. More specifically, the synthesis time stride equals the analysis time stride multiplied by the base transposition factor.
  • Embodiments include a latency reduction system in a virtual bass processing system that performs harmonic transposition on low frequency components of an audio signal to generate transposed data indicative of harmonics.
  • the harmonic transposition process uses a base transposition factor greater than two, and generates the harmonics in response to frequency-domain values determined by transform and inverse transform stages that use asymmetric analysis and synthesis windows.
  • An enhanced audio signal is generated by combining a virtual bass signal with the delayed audio signal through the use of Nyquist analysis filter banks that comprise truncated prototype filters.
  • the virtual bass signal may be allowed to lag the delayed audio signal by a defined time period when combining with the audio signal to further reduce the latency caused by the harmonic transposition process.
  • Embodiments include a method of reducing latency in a virtual bass generation system by performing harmonic transposition on low frequency components of an input audio signal to generate transposed data indicative of harmonics, wherein the harmonic transposition uses a base transposition factor of an integer value greater than two. It generates the harmonics in response to frequency-domain values determined by a time-to-frequency domain transform stage and a subsequent inverse frequency-to-time domain transform stage through the use of asymmetric analysis and synthesis windows for the time-to-frequency domain transform and inverse frequency-to-time domain transforms.
  • the input audio signal is a sub-banded CQMF (complex-valued quadrature mirror filter) signal and samples of the input audio signal may be pre-processed to generate critically sampled audio indicative of the low frequency components.
  • CQMF complex-valued quadrature mirror filter
  • the method processes the input audio signal through an analysis filter bank or transform to provide a set of analysis sub-band signals or frequency bins from the low frequency components, computes a set of synthesis sub-band signals or frequency bins using the base transposition factor B and transposition factor T, and processes the analysis sub-band signals or frequency bins through a synthesis filter bank or transform to generate a high frequency component from the set of synthesis sub-band signals.
  • the method may further include generating a virtual bass signal in response to the transposed data, and generating an enhanced audio signal by combining the virtual bass signal with the input audio signal by applying one or two analysis filter banks to the virtual bass audio output signal, wherein the analysis filter banks comprise truncated prototype filters that have a defined number of filter coefficients removed.
  • the method may yet further include a lag of the virtual bass signal by a pre-defined time period relative to the input audio signal, by combining the virtual bass signal with the input audio signal delayed a pre-defined time period shorter than the processing delay of the virtual bass system would imply, to generate an enhanced audio signal comprising time lagged virtual bass processed sub-band samples combined with delayed input sub-band samples.
  • the base transposition factor under some embodiments extends the input audio signal in the frequency domain to a degree proportionate to the value of the base transposition factor to produce a transposed audio signal, and this base transposition factor may be an even integer value between 4 and 16.
  • the analysis filter banks operating on the transposer CQMF output sub bands comprise an eight-channel Nyquist filter bank and a four-channel Nyquist filter bank, and the defined number of removed prototype filter coefficients comprises six coefficients.
  • the input CQMF signal is routed directly from a preceding CQMF analysis bank channel 0 output, hence bypassing a subsequent Nyquist filter bank stage and so avoiding the related delay.
  • Embodiments of the method may further include generating the low frequency components by performing a frequency domain oversampled transform on the input audio signal by generating windowed and zero-padded samples at a defined sample frequency (using the analysis time stride).
  • the pre-defined time period when combining the virtual bass signal with the delayed input audio signal may be a value selected from the range of 0 samples to 1000 samples, since the virtual bass signal may be allowed to lag the wide band input audio signal up to 20 ms without noticeable degradation of the enhanced audio signal.
  • the asymmetric analysis and synthesis windows are configured such that a longer portion of the analysis windows are stretched toward past input samples, and that a longer portion of the synthesis windows are stretched toward future output samples.
  • Embodiments are also directed to systems or apparatus elements configured to implement at least some of the methods described above.
  • Embodiments of systems and methods are described for reducing latency and algorithmic delays in transposer-based virtual bass systems.
  • Such systems and methods utilize higher-order base transposition factors, low latency asymmetric transform windows, truncated Nyquist prototype filters, a time lagged virtual bass signal in respect to the original audio signal, and a bypassed Nyquist analysis filter bank in a preceding Hybrid filter bank stage.
  • the expression performing an operation "on" a signal or data is used in a broad sense to denote performing the operation directly on the signal or data, or on a processed version of the signal or data (e.g., on a version of the signal that has undergone preliminary filtering or pre-processing prior to performance of the operation thereon).
  • the expression "transposer” is used in a broad sense to denote an algorithmic unit or device that performs pitch-shifting or time-stretching of a real or complex-valued input signal, for parts of, or the entire available input signal spectrum.
  • transposer Harmonic transposer
  • phase vocoder phase vocoder
  • high frequency generator high frequency generator
  • harmonic generator may be used interchangeably.
  • system is used in a broad sense to denote a device, system, or subsystem.
  • a subsystem that implements a decoder may be referred to as a decoder system
  • a system including such a subsystem e.g., a system that generates X output signals in response to multiple inputs, in which the subsystem generates M of the inputs and the other X - M inputs are received from an external source
  • a decoder system e.g., a system that generates X output signals in response to multiple inputs, in which the subsystem generates M of the inputs and the other X - M inputs are received from an external source
  • processor is used in a broad sense to denote a system or device programmable or otherwise configurable (e.g., with software or firmware) to perform operations on data (e.g., audio, or video or other image data).
  • processors include a field-programmable gate array (or other configurable integrated circuit or chip set), a digital signal processor programmed and/or otherwise configured to perform pipelined processing on audio or other sound data, a programmable general purpose processor or computer, and a programmable microprocessor chip or chip set.
  • audio processor and “audio processing unit” are used interchangeably, and in a broad sense, to denote a system configured to process audio data.
  • audio processing units include, but are not limited to encoders (e.g., transcoders), decoders, vocoders, codecs, pre-processing systems, post-processing systems, and bitstream processing systems (sometimes referred to as bitstream processing tools).
  • encoders e.g., transcoders
  • decoders e.g., vocoders
  • codecs e.g., pre-processing systems
  • post-processing systems e.g., post-processing systems
  • bitstream processing tools sometimes referred to as bitstream processing tools
  • Embodiments are directed to systems and methods of decreasing virtual bass delay without requiring substantial changes to existing virtual bass processing components, such as the harmonic transposer used in a virtual bass processing system.
  • Aspects of the virtual bass latency reduction system and method may be used in conjunction with a harmonic generator (transposer) in audio codecs (e.g., in a decoder).
  • Aspects of the virtual bass latency reduction system and method may also be used in conjunction with other transposer or phase vocoder systems, e.g., traditional phase vocoders used for general time-stretching or pitch-shifting of audio signals.
  • virtual bass generation methods using harmonic transposition involve the transposition of frequency components from an inaudible frequency range to an audible frequency range in order to improve playback of bass content in limited playback equipment, such as through small speakers that cannot physically reproduce the missing lower frequencies.
  • Embodiments of the virtual bass latency reduction system and method improve upon virtual bass generation methods that performs harmonic transposition on low frequency components of an audio signal to generate transposed data indicative of harmonics that are expected to be audible during playback, generating a virtual bass signal in response to the transposed data, and generating an enhanced audio signal by combining the virtual bass signal with the (delayed) input audio signal.
  • the enhanced audio signal provides an increased perceived level of bass content during playback of the enhanced audio signal by one or more loudspeakers that cannot physically reproduce the low frequency components.
  • the harmonic transposition performed by the virtual bass generation method employs combined transposition to generate harmonics using a second-order transposer and at least one higher order transposer (typically, a third-order and a fourth-order, and optionally at least one additional higher order transposer) of each of the low frequency components, such that all of the harmonics are generated in response to frequency-domain values determined by a common time-to-frequency domain transform stage (e.g., by performing phase multiplication or other manipulation of the phase on frequency coefficients resulting from a single time-to-frequency domain transform), followed by a common frequency-to-time domain transform (in practice, the common frequency-to-time domain transform is split up into two smaller transforms in order to adapt to the bandwidths and sampling frequencies of the sub-bands of the CQMF framework).
  • a common time-to-frequency domain transform stage e.g., by performing phase multiplication or other manipulation of the phase on frequency coefficients resulting from a single time-to-frequency domain transform
  • FIG. 2 is a block diagram of a virtual bass processing system that implements or is used in conjunction with certain latency reduction processes under an embodiment.
  • the virtual bass processing system 200 takes as input 201 (input A), a plurality of complex-valued sub-band samples (HQMF samples) from a so-called Hybrid filter bank.
  • HQMF samples complex-valued sub-band samples
  • a Hybrid filter bank preceding the virtual bass process has separated an original time domain audio input signal into such multiple Hybrid sub-bands 201 (which are described in further detail below), and they may be buffered by input buffers 206.
  • the buffered input is then processed by a Nyquist synthesis filter bank 208 that performs the synthesis function in order to reconstitute a single complex-valued QMF (CQMF) domain signal 202 (signal C) indicative of low frequency audio content (e.g., between 0 and 375 Hz).
  • the virtual bass system includes a latency saving mechanism by bypassing the Nyquist analysis filter bank stage in the preceding Hybrid filter bank. This allows the system to save the delay associated with the Nyquist analysis bank (e.g., 384 samples) by feeding the CQMF channel 0 signal as input 203 (input B) directly to the virtual bass module.
  • one of the two inputs 202 or 203 are chosen by a switch, such as selector 204, and the selected signal comprises a virtual bass input signal 205 (signal D) that is further processed by the transposer 209.
  • transposer is generally the combination of a time-to-frequency transform or a filter bank followed by a non-linear stage (performing phase multiplication or phase shifting) followed by the frequency-to-time transform or filter bank.
  • transposer 209 comprises a time-to frequency transform component 210, a non-linear stage 212, and a frequency-to-time transform 214.
  • the non-linear stage 212 within transposer 209 is a processing block that modifies the phase and applies certain gain (amplitude) control signals to the sub-band or transform components of the signal.
  • the transposed signals are then buffered by output buffers 216 and subsequently processed by Nyquist analysis filter banks 218 that perform the analysis function that decomposes the virtual bass output CQMF signals into sub-bands corresponding to the Hybrid sub-band samples (HQMF) of the input signal 201.
  • a delayed and unprocessed version of the input A signal 220 is mixed with the Nyquist filter bank 218 output to produce an enhanced audio output signal 222 comprising the virtual bass output signal plus the delayed input signal.
  • embodiments may be directed to the use of Nyquist filter banks for certain functions, such as synthesis 208 and analysis 218 stage processing, it should be noted that other types of filter banks or frequency splitting or partitioning circuits and techniques may also be used. In other embodiments, the above mentioned filter banks or frequency splitting or partitioning circuits and techniques, may not be present at all.
  • FIGS. 3A-C are more detailed diagrams of the virtual bass processing system illustrated in FIG. 2 .
  • FIG. 3A illustrates a pre-processing Hybrid filter bank stage 300, that is, a stage that typically is not part of, but instead precedes the virtual bass system.
  • a Hybrid filter bank may be the combination of a CQMF bank, where a certain number of the lowest CQMF bands are processed by Nyquist filter banks of pre-determined sizes in order to increase the frequency resolution of the low frequency range.
  • the combination of low frequency sub-band samples from the Nyquist analysis stages and the remaining CQMF channels are referred to as Hybrid sub-band samples, or an HQMF (Hybrid QMF) signal.
  • Hybrid sub-band samples or an HQMF (Hybrid QMF) signal.
  • a time domain input signal 302 is input to a 64-channel CQMF analysis filter bank 304.
  • the CQMF channel 0 (denoted signal B) 306
  • the virtual bass module 330 of FIG. 3C (this signal corresponds to input B 203 of FIG. 2 ).
  • the signal B 306 bypasses the Nyquist analysis filter bank 307, and hence avoids the associated delay.
  • CQMF channels 0, 1, and 2 are also input to a number of Nyquist analysis filter banks 307-309. The output from the Nyquist analysis filter banks and the remaining CQMF sub-bands (3 to 63) produce the Hybrid sub-band samples 0-76 (denoted as signal A) 310.
  • a plurality of complex-valued Hybrid sub-band samples (signal A) 322 are input to a Nyquist synthesis filter bank stage 324.
  • the virtual bass module 330 of FIG. 3C is assumed to be one module amongst other modules in a system that operates on Hybrid sub-band samples (HQMF samples).
  • signal A 310 of FIG. 3A may undergo processing by other modules after the pre-processing filter bank stage 300 before becoming input A 322 of FIG. 3B .
  • the first 8 Hybrid sub-bands i.e., the sub-bands from the low frequency, eight-channel (8-ch) Nyquist filter bank 307 (which produce a signal bandwidth of roughly 344-375 Hz depending on the sampling rate) are processed. Since a Nyquist filter bank is not down-sampled in contrast to the CQMF bank, the Nyquist filter bank synthesis step is particularly straightforward since it is just a summation of the sub-band samples for each CQMF (or HQMF) time slot. After summation of the eight lowest Hybrid sub-band samples in stage 324, the system has reconstituted the CQMF channel 0 signal C 326, which becomes input 332 to the virtual bass module 330 of FIG. 3C .
  • FIG. 3C illustrates a virtual bass system that implements or is used in conjunction with certain latency reduction processes, under an embodiment.
  • the virtual bass module 330 of FIG. 3C has signal D 332 as input.
  • signal D 332 may be routed from signal B 306 of FIG. 3A .
  • signal D 332 may be fed from signal C 326 of the Nyquist synthesis stage 320 of FIG: 3B .
  • signal D 332, i.e., the input signal to the virtual bass module is a single complex-valued CQMF signal (e.g., the first channel (channel 0) from a set of CQMF sub-band signals).
  • an optional dynamics processing function may be performed by dynamics processor 336 in order to change the dynamics of the virtual bass input signal.
  • the processor 336 may be used to decrease the level of weak bass and maintain or enhance strong bass, i.e., be used as an expander. This scheme is in agreement to the shapes of the Equal Loudness Contours (ELC) in the bass range, where the loudness curves are flatter in frequency for louder signals and steeper for signals of weaker loudness. Weaker bass can hence be attenuated more than stronger bass when generating harmonics in order to maintain the relative loudness between the fundamental component and the generated harmonics.
  • the gain of the dynamics processor 336 may be controlled by a running average energy signal, e.g., the running average energy of a down-mixed (mono) version of the first CQMF band signal 332.
  • a first windowing function using a window size L (including zero-padding up to length N ) 338, forward FFT 340 and modulation function 342 is performed on the (possibly dynamics processed) CQMF signal prior to input to the non-linear processing block 344.
  • the window shape is asymmetric.
  • the transposer (comprising components 338 to 356) represents an improved phase vocoder that uses an interpolation technique referred to as "combined transposition" to generate second, third, fourth, and possibly higher order harmonics (transposition factors), using the same FFT analysis/synthesis chain as for the base transposer.
  • the non-linear processing block 344 uses integer transposition factors, which makes redundant certain phase estimation, phase unwrapping, or phase locking techniques that are generally unstable and inexact as used in many standard phase vocoders.
  • the phase multipliers 344 use a base transposition factor B higher than 2, such as 8, or any other appropriate value.
  • the transposer 338-356 uses oversampling in the frequency domain (i.e., zero-padded analysis and synthesis windows in blocks 338 and 356) to improve impulsive (percussive) sounds, which is paramount when used in the bass frequency range. Without such oversampling, percussive drum sounds would likely generate at least some pre- and post-echo artifacts, making the bass blurry and indistinct.
  • the transposer includes gain and slope compensation per FFT bin applied by amplifiers 346 following the phase multiplier circuits ( non-linear processing block 344).
  • This allows overall gains for different transposition factors to be set independently. For example, gains can be set to approximate certain equal loudness contours (ELC).
  • ELC equal loudness contours
  • the ELC can be adequately modeled by straight lines on a logarithmic scale for frequencies below 400 Hz.
  • odd order harmonics can be attenuated to a greater extent since odd order harmonics (e.g., third, fifth, etc.) can sometimes be perceived as being more harsh than even order harmonics, although being important for the resulting virtual bass effect.
  • Each transposed signal may additionally have a slope gain, i.e., a roll-off attenuation factor, measured in e.g., dB per octave. This attenuation is also applied per bin in the transform domain by amplifiers 346.
  • a slope gain i.e., a roll-off attenuation factor, measured in e.g., dB per octave. This attenuation is also applied per bin in the transform domain by amplifiers 346.
  • the transposer 338-356 In a non-Hybrid filter bank based system, e.g., a time domain system, taking signal 302 of FIG. 3A as input, the transposer 338-356 would directly operate on a time domain signal of full sampling rate (e.g., 44.1 or 48 kHz), and then employ an FFT size of roughly 4096 lines, in order to provide an adequate resolution in the low frequency (bass) range. In an embodiment, all processing, however, is performed on CQMF channel 0 sub-band samples (signal D 332 of system 330). This provides certain advantages over normal processing practices, such as saving computational complexity by processing only the signal of interest in the transposer, i.e., by processing a critically sampled (or maximally decimated) low-pass signal.
  • the virtual bass system expands the bandwidth of the input signal by a factor of four.
  • a virtual bass system is not required to output a signal with a bandwidth above roughly 500 Hz.
  • the system can process the complex-valued samples using an FFT transform of size 64 (4096/64) instead of 4096, where the decrease by 64 comes from the down-sampling factor of the CQMF bank, which also equals the reduced bandwidth of the first CQMF sub-band signal compared to the time domain input signal.
  • the output from the transposer needs to be transformed to CQMF bands 0 and 1. This may be done approximately by a split of the 64-line FFT into four 16-line FFTs and subsequently employing CQMF prototype filter response compensation in the transform domain before the inverse FFT of the two 16-line FFTs that constitute CQMF band 0 and 1 are calculated.
  • the FFT spectrum may be split in module 348 of the virtual bass module 330 and the CQMF filter response compensation may be done by multipliers 350.
  • the CQMF filter response compensation may be done on the full (e.g., 64-lines in the example above) FFT spectrum before the FFT split module 348.
  • the output from the CQMF filter response compensation blocks 350 is input to modulation steps 352 followed by inverse FFT circuits 354, using transform sizes of N / B points, and subsequent windowing and overlap/add steps 356, using window lengths L / B .
  • the window shapes are asymmetric.
  • the modulation steps 352 may also be applied before the FFT split 348 and CQMF filter response compensation 350 blocks.
  • the output signals from the windowing and overlap/add circuits 356 are two CQMF signals, containing the virtual bass signal to be mixed with the delayed HQMF signal A 364. However, both signals need first be filtered through 8- and 4-channel Nyquist analysis filter banks 360 respectively to fit in the Hybrid domain.
  • the Nyquist analysis filter banks 360 use truncated prototype filters.
  • the HQMF output from the filter banks 360 may be band pass filtered and mixed with a delayed input component A 364 in module 362 to produce the enhanced audio output HQMF signal 366.
  • the delay of input A 364 to the Hybrid band mix block 362 is less than the virtual bass system delay (minus the Nyquist analysis delay if signal B 306 is used as input) to comprise a time lagged virtual bass signal.
  • system 330 employs phase compensation by an exp(-j ⁇ /2) multiplication 358 on the CQMF channel 1 before the Nyquist analysis blocks 360.
  • the specific argument to the phase compensation function 358 is dependent on the modulation scheme used by the preceding CQMF bank 304 of FIG. 3A and may differ between embodiments. Also, the compensation factor 358 may be moved and absorbed in other processing blocks.
  • the virtual bass processing system introduces certain delays when processing the input signal.
  • the total delay of the transposer and the Nyquist filter bank analysis stage can be in the order of 3200 samples, as described previously.
  • the virtual bass processing system includes components that perform certain steps to reduce the latency associated with virtual bass processed content.
  • FIG. 4 is a block diagram of the principal functional components utilized by a virtual bass latency reduction process and system, under an embodiment.
  • the latency reduction process comprises the use of higher order base transposition factors 402, low-latency asymmetric transform windows 404, truncated Nyquist prototype filters 406, and a time lagged virtual bass signal 408.
  • Each of the functional components of diagram 400 may be used alone or in conjunction with one or more of the other components to help reduce the latency of the virtual bass processed content.
  • Diagram 400 may represent a system, such as when each of the components 402-408 is embodied as hardware component, such as circuits, processors, and so on.
  • the diagram may also represent a process, such as when each of the components 402-408 is implemented as an act performed by a functional component, such as a computer-implemented process executed by one or more processors.
  • diagram 400 may represent a hybrid system and method wherein certain components may be implemented in hardware circuitry and others may be implemented as performed method steps.
  • the components 402-408 may be implemented as separate stand-alone components, or they may be combined in one or more consolidated latency reduction functions. A detailed description of the composition and operation of each component of system 400 follows below.
  • FIG. 5A is a table illustrating the delay associated with a first hop size
  • FIG. 5B is a table illustrating the delay associated with a second hop size for a virtual bass latency reduction system under an embodiment.
  • L 16 to 128
  • the transposer source ranges are smaller than the transposer target ranges in the analysis transform spectrum.
  • the target bins result from interpolation of the source bins.
  • the source ranges will be larger than the target ranges and the target bins result from decimation of source bins.
  • the increased order of the base transposition factor has certain implications on the virtual bass process.
  • the transposer output inherently covers a frequency range of B CQMF bands (assuming an input of one CQMF band), where only the first two will actually be synthesized, thus saving complexity.
  • B 8
  • F 4
  • the quality of the transposed signals is governed by the base transposition factor and gets reduced for higher order transposition orders, but can be improved by using a decreased analysis hop-size (increased oversampling in the time domain). Moreover, to maintain the quality for percussive sounds (transients), the order of frequency domain oversampling needs to increase for higher base transposition factors. However, the increased oversampling in both time and frequency may add to the computational complexity of the transposer.
  • the analysis hop-size is decreased a factor of two compared to the legacy system.
  • the latency reduction system uses asymmetric analysis and synthesis windows in the forward and inverse transform stages (e.g., windowing stages 338 and 356 of FIG. 3C , respectively). This essentially improves the frequency response of a symmetric window of limited length by extending the "tail" of the window towards samples in the history not contributing to the transform delay.
  • both the length of the analysis window and the size of the forward transform may be different from that of the synthesis window and the inverse transform.
  • FIG. 5C is an example plot of a time response of an asymmetric window compared to legacy symmetric Hanning windows.
  • FIG. 5C illustrates the time response as a function of samples (x-axis) versus signal amplitude (e.g., in volts) for a Hanning window of length 64 shown as plot 514 and a Hanning window of length 41 shown as plot 516 versus the time response plot 512 for an asymmetric window of length 64 and delay 40 (a delay equal to the Hanning window of length 41).
  • FIG. 5D is an example plot of frequency responses of an asymmetric window compared to legacy symmetric Hanning windows.
  • 5D illustrates the frequency response as a function of normalized frequency (x-axis) versus signal amplitude on a logarithmic scale (e.g., in dB) for the Hanning window of length 64 shown as plot 524 and the Hanning window of length 41 shown as plot 526 versus the frequency response plot 522 for the asymmetric window of length 64 and delay 40 (equal to the Hanning window of length 41).
  • the main lobe of the asymmetric window has a width in between those of the symmetric Hanning windows, indicating a frequency resolution or selectivity in between the two Hanning windows.
  • the transposer algorithm need to be partially changed compared to the legacy implementation, taking into account the reduced transform delay D of the analysis/synthesis chain.
  • M S n e ⁇ i ⁇ ⁇ / N ⁇ D ⁇ n , 0 ⁇ n ⁇ N
  • k and n respectively are the transform frequency coefficient indices
  • F is the frequency domain oversampling factor
  • L is the analysis window size
  • D is the transform delay.
  • the modulation of Eq. 5 may also be applied in modulation stages 352 after the FFT split module 348 and response compensation step 350.
  • FIG. 6 illustrates stylistically the use of asymmetric windows and the associated delay imposed by a B -order base transposer, under an embodiment.
  • Time plot 600 shows the time zero reference as the group delay of the analysis window (approximately D /2). New samples 604 are added from time to in the analysis phase 602.
  • Time plot 610 shows that the time stretch duality of the transposer moves t 0 to time B ⁇ t 0 in the synthesis phase 612 for the new time-stretched samples 614.
  • the total analysis/synthesis chain delay amounts approximately to: D /2 + B ⁇ ( D /2 - S A ) in the case where asymmetric windows, such as shown in FIG. 5 (512) or FIG. 6 are used.
  • the calculations of Eqs. 4 and 5 above may likewise be implemented by circular time shifts of N- ( D / 2 - ( L - 1)) (mod N) samples before the analysis transform and N- D /2 samples after a (single) synthesis transform respectively.
  • N- D / 2 - ( L - 1) mod N samples before the analysis transform
  • N- D /2 samples after a (single) synthesis transform respectively.
  • B 8
  • the time shifts after the synthesis transforms will be ( N- D / 2 )/ B samples, which may not be an integer value. In this case, a rounded value may be used as an approximation.
  • the analysis modulation may be combined with the synthesis modulation as a merged synthesis modulation as given by Eq. 6:
  • M ASC k e ⁇ i 2 ⁇ ⁇ / N ⁇ D / 2 ⁇ B + 1 ⁇ L + 1 ⁇ B ) ⁇ k , 0 ⁇ k ⁇ N
  • T the transposition factor
  • Eq. 6 will also be an approximation.
  • g x ( m ) is the time-domain output from one of the synthesis inverse transforms
  • Eq. 7 provides only an approximation of the frequency modulation implemented by Eq. 6 (which in itself may be an approximation) when the argument to the ceil-function ⁇ (rounding up to closest integer) is not an exact integer. It should also be noted that Eqs. 5 or 6 above are preferably applied only to the limited part of the coefficients that will be included in the two inverse Fourier transforms.
  • Eq. 8 refers to the delay in output samples using a 64-channel CQMF based framework.
  • FIG. 7A is a table illustrating the total latency values for a first hop size
  • FIG. 7B is a table illustrating the total latency values for a second hop size for a virtual bass latency reduction system that uses asymmetric transform windows, under an embodiment.
  • the amount of asymmetry of the transposition windows may vary depending upon the constraints and requirements of the system.
  • the group delay of the asymmetric window is selected to be close to half of the transform delay in order to maintain adequate transposition quality.
  • G d ⁇ D /2 20. This may be accomplished by including a constraint for the group delay during an optimization phase for design of the asymmetric filter.
  • a third latency reduction element comprises using truncated Nyquist prototype filters, 406.
  • 8-channel and 4-channel Nyquist analysis filter banks 360 are applied to the virtual bass output CQMF channels (these filter banks correspond to the Nyquist filter banks 307 and 308 of FIG. 3A ).
  • this entire delay (e.g., 384 samples) may be eliminated.
  • the Nyquist analysis/synthesis chain still provides perfect reconstruction. However, the frequency responses of the Nyquist filter banks using truncated filters may change. Optimization of the remaining filter coefficients may improve the potentially poorer frequency responses of the Nyquist filter banks using truncated filters.
  • a fourth latency reduction element comprises letting the virtual bass signal lag the original signal, 408.
  • the latency of the overall system can be reduced as the wide band signal (i.e., the Hybrid signal A 364 of FIG. 3C ) is delayed a shorter period of time than the virtual bass system delay actually implies.
  • Informal listening tests have shown that a lag below 20 ms does not hamper the virtual bass effect. This lag corresponds to 960 samples for a 48 kHz audio signal.
  • the virtual bass signal is allowed to lag the wide band signal by a total of 352 samples (7.33 ms at 48 kHz).
  • 352 samples 32 samples are coming from the use of the asymmetric transform window as 1376 is not evenly divisible by the CQMF filter bank size of 64.
  • the delay from the asymmetric window transform can be divided into a wide band latency of 1344 plus a bass lag of 32 samples.
  • the extra lag added on top of the 32 samples is thus 320 samples (5 CQMF samples, corresponding to 6.67 ms at 48 kHz sampling frequency).
  • the different latency reduction elements 402-408 of FIG. 4 may be used in any practical number of combinations to achieve a reduction in virtual bass system latency. Furthermore, the appropriate variables of each latency reduction method may be altered to increase the latency in relation to any perceived decrease in virtual bass signal quality.
  • the delay of 640 samples in this example case is significantly less than the nominal delay of 3200 samples in the legacy virtual bass system described previously. This delay can be reduced even further by adding more virtual bass lag, by increasing the hop-size S A to 4 instead of 2, or by designing an asymmetric transform window with a resulting analysis/synthesis delay shorter than 40. However, the change of any such values may result in slightly poorer virtual bass quality, though the latency may be further reduced.
  • FIG. 8 is a block diagram illustrating an audio processing system that includes a virtual bass generation system and a latency reduction system, under an embodiment.
  • system 800 comprises a virtual bass system 330 as illustrated in FIG. 3C .
  • Virtual bass system 330 receives input audio signals 801 and performs certain frequency transposition functions to produce enhanced audio content for playback through speakers 806 that may be of limited frequency response capability. Certain latencies may be associated with the transposition functions performed by the virtual bass system 330.
  • a virtual bass latency reduction system 400 (as illustrated in FIG.
  • the reduced latency audio signals from the virtual bass systems 330 and 400 are then sent to a rendering subsystem 802 that is configured to generate speaker feeds that may be fed through amplifier 804 for left and right (or multi-channel) speakers 806.
  • the virtual bass latency reduction system 400 is shown to be a separate post-process element in system 800, it should be noted that such a latency reduction system may be implemented as part of the virtual bass system 330 (as indicated earlier), or as part of any other appropriate element of system 800, such as a functional component within rendering subsystem 802.
  • the virtual bass system 330 may be a legacy virtual bass generation system as outlined in the background, or it may be any other virtual bass generation and processing system that uses harmonic transposition to enhance input audio signals 801 to increase the perceived level of bass content for playback through speakers 806.
  • Embodiments of the virtual bass latency reduction system can be used in any audio processing system that renders and plays back digital audio through a variety of different playback devices and audio speakers (transducers).
  • These speakers may be embodied in any of a variety of different listening devices or items of playback equipment, such as computers, televisions, stereo systems (home or cinema), mobile phones, tablets, and other portable playback devices.
  • the speakers may be of any appropriate size and power rating, and may be provided in the form of free-standing drivers, speaker enclosures, surround-sound systems, soundbars, headphones, earbuds, and so on.
  • the speakers may be configured in any appropriate array, and may include monophonic drivers, binaural speakers, surround-sound speaker arrays, or any other appropriate array of audio drivers.
  • aspects of one or more embodiments described herein may be implemented in an audio system that processes audio signals for transmission across a network that includes one or more computers or processing devices executing software instructions. Any of the described embodiments may be used alone or together with one another in any combination. Although various embodiments may have been motivated by various deficiencies with the prior art, which may be discussed or alluded to in one or more places in the specification, the embodiments do not necessarily address any of these deficiencies. In other words, different embodiments may address different deficiencies that may be discussed in the specification. Some embodiments may only partially address some deficiencies or just one deficiency that may be discussed in the specification, and some embodiments may not address any of these deficiencies.
  • Portions of the adaptive audio system may include one or more networks that comprise any desired number of individual machines, including one or more routers (not shown) that serve to buffer and route the data transmitted among the computers.
  • Such a network may be built on various different network protocols, and may be the Internet, a Wide Area Network (WAN), a Local Area Network (LAN), or any combination thereof.
  • One or more of the components, blocks, processes or other functional components may be implemented through a computer program that controls execution of a processor-based computing device of the system. It should also be noted that the various functions disclosed herein may be described using any number of combinations of hardware, firmware, and/or as data and/or instructions embodied in various machine-readable or computer-readable media, in terms of their behavioral, register transfer, logic component, and/or other characteristics.
  • Computer-readable media in which such formatted data and/or instructions may be embodied include, but are not limited to, physical (non-transitory), non-volatile storage media in various forms, such as optical, magnetic or semiconductor storage media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Auxiliary Devices For Music (AREA)

Claims (15)

  1. Verfahren zum Erzeugen eines virtuellen Basses mit geringer Latenz, das Folgendes umfasst:
    Empfangen eines Eingangsaudiosignals;
    Durchführen einer Oberschwingungstransposition an niedrigen Frequenzkomponenten des Audioeingangssignals, um transponierte Daten, die Oberschwingungen des Eingangsaudiosignals angeben, zu erzeugen;
    Erzeugen eines virtuellen Basssignals als Antwort auf die transponierten Daten; und
    Erzeugen eines verbesserten Audiosignals durch Kombinieren des virtuellen Basssignals mit einer verzögerten Version des Eingangsaudiosignals, wobei die Oberschwingungstransposition eine kombinierte Transposition unter Verwendung einer Basistransposition einer Ordnung B, die höher als 2 ist, verwendet, so dass die Oberschwingungen eine Oberschwingung zweiter Ordnung und mindestens eine Oberschwingung höherer Ordnung jeder der Niederfrequenzkomponenten umfasst, und dadurch gekennzeichnet, dass alle der Oberschwingungen als Antwort auf Frequenzbereichswerte, die von einer gemeinsamen Zeit-zu-Frequenzbereich-Transformationsstufe unter Verwendung eines asymmetrischen Analysefensters bestimmt werden, und eine anschließende Umkehrtransformation, die von einer gemeinsamen Frequenz-zu-Zeitbereich-Transformationsstufe unter Verwendung eines asymmetrischen Synthesefensters bestimmt wird, erzeugt werden.
  2. Verfahren nach Anspruch 1, wobei das Audioeingangssignal ein Teilband-Komplexwert-Quadraturspiegelfilter-Signal (Teilband-CQMF-Signal) ist, das ein kritisch abgetastetes oder nahezu kritisch abgetastetes Niederfrequenz-Audio aus einem Satz von CQMF-Teilbandsignalen angibt.
  3. Verfahren nach Anspruch 2, wobei das kritisch abgetastete oder nahezu kritisch abgetastete Niederfrequenz-Eingangsaudio ein CQMF-Kanal-O-Signal ist, das das niedrigste Frequenzband aus einem Satz von CQMF-Teilbandsignalen angibt.
  4. Verfahren nach Anspruch 3, das ferner Folgendes umfasst:
    Erzeugen von transponierten Daten aus Niederfrequenzkomponenten durch Durchführen einer überabgetasteten Frequenzbereichstransformation an dem Eingangsaudiosignal durch Erzeugen asymmetrisch gefensterter, mit Nullen aufgefüllter Samples und Durchführen einer Zeit-zu-Frequenzbereich-Transformation an den asymmetrisch gefensterten, mit Nullen aufgefüllten Samples, und anschließend Durchführen einer nichtlinearen Operation an der Ausgabe aus der Zeit-zu-Frequenzbereich-Transformation, um die transponierten Daten aus den Niederfrequenzkomponenten zu erzeugen;
    Erzeugen von zwei Sätzen von Frequenzkomponenten aus den Frequenzkomponenten, die durch die nichtlineare Operation verarbeitet werden, durch Aufteilen in einen ersten Satz von Frequenzkomponenten in einem ersten Frequenzband und einen zweiten Satz von Frequenzkomponenten in einem zweiten Frequenzband; und
    ferner Durchführen einer ersten Frequenz-zu-Zeitbereich-Transformation an dem ersten Satz von Frequenzkomponenten und einer zweiten Frequenz-zu-Zeitbereich-Transformation an dem zweiten Satz von Frequenzkomponenten, wobei die erste Frequenz-zu-2eitbereich-Transformation und die zweite Frequenz-zu-Zeitbereich-Transformation jeweils Transformationsgrößen aufweisen, die B-mal kleiner als die Zeit-zu-Frequenzbereich-Transformation sind; und
    ferner Anwenden von asymmetrischen, mit Nullen aufgefüllten Fenstern auf die Samples aus den Frequenzzu-Zeitbereich-Transformationen, wobei die asymmetrischen, mit Nullen aufgefüllten Fenster B-mal kürzer als die asymmetrischen, gefensterten, mit Nullen aufgefüllten Samples sind, die aus dem Audioeingangssignal erzeugt werden, wodurch zwei Sätze von transponierten Daten gebildet werden.
  5. Verfahren nach Anspruch 4, wobei das erste Frequenzband das Frequenzband des CQMF-Kanals 0 und das zweite Frequenzband das Frequenzband des CQMF-Kanals 1 aus einem Satz von CQMF-Teilbandsignalen ist,
    wobei das Erzeugen eines virtuellen Basssignals als Antwort auf die transponierten Daten umfasst, dass eine Analysefilterbank auf eine oder beide der zwei Sätze von transponierten Daten angewendet wird, wobei die Analysefilterbank eine gestutzte Version eines symmetrischen Filters umfasst.
  6. Verfahren nach Anspruch 1, wobei die verzögerte Version des Eingangsaudiosignals um eine vordefinierte Zeitspanne kürzer als die Latenzzeit des virtuellen Basssignals ist und das verbesserte Audiosignal ein zeitverschobenes virtuelles Basssignal angibt.
  7. Verfahren nach Anspruch 3, wobei der Audioeingangs-CQMF-Kanal 0 direkt aus der Analyse-CQMF-Bank-Ausgabe einer Vorverarbeitungs-Hybrid-Filterbank-Stufe empfangen wird, wobei die Nyquist-Analysefilterbank der Vorverarbeitungs-Hybrid-Filterbank-Stufe umgangen wird.
  8. Vorrichtung zum Erzeugen eines virtuellen Basses mit geringer Latenz, die Folgendes umfasst:
    eine erste Komponente, die zum Empfangen eines Eingangsaudiosignals und zum Durchführen einer Oberschwingungstransposition an niedrigen Frequenzkomponenten des Audioeingangssignals, um transponierte Daten, die Oberschwingungen des Eingangsaudiosignals angeben, zu erzeugen, ausgelegt ist;
    eine zweite Komponente, die zum Erzeugen eines virtuellen Basssignals als Antwort auf die transponierten Daten und zum Kombinieren des virtuellen Basssignals mit einer verzögerten Version des Eingangsaudiosignals, um ein verbessertes Audiosignal zu erzeugen, ausgelegt ist, wobei die Oberschwingungstransposition eine kombinierte Transposition unter Verwendung einer Basistransposition einer Ordnung B, die höher als 2 ist, verwendet, so dass die Oberschwingungen eine Oberschwingung zweiter Ordnung und mindestens eine Oberschwingung höherer Ordnung jeder der Niederfrequenzkomponenten umfasst, und dadurch gekennzeichnet, dass alle der Oberschwingungen als Antwort auf Frequenzbereichswerte, die von einer gemeinsamen Zeit-zu-Frequenzbereich-Transformationsstufe unter Verwendung eines asymmetrischen Analysefensters bestimmt werden, und eine anschließende Umkehrtransformation, die von einer gemeinsamen Frequenz-zu-Zeitbereich-Transformationsstufe unter Verwendung eines asymmetrischen Synthesefensters bestimmt wird, erzeugt werden.
  9. Vorrichtung nach Anspruch 8, wobei das Audioeingangssignal ein Teilband-Komplexwert-Quadraturspiegelfilter-Signal (Teilband-CQMF-Signal) ist, das ein kritisch abgetastetes oder nahezu kritisch abgetastetes Niederfrequenz-Audio aus einem Satz von CQMF-Teilbandsignalen angibt.
  10. Vorrichtung nach Anspruch 9, wobei das kritisch abgetastete oder nahezu kritisch abgetastete Niederfrequenz-Eingangsaudio ein CQMF-Kanal-0-Signal ist, das das niedrigste Frequenzband aus einem Satz von CQMF-Teilbandsignalen angibt.
  11. Vorrichtung nach Anspruch 10, die ferner Folgendes umfasst:
    eine dritte Komponente, die zum Erzeugen von transponierten Daten aus Niederfrequenzkomponenten durch Durchführen einer überabgetasteten Frequenzbereichstransformation an dem Eingangsaudiosignal durch Erzeugen asymmetrisch gefensterter, mit Nullen aufgefüllter Samples und Durchführen einer Zeit-zu-Frequenzbereich-Transformation an den asymmetrisch gefensterten, mit Nullen aufgefüllten Samples und zum anschließenden Durchführen einer nichtlinearen Operation an der Ausgabe aus der Zeit-zu-Frequenzbereich-Transformation, um die transponierten Daten aus den Niederfrequenzkomponenten zu erzeugen, ausgelegt ist;
    eine vierte Komponente, die zum Erzeugen von zwei Sätzen von Frequenzkomponenten aus den Frequenzkomponenten, die durch die nichtlineare Operation verarbeitet werden, durch Aufteilen in einen ersten Satz von Frequenzkomponenten in einem ersten Frequenzband und einen zweiten Satz von Frequenzkomponenten in einem zweiten Frequenzband ausgelegt ist; und
    eine fünfte Komponente, die ferner zum Durchführen einer ersten Frequenz-zu-Zeitbereich-Transformation an dem ersten Satz von Frequenzkomponenten und einer zweiten Frequenz-zu-Zeitbereich-Transformation an dem zweiten Satz von Frequenzkomponenten ausgelegt ist, wobei die erste Frequenz-zu-Zeitbereich-Transformation und die zweite Frequenz-zu-Zeitbereich-Transformation jeweils Transformationsgrößen aufweisen, die B-mal kleiner als die Zeit-zu-Frequenzbereich-Transformation sind; und
    eine sechste Komponente, die ferner zum Anwenden von asymmetrischen, mit Nullen aufgefüllten Fenstern auf die Samples aus den Frequenz-zu-Zeitbereich-Transformationen ausgelegt ist, wobei die asymmetrischen, mit Nullen aufgefüllten Fenster B-mal kürzer als die asymmetrischen, gefensterten, mit Nullen aufgefüllten Samples sind, die aus dem Audioeingangssignal erzeugt werden, wodurch zwei Sätze von transponierten Daten gebildet werden.
  12. Vorrichtung nach Anspruch 11, wobei das erste Frequenzband das Frequenzband des CQMF-Kanals 0 und das zweite Frequenzband das Frequenzband des CQMF-Kanals 1 aus einem Satz von CQMF-Teilbandsignalen ist, wobei das Erzeugen eines virtuellen Basssignals als Antwort auf die transponierten Daten umfasst, dass eine Analysefilterbank auf eine oder beide der zwei Sätze von transponierten Daten angewendet wird, wobei die Analysefilterbank eine gestutzte Version eines symmetrischen Filters umfasst.
  13. Vorrichtung nach Anspruch 8, die ferner Folgendes umfasst:
    eine Zeitkomponente, die zum Erzeugen einer Version des Eingangsaudiosignals ausgelegt ist, die um eine vorgegebene Zeitspanne verzögert ist, die kürzer als die Latenzzeit des virtuellen Basssignals ist; und
    eine Mischkomponente, die zum Kombinieren des virtuellen Basssignals mit dem verzögerten Eingangsaudiosignal ausgelegt ist, um ein verbessertes Audiosignal zu erzeugen, das ein zeitverschobenes virtuelles Basssignal angibt.
  14. Vorrichtung nach Anspruch 10, die ferner eine Schnittstellenkomponente umfasst, die zum Empfangen des Audioeingangs-CQMF-Kanals 0 direkt aus der Analyse-CQMF-Bank-Ausgabe einer Vorverarbeitungs-Hybrid-Filterbank-Stufe ausgelegt ist, wobei die Nyquist-Analysefilterbank der Vorverarbeitungs-Hybrid-Filterbank-Stufe umgangen wird.
  15. Computerlesbares Speichermedium, das ausführbare Computerprogrammbefehle zum Ausführen eines Verfahrens nach einem der Ansprüche 1-7, wenn sie auf einem Computer durchgeführt werden, speichert.
EP13771123.0A 2012-10-15 2013-09-27 System und verfahren zur reduzierung der latenzzeit in transposerbasierten virtuellen basssystemen Active EP2907324B1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/652,023 US8971551B2 (en) 2009-09-18 2012-10-15 Virtual bass synthesis using harmonic transposition
PCT/EP2013/070262 WO2014060204A1 (en) 2012-10-15 2013-09-27 System and method for reducing latency in transposer-based virtual bass systems

Publications (2)

Publication Number Publication Date
EP2907324A1 EP2907324A1 (de) 2015-08-19
EP2907324B1 true EP2907324B1 (de) 2016-11-09

Family

ID=49293633

Family Applications (2)

Application Number Title Priority Date Filing Date
EP13771123.0A Active EP2907324B1 (de) 2012-10-15 2013-09-27 System und verfahren zur reduzierung der latenzzeit in transposerbasierten virtuellen basssystemen
EP13188415.7A Active EP2720477B1 (de) 2012-10-15 2013-10-14 Virtuelle Basssynthese mit harmonischer Transposition

Family Applications After (1)

Application Number Title Priority Date Filing Date
EP13188415.7A Active EP2720477B1 (de) 2012-10-15 2013-10-14 Virtuelle Basssynthese mit harmonischer Transposition

Country Status (4)

Country Link
EP (2) EP2907324B1 (de)
JP (1) JP5894347B2 (de)
CN (1) CN104704855B (de)
WO (1) WO2014060204A1 (de)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105280189B (zh) * 2015-09-16 2019-01-08 深圳广晟信源技术有限公司 带宽扩展编码和解码中高频生成的方法和装置
CN114467313B (zh) * 2019-08-08 2023-04-14 博姆云360公司 用于心理声学频率范围延伸的非线性自适应滤波器组
CN115299075B (zh) * 2020-03-20 2023-08-18 杜比国际公司 扬声器的低音增强
WO2023280357A1 (en) * 2021-07-09 2023-01-12 Soundfocus Aps Method and loudspeaker system for processing an input audio signal
EP4367901A1 (de) * 2021-07-09 2024-05-15 Soundfocus Aps Verfahren und wandleranordnungssystem zur gerichteten wiedergabe eines audioeingangssignals
JP2023130644A (ja) * 2022-03-08 2023-09-21 アルプスアルパイン株式会社 音響信号処理装置、音響システム及び低音感の増強方法

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE0101175D0 (sv) 2001-04-02 2001-04-02 Coding Technologies Sweden Ab Aliasing reduction using complex-exponential-modulated filterbanks
TWI339991B (en) * 2006-04-27 2011-04-01 Univ Nat Chiao Tung Method for virtual bass synthesis
US8036903B2 (en) * 2006-10-18 2011-10-11 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Analysis filterbank, synthesis filterbank, encoder, de-coder, mixer and conferencing system
JP4983694B2 (ja) * 2008-03-31 2012-07-25 株式会社Jvcケンウッド 音声再生装置
BR122019023684B1 (pt) * 2009-01-16 2020-05-05 Dolby Int Ab sistema para gerar um componente de frequência alta de um sinal de áudio e método para realizar reconstrução de frequência alta de um componente de frequência alta
CN101505443B (zh) * 2009-03-13 2013-12-11 无锡中星微电子有限公司 一种虚拟重低音增强方法及系统
GB0906594D0 (en) * 2009-04-17 2009-05-27 Sontia Logic Ltd Processing an audio singnal
KR101613684B1 (ko) * 2009-12-09 2016-04-19 삼성전자주식회사 음향 신호 보강 처리 장치 및 방법
US8638953B2 (en) * 2010-07-09 2014-01-28 Conexant Systems, Inc. Systems and methods for generating phantom bass
CA2792011C (en) * 2010-07-19 2016-04-26 Dolby International Ab Processing of audio signals during high frequency reconstruction
JP5375861B2 (ja) * 2011-03-18 2013-12-25 ヤマハ株式会社 オーディオ再生の効果付加方法およびその装置
CN102354500A (zh) * 2011-08-03 2012-02-15 华南理工大学 一种基于谐波控制的虚拟低音增强处理方法
TWI575962B (zh) * 2012-02-24 2017-03-21 杜比國際公司 部份複數處理之重疊濾波器組中的低延遲實數至複數轉換

Also Published As

Publication number Publication date
JP5894347B2 (ja) 2016-03-30
JP2015531575A (ja) 2015-11-02
WO2014060204A1 (en) 2014-04-24
EP2720477B1 (de) 2016-03-02
CN104704855A (zh) 2015-06-10
CN104704855B (zh) 2016-08-24
EP2720477A1 (de) 2014-04-16
EP2907324A1 (de) 2015-08-19

Similar Documents

Publication Publication Date Title
US9407993B2 (en) Latency reduction in transposer-based virtual bass systems
EP2907324B1 (de) System und verfahren zur reduzierung der latenzzeit in transposerbasierten virtuellen basssystemen
US7487097B2 (en) Advanced processing based on a complex-exponential-modulated filterbank and adaptive time signalling methods
RU2666316C2 (ru) Аппарат и способ улучшения аудиосигнала, система улучшения звука
JP5607626B2 (ja) パラメトリックステレオ変換システム及び方法
JP2005530432A (ja) 部屋における拡声器からの音声のデジタル等化方法、および、この方法の使用法
SG183966A1 (en) Improved magnitude response and temporal alignment in phase vocoder based bandwidth extension for audio signals
US8295508B2 (en) Processing an audio signal
JP7410282B2 (ja) スペクトル直交オーディオ成分を用いたサブバンド空間処理およびクロストーク処理
CN112566008A (zh) 音频上混方法、装置、电子设备和存储介质
JP7260101B2 (ja) 情報処理装置、これを用いたミキシング装置、及びレイテンシ減少方法
CN112584300B (zh) 音频上混方法、装置、电子设备和存储介质
JP2024510177A (ja) オーディオ相関除去器、オーディオ信号を相関除去するための処理システムおよび方法
TW202217800A (zh) 包括移頻功能之即時音訊處理系統以及包括移頻功能之即時音訊處理程序
CN117157706A (zh) 用于对音频信号进行解相关的音频解相关器、处理系统和方法

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20150515

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAX Request for extension of the european patent (deleted)
RIC1 Information provided on ipc code assigned before grant

Ipc: H04R 3/04 20060101AFI20160428BHEP

Ipc: G10L 21/038 20130101ALI20160428BHEP

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

INTG Intention to grant announced

Effective date: 20160607

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 844866

Country of ref document: AT

Kind code of ref document: T

Effective date: 20161115

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602013013882

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20161109

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 844866

Country of ref document: AT

Kind code of ref document: T

Effective date: 20161109

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20170209

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20170210

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20170309

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20170309

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602013013882

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20170209

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 5

26N No opposition filed

Effective date: 20170810

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

REG Reference to a national code

Ref country code: IE

Ref legal event code: MM4A

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20170927

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20170927

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20170930

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20170930

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 6

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20170927

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO

Effective date: 20130927

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20161109

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 10

REG Reference to a national code

Ref country code: DE

Ref legal event code: R081

Ref document number: 602013013882

Country of ref document: DE

Owner name: DOLBY INTERNATIONAL AB, IE

Free format text: FORMER OWNER: DOLBY INTERNATIONAL AB, AMSTERDAM, NL

Ref country code: DE

Ref legal event code: R081

Ref document number: 602013013882

Country of ref document: DE

Owner name: DOLBY INTERNATIONAL AB, NL

Free format text: FORMER OWNER: DOLBY INTERNATIONAL AB, AMSTERDAM, NL

REG Reference to a national code

Ref country code: DE

Ref legal event code: R081

Ref document number: 602013013882

Country of ref document: DE

Owner name: DOLBY INTERNATIONAL AB, IE

Free format text: FORMER OWNER: DOLBY INTERNATIONAL AB, DP AMSTERDAM, NL

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230512

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20230823

Year of fee payment: 11

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20230822

Year of fee payment: 11

Ref country code: DE

Payment date: 20230822

Year of fee payment: 11