WO2011039413A1 - Appareil - Google Patents

Appareil Download PDF

Info

Publication number
WO2011039413A1
WO2011039413A1 PCT/FI2010/050709 FI2010050709W WO2011039413A1 WO 2011039413 A1 WO2011039413 A1 WO 2011039413A1 FI 2010050709 W FI2010050709 W FI 2010050709W WO 2011039413 A1 WO2011039413 A1 WO 2011039413A1
Authority
WO
WIPO (PCT)
Prior art keywords
difference
audio
group
audio signals
threshold
Prior art date
Application number
PCT/FI2010/050709
Other languages
English (en)
Inventor
Ravi Rangnath Shenoy
Original Assignee
Nokia Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corporation filed Critical Nokia Corporation
Priority to EP10819956.3A priority Critical patent/EP2484127B1/fr
Priority to CN201080044113.1A priority patent/CN102550048B/zh
Publication of WO2011039413A1 publication Critical patent/WO2011039413A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/05Generation or adaptation of centre channel in multi-channel audio systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/07Synergistic effects of band splitting and sub-band processing

Definitions

  • the present invention relates to apparatus for processing of audio signals.
  • the invention further relates to, but is not limited to, apparatus for processing audio and speech signals in audio playback devices.
  • Audio rendering and sound virtualization has been a growing area in recent years. There are different playback techniques some of which are mono, stereo playback, surround 5.1 , ambisonics etc.
  • apparatus or signal processing integrated within apparatus or signal processing performed prior to the final playback apparatus has been designed to allow a virtual sound image to be created in many applications such as music playback, movie sound tracks, 3D audio, and gaming applications.
  • stereo audio signal generation The standard for commercial audio content until recently, for music or movie, was stereo audio signal generation. Signals from different musical instruments, speech or voice, and other audio sources creating the sound scene were combined to form a stereo signal.
  • Commercially available playback devices would typically have two loudspeakers placed at a suitable distance in front of the listener. The goal of stereo rendering was limited to creating phantom images at a position between the two speakers and is known as panned stereo.
  • the same content could be played on portable playback devices as well, as it relied on a headphone or an earplug which uses 2 channels.
  • stereo widening and 3D audio applications have recently become more popular especially for portable devices with audio playback capabilities. There are various techniques for these applications that provide user spatial feeling and 3D audio content. The techniques employ various signal processing algorithms and filters. It is known that the effectiveness of spatial audio is stronger over headphone playback
  • FIG. 2 An example of a 5.1 multichannel system is shown in Figure 2 where the user 21 1 is surrounded by a front left channel speaker 251 , a front right channel speaker 253, a centre channel speaker 255, a left surround channel speaker 257 and a right surround channel speaker 259. Phantom images can be created using this type of setup lying anywhere on the circle 271 as shown in Figure 2.
  • a channel in multichannel audio is not necessarily unique. Audio signals for one channel after frequency dependent phase shifts and magnitude modifications can become the audio signal for a different channel. This in a way helps to create phantom audio sources around the listener leading to a surround sound experience. However such equipment is expensive and many end users do not have the multi-loudspeaker equipment for replaying the multichannel audio content.
  • the multichannel audio signals are matrix downmixed.
  • the original multi-channel content is no longer available in its component form (each component being each channel in say 5.1 ). All of the channels from 5.1 are present in the down-mixed stereo.
  • the phantom images lie on an imaginary line joining the left and right ears. This line is known as the interaural axis and the experience is often called inside-the-head feeling or lateralization.
  • a user would not experience an audio source that is localized inside their head.
  • prolonged listening to this form of stereo audio over headphones leads to listener fatigue.
  • users perceive a limited surround feeling.
  • Such re-synthesis typically involves an upmixing of the stereo-signal to extract additional channel audio signals.
  • centre channel extraction is important as the centre channel could be speech/vocal audio signals, specific musical instruments or both.
  • Each of these extracted audio signals may be then virtualized to different virtual locations.
  • a virtualizer typically introduces frequency dependent relative delays and amplification or attenuation to the signals before the signals are sent to headphone speakers. The introduction of typical virtualization would pan certain sources away from the mid plane where the user does not have any control how loud or quiet these sources could be.
  • the user may be interested in a vocalist located centre stage rather than the audience located off-centre stage and the stereo audio signals may easily mask the key sections of the vocalist by the background noise from the audience.
  • the sources that appear to be originating from the centre can often be at higher or lower audio levels relative to the rest of the sources in the audio scene. Listeners typically do not have any control over this level and often want to amplify or attenuate these central sources depending on their perceptual preference. Lack of this feature often results in a poor audio experience.
  • This invention proceeds from the consideration that prior art solutions for centre channel extraction do not produce good quality centre channel audio signals. Thus listening to centre channel audio signals produces a poor listening experience. Furthermore the poor quality centre channel audio signals produce poor quality listening experiences when virtualized.
  • Embodiments of the present invention aim to address the above problem.
  • a method comprising: filtering at least two audio signals to generate at least two groups of audio components per audio signal; determining a difference between the at least two audio signals for each group of audio components; and generating a further audio signal by selectively combining the at least two audio signals for each group of audio components dependent on the difference between the at least two audio signals for each group of audio components.
  • Filtering the at least two audio signals may comprise filtering the at least two audio signals into at least one of: overlapping frequency range groups; abutting frequency range groups; linear interval frequency range groups; and non-linear interval frequency range groups.
  • Determining the difference between the at least two audio signals may comprise: determining a first difference for a first group with a frequency range below a frequency threshold; and determining a second difference for a second group with a frequency range above the frequency threshold.
  • the difference may comprise at least one of: Interaural level difference value; Interaural phase difference value; and Interaural time difference value.
  • Selectively combining the at least two audio signals for each group of audio components may further comprise: associating a gain function for each group of audio components by comparing the difference between the at least two audio signals for the group of audio components against at least one difference threshold for the group; multiplying each audio signal for the group by the associated gain value for the group; and combining the products of the audio signal for the group and the associated gain value for the group.
  • Associating a gain function may further comprise: associating a first gain function for each group of audio components with a difference less than a first difference threshold; associating a second gain function for each group of audio components with a difference greater or equal to a first difference threshold and less than a second difference threshold; associating a third gain function for each group of audio components with a difference greater or equal to a second difference threshold.
  • the method may further comprise determining the at least one difference threshold dependent on at least one of: a measured head related transfer function; a measured head related impulse response; a chosen head related transfer function; a chosen head related impulse response; a modified head related transfer function; and a modified head related impulse response.
  • an apparatus comprising at least one processor and at least one memory including computer program code the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to perform: filtering at least two audio signals to generate at least two groups of audio components per audio signal; determining a difference between the at least two audio signals for each group of audio components; and generating a further audio signal by selectively combining the at least two audio signals for each group of audio components dependent on the difference between the at least two audio signals for each group of audio components.
  • the filtering the at least two audio signals may cause the apparatus at least to perform filtering the at least two audio signals into at least one of: overlapping frequency range groups; abutting frequency range groups; linear interval frequency range groups; and non-linear interval frequency range groups.
  • Determining the difference between the at least two audio signals may cause the apparatus at least to perform: determining a first difference for a first group with a frequency range below a frequency threshold; and determining a second difference for a second group with a frequency range above the frequency threshold.
  • the difference may comprise at least one of: Interaural level difference value; Interaural phase difference value; and Interaural time difference value.
  • Selectively combining the at least two audio signals for each group of audio components may cause the apparatus at least to perform: associating a gain function for each group of audio components by comparing the difference between the at least two audio signals for the group of audio components against at least one difference threshold for the group; multiplying each audio signal for the group by the associated gain value for the group; and combining the products of the audio signal for the group and the associated gain value for the group.
  • Associating a gain function may further cause the apparatus at least to perform: associating a first gain function for each group of audio components with a difference less than a first difference threshold; associating a second gain function for each group of audio components with a difference greater or equal to a first difference threshold and less than a second difference threshold; and associating a third gain function for each group of audio components with a difference greater or equal to a second difference threshold.
  • the at least one processor and at least one memory may further cause the apparatus at least to perform determining the at least one difference threshold dependent on at least one of: a measured head related transfer function; a measured head related impulse response; a chosen head related transfer function; a chosen head related impulse response; a modified head related transfer function; and a modified head related impulse response.
  • an apparatus comprising: at least one filter configured to filter at least two audio signals to generate at least two groups of audio components per audio signal; a comparator configured to determine a difference between the at least two audio signals for each group of audio components; and a signal combiner configured to generate a further audio signal by selectively combining the at least two audio signals for each group of audio components dependent on the difference between the at least two audio signals for each group of audio components.
  • the at least one filter may further be configured to filter the at least two audio signals into at least one of: overlapping frequency range groups; abutting frequency range groups; linear interval frequency range groups; and non-linear interval frequency range groups.
  • the comparator may further be configured to determine a first difference for a first group with a frequency range below a frequency threshold; and determine a second difference for a second group with a frequency range above the frequency threshold.
  • the difference may comprise at least one of: Interaural level difference value; Interaural phase difference value; and Interaural time difference value.
  • the signal combiner may further comprise: a gain determiner configured to determine a gain for each group of audio components by comparing the difference between the at least two audio signals for the group of audio components against at least one difference threshold for the group; at least one amplifier configured to multiply each audio signal for the group by the associated gain value for the group; and at least one adder configured to combine the products of the audio signal for the group and the associated gain value for the group.
  • a gain determiner configured to determine a gain for each group of audio components by comparing the difference between the at least two audio signals for the group of audio components against at least one difference threshold for the group
  • at least one amplifier configured to multiply each audio signal for the group by the associated gain value for the group
  • at least one adder configured to combine the products of the audio signal for the group and the associated gain value for the group.
  • the gain determiner may further be configured to associate a first gain function for each group of audio components with a difference less than a first difference threshold; associate a second gain function for each group of audio components with a difference greater or equal to a first difference threshold and less than a second difference threshold; associate a third gain function for each group of audio components with a difference greater or equal to a second difference threshold.
  • the apparatus may further comprise a threshold determiner configured to determine the at least one difference threshold dependent on at least one of: a measured head related transfer function; a measured head related impulse response; a chosen head related transfer function; a chosen head related impulse response; a modified head related transfer function; and a modified head related impulse response.
  • a threshold determiner configured to determine the at least one difference threshold dependent on at least one of: a measured head related transfer function; a measured head related impulse response; a chosen head related transfer function; a chosen head related impulse response; a modified head related transfer function; and a modified head related impulse response.
  • a computer-readable medium encoded with instructions that, when executed by a computer perform: filtering at least two audio signals to generate at least two groups of audio components per audio signal; determining a difference between the at least two audio signals for each group of audio components; and generating a further audio signal by selectively combining the at least two audio signals for each group of audio components dependent on the difference between the at least two audio signals for each group of audio components.
  • an apparatus comprising: filtering means for filtering at least two audio signals to generate at least two groups of audio components per audio signal; comparator means for determining a difference between the at least two audio signals for each group of audio components; and combining means for generating a further audio signal by selectively combining the at least two audio signals for each group of audio components dependent on the difference between the at least two audio signals for each group of audio components.
  • An electronic device may comprise apparatus as described above.
  • a chipset may comprise apparatus as described above.
  • FIG 1 shows schematically an electronic device employing embodiments of the application
  • Figure 2 shows schematically a 5 channel audio system configuration
  • Figure 3 shows schematically a stereo to multichannel up-mixer according to some embodiments of the application
  • Figure 4 shows schematically a centre channel extractor as shown in Figure 3 according to some embodiments of the application
  • FIG. 5 shows schematically a centre channel extractor as shown in Figure 3 and 4 in further detail
  • Figure 6 shows a flow diagram illustrating the operation of the centre channel extractor according to embodiments of the application
  • Figure 7 shows schematically an Euclidean difference distance showing the first and second threshold distances as used in embodiments of the application
  • Figures 8a and 8b show graphically head related transfer functions across frequencies for specific azimuth angles for use in determining first and second threshold values according to some embodiments
  • Figures 9a and 9b show graphically head related transfer functions across azimuth positions for specific frequencies for use in determining first and second threshold values according to some embodiments
  • Figures 10a and 10b show graphically perception beam determinations for frequencies according to some embodiments
  • Figure 1 1 shows schematically a pre-processing stage for the left channel audio signal according to some embodiments of the application.
  • Figure 12 shows schematically a section of the centre channel extractor for some further embodiments.
  • FIG. 1 schematic block diagram of an exemplary electronic device 10 or apparatus, which may incorporate a centre channel extractor.
  • the centre channel extracted by the centre channel extractor in some embodiments is suitable for an up-mixer.
  • the electronic device 10 may for example be a mobile terminal or user equipment for a wireless communication system.
  • the electronic device may be a Television (TV) receiver, portable digital versatile disc (DVD) player, or audio player such as an ipod.
  • the electronic device 10 comprises a processor 21 which may be linked via a digital-to- analogue converter 32 to a headphone connector for receiving a headphone or headset 33.
  • the processor 21 is further linked to a transceiver (TX/RX) 13, to a user interface (Ul) 15 and to a memory 22.
  • the processor 21 may be configured to execute various program codes.
  • the implemented program codes comprise a channel extractor for extracting a centre channel audio signal from a stereo audio signal.
  • the implemented program codes 23 may be stored for example in the memory 22 for retrieval by the processor 21 whenever needed.
  • the memory 22 could further provide a section 24 for storing data, for example data that has been processed in accordance with the embodiments.
  • the channel extracting code may in embodiments be implemented in hardware or firmware.
  • the user interface 15 enables a user to input commands to the electronic device 10, for example via a keypad, and/or to obtain information from the electronic device 10, for example via a display.
  • the transceiver 13 enables a communication with other electronic devices, for example via a wireless communication network.
  • the apparatus 10 may in some embodiments further comprise at least two microphones for inputting audio or speech that is to be processed according to embodiments of the application or transmitted to some other electronic device or stored in the data section 24 of the memory 22.
  • a corresponding application to capture stereo audio signals using the at least two microphones may be activated to this end by the user via the user interface 15.
  • the apparatus 10 in such embodiments may further comprise an analogue-to-digital converter configured to convert the input analogue audio signal into a digital audio signal and provide the digital audio signal to the processor 21 .
  • the apparatus 10 may in some embodiments also receive a bit stream with correspondingly encoded stereo audio data from another electronic device via the transceiver 13.
  • the processor 21 may execute the channel extraction program code stored in the memory 22.
  • the processor 21 in these embodiments may process the received stereo audio signal data, and output the extracted channel data.
  • the headphone connector 33 may be configured to communicate to a headphone set or earplugs wirelessly, for example by a Bluetooth profile, or using a conventional wired connection.
  • the received stereo audio data may in some embodiments also be stored, instead of being processed immediately, in the data section 24 of the memory 22, for instance for enabling a later processing and presentation or forwarding to still another electronic device.
  • FIG. 3 shows in further detail an up-mixer 106 suitable for the implementation of some embodiments of the application.
  • the up-mixer is configured to receive a stereo audio signal and generate a left channel audio signal L", a centre channel audio signal C and a right channel audio signal R".
  • the up-mixer 106 is configured to receive the left channel audio signal at the left input 451 and the right channel audio signal at the right input 453.
  • the up-mixer 106 furthermore comprises a centre channel extractor 455 which receives the left channel audio signal L and the right channel audio signal R and generates a centre channel audio signal C.
  • a centre channel extractor 455 which receives the left channel audio signal L and the right channel audio signal R and generates a centre channel audio signal C.
  • the centre channel audio signal C is in some embodiments furthermore passed to a first amplifier 461 which applies a gain A-i to the signal and outputs the amplified signal to the left channel modifier 465.
  • the left channel audio signal L is further passed to a left channel filter 454 which applies a delay to the audio signal substantially equal to the time required to generate the centre channel audio signal C.
  • the left channel filter 454 in some embodiments may be implemented by an all pass filter.
  • the filtered left channel audio signal is passed to the left channel modifier 465.
  • the left channel modifier 465 is configured to subtract the amplified centre channel audio signal A-iC from the filtered left channel audio signal to generate a modified left channel audio signal L .
  • the modified left channel audio signal in some embodiments is passed to the left channel amplifier 487.
  • the centre channel audio signal C is furthermore in some embodiments passed to a second amplifier 463 which applies a gain A 2 to the signal and outputs the amplified signal to the right channel modifier 467.
  • the right channel audio signal R is further passed to a right channel filter 456 which applies a delay to the audio signal substantially equal to the time required to generate the centre channel audio signal C.
  • the right channel filter 456 in some embodiments may be implemented by an all pass filter.
  • the filtered right channel audio signal is passed to the right channel modifier 467.
  • the right channel modifier 467 is configured to subtract the amplified centre channel audio signal A 2 C from the filtered right channel audio signal to generate a modified left channel audio signal R .
  • the modified right channel audio signal in some embodiments is passed to the right channel amplifier 491 .
  • the left channel amplifier 487 in some embodiment is configured to receive the modified left channel audio signal L', amplify the modified left channel audio signal and output the amplified left channel signal L".
  • the up-mixer 106 furthermore is configured in some embodiments to comprise a centre channel amplifier 489 configured to receive the centre channel audio signal C, amplify the centre channel audio signal and output an amplified centre channel signal C .
  • the up-mixer 106 in the same embodiments comprises a right channel amplifier 491 configured to receive the modified right channel audio signal R ⁇ amplify the modified right channel audio signal and output the amplified right channel signal R".
  • the gain of the left channel amplifier 487, centre channel amplifier 489 and right channel amplifier 491 in some embodiments may be determined by the user for example using the user interface 15 so as to control the importance of the 'centre' stage audio components with respect to the 'left' and 'right' stage audio components.
  • the user may control the gain of the 'centre' over the 'left' and 'right' components so that the user may emphasise the vocalist over the instruments or audience audio components according to the earlier examples.
  • the gains may be controlled or determined automatically or semi- automatically. For example such embodiments may be implemented for applications such as Karaoke.
  • the extraction of the centre channel uses both magnitude and phase information for lower frequency components and magnitude information only for higher frequencies. More specifically the centre channel extractor 455 in some embodiments uses frequency dependent magnitude and phase difference information between the stereo signals and compares this information against the user interaural level difference (ILD) and interaural phase difference (IPD) to decide if the signal is located at the centre i.e. in the median plane (vertical plane passing through the midpoint between two ears and nose).
  • ILD interaural level difference
  • IPD interaural phase difference
  • the proposed method can in some embodiments be customized according to user's own head related transfer function. It can in some other embodiments be used to extract sources in the median plane for a binaurally recorded signal.
  • the methods and apparatus as described hereafter may extract the centre channel using at least one of the interaural level difference (ILD), the interaural phase difference (IPD) and the interaural time difference (ITD).
  • the selection of the at least one difference used may differ according to the frequency being analysed. As the example described above and following, there is a first selection for a first frequency range where the interaural level difference and interaural phase difference are used at a low frequency range and a second selection of only the interaural level difference used at a higher frequency range.
  • the centre channel audio signal components are present in both the left and right stereo audio signals where the components have the same intensity and zero delay i.e. no phase difference.
  • a listener When listening over headphones, a listener would perceive this sound to be on the median plane (vertical plane passing through the midpoint of two ears and noise). The absence of finer frequency specific cues would mean that a listener would often perceive this signal at the centre of the head. In the other words the listener may not be able to determine whether the signal is at the front or back or up or down on that plane.
  • the stereo Right channel audio signal does not contain any or significant components of the front left audio channel signal. As a result, the user perceives this signal to be at the left ear.
  • the principle of how to identify the centre channel components for extraction from such a downmixed stereo audio signal is to determine a selection of at least one of the ITD, IPD and ILD in the stereo signal and compare the ITD, IPD and ILD values to accustomed ILD, IPD and ITD values in order to evaluate the direction.
  • This approach may be termed as Perceptual Basis from here on.
  • the overall level difference is minimal, there should be minimal interaural time delay (in other words the ITD is small), and furthermore minimal interaural phase delay (in other words the IPD is small).
  • the analysis may be carried out on a time domain basis for example where ITD is the selected difference and in some other embodiments on a spectral domain basis.
  • the spatial analysis may in some embodiments be done on a frequency sub-band basis.
  • the analysis may employ time domain analysis such that in these other embodiments instead of calculating the relative phase, the time difference between the envelopes of signal pairs in the time domain are calculated.
  • the frequency sub-band based analysis is in some embodiments based on the superimposition of signals from all the sources in that given frequency band.
  • the extraction in some embodiments uses the differences in different frequency sub-bands (such as level, time or phase differences or a selection or combination of differences) to estimate the direction of the source in that frequency sub band.
  • the net differences are compared to the differences (ILD, IPD and ITD cues) that are unique to that particular listener. These values may be obtained from Head Related Transfer Function (HRTF) for that particular listener.
  • HRTF Head Related Transfer Function
  • more than one of the cues may be used to estimate the source direction in the lower frequency ranges ( ⁇ 1.5 KHz) but a single cue (for example the ILD or in other embodiments the ITD) may be the dominant cue at a higher frequency range (>1.5 KHz).
  • a single cue for example the ILD or in other embodiments the ITD
  • the determination of the use of a dominant cue such as the use of the ILD for higher frequency ranges in some embodiments is because a high frequency source signal may see multiple phase wraparounds before reaching the contralateral ear.
  • a crude or basic estimator for the centre channel is 0.5 * (L(n)+R(n)). This average of samples in time domain may perfectly preserve the original centre channel, but all of the remaining channels may also leak into the extracted centre channel. This leakage may be controlled in some embodiments by applying frequency specific gating or gains.
  • a weighting may be applied to the components of the band or sub-band to prevent leakage of non-centre components into the extracted centre channel audio signal.
  • a beam pattern may be formed to gate or filter unwanted leakage from other channels. This may be considered to be forming a perceptual beam pattern to allow signals that are located in the median plane.
  • the centre channel extractor 455 may receive the left channel audio signal L and the right channel audio signal R.
  • the audio signals may be described with respect to a time and thus at time n the left channel audio signal may be labelled as L(n) and the right channel audio signal may be labelled as R(n).
  • This operation of receiving the left and right channel audio signals may be shown with respect to Figure 6 in step 651 .
  • the centre channel extractor 455 may comprise a sub-band generator 601 which is configured to receive the left and right channel audio signals and output for each channel a number of frequency sub-band signals.
  • the number of sub-bands may be N+1 and thus the output of the sub-band generator 601 comprises N+1 left channel sub-band audio signals L 0 (n),...,L N (n) and N+1 right channel sub-band audio signals R 0 (n),...,R N (n).
  • the frequency range for each sub-band may be any suitable frequency division design.
  • the sub-bands in some embodiments may be regular whilst in some other embodiments the sub-bands may be determined according to psychoacoustical principles.
  • the sub-bands may have overlapping frequency ranges on some other embodiments at least some sub-bands may have abutting or separated frequency ranges.
  • the sub-band generator is shown as a filterbank comprising a pair of first filters 603 (one left channel low pass filter 603 L and one right channel low pass filter 603 R ) with cut-off frequency of 150Hz, a pair of second filters 605 (one left channel band pass filter 605 L and one right channel band pass filter 605 R ) with a centre frequency of 200Hz and a bandwidth of 150Hz, a pair of third filters 607 (one left channel band pass filter 607 L and one right channel band pass filter 607 R ) with a centre frequency of 400Hz and a bandwidth of 200Hz, and on to a pair of N+1th filters 609 (one left channel band pass filter 609 L and one right channel band pass filter 609 R ) with centre frequency of 2500Hz and bandwidth of 500Hz.
  • N+1th filters 609 one left channel band pass filter 609 L and one right channel band pass
  • Any suitable filter design may be used in embodiments of the application to implement the filters.
  • gammatone or gammachirp filterbank models which are models of particularly suitable filterbanks for the human hearing system may be used.
  • a suitable finite impulse response (FIR) filter design may be used to generate the sub-bands.
  • the filtering process may be configured in some embodiments to be carried out in the frequency domain and thus the sub-band generator 601 may in these embodiments comprise a time to frequency domain converter, a frequency domain filtering and a frequency to time domain converter. The operation of generating sub-bands is shown in Figure 6 by step 653.
  • the centre channel extractor 455 in some embodiments may further comprise a gain determiner 604.
  • the gain determiner 604 is configured in some embodiments to receive the left and right channel sub-band audio signals from the sub-band generator 601 and determine a gain function value to be passed to a combined signal amplifier 610.
  • the gain determiner 604 for clarity reasons is partially shown as separate gain determiner apparatus for the first sub-band (a first sub-band gain determiner 604 0 ) and the N+1 th sub-band (a N+1 th sub-band gain determiner 604 N ).
  • This separation of the gain determination into sub-band apparatus allows the gain determination to be carried out in parallel or substantially in parallel.
  • the same operation may be carried out serially for each sub-band in some embodiments of the application and as such may employ a number of separate sub-band gain determiner apparatus fewer than the number of sub-bands.
  • the gain determiner 604 in some embodiments may comprise a gain estimator 633 and a threshold determiner 614.
  • the gain estimator 633 in some embodiments receives the left and right channel sub-band audio signal values, and the threshold values for each sub-band from the threshold determiner 614 and determines the gain function value for each sub-band.
  • the threshold determiner 614 is configured in some embodiments to generate the threshold values for each sub-band. In some embodiments the threshold determiner generates or stores two thresholds for each sub-band, a lower threshold value thresholdi and a higher threshold value threshold 2 . The thresholds generated for each sub-band such as thresholdi and threshold 2 may be generated based on the listeners head related transfer function (HRTF). In some embodiments the HRTF for the specific listener may be determined using any suitable method for determining the HRTF. For example in some embodiments the HRTF may be generated by selecting a suitable HRTF from the Centre for Image Processing and Integrated Computing (CIPIC) database or any suitable HRTF database.
  • CPIC Centre for Image Processing and Integrated Computing
  • a suitable HRTF may be retrieved from a earlier determined HRTF for a user determined using a HRTF measuring device.
  • the threshold determiner 614 generates sub-band threshold values dependent on an idealized or modelled HRTF function such as a dummy head model HRTF.
  • Figure 8a shows a sample HRTF for the left and right ears for frequencies from 20Hz to 20kHz for azimuth of 0 degrees, in other words with a source directly in front of the listener. From the above plot, it can be seen that the Interaural level differences (ILD) for most of the frequencies up to about 5KHz is less than 6dB. This would be true for sources that are directly in front of the listener.
  • Figure 8b shows for the same listener a sample HRTF for the left and right ears for frequencies from 20Hz to 20kHz for a source azimuth of -65 degrees. The level differences in this example are now much greater at higher frequencies.
  • Figures 9a and 9b shows the signal level HRTF for the left and right ears for a 200Hz and 2 KHz signal for a sample listener for different azimuth angles all around the listener.
  • the threshold determiner 614 in order to determine threshold values suitable so that the centre channel extractor may perceive a signal to be in the median plane (0,180 degrees) may have to determine threshold values where the left and right levels of a stereo signal (in other words the difference between the two traces for that azimuth angle) are very close at lower as well as for higher frequencies.
  • This closeness metric is a function of frequency and tolerance around the intended azimuth angle (e.g. +/- 15 degrees from 0 degree azimuth).
  • phase differences may in some embodiments also be checked at lower frequencies and limits can be established.
  • the threshold values generated by the threshold determiner thus specify the differences allowed between the left and right channels to enable the extraction of the centre channel for each frequency band.
  • the selected or generated HRTF may be associated with a number of predetermined threshold values for each sub-band.
  • the thresholds may be determined by determining the ILD between the left and right HRTF for the user at +/- 15 degree range from the centre.
  • the thresholds may be determined by examining the total power in a frequency band or sub-band (for example in some embodiments this may be a indicated or selected critical band). Similarly in some embodiments a band filtered Head Related Impulse Response (HRIR) may be cross correlated to determine the difference between the left and right ear response in terms of phase/time differences. Then the threshold determiner 614 in these embodiments may use these Interaural Level Difference (ILD) values, Interaural Time Difference (ITD) and/or Interaural Phase Difference (IPD) values to set the threshold values for each band/sub-band accordingly.
  • ILD Interaural Level Difference
  • IPD Interaural Time Difference
  • IPD Interaural Phase Difference
  • the differences selected are for the lower frequency range based on a selection of the Interaural Level Difference (ILD) values, and Interaural Phase Difference (IPD) values
  • HRTF or HRIR values for the Interaural Level Difference (ILD) and Interaural Phase Difference (IPD) may be used to set the threshold values for the lower frequency range.
  • the difference selected for the higher frequency range is based on the Interaural Level Difference (ILD) values only then the HRTF or HRIR values for the Interaural Level Difference (ILD) may be used to set the threshold values for the higher frequency ranges.
  • the threshold is set based on the selected difference or differences shown in the HRTF or HRIR.
  • the operation of determining threshold values for sub-bands is shown in figure 6 by step 656.
  • the gain estimator 633 in some embodiments, and as shown in Figure 4, comprises a Discrete Fourier Transformer (DFT) computation block 606 and a coefficient comparator 608.
  • the DFT computation block 606 in some embodiments receives the left and right channel sub-band audio signal values.
  • the DFT computation block 606 generates complex frequency domain values for each sub-band for both the left and right channel.
  • any suitable time to frequency domain transformer may be used to generate complex frequency domain values such as a discrete cosine transform (DOT), Fast Fourier Transform (FFT), or wavelet transform.
  • DOT discrete cosine transform
  • FFT Fast Fourier Transform
  • the DFT computation block 606 may generate the complex coefficients for each sub-band using Goertzel algorithm: The DFT computation block 606 may thus in these embodiments compute v k (n) for each new input sample.
  • the DFT computation block 606 calculates the DFT coefficient by evaluating the left hand side of the equation once:
  • Values of M and k may in some embodiments be chosen for each sub-band independently to approximately capture the frequency range of the given sub-band filter.
  • W M k and cos(2 * pi * k/M) are constants.
  • the DFT computation block 606 in these embodiments sets the values of v k (n-2) and v k (n-1 ) to zero initially, and also reset after every M samples. After doing the above processing for M samples y k (n) is the required DFT coefficient. The DFT computes these coefficients for all the sub-bands for both left and right channel signals.
  • the DFT coefficients determined by the DFT computation block 606 are complex numbers.
  • the left channel DFT coefficients are represented as H L (k), and the right channel DFT coefficients are represented as H R (k) where k represents the sub-band number.
  • the DFT coefficients are passed to the coefficient comparator 608.
  • the operation of generating the DFT coefficients is shown in figure 6 by step 655.
  • the coefficient comparator 608 receives the DFT coefficients from the DFT computation block 606 and the threshold values for each sub-band from the threshold determiner 614 to determine the gain function value for each sub-band.
  • the coefficient comparator 608 is configured in some embodiments to determine how close the sub-band Interaural difference (for example at least one of the Interaural level difference - ILD, the Interaural time difference ITD, and the Interaural phase difference - IPD) values are with respect to the ILD, IPD and ITD values for the centre of head (front or back) localization.
  • the coefficient comparator 608 attempts to find closeness in H L (k) and H R (k) values.
  • this 'closeness' can be measured by determining the Euclidean distance between the H L (k) and H R (k) points on the complex plane. In other embodiments other distance metrics may be applied.
  • the pure phase difference value, the IPD may be determined by calculating the minimum phase impulse response for the sub-band. For example if a head related impulse response for the left and right channel signal is determined and converted to a minimum phase impulse response form the difference in the phase between phase responses of the minimum phase impulse response may be treated as the IPD value.
  • a graphical representation of the selection of the difference and the thresholds for some embodiments which have selected the differences as level and phase differences may be shown in which an example of the normalized H L (k) value H
  • the coefficient comparator 608 may in some embodiments determine the distance of the difference vector (or scalar) 705 for the sub-band and also compare the distance against the defined/generated threshold values for the sub-band. For example in the embodiments described above where the differences selected are for the lower frequency range based on a selection of the Interaural Level Difference (ILD) values, and Interaural Phase Difference (IPD) values then the difference is the vector difference which is compared against a vector threshold - which may be represented by the circle from the end of one of the vectors in Figure 7.
  • ILD Interaural Level Difference
  • IPD Interaural Phase Difference
  • the difference selected for the higher frequency range is based on the Interaural Level Difference (ILD) values only then the difference is the scalar difference produced by rotating one of the left or right normalized vectors onto the other vector.
  • ILD Interaural Level Difference
  • the threshold or thresholds themselves may further be vector in nature (in other words that the level difference is more significant than the phase difference).
  • two threshold values are determined/generated and passed to the coefficient comparator 608 to be checked against the sub-band difference vector distance.
  • only one threshold value is determined/generated and checked against or in some other embodiments more than two threshold values may be used.
  • the coefficient comparator 608 may determine that if the two DFT vectors, H L (k) and, H R (k) for a specific sub-band k are close, in other words less than the smaller threshold (threshold- ⁇ ) value, or mathematically
  • a gain g k of 1 or OdB is assigned to that sub-band.
  • This is represented by a first region 721 in Figure 7.
  • the comparator 608 has determined that as the difference values (such as a selection of one of the ILD, IPD and ITD and for example for the lower frequency range based on a selection of the Interaural Level Difference (ILD) values, and Interaural Phase Difference (IPD) values and for the higher frequency range is based on the Interaural Level Difference (ILD) values only) between the two channels is small then this sub-band comprises audio information which with a high confidence level was originally centre channel audio signal.
  • ILD Interaural Level Difference
  • IPD Interaural Phase Difference
  • step 657 The comparison operation against the first threshold value is shown in Figure 6 by step 657. Furthermore the operation of the assignment of the gain g k of 1 where the difference is less than the threshold is shown in step 659. Following step 659 the method progresses to the operation of combining left and right channel audio signals.
  • the coefficient comparator 608 furthermore determines that if the difference between the vectors (for the IPD and ILD lower frequency range) or scalars (for the ILD only higher frequency range) shown as the two DFT vectors, H L (k) and, H R (k) in Figure 7 for a specific sub-band k are greater than the lower threshold (thresholdi) value but less than a higher threshold (threshold 2 ) then a gain, g k , which is less than 1 but greater than 0 is assigned to that sub-band.
  • This area is represented in Figure 7 by a second region 723.
  • the comparator 608 has determined that as the difference values (such as a selection of at least one of the ILD, IPD and ITD as seen from the vector or scalar distances between the left and right channel sub-vector values H L and H R ) between the two channels is moderate then this sub-band comprises audio information which with a moderate confidence level was originally part of the centre channel audio signal.
  • the assigned gain is a function of the difference distance and the threshold values.
  • the assigned gain may be an interpolation of a value between 0 and 1 where the assigned gain is higher the nearer the difference value is to the lower threshold value. This interpolation may in some embodiments be a linear interpolation and may in some other embodiments be a nonlinear interpolation.
  • the coefficient comparator 608 furthermore determines that if the distance of the vector (for the IPD and ILD lower frequency range) or of the scalar (for the ILD only higher frequency range) is greater than the higher threshold (threshold 2 ) value then the gain, g k assigned for the sub-band is 0. This is represented in Figure 7 by a third region 725.
  • the comparator 608 has determined that as the difference values (such as at least one of ILD, IPD and ITD) between the two channels is large then this sub-band comprises audio information which with a low or no confidence level was originally centre channel audio signal.
  • the comparison operation against the second or higher, threshold value (threshold 2 ) is shown in Figure 6 by step 661 .
  • the operation of the assignment of the gain of between 1 and 0 where the difference is less than the higher threshold is shown in step 665. Following step 665 the method progresses to the operation of combining left and right channel audio signals.
  • step 663 Furthermore the operation of the assignment of the gain of 0 where the difference is greater than the higher threshold (and implicitly greater than the lower threshold) is shown in step 663. Following step 663 the method progresses to the operation of combining left and right channel audio signals.
  • the coefficient comparator 608 may for some sub-band compare a non- vector (scalar) difference distance against the threshold value or values.
  • the non-vector difference is the difference between the magnitudes
  • the magnitude or level (ILD) difference is compared against the threshold values in the same was as described above.
  • the coefficient comparator 608 determines both vector and scalar differences and selects the result dependent on the sub-band being analysed.
  • the magnitude (scalar) difference may be determined and compared for the higher frequency sub-bands and the vector (phase and level) difference values may be determined for the lower frequency sub-bands.
  • the coefficient comparator 608 may in some embodiments compare the magnitude difference against the threshold values for sub- bands for the frequency range >1500Hz and the vector difference against the threshold values for the sub-bands in the frequency range ⁇ 1500Hz.
  • difference thresholds or 'cue' values defined by the IPD and ILD it would be appreciated that other cues such as phase difference or inter-aural time difference (ITD) where the relative time difference between the right and left signals is determined and compared against time threshold values or value may be used in some other embodiments.
  • the ILD and ITD differences which would describe a vector difference may be employed in lower frequency ranges or sub-bands and ILD differences only which would describe a scalar difference in higher frequency ranges or sub- bands.
  • the differences selected may be all three of the differences IPD, ILD and ITD which define a three dimensional vector. The distance between the left and right channels may then define a three dimensional space and be tested against at least one three dimensional threshold.
  • the ILD may be employed for the whole frequency range being analysed and the IPD and ITD being selected dependent on the frequency range being analysed.
  • a schematic view of a gain determiner 604 configured to determine a gain based on the selection of the ILD and ITD is shown.
  • the sub-band signals for the left and right channels are passed to the cross correlator 1201 and the level difference calculator.
  • the cross correlator 1201 may determine a cross correlation between the filterbank pairs, for example the cross correlation for the first band or sub-band may be determined between the output of the first band or sub-band of the left channel audio signal with the output of the first band or sub-band of the right signal.
  • the cross correlation would in these embodiments reveal a maximum peak which would occur at the time delay between the two signals, or in other words generating a result similar to the ITD which is passed to the coefficient comparator 608.
  • the group delays of each of the filtered signals may be calculated and the ITD between the right and left signals after the filterbanks be determined from these group delay values.
  • level difference calculator 1203 may determine the magnitude of the sub band components and may further determine the difference between the magnitude of the components and furthermore pass these values to the coefficient comparator 608.
  • the threshold determiner 614 in these embodiments may determine at least one threshold value for each of the ILD value and the ITD value. In other words two sets of thresholds are determined, received or generated, one for delay and one for timing.
  • the coefficient comparator 608 may then compare the determined ITD and ILD values against the associated set of threshold values to generate the associated gain or pass value.
  • the coefficient comparator 608 in some embodiments may generate the threshold values by using in a lookup table. For example in the embodiments where the difference is the selection of the ITD and ILD values a two dimensional look-up table is used with delay on one axis and level difference on the other axis. The gain is then read from the look-up table based on the input delay and level difference values for that sub-band.
  • one difference or cue may be used for one frequency range (or sub-band) and a second difference or cue for a different frequency range (or sub-band).
  • the ITD cue may be used for higher frequency signals because ITD is effective at higher frequencies whereas the IPD is used at lower frequencies.
  • the ITD can be thought as the time difference between the envelopes of signal pairs whereas IPD is the difference between the signal contents (in other words inside the envelope).
  • the IPD and ITD may be determined at lower frequencies.
  • any suitable combination of the IPD, ITD, and/or ILD cues may be used determine or identify the sub-band components which may be used to generate the centre channel audio signal by comparing the difference value against one or more threshold values.
  • the ILD may be used on sub-bands analysed above 1500 Hz
  • IPD may be used on sub-bands analysed below 1500 Hz
  • ITD used for sub-bands analysed from 0 to 5000 Hz (This would for example from the viewpoint of frequency ranges could be seen as a lower frequency range ⁇ 1500 Hz with the IPD and ITD difference being selected and a higher frequency range > 1500 Hz with the ILD and ITD being selected).
  • each of the differences may be used for different analysis ranges which may overlap or abut or be separated.
  • the IPD is selected for a first frequency range from 0 Hz to 500 Hz
  • the ITD is selected for a second frequency range from 501 Hz to 1500 Hz
  • the ILD is selected for a third frequency range from 1501 Hz to 5000 Hz.
  • two regions may be defined with different gain values.
  • two regions may be defined with one region being a pass region (i.e. the switch is on or gain is equal to one) where the cue value is less than the threshold, and the second region being a block region (i.e. the switch is off or gain is zero) where the cue value is greater than the threshold.
  • more than two thresholds would produce more than three regions.
  • the comparator 608 applies an additional 1 st order low pass smoothing function to reduce any perceptible distortion because of time varying nature of the gain.
  • g k (n-1 ) value is the previous time instant output gain value for the k'th sub-band
  • g k the value determined by the comparer 608
  • g k (n) the output gain value for the current time instant for the k'th sub-band.
  • the comparator 608 may apply a higher order smoothing function or any suitable smoothing function to the output gain values in order to attempt to reduce perceptible distortion.
  • the gain values are in some embodiments output to the amplifier 610.
  • the centre channel extractor 455 in some embodiments comprises a sub-band combiner 602 which receives the left and right channel sub-band audio signals and outputs combined left and right sub-band audio signals.
  • the sub-band combiner 602 is shown to comprise an array of adders. Each of the adders is in some embodiments shown to receive one sub-band left channel audio signal and the same sub-band right channel audio signal and output a combined signal for that sub-band.
  • first adder 623 adding the left and right channel audio signals for the sub-band 0
  • second adder 625 adding the left and right channel audio signals for the sub-band 1
  • third adder 627 adding the left and right channel audio signals for the sub-band 2
  • N+1th adder 629 adding the left and right channel audio signals for the sub-band N.
  • the fourth to N'th adders are not shown in Figure 5 for clarity reasons.
  • the combination is an averaging of the left and right channel audio signals for a specific sub-band.
  • the sub-band combiner may in these embodiments produce the following results:
  • step 667 The process of combining the sub-band left and right channel audio signals is shown in Figure 6 by step 667.
  • the centre channel extractor 455 may further in some embodiments comprise an amplifier 610 for amplifying the combined left and right channel audio signals for each sub-band by the assigned gain value for the sub-band and outputting an amplified value of the combined audio signal to the sub-band combiner 612.
  • the amplifier 610 in some embodiments may comprise as shown in Figure 5 an array of variable gain amplifiers where the gain is set from a control signal from the gain determiner 604.
  • a second variable gain amplifier 635 amplifying the sub-band 1 combined audio signal B-i by the sub-band 1 assigned gain value g-i
  • a third variable gain amplifier 637 amplifying the sub-band 2 combined audio signal B 2 by the sub-band 2 assigned gain value g 2
  • a N+1th variable gain amplifier 639 amplifying the sub-band N combined audio signal B N by the sub-band N assigned gain value g N .
  • the fourth to Nth variable gain amplifiers are not shown in Figure 5 for clarity reasons.
  • the centre channel extractor 455 may further in some embodiments comprise a sub-band combiner 612.
  • the sub-band combiner 612 in some embodiments receives for each sub-band the amplified combined sub-band audio signal value and combines them to generate an extracted centre channel audio signal.
  • the sub-band combiner 612 comprises an adder 651 for performing a summing of the amplified combined sub-band audio signals. This averaging may be expressed as the following equation:
  • step 673 The operation of the combining of the sub-bands is shown in figure 6 by step 673.
  • FIG. 10a and 10b The difference between the basic averaging of the combined left and right channel signals and a centre channel signal extracted according to some of the embodiments is shown in figures 10a and 10b.
  • the basic averaging of the left and the right channel signals does not detect the audio components where the signal is clearly in the left and right channel and thus audio sources which originated in the right or left sound stage 'bleed' into the extracted centre channel signal.
  • the example embodiment where the allocated gain is applied to the combination of the left and right channel signals produces an extracted centre channel signal which is far less susceptible to audio signals originating from other channels.
  • centre channel extraction may in further embodiments provide further use cases so that the user may control the centre channel depending on user preferences.
  • centre channel extractor has been described with respect to an upmixing and virtualisation process for headphones the centre channel extraction apparatus and method is suitable for many different audio signal processing operations.
  • the apparatus may be employed to extract audio signals from pairs of channels at various directions to the pairs of channels.
  • the same centre channel extraction process may be used to extract so called unknown sources.
  • a device such as a camera, with microphones mounted on opposite sides to record stereo sound may generate a pair of audio signals which using the channel extraction apparatus or methods may then produce a centre channel audio signal for presentation.
  • a centre channel signal may be determined in order to isolate an audio source located in the 'centre'.
  • the vocalist audio components may be extracted from the signal containing the instruments and audience component signals.
  • the extracted centre channel may be subtracted from the left L' and right R' channel audio signals to generate a modified left L" and right R" channel audio signal.
  • This output stereo signal would conventionally thus remove the vocalist from the audio signal rendering the resultant stereo audio signal suitable for Karoake.
  • the process and apparatus may be implemented by an electronic device (such as a mobile phone) or through a server/database.
  • the centre channel extractor 455 further comprises a pre-processor.
  • a left channel audio signal pre-processor part 1 151 is shown. It would be appreciated that the pre-processor would further comprise a mirror image right channel audio signal pre-processor part which is not shown in order to clarify the figure.
  • the pre-processor is implemented prior to the sub-band generator 601 and the output of the pre-processor in such embodiments is input to the sub-band generator 601 .
  • the pre-processor is configured to apply a pre-processing to the signal to remove some of the uncorrelated signal in left and right channels. Therefore in some embodiments the pre-processor attempts to remove these uncorrelated signals from the left and right channel audio signals before the generation of the sub-band audio signals.
  • the left channel audio signal may be expressed as a combination of two components. These two components are a component S(n) that is coherent with right channel audio signal and an uncorrelated component N-i(n). Similarly the right channel signal may also be expressed as a combination of two components, the coherent component S(n) and an uncorrelated component N 2 (n).
  • the left channel audio signal pre-processor part 1 151 comprises a Least Mean Square (LMS) processor 1 109 to estimate the uncorrelated component.
  • LMS Least Mean Square
  • the left channel audio signal is input into a delay 1 101 with a length of T+1 and then passed to a first pre-processor combiner 1 105 and a second pre-processor combiner 1 107.
  • the right channel audio signal is in these embodiments input to a filter W 1 103 with a length of 2T+1 and whose filter parameters are controlled by the LMS processor 1 109.
  • the output of the filter is furthermore in these embodiments passed to the first pre-processing combiner 1 105 to generate an estimate of the unrelated component N-i ' which is passed to the second preprocessing combiner 1 107 to be subtracted from the delayed left channel audio signal in an attempt to remove the uncorrelated information.
  • the LMS processor 1 109 in embodiments such as these receives both the ⁇ - ⁇ ' estimate of the uncorrelated information and the right channel audio signal to choose the filter parameters such that the correlated information is output to be subtracted at the first pre-processing combiner 1 105.
  • embodiments of the application perform a method comprising: filtering at least two audio signals to generate at least two groups of audio components per audio signal; determining a difference between the at least two audio signals for each group of audio components; and generating a further audio signal by selectively combining the at least two audio signals for each group of audio components dependent on the difference between the at least two audio signals for each group of audio components.
  • embodiments of the invention operating within an electronic device 10 or apparatus
  • the invention as described below may be implemented as part of any audio processor.
  • embodiments of the invention may be implemented in an audio processor which may implement audio processing over fixed or wired communication paths.
  • user equipment may comprise an audio processor such as those described in embodiments of the invention above.
  • the term electronic device and user equipment is intended to cover any suitable type of wireless user equipment, such as mobile telephones, portable data processing devices or portable web browsers.
  • the various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof.
  • some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
  • firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
  • While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
  • At least some embodiments may be apparatus comprising: at least one filter configured to filter at least two audio signals to generate at least two groups of audio components per audio signal; a comparator configured to determine a difference between the at least two audio signals for each group of audio components; and a signal combiner configured to generate a further audio signal by selectively combining the at least two audio signals for each group of audio components dependent on the difference between the at least two audio signals for each group of audio components.
  • the embodiments of this invention may be implemented by computer software executable by a data processor of the mobile device, such as in the processor entity, or by hardware, or by a combination of software and hardware.
  • any blocks of the logic flow as in the Figures may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions.
  • the software may be stored on such physical media as memory chips, or memory blocks implemented within the processor, magnetic media such as hard disk or floppy disks, and optical media such as for example DVD and the data variants thereof, CD.
  • At least some embodiments may be a computer-readable medium encoded with instructions that, when executed by a computer perform: filtering at least two audio signals to generate at least two groups of audio components per audio signal; determining a difference between the at least two audio signals for each group of audio components; and generating a further audio signal by selectively combining the at least two audio signals for each group of audio components dependent on the difference between the at least two audio signals for each group of audio components.
  • the memory may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory.
  • the data processors may be of any type suitable to the local technical environment, and may include one or more of general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASIC), gate level circuits and processors based on multi-core processor architecture, as non-limiting examples.
  • Embodiments of the inventions may be practiced in various components such as integrated circuit modules.
  • the design of integrated circuits is by and large a highly automated process.
  • Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
  • circuits and software and/or firmware
  • combinations of circuits and software such as: (i) to a combination of processor(s) or (ii) to portions of processor(s)/software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions and (c) to circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present.
  • circuitry' applies to all uses of this term in this application, including any claims.
  • the term 'circuitry' would also cover an implementation of merely a processor (or multiple processors) or portion of a processor and its (or their) accompanying software and/or firmware.
  • the term 'circuitry' would also cover, for example and if applicable to the particular claim element, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or similar integrated circuit in server, a cellular network device, or other network device.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

La présente invention se rapporte à un appareil qui comprend au moins un processeur et au moins une mémoire comprenant un code de programme informatique. La ou les mémoires et le code de programme informatique sont configurés pour amener, à l'aide du ou des processeurs, l'appareil à effectuer au moins un filtrage d'au moins deux signaux audio afin de générer au moins deux groupes de composantes audio par signal audio, une détermination d'une différence entre les deux signaux audio ou plus pour chaque groupe de composantes audio et la génération d'un signal audio supplémentaire en combinant sélectivement les deux signaux audio ou plus pour chaque groupe de composantes audio en fonction de la différence entre les deux signaux audio ou plus pour chaque groupe de composantes audio.
PCT/FI2010/050709 2009-09-30 2010-09-15 Appareil WO2011039413A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP10819956.3A EP2484127B1 (fr) 2009-09-30 2010-09-15 Procédé, logiciel, et appareil pour traitement de signaux audio
CN201080044113.1A CN102550048B (zh) 2009-09-30 2010-09-15 一种用于处理音频信号的方法和装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN2055/DEL/2009 2009-09-30
IN2055DE2009 2009-09-30

Publications (1)

Publication Number Publication Date
WO2011039413A1 true WO2011039413A1 (fr) 2011-04-07

Family

ID=43825606

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2010/050709 WO2011039413A1 (fr) 2009-09-30 2010-09-15 Appareil

Country Status (3)

Country Link
EP (1) EP2484127B1 (fr)
CN (1) CN102550048B (fr)
WO (1) WO2011039413A1 (fr)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014164361A1 (fr) * 2013-03-13 2014-10-09 Dts Llc Système et procédés pour traiter un contenu audio stéréoscopique
CN104468991A (zh) * 2014-11-24 2015-03-25 广东欧珀移动通信有限公司 一种移动终端及其音频收发方法
CN105247892A (zh) * 2013-05-31 2016-01-13 弗兰霍菲尔运输应用研究公司 用于空间选择性音频播放的设备和方法
CN108989688A (zh) * 2018-09-14 2018-12-11 成都数字天空科技有限公司 虚拟相机防抖方法、装置、电子设备及可读存储介质
EP3499915A3 (fr) * 2017-12-13 2019-10-30 Oticon A/s Dispositif auditif et système auditif binauriculaire comprenant un système de réduction de bruit binaural
US10560794B2 (en) 2017-03-07 2020-02-11 Interdigital Ce Patent Holdings Home cinema system devices
US20230292079A1 (en) * 2018-08-10 2023-09-14 Samsung Electronics Co., Ltd. Audio device and control method thereor
KR20230139847A (ko) * 2022-03-23 2023-10-06 주식회사 알머스 위치보정 기능의 이어폰 및 이를 이용하는 녹음방법

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0593128A1 (fr) * 1992-10-15 1994-04-20 Koninklijke Philips Electronics N.V. Système de dérivation pour obtenir un signal de canal central à partir d'un signal audio stéréophonique
WO2002100128A1 (fr) * 2001-06-01 2002-12-12 Sonics Associates, Inc. Amelioration de la voie centrale des images sonores virtuelles
US20050169482A1 (en) 2004-01-12 2005-08-04 Robert Reams Audio spatial environment engine
EP1784048A2 (fr) 2005-11-02 2007-05-09 Sony Corporation Appareil et procédé de traitement de signal
WO2007106324A1 (fr) * 2006-03-13 2007-09-20 Dolby Laboratories Licensing Corporation Rendu de données audios de canal central
US20080298597A1 (en) * 2007-05-30 2008-12-04 Nokia Corporation Spatial Sound Zooming

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3670562B2 (ja) * 2000-09-05 2005-07-13 日本電信電話株式会社 ステレオ音響信号処理方法及び装置並びにステレオ音響信号処理プログラムを記録した記録媒体
WO2006050112A2 (fr) * 2004-10-28 2006-05-11 Neural Audio Corp. Moteur configure pour un environnement audio-spatial
EP2064915B1 (fr) * 2006-09-14 2014-08-27 LG Electronics Inc. Dispositif de commande et interface utilisateur pour des techniques d'amélioration de dialogue
JP2013524562A (ja) * 2010-03-26 2013-06-17 バン アンド オルフセン アクティー ゼルスカブ マルチチャンネル音響再生方法及び装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0593128A1 (fr) * 1992-10-15 1994-04-20 Koninklijke Philips Electronics N.V. Système de dérivation pour obtenir un signal de canal central à partir d'un signal audio stéréophonique
WO2002100128A1 (fr) * 2001-06-01 2002-12-12 Sonics Associates, Inc. Amelioration de la voie centrale des images sonores virtuelles
US20050169482A1 (en) 2004-01-12 2005-08-04 Robert Reams Audio spatial environment engine
EP1784048A2 (fr) 2005-11-02 2007-05-09 Sony Corporation Appareil et procédé de traitement de signal
WO2007106324A1 (fr) * 2006-03-13 2007-09-20 Dolby Laboratories Licensing Corporation Rendu de données audios de canal central
US20080298597A1 (en) * 2007-05-30 2008-12-04 Nokia Corporation Spatial Sound Zooming

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BAI M.R.: "Upmixing and downmixing two-channel stereo audio for consumer electronics", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, vol. 53, no. 3, August 2007 (2007-08-01), pages 1011 - 1019, XP011193644 *
See also references of EP2484127A4 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9794715B2 (en) 2013-03-13 2017-10-17 Dts Llc System and methods for processing stereo audio content
WO2014164361A1 (fr) * 2013-03-13 2014-10-09 Dts Llc Système et procédés pour traiter un contenu audio stéréoscopique
CN105247892A (zh) * 2013-05-31 2016-01-13 弗兰霍菲尔运输应用研究公司 用于空间选择性音频播放的设备和方法
CN105247892B (zh) * 2013-05-31 2019-02-22 弗劳恩霍夫应用研究促进协会 用于空间选择性音频播放的设备和方法以及数字存储介质
CN104468991A (zh) * 2014-11-24 2015-03-25 广东欧珀移动通信有限公司 一种移动终端及其音频收发方法
US10560794B2 (en) 2017-03-07 2020-02-11 Interdigital Ce Patent Holdings Home cinema system devices
US10834515B2 (en) 2017-03-07 2020-11-10 Interdigital Ce Patent Holdings, Sas Home cinema system devices
EP3499915A3 (fr) * 2017-12-13 2019-10-30 Oticon A/s Dispositif auditif et système auditif binauriculaire comprenant un système de réduction de bruit binaural
EP4236359A3 (fr) * 2017-12-13 2023-10-25 Oticon A/s Dispositif auditif et système auditif binauriculaire comprenant un système de réduction de bruit binaural
US20230292079A1 (en) * 2018-08-10 2023-09-14 Samsung Electronics Co., Ltd. Audio device and control method thereor
US11968521B2 (en) * 2018-08-10 2024-04-23 Samsung Electronics Co., Ltd. Audio apparatus and method of controlling the same
CN108989688A (zh) * 2018-09-14 2018-12-11 成都数字天空科技有限公司 虚拟相机防抖方法、装置、电子设备及可读存储介质
KR20230139847A (ko) * 2022-03-23 2023-10-06 주식회사 알머스 위치보정 기능의 이어폰 및 이를 이용하는 녹음방법
KR102613035B1 (ko) * 2022-03-23 2023-12-18 주식회사 알머스 위치보정 기능의 이어폰 및 이를 이용하는 녹음방법

Also Published As

Publication number Publication date
CN102550048A (zh) 2012-07-04
EP2484127A1 (fr) 2012-08-08
CN102550048B (zh) 2015-03-25
EP2484127A4 (fr) 2013-06-19
EP2484127B1 (fr) 2020-02-12

Similar Documents

Publication Publication Date Title
KR101567461B1 (ko) 다채널 사운드 신호 생성 장치
CN107925815B (zh) 空间音频处理装置
US10757529B2 (en) Binaural audio reproduction
EP2484127B1 (fr) Procédé, logiciel, et appareil pour traitement de signaux audio
US10034113B2 (en) Immersive audio rendering system
US9232319B2 (en) Systems and methods for audio processing
JP6620235B2 (ja) サウンドステージ拡張のための機器及び方法
KR101368859B1 (ko) 개인 청각 특성을 고려한 2채널 입체 음향 재생 방법 및장치
US9015051B2 (en) Reconstruction of audio channels with direction parameters indicating direction of origin
CN111418220B (zh) 串扰处理b链
US20080232601A1 (en) Method and apparatus for enhancement of audio reconstruction
KR101989062B1 (ko) 오디오 신호를 향상시키기 위한 장치 및 방법 및 음향 향상 시스템
KR20130116271A (ko) 다중 마이크에 의한 3차원 사운드 포착 및 재생
US9743215B2 (en) Apparatus and method for center signal scaling and stereophonic enhancement based on a signal-to-downmix ratio
EP2792168A1 (fr) Procédé de traitement audio et appareil de traitement audio
JPWO2010076850A1 (ja) 音場制御装置及び音場制御方法
CN112019993B (zh) 用于音频处理的设备和方法
KR20130130547A (ko) 잡음을 제거하는 장치 및 이를 수행하는 방법
JP2008311718A (ja) 音像定位制御装置及び音像定位制御プログラム
CN112313970B (zh) 增强具有左输入通道和右输入通道的音频信号的方法和系统
JP2017028525A (ja) 頭外定位処理装置、頭外定位処理方法、及びプログラム
JP2020508590A (ja) マルチチャネル・オーディオ信号をダウンミックスするための装置及び方法
US20200059750A1 (en) Sound spatialization method
WO2018150766A1 (fr) Dispositif, procédé et programme de traitement de localisation hors tête
US9794717B2 (en) Audio signal processing apparatus and audio signal processing method

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201080044113.1

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10819956

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2010819956

Country of ref document: EP