EP0959644A2 - Method of modifying a filter for implementing a head-related transfer function - Google Patents

Method of modifying a filter for implementing a head-related transfer function Download PDF

Info

Publication number
EP0959644A2
EP0959644A2 EP19990303966 EP99303966A EP0959644A2 EP 0959644 A2 EP0959644 A2 EP 0959644A2 EP 19990303966 EP19990303966 EP 19990303966 EP 99303966 A EP99303966 A EP 99303966A EP 0959644 A2 EP0959644 A2 EP 0959644A2
Authority
EP
European Patent Office
Prior art keywords
transfer function
filter
ear
ear transfer
amplitude
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP19990303966
Other languages
German (de)
French (fr)
Inventor
Alastair Sibbald
Fawad Nackvi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Creative Technology Ltd
Original Assignee
Central Research Laboratories Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central Research Laboratories Ltd filed Critical Central Research Laboratories Ltd
Publication of EP0959644A2 publication Critical patent/EP0959644A2/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • H04S5/005Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation  of the pseudo five- or more-channel type, e.g. virtual surround
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic

Definitions

  • This invention relates to a method of modifying a filter for implementing a head-related transfer function (HRTF) for use in the reproduction of three-dimensional (3D) sound.
  • HRTF head-related transfer function
  • Transaural crosstalk from each of the loudspeakers may be cancelled by creating appropriate crosstalk cancellation signals from the opposite loudspeaker.
  • Crosstalk cancellation signals are equal in magnitude and inverted (opposite in polarity) with respect to the transaural crosstalk signals.
  • a system for performing transaural crosstalk cancellation is discussed in the published International Patent Application No. WO-A1-9515069.
  • the direct sound When listening to a real sound source in an ordinary environment (e.g. a living room), the first sound that the listener hears is termed the "direct" sound (so called because it travels directly to the ears).
  • the direct sound is soon followed by the first reflections from the floor, ceiling and walls, some milliseconds later (or tens of milliseconds, depending on the dimensions of the room).
  • the first reflections are themselves reflected back again to the listener from other boundaries, and these sound waves are termed secondary reflections, or second-order reflections. This process continues until the sound energy has been totally absorbed by the boundaries of the environment, and by the air itself.
  • the reflections which follow the first few reflections soon begin to overlap each other, becoming complex and scattered, and are termed the reverberant sound.
  • the placing of a virtual sound source using HRTF filters uses a considerable amount of computational effort, it is common to simulate only the direct sound, and not the reflections. Consequently, the resulting virtual sound is anechoic, that is, it lacks the reflected components. This can be a disadvantage, as such reflected components can help the brain determine distance and reinforce spatial effects.
  • a further limitation in conventional 3D sound reproduction is that when reproducing virtual sounds via loudspeakers, the sounds originating from the loudspeakers themselves may be reflected from surfaces such as walls, floor, ceiling, and furniture. These sound reflections may conflict with the virtual sound image, especially if the virtual sound image is placed behind the listener. This is because sound reflections from room boundaries close to the loudspeaker "overwhelm" the 3D cue arising from spectral shaping by the outer ear, and so the inter-aural time delay (ITD) cue predominates. This causes the virtual sound source to flip from the required rearward position to a position in front of the listener which shares the same ITD value.
  • ITD inter-aural time delay
  • FIG. 4 An example illustrating this point is the virtualisation of rear surround speakers for the Dolby AC-3 5.1 system.
  • Dolby and AC-3 are trademarks of Dolby Laboratories Inc.
  • An audio system incorporating the AC-3 compression standard provides for multi-channel digital surround sound.
  • AC-3 5.1 gives separate audio channels for left, right, and centre speakers in front of the listening position, two rear surround speakers, and a sub-woofer positioned according to the listener's preference.
  • a typical loudspeaker configuration for the AC-3 system is shown in Figure 4.
  • Figures 1 and 2 show a co-ordinate system used for the following description.
  • the convention chosen here for referring to azimuth angles is that they are measured from the frontal pole P towards the rear pole P' , with positive values of azimuth on the right-hand side of the listener and negative values on the left-hand side.
  • Rear pole P' is at an azimuth of +180° (and -180°). Angles of elevation are measured directly upwards (or downwards, for negative angles) from the origin at the centre of the head of the listener relative to the horizontal plane.
  • the preferred positions of the rear surround speakers in the AC-3 system are ⁇ 120° azimuth and 0° elevation. Therefore, the use of a +120°, and a -120°, HRTF is required.
  • the characteristics of the +120° and -120° HRTF are very similar to those of the +60° and -60° HRTF: the inter-aural time delays for both HRTFs are identical (522 ⁇ s). Consequently, when attempts are made to create a virtual sound source at +120° (or - 120°), the presence of unwanted reflections from room boundaries adjacent the loudspeakers, in addition to the absence of virtual reflections from the virtual sound source, causes the image to flip to the +60° (or -60°) position. Thus sounds placed at an azimuth of +120° (or -120°) appear to be in front of the listener at +60° (or -60°), and the illusion of the surround sound effect is disturbed.
  • An aim of the present invention is to provide more effective virtual sound source placement in three dimensions, particularly, but not exclusively, for virtual sound sources placed behind a listener, by modification of the characteristics of a filter for implementing a head-related transfer function.
  • a method of modifying the characteristics of a filter for implementing a head-related transfer function (HRTF), the HRTF including a near-ear transfer function and a far-ear transfer function comprising increasing the magnitude of the amplitude of the near-ear transfer function and/or far-ear transfer function over a range of frequencies to give an exaggerated near-ear transfer function and/or an exaggerated far-ear transfer function, the amount of the increase at a given frequency being a function of the amplitude of the corresponding transfer function or functions at the given frequency, thereby forming a filter which implements an HRTF having an exaggerated near-ear transfer function and/or an exaggerated far-ear transfer function.
  • HRTF head-related transfer function
  • the magnitude of the amplitude of the near-ear transfer function, and/or the far-ear transfer function is increased by convolving the transfer function with itself.
  • the amplitude of the exaggerated near-ear transfer function and/or the amplitude of the exaggerated far-ear transfer function may be limited over a range of frequencies above a threshold value.
  • the threshold value may be, for example, 6 kHz.
  • the amplitude of the exaggerated near-ear transfer function and/or the amplitude of the exaggerated far-ear transfer function may be adjusted so that the amplitude of the exaggerated near-ear transfer function and the amplitude of the exaggerated far-ear transfer function tend to the same value at frequencies below, for example, 100 Hz.
  • a filter modified using the aforedescribed method is provided.
  • the modified filter is used for implementing an HRTF, the HRTF having an amplitude response characteristic curve substantially as shown in plot B of Figure 8.
  • the filter may also include crosstalk cancellation means.
  • the filter may be used in a multi-channel surround sound system, or a multi-channel encoding system.
  • the modified filter for implementing an HRTF places a virtual sound source at positions behind a listener.
  • the virtual sound sources are placed at azimuths of ⁇ 120° and elevations of 0° relative to a listener.
  • the virtual sound source is placed at an elevation of ⁇ 90° relative to a listener.
  • the modified filter is a finite impulse response filter.
  • a sound recording or transmission made using a modified filter implementing an HRTF.
  • a signal processed using a modified filter implementing an HRTF is provided.
  • a filter implementing an HRTF (12), shown in Figure 3 is modified to provide improved positioning of a virtual sound source.
  • an HRTF (12) placing a virtual sound source at an azimuth of +120° and elevation of 0° is described.
  • an HRTF of azimuth angle 60° and elevation 0° will be referred to as a 60° HRTF.
  • the method described may also be applied to the -120°, or indeed any, HRTF.
  • Figure 5 shows the near-ear amplitude response (16a) of a 120° HRTF, and the far-ear amplitude response (16b) of the same function.
  • near-ear corresponds to the ear of a listener which is nearest to the virtual sound source
  • far-ear is the ear furthest away from the virtual sound source.
  • the HRTF (12) therefore comprises a near-ear transfer function (16a), a far-ear transfer function (16b), and an inter-aural time delay.
  • Figure 6 shows the near-ear amplitude response (18a), and the far-ear amplitude response (18b), of a 60° HRTF. It can be seen that the general form of the far-ear data (16a and 18b) for both plots is similar. However, the near-ear data (16a) of Figure 5 exhibits some differences from the near-ear data (18a) of Figure 6. It should be noted that, in this example, differences in the far-ear responses (16b, 18b) are not as obvious to the brain as differences in the near-ear responses (16a, 18a). This is because the far-ear response (16b, 18b) is generally associated with less energy than the near-ear response (16a, 18a).
  • the prime difference between the 120° HRTF and the 60° HRTF appears to be the near-ear amplitude responses (16a, 18a).
  • this difference is not large enough for the brain to be able to distinguish the 120° near-ear response (16a) from the 60° near-ear response (18a) in the presence of real reflections, and the absence of virtual reflections.
  • the invention overcomes this deficiency by exaggerating the spectral features of the near-ear amplitude response (16a, 18a) to provide more spectral information to the listener's brain.
  • the first embodiment of the present invention provides a method of creating more pronounced spectral data by increasing the magnitude of the amplitude of the near-ear function (16a, 18a) over a range of frequencies.
  • the amount of the increase at a given frequency is a function of the amplitude of the near-ear function (16a,18a) at the given frequency.
  • the near-ear function (16a) is convolved with itself. This results in an exaggerated near-ear function (26a), as shown in Figure 7, with an increase in the magnitude of peaks and troughs, at all frequencies.
  • the magnitude of the trough at 4 kHz in the unmodified function has been increased.
  • a filter may then be designed to implement an HRTF having an exaggerated near-ear function (26a).
  • an exaggerated near-ear function 26a
  • a near-ear function and a far-ear function which have undergone any one of a number of processing steps according to the method described herein, are known as exaggerated near-ear and far-ear functions, respectively.
  • Figure 7 shows the near-ear transfer function (16a) of the 120° HRTF (12a), convolved with itself (26a), and its overall gain adjusted for low frequency alignment of the modified and unmodified functions.
  • the virtual sound source appears to be located at +120°, and not at +60° as can occur with the unmodified filter which implements the original 120° HRTF.
  • the size of the increase in magnitude of the amplitude of the near-ear function may be varied. For example, if the near-ear transfer function is convolved with itself, the amplitude values of the transfer function are squared at a given frequency. If, however, the amplitudes of the transfer function are raised to the power 3, the resulting modified function will have more exaggerated features, and the 3D effects will be enhanced further. This may be appropriate for use in computer games, for example. Alternatively, the amplitude values of the transfer function may be raised to the power 1.5. This results in more subtle effects, and may be used advantageously, for example, for classical music recordings.
  • the high-frequency components of the exaggerated near-ear function can be limited, typically by appropriate design of the filters used for the signal processing. In this example, frequencies of more than 10 kHz are limited. This is shown in Figure 8, plot B. However, the point at which the high frequencies are limited may vary from 10 kHz. For example, it may be desirable to reduce high frequency components above 6 kHz, or above 20 kHz.
  • Modified filters which implement the exaggerated HRTFs may be used in many applications. Examples of these applications will now be described.
  • the AC-3 surround sound listening format there is provision for 6 loudspeakers: front left, centre, front right, surround left (rear), surround right (rear), and a non-directional sub-woofer.
  • a sound engineer can "pan" sounds from one position to another by varying the relative loudness of the sound being fed to the various loudspeakers. For example, a sound source may be panned from the front right speaker to the rear left speaker, and the sound would appear to the listener to move from the front right speaker to the rear left speaker through him or herself. However, it may be required for some applications that a sound is panned over the head of the listener, or underneath the listener.
  • HRTF may be produced via the method described in the first embodiment of the invention, and used as a "height" filter for surround sound mastering (or encoding) applications. This would enable panning from the front of a listener, to behind the listener, passing over the top of the listener's head.
  • HRTF may also be produced to make a "depression” filter, and could be used to enable panning from a position in front of a listener, passing underneath the listener, to a position behind the listener. This approach enables the conventional sound format to extend into the third dimension without any changes in the user's hardware, and without any change in format, bandwidth and the like.
  • the method of the invention may also be used in conjunction with vertical balance adjustment.
  • Vertical balance adjustment is described in published International Patent Application, No. WO-A1-9517799.
  • a set of digital filters may be produced which implement an entire exaggerated HRTF library. This may be appropriate for applications such as PC games, where 3D effects with great spectral impact are more important than optimal tonal quality.
  • a sound recording or a transmission such as, for example, via wire based or wireless telegraphy, may be made by using modified filters which implement the exaggerated HRTFs.
  • the method of the invention may be applied to the far-ear transfer function (16b,18b), or to both the near-ear transfer function (16a,18a) and the far-ear transfer function (16b,18b).

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

A head-related transfer function (HRTF) is used to place a virtual sound source at a particular position in 3D space. The absence of modelled sound reflections from the virtual sound source, and the presence of unwanted real reflections from real sound sources, can impair the effectiveness of the positioning of the virtual sound source. The invention describes a method of modifying a filter for implementing an HRTF (13), whereby the spectral profile of the HRTF is exaggerated by convolving the near-ear (16a,18a) and/or far-ear (16b,18b) transfer function with itself. This results in more effective placement of virtual sound images in 3D space, giving improved realism of 3D effects.
The invention is of particular use in the virtualisation of multi-channel surround sound systems.

Description

  • This invention relates to a method of modifying a filter for implementing a head-related transfer function (HRTF) for use in the reproduction of three-dimensional (3D) sound.
  • The processing of binaural (two channel or stereo) audio signals to produce highly realistic 3D sound images is well known. One method is described in International Patent Application No. WO-A1-9422278, and is known as the Sensaura™ system. This system is based on recordings made using a so-called "artificial head" microphone system, and the recordings are subsequently processed digitally. The use of the artificial head ensures that natural 3D sound cues - which the brain uses to determine the position of sound sources in 3D space - are incorporated into the stereo recordings. 3D sound cues are introduced naturally by the head and ears when we listen to sounds in real life, and they include the following characteristics: inter-aural amplitude difference (IAD), inter-aural time delay (ITD), and spectral shaping by the outer ear.
  • By electronically synthesising these natural acoustic processes, it is possible to create "virtual" sound sources for headphone and loudspeaker reproduction. To set the position of a single channel virtual sound source in a plural channel system, separate audio filters for the left and right channels of the audio signal, together with a relative time delay, introduce the above mentioned characteristics. The filters used, and the time delay introduced, depend on the desired position of the virtual sound. The characteristics themselves are initially determined by measurement of an appropriate head-related transfer function (HRTF). The HRTF characterises the modifications which an audio signal undergoes on its path from a point in space, at a defined direction and distance from a listener, to the eardrums of the listener. An HRTF comprises a left-ear transfer function, a right-ear transfer function, and an inter-aural time delay. A block diagram of the synthesis of a virtual sound source is shown in Figure 3.
  • When a pair of audio signals incorporating such 3D sound cues are introduced efficiently into the ears of the listener, by headphones for example, then he or she perceives a virtual sound source to be located at the associated position in 3D space. However, if the processed signals are not conveyed directly and efficiently into the ears of the listener, then the full 3D effects will not be perceived. For example, when listening to sounds via conventional stereo loudspeakers, the left ear hears a little of the right loudspeaker signal, and vice versa. This is known as transaural crosstalk. By cancelling out transaural crosstalk, full 3D effects can be enjoyed via loudspeakers remote from the listener. Transaural crosstalk from each of the loudspeakers may be cancelled by creating appropriate crosstalk cancellation signals from the opposite loudspeaker. Crosstalk cancellation signals are equal in magnitude and inverted (opposite in polarity) with respect to the transaural crosstalk signals. A system for performing transaural crosstalk cancellation is discussed in the published International Patent Application No. WO-A1-9515069.
  • When listening to a real sound source in an ordinary environment (e.g. a living room), the first sound that the listener hears is termed the "direct" sound (so called because it travels directly to the ears). The direct sound is soon followed by the first reflections from the floor, ceiling and walls, some milliseconds later (or tens of milliseconds, depending on the dimensions of the room). The first reflections are themselves reflected back again to the listener from other boundaries, and these sound waves are termed secondary reflections, or second-order reflections. This process continues until the sound energy has been totally absorbed by the boundaries of the environment, and by the air itself. The reflections which follow the first few reflections soon begin to overlap each other, becoming complex and scattered, and are termed the reverberant sound.
  • Because the placing of a virtual sound source using HRTF filters uses a considerable amount of computational effort, it is common to simulate only the direct sound, and not the reflections. Consequently, the resulting virtual sound is anechoic, that is, it lacks the reflected components. This can be a disadvantage, as such reflected components can help the brain determine distance and reinforce spatial effects.
  • A further limitation in conventional 3D sound reproduction is that when reproducing virtual sounds via loudspeakers, the sounds originating from the loudspeakers themselves may be reflected from surfaces such as walls, floor, ceiling, and furniture. These sound reflections may conflict with the virtual sound image, especially if the virtual sound image is placed behind the listener. This is because sound reflections from room boundaries close to the loudspeaker "overwhelm" the 3D cue arising from spectral shaping by the outer ear, and so the inter-aural time delay (ITD) cue predominates. This causes the virtual sound source to flip from the required rearward position to a position in front of the listener which shares the same ITD value.
  • It can be concluded that the absence of synthesised sound reflections in the virtual image, in addition to the presence of real reflections from room boundaries, can impair the effectiveness of positioning the virtual sound source.
  • An example illustrating this point is the virtualisation of rear surround speakers for the Dolby AC-3 5.1 system. Dolby and AC-3 are trademarks of Dolby Laboratories Inc. An audio system incorporating the AC-3 compression standard provides for multi-channel digital surround sound. AC-3 5.1 gives separate audio channels for left, right, and centre speakers in front of the listening position, two rear surround speakers, and a sub-woofer positioned according to the listener's preference. A typical loudspeaker configuration for the AC-3 system is shown in Figure 4.
  • Figures 1 and 2 show a co-ordinate system used for the following description. The convention chosen here for referring to azimuth angles is that they are measured from the frontal pole P towards the rear pole P', with positive values of azimuth on the right-hand side of the listener and negative values on the left-hand side. Rear pole P' is at an azimuth of +180° (and -180°). Angles of elevation are measured directly upwards (or downwards, for negative angles) from the origin at the centre of the head of the listener relative to the horizontal plane.
  • The preferred positions of the rear surround speakers in the AC-3 system are ±120° azimuth and 0° elevation. Therefore, the use of a +120°, and a -120°, HRTF is required. However, the characteristics of the +120° and -120° HRTF are very similar to those of the +60° and -60° HRTF: the inter-aural time delays for both HRTFs are identical (522 µs). Consequently, when attempts are made to create a virtual sound source at +120° (or - 120°), the presence of unwanted reflections from room boundaries adjacent the loudspeakers, in addition to the absence of virtual reflections from the virtual sound source, causes the image to flip to the +60° (or -60°) position. Thus sounds placed at an azimuth of +120° (or -120°) appear to be in front of the listener at +60° (or -60°), and the illusion of the surround sound effect is disturbed.
  • An aim of the present invention is to provide more effective virtual sound source placement in three dimensions, particularly, but not exclusively, for virtual sound sources placed behind a listener, by modification of the characteristics of a filter for implementing a head-related transfer function.
  • According to a first aspect of the invention there is provided a method of modifying the characteristics of a filter for implementing a head-related transfer function (HRTF), the HRTF including a near-ear transfer function and a far-ear transfer function, the method comprising increasing the magnitude of the amplitude of the near-ear transfer function and/or far-ear transfer function over a range of frequencies to give an exaggerated near-ear transfer function and/or an exaggerated far-ear transfer function, the amount of the increase at a given frequency being a function of the amplitude of the corresponding transfer function or functions at the given frequency, thereby forming a filter which implements an HRTF having an exaggerated near-ear transfer function and/or an exaggerated far-ear transfer function.
  • Preferably the magnitude of the amplitude of the near-ear transfer function, and/or the far-ear transfer function, is increased by convolving the transfer function with itself.
  • The amplitude of the exaggerated near-ear transfer function and/or the amplitude of the exaggerated far-ear transfer function may be limited over a range of frequencies above a threshold value. The threshold value may be, for example, 6 kHz.
  • The amplitude of the exaggerated near-ear transfer function and/or the amplitude of the exaggerated far-ear transfer function may be adjusted so that the amplitude of the exaggerated near-ear transfer function and the amplitude of the exaggerated far-ear transfer function tend to the same value at frequencies below, for example, 100 Hz.
  • According to another aspect of the invention, there is provided a filter modified using the aforedescribed method. Preferably the modified filter is used for implementing an HRTF, the HRTF having an amplitude response characteristic curve substantially as shown in plot B of Figure 8.
  • The filter may also include crosstalk cancellation means. The filter may be used in a multi-channel surround sound system, or a multi-channel encoding system.
  • Preferably the modified filter for implementing an HRTF places a virtual sound source at positions behind a listener. For AC-3, or other, surround sound systems, preferably the virtual sound sources are placed at azimuths of ±120° and elevations of 0° relative to a listener. For different applications such as AC-3, or other, mastering (or encoding) applications, preferably the virtual sound source is placed at an elevation of ±90° relative to a listener. Preferably the modified filter is a finite impulse response filter.
  • According to another aspect of the invention, there is provided a sound recording or transmission made using a modified filter implementing an HRTF.
  • According to a further aspect of the invention, there is provided a signal processed using a modified filter implementing an HRTF.
  • The invention will now be described, by way of example only, with reference to the accompanying Figures, in which:-
  • Figure 1 shows the head of a listener within a reference sphere, and a co-ordinate system;
  • Figure 2 shows the position of a sound-source on the reference sphere with respect to the listener;
  • Figure 3 shows a schematic representation of the conventional method for creating a virtual sound source;
  • Figure 4 shows a schematic representation of a typical Dolby AC-3 surround sound system configuration;
  • Figure 5 shows a graph of 120° near-ear and far-ear transfer functions;
  • Figure 6 shows a graph of 60° near-ear and far-ear transfer functions;
  • Figure 7 shows a graph of a 120° near-ear transfer function, and the 120° near-ear transfer function convolved with itself, according to the invention;
  • Figure 8 shows a graph of a 120° near-ear transfer function convolved with itself and a high frequency limited version of the same, according to the invention;
  • Figure 9 shows a graph of near-ear transfer functions for positions directly above the listener and directly below the listener; and
  • Figure 10 shows a graph of modified near-ear transfer functions for positions directly above the listener and directly below the listener, according to the invention.
  • In a first embodiment, a filter implementing an HRTF (12), shown in Figure 3, is modified to provide improved positioning of a virtual sound source. In particular, an HRTF (12) placing a virtual sound source at an azimuth of +120° and elevation of 0° is described. Similarly, an HRTF of azimuth angle 60° and elevation 0° will be referred to as a 60° HRTF. The method described may also be applied to the -120°, or indeed any, HRTF.
  • Figure 5 shows the near-ear amplitude response (16a) of a 120° HRTF, and the far-ear amplitude response (16b) of the same function. Here, near-ear corresponds to the ear of a listener which is nearest to the virtual sound source, and far-ear is the ear furthest away from the virtual sound source. At positions where the sound source is located at identical distances from the left and right ears, the near-ear (16a) and far-ear responses (16b) are identical. The HRTF (12) therefore comprises a near-ear transfer function (16a), a far-ear transfer function (16b), and an inter-aural time delay.
  • Figure 6 shows the near-ear amplitude response (18a), and the far-ear amplitude response (18b), of a 60° HRTF. It can be seen that the general form of the far-ear data (16a and 18b) for both plots is similar. However, the near-ear data (16a) of Figure 5 exhibits some differences from the near-ear data (18a) of Figure 6. It should be noted that, in this example, differences in the far-ear responses (16b, 18b) are not as obvious to the brain as differences in the near-ear responses (16a, 18a). This is because the far-ear response (16b, 18b) is generally associated with less energy than the near-ear response (16a, 18a).
  • By inspection of the graphs of Figures 5 and 6, it can be seen that the prime difference between the 120° HRTF and the 60° HRTF appears to be the near-ear amplitude responses (16a, 18a). However, this difference is not large enough for the brain to be able to distinguish the 120° near-ear response (16a) from the 60° near-ear response (18a) in the presence of real reflections, and the absence of virtual reflections. The invention overcomes this deficiency by exaggerating the spectral features of the near-ear amplitude response (16a, 18a) to provide more spectral information to the listener's brain.
  • However, the best means of providing more spectral information is not immediately apparent. One may, for example, select a particular spectral feature of the HRTF data (a peak, or a trough, say), and increase its magnitude. Unfortunately, there is no way of knowing whether any particular spectral feature (or combination thereof) is important or not to the brain for the purpose of identifying the location of a sound. Also, there is the difficulty of merging such an exaggerated feature with the remainder of the spectral response. Finally, it would not be possible to automate such a process for application to an entire library of HRTFs (12), as such a library may contain more than a thousand HRTF pairs.
  • Accordingly, the first embodiment of the present invention provides a method of creating more pronounced spectral data by increasing the magnitude of the amplitude of the near-ear function (16a, 18a) over a range of frequencies. The amount of the increase at a given frequency is a function of the amplitude of the near-ear function (16a,18a) at the given frequency. In this particular example, for the 120° HRTF, the near-ear function (16a) is convolved with itself. This results in an exaggerated near-ear function (26a), as shown in Figure 7, with an increase in the magnitude of peaks and troughs, at all frequencies. In particular, it can be seen from Figure 7 that the magnitude of the trough at 4 kHz in the unmodified function has been increased. A filter may then be designed to implement an HRTF having an exaggerated near-ear function (26a). Hereinafter, a near-ear function and a far-ear function which have undergone any one of a number of processing steps according to the method described herein, are known as exaggerated near-ear and far-ear functions, respectively.
  • It is required that the magnitudes of the near-ear and far-ear amplitudes at low frequencies are similar. Therefore, it is necessary to set the overall gain factor of the modified function so as to align its low frequency response to match that of the corresponding unmodified function. Figure 7 shows the near-ear transfer function (16a) of the 120° HRTF (12a), convolved with itself (26a), and its overall gain adjusted for low frequency alignment of the modified and unmodified functions.
  • When an audio signal is processed by a modified filter which implements the exaggerated 120° HRTF, the virtual sound source appears to be located at +120°, and not at +60° as can occur with the unmodified filter which implements the original 120° HRTF.
  • In order to vary the subtlety of the 3D effects, the size of the increase in magnitude of the amplitude of the near-ear function may be varied. For example, if the near-ear transfer function is convolved with itself, the amplitude values of the transfer function are squared at a given frequency. If, however, the amplitudes of the transfer function are raised to the power 3, the resulting modified function will have more exaggerated features, and the 3D effects will be enhanced further. This may be appropriate for use in computer games, for example. Alternatively, the amplitude values of the transfer function may be raised to the power 1.5. This results in more subtle effects, and may be used advantageously, for example, for classical music recordings.
  • The high-frequency components of the exaggerated near-ear function can be limited, typically by appropriate design of the filters used for the signal processing. In this example, frequencies of more than 10 kHz are limited. This is shown in Figure 8, plot B. However, the point at which the high frequencies are limited may vary from 10 kHz. For example, it may be desirable to reduce high frequency components above 6 kHz, or above 20 kHz.
  • Limitation, or attenuation, of high frequencies may be carried out for the following reasons: For 3D sound conveyed via loudspeakers remote from the listener's ears, high-frequency information cannot, in practice, be crosstalk cancelled effectively. We can therefore attenuate the high frequencies with little effect on the apparent placement of the virtual sounds. This is discussed in our co-pending UK Patent Application No. GB 9805534.6.
  • When listening to sounds via loudspeakers, high frequencies are attenuated more than low frequencies along the pathway from the loudspeakers to the listener's head. However, when listening to sounds via headphones (where crosstalk cancellation is not required), high frequencies are not attenuated along the pathway from the headphones to the ears of a listener, due to the proximity of the headphones to the ears. Thus more high frequency sound is presented to the ears than would be so via loudspeakers. This may result in the virtual sound image appearing to be close to the listener's head. For this reason, a reduction in high frequencies is desirable for headphone reproduction to enable the virtual sound image to appear "out-of-the-head".
  • Modified filters which implement the exaggerated HRTFs may be used in many applications. Examples of these applications will now be described.
  • In the AC-3 surround sound listening format, there is provision for 6 loudspeakers: front left, centre, front right, surround left (rear), surround right (rear), and a non-directional sub-woofer. During the sound mixing process (wherein the sound is encoded for the AC-3 format), a sound engineer can "pan" sounds from one position to another by varying the relative loudness of the sound being fed to the various loudspeakers. For example, a sound source may be panned from the front right speaker to the rear left speaker, and the sound would appear to the listener to move from the front right speaker to the rear left speaker through him or herself. However, it may be required for some applications that a sound is panned over the head of the listener, or underneath the listener. For example, it might be required to move the sound of a helicopter from the front right speaker over the head of the listener, and then to the front left speaker. With present panning systems this would not be possible as the apparent positions of virtual sounds are restricted to the horizontal plane. By the use of an exaggerated "height" filter, it is possible to introduce height elements into the system.
  • For example, an exaggerated "overhead" (that is, where elevation=90°) HRTF may be produced via the method described in the first embodiment of the invention, and used as a "height" filter for surround sound mastering (or encoding) applications. This would enable panning from the front of a listener, to behind the listener, passing over the top of the listener's head. An exaggerated "below" (for example, elevation=-90°) HRTF may also be produced to make a "depression" filter, and could be used to enable panning from a position in front of a listener, passing underneath the listener, to a position behind the listener. This approach enables the conventional sound format to extend into the third dimension without any changes in the user's hardware, and without any change in format, bandwidth and the like.
  • The method of the invention may also be used in conjunction with vertical balance adjustment. Vertical balance adjustment is described in published International Patent Application, No. WO-A1-9517799.
  • A set of digital filters may be produced which implement an entire exaggerated HRTF library. This may be appropriate for applications such as PC games, where 3D effects with great spectral impact are more important than optimal tonal quality.
  • A sound recording or a transmission such as, for example, via wire based or wireless telegraphy, may be made by using modified filters which implement the exaggerated HRTFs.
  • Variation may be made to the aforementioned embodiments without departing from the scope of the invention. For example, the method of the invention may be applied to the far-ear transfer function (16b,18b), or to both the near-ear transfer function (16a,18a) and the far-ear transfer function (16b,18b).

Claims (18)

  1. A method of modifying the characteristics of a filter for implementing a head-related transfer function (HRTF), the HRTF (12) including a near-ear transfer function (16a) and a far-ear transfer function (16b), the method comprising increasing the magnitude of the amplitude of the near-ear transfer function and/or far-ear transfer function over a range of frequencies to give an exaggerated near-ear transfer function (26a) and/or an exaggerated far-ear transfer function, the amount of the increase at a given frequency being a function of the amplitude of the corresponding transfer function or functions at the given frequency, thereby forming a filter which implements an HRTF having an exaggerated near-ear transfer function (26a) and/or an exaggerated far-ear transfer function.
  2. A method according to claim 1 wherein the magnitude of the amplitude of the near-ear transfer function is increased by convolving the near-ear transfer function (16a) with itself.
  3. A method according to claims 1 or 2 wherein the magnitude of the amplitude of the far-ear transfer function is increased by convolving the far-ear transfer function (16b) with itself.
  4. A method according to any preceding claim wherein the amplitude of the exaggerated near-ear transfer function (26a) and/or the amplitude of the exaggerated far-ear transfer function is limited over a range of frequencies above a threshold value.
  5. A method according to claim 4 wherein the threshold value is 6 kHz.
  6. A method according to any preceding claim wherein the amplitude of the exaggerated near-ear transfer function (26a) and/or the amplitude of the exaggerated far-ear transfer function is adjusted so that the amplitude of the exaggerated near-ear transfer function (26a) and the amplitude of the exaggerated far-ear transfer function tend to the same value at frequencies below 100 Hz.
  7. A filter modified using the method as claimed in any of claims 1 to 6.
  8. A filter according to claim 7 for implementing an HRTF, wherein the HRTF has an amplitude response characteristic curve substantially as shown in plot B of Figure 8.
  9. A filter according to claim 7 including transaural crosstalk cancellation means.
  10. A filter according to claim 7 wherein the filter places a virtual sound source at positions behind the preferred position of a listener in use.
  11. A filter according to claim 7 wherein the filter places a virtual sound source at an azimuth of ±120° and an elevation of 0° relative to the preferred position of a listener in use.
  12. A filter according to claim 7 wherein the filter places a virtual sound source at an elevation of ±90° relative to the preferred position of a listener in use.
  13. A filter according to claim 11 for use in a multi-channel surround sound system.
  14. A filter according to claim 13 wherein a multi-channel audio signal is converted to a binaural signal.
  15. A filter according to claim 12 for use in a multi-channel encoding system.
  16. A filter according to claims 7 to 15 wherein the filter is a finite impulse response filter.
  17. A sound recording or transmission made using the filter as claimed in any of claims 7 to 16.
  18. A signal processed using the filter claimed in any of claims 7 to 16.
EP19990303966 1998-05-22 1999-05-21 Method of modifying a filter for implementing a head-related transfer function Withdrawn EP0959644A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9811054 1998-05-22
GB9811054A GB2337676B (en) 1998-05-22 1998-05-22 Method of modifying a filter for implementing a head-related transfer function

Publications (1)

Publication Number Publication Date
EP0959644A2 true EP0959644A2 (en) 1999-11-24

Family

ID=10832550

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19990303966 Withdrawn EP0959644A2 (en) 1998-05-22 1999-05-21 Method of modifying a filter for implementing a head-related transfer function

Country Status (2)

Country Link
EP (1) EP0959644A2 (en)
GB (1) GB2337676B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1542502A2 (en) * 2003-12-10 2005-06-15 Ultrasone AG Headphone with surround-sound effect
US20120008789A1 (en) * 2010-07-07 2012-01-12 Korea Advanced Institute Of Science And Technology 3d sound reproducing method and apparatus
US9204236B2 (en) 2011-07-01 2015-12-01 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US10142761B2 (en) 2014-03-06 2018-11-27 Dolby Laboratories Licensing Corporation Structural modeling of the head related impulse response
CN112188358A (en) * 2019-07-04 2021-01-05 歌拉利旺株式会社 Audio signal processing apparatus, audio signal processing method, and non-volatile computer-readable recording medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2366975A (en) * 2000-09-19 2002-03-20 Central Research Lab Ltd A method of audio signal processing for a loudspeaker located close to an ear
US6738479B1 (en) 2000-11-13 2004-05-18 Creative Technology Ltd. Method of audio signal processing for a loudspeaker located close to an ear
US6741711B1 (en) 2000-11-14 2004-05-25 Creative Technology Ltd. Method of synthesizing an approximate impulse response function

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0689756B1 (en) * 1993-03-18 1999-10-27 Central Research Laboratories Limited Plural-channel sound processing

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1542502A2 (en) * 2003-12-10 2005-06-15 Ultrasone AG Headphone with surround-sound effect
EP1542502A3 (en) * 2003-12-10 2009-04-22 Ultrasone AG Headphone with surround-sound effect
US20120008789A1 (en) * 2010-07-07 2012-01-12 Korea Advanced Institute Of Science And Technology 3d sound reproducing method and apparatus
KR20230019809A (en) * 2010-07-07 2023-02-09 삼성전자주식회사 Method and apparatus for 3D sound reproducing
US10531215B2 (en) * 2010-07-07 2020-01-07 Samsung Electronics Co., Ltd. 3D sound reproducing method and apparatus
AU2018211314B2 (en) * 2010-07-07 2019-08-22 Korea Advanced Institute Of Science And Technology 3d sound reproducing method and apparatus
AU2017200552B2 (en) * 2010-07-07 2018-05-10 Korea Advanced Institute Of Science And Technology 3d sound reproducing method and apparatus
US10244343B2 (en) 2011-07-01 2019-03-26 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US9838826B2 (en) 2011-07-01 2017-12-05 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US9549275B2 (en) 2011-07-01 2017-01-17 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US10609506B2 (en) 2011-07-01 2020-03-31 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US11057731B2 (en) 2011-07-01 2021-07-06 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US9204236B2 (en) 2011-07-01 2015-12-01 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US11641562B2 (en) 2011-07-01 2023-05-02 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US12047768B2 (en) 2011-07-01 2024-07-23 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US10142761B2 (en) 2014-03-06 2018-11-27 Dolby Laboratories Licensing Corporation Structural modeling of the head related impulse response
CN112188358A (en) * 2019-07-04 2021-01-05 歌拉利旺株式会社 Audio signal processing apparatus, audio signal processing method, and non-volatile computer-readable recording medium
EP3761674A1 (en) * 2019-07-04 2021-01-06 Clarion Co., Ltd. Audio signal processing apparatus, audio signal processing method, and audio signal processing program
JP2021013063A (en) * 2019-07-04 2021-02-04 クラリオン株式会社 Audio signal processing device, audio signal processing method and audio signal processing program

Also Published As

Publication number Publication date
GB2337676A (en) 1999-11-24
GB9811054D0 (en) 1998-07-22
GB2337676B (en) 2003-02-26

Similar Documents

Publication Publication Date Title
EP3311593B1 (en) Binaural audio reproduction
EP0966179B1 (en) A method of synthesising an audio signal
US20050265558A1 (en) Method and circuit for enhancement of stereo audio reproduction
US10880649B2 (en) System to move sound into and out of a listener's head using a virtual acoustic system
JP2010004512A (en) Method of processing audio signal
Gardner 3D audio and acoustic environment modeling
WO2002015637A1 (en) Method and system for recording and reproduction of binaural sound
US7197151B1 (en) Method of improving 3D sound reproduction
KR20120065365A (en) Loudspeaker system for reproducing multi-channel sound with an improved sound image
US6990210B2 (en) System for headphone-like rear channel speaker and the method of the same
US10440495B2 (en) Virtual localization of sound
EP0959644A2 (en) Method of modifying a filter for implementing a head-related transfer function
US7050596B2 (en) System and headphone-like rear channel speaker and the method of the same
US20120224700A1 (en) Sound image control device and sound image control method
EP1212923B1 (en) Method and apparatus for generating a second audio signal from a first audio signal
US6983054B2 (en) Means for compensating rear sound effect
GB2369976A (en) A method of synthesising an averaged diffuse-field head-related transfer function
Sibbald Transaural acoustic crosstalk cancellation
US11284195B2 (en) System to move sound into and out of a listener's head using a virtual acoustic system
Glasgal Improving 5.1 and Stereophonic Mastering/Monitoring by Using Ambiophonic Techniques
CA3192986A1 (en) Sound reproduction with multiple order hrtf between left and right ears
AU751831C (en) Method and system for recording and reproduction of binaural sound
Tsakostas Binaural Simulation applied to standard stereo audio signals aiming to the enhancement of the listening experience
WO2004001699A2 (en) Method for enhancement of listener perception of sound spatialization
AU2004202113A1 (en) Depth render system for audio

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: CREATIVE TECHNOLOGY LTD.

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20041201