EP0968624A2 - Fernsprechübertragung von dreidimensionalem schall - Google Patents

Fernsprechübertragung von dreidimensionalem schall

Info

Publication number
EP0968624A2
EP0968624A2 EP98909666A EP98909666A EP0968624A2 EP 0968624 A2 EP0968624 A2 EP 0968624A2 EP 98909666 A EP98909666 A EP 98909666A EP 98909666 A EP98909666 A EP 98909666A EP 0968624 A2 EP0968624 A2 EP 0968624A2
Authority
EP
European Patent Office
Prior art keywords
signal
signals
output signals
channel
khz
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP98909666A
Other languages
English (en)
French (fr)
Inventor
David Monteith
Alastair Sibbald
Martin Peter Todd
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Creative Technology Ltd
Original Assignee
Central Research Laboratories Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GBGB9705565.1A external-priority patent/GB9705565D0/en
Priority claimed from GBGB9707962.8A external-priority patent/GB9707962D0/en
Application filed by Central Research Laboratories Ltd filed Critical Central Research Laboratories Ltd
Publication of EP0968624A2 publication Critical patent/EP0968624A2/de
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems

Definitions

  • This invention relates to telephonic transmission of three dimensional (3D) sounds and more particularly to an apparatus for communicating three dimensional sounds between two or more remote locations by telephone transmission.
  • the present invention is concerned with telephone conference systems irrespective of whether or not a visual image is transmitted at the same time as the audio transmission.
  • Video telephone conference systems which employ large expensive equipment to enable a large group of people in one location to communicate with another group at another location is well known.
  • Video telephones which incorporate a camera and video screen at each location are also well known.
  • office technologies such as fax, telephone and video such systems are becoming more readily available.
  • An object of the present invention is to overcome, or reduce, these undesirable effects by reproducing a three dimensional sound field of the transmitting station at the receiving location.
  • Binaural technology is based on using a so-called "artificial head” microphone system to receive sound from a sound source and convert the acoustic energy into an electrical signal which is subsequently processed digitally.
  • the use of an artificial head ensures that the natural three dimensional sound cues, which the brain of a listener uses to determine the position of sound sources in three dimensional space, are incorporated into the audio signal.
  • the artificial head is preferably constructed to resemble as close as possible an actual human head and upper torso and has silicone rubber ears which precisely resemble human ears but in some applications good results (but less precise) can be achieved using two spaced microphones with a block or sheet of wood between the microphones.
  • binaural signals is intended to mean two channel or stereophonic signals which include one or more components representing audio diffraction effects created by an artificial head means positioned between a pair of microphones.
  • artificial head is intended to cover not only a precise model of a human head but other imprecise models (such as for example a block of wood between microphones) and electrical synthesis of the audio diffraction signals.
  • a further problem with artificial - head microphone systems is that when listening to the reproduced sound through loudspeakers interaural cross talk occurs, when an audio signal intended for one ear of a listener is also received by the other ear. In order to compensate for this effect it is well known to employ cross-talk cancellation circuits. See for example International Patent Application WO-A-9515069.
  • a further object of the present invention is to provide apparatus which enables binaural processing of the audio signals of a telephone conference system.
  • apparatus for communicating three dimensional sounds via a telephone link comprising an input device consisting of two spaced microphones operable to produce left and right channel monophonic microphone output signals, signal processing means for each channel comprising filter means for receiving the microphone output signals and modifying the signals to compensate for head related air-to-ear transfer functions and equalise the spectral response of the microphone output signals, cross-talk cancellation means for cancelling out interaural cross-talk between the channels, and data compression means operable to receive an output signal from each channel, combine them to produce a binaural signal and compress said binaural signal to produce a compressed binaural signal for transmission over the telephone link, said compression means using a first compression algorithm to compress frequencies below 1 kHz whilst preserving relative phase differences between the channel output signals, a second algorithm to compress frequencies above 2 kHz whilst preserving relative differences between amplitudes of the channel output signals and a third algorithm to compress frequencies between 1 kHz and 2 kHz whilst preserving the IAD and ITD
  • the apparatus further includes a receiving means for receiving a compressed binaural signal transmitted over a telephone link and converting said compressed signal into left and right channel audio output signals, and spaced left and right channel sound reproduction means each of which is operable to receive a respective channel audio output signal from said receiving means and reproduced sound corresponding to said respective channel audio output signal.
  • a receiving means for receiving a compressed binaural signal transmitted over a telephone link and converting said compressed signal into left and right channel audio output signals
  • spaced left and right channel sound reproduction means each of which is operable to receive a respective channel audio output signal from said receiving means and reproduced sound corresponding to said respective channel audio output signal.
  • the sound reproduction means may comprise a pair of loudspeakers, or a pair of headphones.
  • the apparatus may be provided with a video signal means comprising a camera operable to produce a video output signal, and the compression means is operable to receive the video output signal and to combine said video output signal with said compressed binaural signal to produce a combined output signal for transmission via the telephone link.
  • a video signal means comprising a camera operable to produce a video output signal
  • the compression means is operable to receive the video output signal and to combine said video output signal with said compressed binaural signal to produce a combined output signal for transmission via the telephone link.
  • receiving means further includes means for receiving a video signal transmitted over a telephone link and converting the video signal into a video output signal, and display means operable to receive said video output signal and display a visual image.
  • Figure 1 illustrates schematically apparatus incorporating the present invention for telephone conference connection between two conference centres.
  • Figure 2 shows in block diagram form apparatus incorporating signal processors in accordance with the present invention.
  • Figure 3 shows schematically human head, and
  • Figure 4 shows a further embodiment of the present invention.
  • each conference station 10, 11 is provided with a personal computer (PC) 12 which includes a monitor, two spaced microphones 13, 14 mounted in silicone rubber moulded ears 15 (which model precisely human outer ears) and two spaced loudspeakers 16.
  • PC personal computer
  • the microphones 13, 14 should ideally be placed about 15 cm apart (the approximate width of a human head) and although it is preferable that the microphones are mounted in moulded ears 15 on an artificial head, the microphones could be mounted in moulded ears mounted on structure 17 (such as a block or sheet of wood). Alternatively the microphones 13, 14 and moulded ears could simply be mounted on the sides of the computer case 12, but this would give less precise detail to the three-dimensional sound field.
  • Each of the stations 10, 11 is connected to the other by means of the public telephone system 27 in the usual way.
  • both microphones 13, 14 are positioned to receive sound generated at their respective station 10, 11, where they are located.
  • Each microphone converts the pressure variations associated with the sound waves that it receives into an analogue electrical signal at inputs 18a, 18b of each channel (representing left and right ears 13,14) of a digital signal processor 19.
  • the processor 18 comprises a HRTF filter 20 and an equalisation filter 21 for each channel.
  • HRTF or "Head Related Transfer Function” is intended to mean a function representing the transfer function of a path between a source of sound and the ear of the listener, either the ear nearer the sound (near HRTF) or the ear further from the sound (far HRTF).
  • HRTF's may be obtained by measurements on a real human head equipped with suitable microphones; alternatively, they may be obtained using an artificial head means, which may be, as is common, a precise model of a human head or torso with microphones in the ear structures; alternatively it may be something far less precise, for example a block or sheet of wood positioned between a pair of spaced apart microphones; it might even be an electrical synthesis circuit or system which creates such functions.
  • an artificial head means which may be, as is common, a precise model of a human head or torso with microphones in the ear structures; alternatively it may be something far less precise, for example a block or sheet of wood positioned between a pair of spaced apart microphones; it might even be an electrical synthesis circuit or system which creates such functions.
  • Filters 21 correct the spectral response to compensate for the mid-range gain associated with the concha-related resonance, as explained in International Patent Applications WO-A- 9422278 and WO-A-9515069.
  • the outputs 21a, 21b of the filters 21 are fed to cross-talk cancellation circuits 22 which cancel out the interaural crosstalk as explained in International Patent Applications WO-A-9422278 and WO-A-9515069.
  • the output signals at each channel output 23 comprises a monophonic digital audio signal.
  • the normal signals transmitted over internationally acceptable telephone networks are typically a monophonic signal covering a range of frequencies from about 200 Hz to 3.4 kHz.
  • the output signals 23 of each channel are combined and compressed by a signal compression means 25 to produce a stereophonic output signal 24.
  • the compression algorithms used by the compression means 25 are designed to preserve the three dimensional cues in the audio output signals 23 from each channel.
  • a second key aspect is to preserve the time relationship between the signals in the two channels.
  • the manner in which the head and outer ears of a listener modify soundwaves before they are registered by the inner ears is complex, with several contributing factors playing a part.
  • each pinna outer ear flap
  • each pinna together with its auditory canal
  • the sound source is moved to one side of the head of the listener, then the more distant ear lies in the shadow of the head, and the ear closer to the sound source is aligned more on-axis with the source.
  • the soundwaves.diffract around the listener's head When sound waves encounter the listeners head, the soundwaves.diffract around the listener's head. In general, the average width of a human head is 15 cm with an interaural path length of about 20 cm when the circumference effect is taken into account. Sound waves of greater wavelength than 15 cm (corresponding to frequencies below about 1.7 kHz) can diffract efficiently around a human head whereas at higher frequencies the sound wave cannot diffract efficiently around the head. This effect, known as "head-shadowing", creates differences in amplitudes of the sound signals arriving at each ear of the listener. This interaural amplitude difference (IAD) is one of the primary 3D cues which need to be preserved.
  • IAD interaural amplitude difference
  • the effects of diffraction on the intensity of the sound are noticeable in the range of between 700 Hz and 8 kHz and are more noticeable at higher frequencies (say above 2 kHz), where the head-shadowing creates noticeable differences in the intensities of the sound waves reaching the ears.
  • the listener's brain uses these differences in intensity as cues to locate the direction of the source of high frequency sounds. Therefore it is important to retain the relationship between the intensities (or amplitudes) of the high frequency sounds.
  • phase difference is approximately proportional to frequency.
  • the listener's brain therefore uses the phase differences of the low frequencies as an important cue to determine the direction of the source of low frequency sounds. It is therefore important to retain the phase relationships between the output signals of the left and right channels for the low frequency sounds.
  • time-of-arrival differences between the left and right ears of the listener, unless the sound source is exactly in front, behind, above or below the head of the listener.
  • ITD interaural time delay
  • Figure 3 shows a plan view of a conceptual head with a left ear (LE) and a right ear (RE) receiving a sound signal from a distant source at azimuth angle ⁇ (about +45° as shown in the drawing).
  • the wave front (W - W 1 ) arrives at the right ear (RE)
  • the path distance a represents a proportion of the circumference subtended by ⁇ .
  • the path length (a+b) is given by.
  • ITDs are measured to be slightly greater than this, possibly because of the non-spherical nature of human heads, the complex diffractive situation and surface effects. Hence ITDs lying in the range of 0 to 0.8 ms are also important primary 3D cues.
  • the mid-range gain due to the concha related resonance and the resonance in the auditory canal of the outer ear occurs at about 3 kHz or slightly higher and this is at the extreme end of the normal bandwidth of conventional telephone transmission lines.
  • the Fossa a cavity at the uppermost region of the Pinna of the outer ear
  • the brain of the listener makes use of the higher frequency sounds at 13 kHz or above to assist in determining whether the source of sound is in front of or behind the listener. It is therefore important to retain the detail of high frequency sounds above 13 kHz, if front and back cues are necessary.
  • the compression means 25 uses a first algorithm which allows compression of frequencies below 1 kHz, whilst preserving phase differences between the channel output signal, and uses a second algorithm to compress frequencies above 2 kHz, whilst preserving relative differences in the amplitudes of the channel output signals 23.
  • the compression means 25 also employs algorithms that allow the compression of the mid range frequencies, whilst preserving the IAD and ITD information over the whole frequency band.
  • the compression means 25 thus preserves the phase and amplitude relationships up to 8 kHz for reproducing three dimensional. sound fields without front and back cues, or up to 13 kHz, or above, when front and back cues are wanted.
  • the output signal 24 of the compression means 25 is a compressed binaural signal which is transmitted over a conventional public telephone link 27 to another receiving station 10, 11.
  • Each station 10, 11 further includes a receiving means 28 for receiving an incoming compressed combined binaural signal transmitted via the telephone link 27.
  • the receiving means 28, (see Figure 2), comprises a signal processor which operates to re-expand the incoming compressed signal 26 and produce two channel input signals 30.
  • Each channel input signal 30 is supplied to a sound reproduction device 16 which may be the pair of loudspeakers 16 or a pair of headphones 32.
  • the apparatus of Figures 1, 2, and 3 further includes means for transmitting and receiving video signals over a telephone link 27 as shown in Figure 4 .
  • Figure 4 the same reference numbers are given to the same components that are common to the Figure 2 embodiment.
  • each station 10, 11 is provided with a video camera 32 and video processor 33 which is operable to produce a video output signal 34.
  • the video output signal 34 from the camera 32 is supplied to the compression means 25 of the signal processor 19 (see Figure 4).
  • the compression means 25 includes circuits for combining the binaural output signal 24 with the video output signal 34 to produce a combined video and binaural output signal 36 for transmission over the a telephone link.
  • the apparatus is also provided with a receiving means 37 for receiving an incoming combined video and binaural signal 38 transmitted over the telephone link 27 from another remote conference centre 10 or 11.
  • the receiving means 37 includes a decompression means 39 for expanding the received video and binaural signal 38, and operates to produce a video signal 40 to a video processor 41 and two audio output signals 30.to the speakers 16 or headphones 32.
  • the output of the video processor 40 drives the monitor 12 to produce a visual image.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)
EP98909666A 1997-03-18 1998-03-18 Fernsprechübertragung von dreidimensionalem schall Withdrawn EP0968624A2 (de)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
GB9705565 1997-03-18
GBGB9705565.1A GB9705565D0 (en) 1997-03-18 1997-03-18 Telephone transmission of 3d sound
GBGB9707962.8A GB9707962D0 (en) 1997-04-19 1997-04-19 Telephonic transmission of 3D sound
GB9707962 1997-04-19
PCT/GB1998/000813 WO1998042161A2 (en) 1997-03-18 1998-03-18 Telephonic transmission of three-dimensional sound

Publications (1)

Publication Number Publication Date
EP0968624A2 true EP0968624A2 (de) 2000-01-05

Family

ID=26311214

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98909666A Withdrawn EP0968624A2 (de) 1997-03-18 1998-03-18 Fernsprechübertragung von dreidimensionalem schall

Country Status (2)

Country Link
EP (1) EP0968624A2 (de)
WO (1) WO1998042161A2 (de)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102006048295B4 (de) * 2006-10-12 2008-06-12 Andreas Max Pavel Verfahren und Vorrichtung zur Aufnahme, Übertragung und Wiedergabe von Schallereignissen für Kommunikationsanwendungen
CN102809742B (zh) 2011-06-01 2015-03-18 杜比实验室特许公司 声源定位设备和方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0689756B1 (de) * 1993-03-18 1999-10-27 Central Research Laboratories Limited Tonverarbeitung für mehrere kanäle
US5434913A (en) * 1993-11-24 1995-07-18 Intel Corporation Audio subsystem for computer-based conferencing system
GB9324240D0 (en) * 1993-11-25 1994-01-12 Central Research Lab Ltd Method and apparatus for processing a bonaural pair of signals
US5596644A (en) * 1994-10-27 1997-01-21 Aureal Semiconductor Inc. Method and apparatus for efficient presentation of high-quality three-dimensional audio

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9842161A3 *

Also Published As

Publication number Publication date
WO1998042161A2 (en) 1998-09-24
WO1998042161A3 (en) 1998-12-17

Similar Documents

Publication Publication Date Title
AU2008362920B2 (en) Method of rendering binaural stereo in a hearing aid system and a hearing aid system
US7012630B2 (en) Spatial sound conference system and apparatus
US4118599A (en) Stereophonic sound reproduction system
JP4166435B2 (ja) 通信会議システム
JP3435156B2 (ja) 音像定位装置
CN109640235B (zh) 利用声源的定位的双耳听力系统
US20160140947A1 (en) Apparatus, Method, and Computer Program for Adjustable Noise Cancellation
US8340315B2 (en) Assembly, system and method for acoustic transducers
US8488820B2 (en) Spatial audio processing method, program product, electronic device and system
CN101208989A (zh) 用于声信号的装置、系统以及方法
JP7070910B2 (ja) テレビ会議システム
KR20090077934A (ko) 통신 애플리케이션의 사운드 이벤트의 획득, 송신, 및 재생을 위한 프로세스 및 장치
EP0968624A2 (de) Fernsprechübertragung von dreidimensionalem schall
West et al. Teleconferencing system using head-related signals
JP6972858B2 (ja) 音響処理装置、プログラム及び方法
KR102613033B1 (ko) 머리전달함수 기반의 이어폰, 이를 포함하는 전화디바이스 및 이를 이용하는 통화방법
JP2662825B2 (ja) 会議通話端末装置
JP2662824B2 (ja) 会議通話端末装置
Horiuchi et al. Adaptive estimation of transfer functions for sound localization using stereo earphone-microphone combination
JPH07107599A (ja) ヘッドホン受聴装置
WO2005069680A1 (en) Sound receiving arrangement comprising sound receiving means and sound receiving method

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19991007

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): CH DE FR GB LI NL SE

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: CREATIVE TECHNOLOGY LTD.

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20041001