WO2019045622A1 - Headset and method of operating headset - Google Patents

Headset and method of operating headset Download PDF

Info

Publication number
WO2019045622A1
WO2019045622A1 PCT/SE2018/050861 SE2018050861W WO2019045622A1 WO 2019045622 A1 WO2019045622 A1 WO 2019045622A1 SE 2018050861 W SE2018050861 W SE 2018050861W WO 2019045622 A1 WO2019045622 A1 WO 2019045622A1
Authority
WO
WIPO (PCT)
Prior art keywords
headset
current position
voice signal
sound
speaker elements
Prior art date
Application number
PCT/SE2018/050861
Other languages
French (fr)
Inventor
Christer HOBERG
Original Assignee
Terranet Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Terranet Ab filed Critical Terranet Ab
Publication of WO2019045622A1 publication Critical patent/WO2019045622A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2460/00Details of hearing devices, i.e. of ear- or headphones covered by H04R1/10 or H04R5/033 but not provided for in any of their subgroups, or of hearing aids covered by H04R25/00 but not provided for in any of its subgroups
    • H04R2460/07Use of position data from wide-area or local-area positioning systems in hearing devices, e.g. program or information selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2499/00Aspects covered by H04R or H04S not otherwise provided for in their subgroups
    • H04R2499/10General applications
    • H04R2499/15Transducers incorporated in visual displaying devices, e.g. televisions, computer displays, laptops
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/033Headphones for stereophonic communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/11Positioning of individual sound objects, e.g. moving airplane, within a sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]

Definitions

  • the present invention relates generally to headsets and in particular to headsets for use in a collaborative setting in which users communicate by headsets while being located in proximity of each other.
  • Headsets are commonly used for communication between individuals.
  • a headset is a device worn on or around the head of an individual and comprises a microphone and one or more loudspeakers.
  • the microphone is operated to sense sound waves when the individual speaks and to generate a corresponding electric signal which is output from the headset.
  • Each loudspeaker is arranged adjacent to one of the indivi- dual's ears, or in the ear canal, and is operated to generate sound waves that reproduce an incoming electric audio signal that comprises speech sensed by the microphone of another headset.
  • Such headsets are e.g. commonly used for telephone communication. Headsets are sometimes also referred to as headphones or earphones, although these terms normally designate a sound-reproducing device without a microphone.
  • Headsets have also found use in collaborative settings, in which users are located in each other's physical proximity and communicate by headsets to jointly complete a task.
  • the headsets guarantee that the collaborating users can properly communicate with each other even at adverse conditions, e.g. in noisy environments and/or when the distance between the users or the presence of obstacles makes communication difficult or increases the risk for misunderstandings.
  • Such collaborative settings are found in industrial environments, such as factories, smelting plants, construction sites, etc, as well as at pit stops for racing vehicles.
  • Headsets are also used for collaborative applications in virtual reality (VR) environments.
  • VR virtual reality
  • each user wears a VR headset that provides virtual reality for the user, via images shown to the user on a stereoscopic head-mounted display.
  • the VR headset also typically includes a microphone and a pair of loudspeakers providing stereo sound.
  • VR headsets are widely used with computer games but they are also used in other applications, including simulators and trainers. Headsets may also be designed for augmented reality (AR), optionally in combination with VR.
  • AR augmented reality
  • US2011/0164188 discloses a remote control device for use with an audio visual (AV) system capable of displaying 3D images.
  • the remote control device enables the AV system to track of the position of viewers that hold a respective remote control device relative to a display, thereby allowing the AV system to deliver optimized 2D or 3D content to the respective viewer on the display.
  • the AV system may deliver audio to the respective viewer, via stand-alone loudspeakers or a headset, with a volume and distribution which is adapted to the viewer position.
  • the headset may include a microphone to enable each viewer to interact with the AV system through voice commands.
  • Another objective is to facilitate collaboration between users that communicate by headsets.
  • a further objective is to facilitate identification of the person talking by users that communicate by headsets.
  • a headset comprising: a microphone; first and second speaker elements configured to be mounted over, on or in a respective ear of an individual; a communication unit configured to transmit a first voice signal generated by the microphone and to receive a second voice signal from a second headset; a positioning unit configured to determine a current position of the second headset; and a control unit configured to operate the first and second speaker elements to reproduce sound based on the second voice signal, wherein the sound is reproduced to represent the current position of the second headset in relation to the headset.
  • the second voice signal comprises speech sensed by a microphone in the second headset.
  • the positioning unit is configured to determine at least one position parameter representing the current position
  • the control unit is configured to generate audio signals for the first and second speaker elements as a function of second voice signal and the at least one position parameter, and provide the audio signals to the speaker elements so as to cause the speaker elements to reproduce the sound with a spatial origin given by the at least one position parameter.
  • said at least one position parameter comprises one or more of: a distance from the headset to the current position of the second headset, and an angle between a reference direction of the headset and the current position of the second headset.
  • the reference direction is predefined with respect to the speaker elements of the headset so as to follow head movement of the individual.
  • said at least one position parameter further comprises an orientation of the second headset.
  • the communication unit is configured to receive a reference signal from the second headset, and wherein the positioning unit is configured to compute at least one of the distance and the angle as a function of the reference signal.
  • the communication unit comprises at least two antenna elements
  • the positioning unit is configured to compute the angle by comparing the reference signal as received at the at least two antenna elements.
  • the reference signal is included in or received together with the second voice signal.
  • the positioning unit is configured to at least partly determine the current position as a function of position data received by the
  • the position data is received from at least one of an external positioning system and the second headset.
  • the position data comprises an absolute position of the second headset in a predefined coordinate system
  • the positioning unit is configured to determine an absolute position of the headset in the predefined coordinate system, and determine the current position as a function of the absolute position of the headset and the absolute position of the second headset.
  • the absolute position of the headset and the absolute position of the second headset are GPS positions.
  • the headset further comprises an orientation sensor configured to generate orientation data for the headset, wherein the positioning unit is further configured to determine the current position as a function of the orientation data.
  • control unit is configured to obtain, based on the at least one position parameter, a head-related transfer function for the first speaker element and the second speaker element, respectively, and operate the respective head- related transfer function on the second voice signal to generate the audio signals for the first and second speaker elements.
  • control unit is configured to operate the first and second speaker elements to reproduce the sound with a spatial origin at the current position.
  • the communication unit is configured for wireless short-range communication with the second headset.
  • a method of operating a headset comprising first and second speaker elements configured to be mounted over, on or in a respective ear of an individual.
  • the method comprises: receiving a voice signal generated by a second headset; obtaining a current position of the second headset; and operating the first and second speaker elements to reproduce sound based on the voice signal, such that the sound is reproduced to represent the current position of the second headset in relation to the headset.
  • a computer-readable medium comprising program instructions which, when executed by a control unit, cause the control unit to perform the method of the second aspect.
  • the computer-readable medium may be a tangible (non-transitory) product (e.g. magnetic medium, optical disk, read-only memory, flash memory, etc) or a propagating signal.
  • any one of the embodiments of the first aspect may be adapted as an embodiment of the second and third aspects.
  • FIG. 1 is a top plan view of headsets operated in a collaborative setting.
  • FIG. 2 is a top plan view of a headset configured to modify an incoming voice signal from another headset based on the relative positioning of the headsets.
  • FIG. 3 is a block diagram of a headset according to an embodiment.
  • FIG. 4 is a flow chart of the data processing in the headset of FIG. 3.
  • FIG. 5 is a schematic view of signal paths in the headset of FIG. 3 when performing the data processing of FIG. 4. Detailed Description of Example Embodiments
  • any of the advantages, features, functions, devices or operational aspects of any of the embodiments of the present invention described or contemplated herein may be included in any of the other embodiments of the present invention described or contemplated herein.
  • any terms expressed in the singular form herein are meant to also include the plural form, and vice versa, unless explicitly stated otherwise.
  • “at least one” shall mean “one or more” and these phrases are intended to be interchangeable. Accordingly, the terms “a” and “an” shall mean “at least one” or “one or more,” even though the phrase “one or more” or “at least one” is also used herein.
  • first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing the scope of the present invention.
  • the term "and/or" includes any and all combinations of one or more of the associated listed items.
  • a “headset” refers to any system that comprises a microphone which is arranged to sense sound generated by a user talking, and at least one speaker element for each ear of the user.
  • the headset may be an integral device to be worn on or around the head of the user.
  • the headset may comprise a part to be worn on or around the head, and a part located elsewhere. At least the speaker elements are included in the head- worn part.
  • a “speaker element” or “speaker” refers to an electroacoustic transducer or “driver” that converts an electrical signal into a corresponding air- conducted signal.
  • This driver is also commonly referred to as a “speaker capsule” in the art.
  • the speakers may be of any kind, including but not limited to electrostatic speakers, magnetostrictive speakers, magnetostatic speakers, ribbon speakers, bending wave speakers, flat panel speakers, and thermo acoustic speakers.
  • a “microphone” refers to an airborne sound sensing transducer that converts an air-conducted signal into an electrical signal.
  • the microphone may be of any kind, including but not limited to electret microphone, magneto-dynamic microphone, condenser microphone, piezoelectric microphone, fiber optic microphone, and MEMS microphone.
  • sound or “sound wave” is a vibration that propagates as an audible mechanical wave of pressure and displacement through a medium.
  • wireless short-range communication refers to any form of wireless communication by use of a wireless communication transceiver that has a limited range, typically less than approximately 500 m.
  • a wireless short-range transceiver thus implements any available wireless local area network (WLAN) technology and/or wireless personal area network (WPAN) technology, such as communication technology based on the IEEE 802.11 standards, denoted Wi-Fi herein, and Bluetooth communication technology based on any standard specified by Bluetooth SIG.
  • WLAN wireless local area network
  • WPAN wireless personal area network
  • Embodiments of the invention are particularly, but not exclusively, suited to enable improved collaboration between individuals that communicate by use of headsets and are located in proximity of each other, typically within range for wireless- short-range communication.
  • FIG. 1 illustrates such a situation, in which three individuals or users Ul, U2, U3 with a respective headset 1 may collaborate with respect to a physical or virtual object 2.
  • each headset 1 comprises a pair of earpieces 3 which are joined by a headband 4 and are located over the ears of the individual.
  • Each earpiece 3 comprises a speaker element (not shown).
  • a microphone 5 is attached to one of the earpieces 3 by a microphone arm 6 and is located at the mouth of the wearer to sense speech.
  • the headsets 1 define a group and communicate wirelessly with each other within the group, via one or more antennas 7 (one shown) on the respective headset 1.
  • each headset 1 is configured to wirelessly transmit outgoing voice signals comprising speech sensed by the microphone 5 in the headset 1 and to wirelessly receive incoming voice signals comprising speech sensed by the microphones 5 in the other headsets 1.
  • the incoming voice signals are processed and transmitted to the speaker elements of the respective headset 1 to reproduce corresponding sound to the wearer.
  • stereo sound is reproduced by the speaker elements to have an undefined origin somewhere inside the head of the wearer.
  • Embodiments of the invention are based on the insight that significant technical advantages may be achieved by configuring the headset to instead reproduce the sound with an origin outside of the wearer, and specifically to represent the physical location of the individual that generates the sound.
  • the speech may be reproduced so as to appear to originate in the direction of the individual that is currently speaking.
  • the speech may be reproduced to reflect the distance from the wearer to the speaking individual.
  • the speech may be reproduced to reflect the direction of the head, and thus the mouth, of the speaking individual.
  • Embodiments of the invention may thereby to some extent create, for the individuals that communicate by the headsets, an experience that resembles a real-life conversation without headsets.
  • the participants may, by simply listening to each other speaking, gain an intuitive understanding of where the respective participant is located in the collaborative space. Such an understanding will facilitate collaboration between the participants, e.g. by reducing the risk for collisions and misunderstandings. Further, the intuitive understanding of the location will also make it easier for the participants to identify who is talking at each time point. This is in contrast to a conventional headset, in which the listener has to be able to discriminate between the voices of the participants to be able to identify the individual that is talking.
  • FIG. 2 shows a more detailed example of the structure for sound reproduction in a headset 1 worn by user Ul when receiving an incoming voice signal x in that comprises speech sensed by a microphone 5 in a headset 1 worn by user U2.
  • the voice signal x in is typically received by the headset 1 of user Ul in real time.
  • the receiving headset 1 is configured to apply a sound effect to the incoming voice signal x in that manipulates the sound produced by the speaker elements of the receiving headset 1, so as to achieve a virtual placement of the origin of the speech that represents the location of user U2 in relation to user Ul.
  • Such signal manipulation is also known in the art as "3D audio spatialization".
  • the current location of user U2 relative to user Ul is given by a set of position parameters, represented by
  • the receiving headset 1 is configured to operate so-called head-related transfer functions (HRTFs) on the voice signal x in to achieve the desired signal manipulation.
  • the receiving headset 1 comprises an HRTF engine 20 which is configured to obtain a suitable pair of HRTFs based on the current values of the position parameters.
  • the HRTF engine 20 may retrieve the pair of HRTFs from an HRTF database, which associates pairs of predefined HRTFs with position parameter values.
  • HRTF database may be generic for all users or tailored to user Ul.
  • the HRTF generator 20 may be configured to select the best match of HRTFs from the database given the current position parameters values, or perform an interpolation among the HRTFs in the database based on the current position parameter values.
  • the selected pair of HRTFs are then operated on the voice signal x in to produce left and right audio signals x L , x R for the left and right speaker elements, respectively.
  • This operation may be done in the time domain, as shown in FIG. 2 , in which the HRTFs are supplied in the form of head -related impulse responses h L , h R to a respective convolution unit 21, 22 that performs a time-convolution between h L , h R and Xi n to produce the audio signals x L , x R .
  • the operation is made in the frequency domain.
  • the use of HRTFs for 3D audio spatialization in either the time or frequency domain is well-known in the art.
  • the set of position parameters that represent the current location of user U2 may include a distance r, a first azimuth angle ⁇ , a first elevation angle ⁇ , a second azimuth angle ⁇ ', and a second elevation angle ⁇ ' .
  • the distance r is defined to extend from a first reference position PI of the headset 1 on user Ul to a second reference position P2 of the headset 1 on user U2.
  • the first azimuth angle ⁇ is defined in the plane of the drawing (i.e. the horizontal plane of the users) between a line extending from PI to P2 and a first reference direction Dl, which coincides with the viewpoint direction of user Ul and thus follows head movement of user Ul.
  • the first reference direction Dl may, e.g., be predefined with respect to the speaker elements of the headset so as to follow head movement of the individual.
  • the first elevation angle ⁇ (not indicated) corresponds to the first azimuth angle ⁇ but is defined in a plane perpendicular to the plane of the drawing (i.e. in the vertical plane of the users).
  • the second azimuth angle ⁇ ' is similar to the first azimuth angle ⁇ but is defined in relation to a second reference direction D2, which coincides with the viewpoint direction of user U2 and thus follows head movement of user U2.
  • the second elevation angle ⁇ ' (not indicated) is similar to the first elevation angle ⁇ but is defined but is defined in the vertical plane of user U2.
  • the reproduction of sound may be based only on a subset of the illustrated position parameters.
  • the reproduction of sound may be based on only the first azimuth angle ⁇ to mimic the direction to the talker in the horizontal plane.
  • the reproduction of sound may be based on the combination the first azimuth angle ⁇ and the first elevation angle ⁇ to mimic the direction to the talker in three dimensions.
  • the reproduction of sound may be based on the combination of the first azimuth angle ⁇ and the distance r, and optionally the first elevation angle ⁇ .
  • the second azimuth angle ⁇ ' and/or the second elevation angle ⁇ ' may be added to any of these examples to even further reproduce a real-life listening experience (i.e. without headset). It should be understood that different combinations of position parameters may be suitable for different types of scenarios with or without collaboration between the wearers of the headsets.
  • FIG. 3 illustrates a general structure of a headset 1 in accordance with
  • the headset 1 comprises a controller or control unit 310 which is responsible for the overall operation of the headset 1 and may be implemented by any commercially available CPU ("Central Processing Unit"), DSP ("Digital Signal Processor"), microprocessor or other electronic programmable logic device.
  • the controller 310 may be implemented using instructions that enable hardware functionality, for example, by using executable computer program instructions that may be stored on a computer memory 320.
  • the controller 310 may be configured to read the instruc- tions from the memory 320 and execute these instructions to control the operation of the headset 1.
  • the memory 320 may be implemented using any commonly known technology for computer-readable memories such as ROM, RAM, SRAM, DRAM, CMOS, FLASH, DDR, SDRAM or some other memory technology.
  • the headset 1 further comprises left and right speaker elements 330, 340 which are arranged to register with or be inserted into a user's left ear and right ear, respectively.
  • the speaker elements 330, 340 may thus be located around, on or in the ears of the user.
  • a microphone 5 is arranged to sense sound waves generated by the user when speaking.
  • the headset 1 further comprises a communication unit or module 350 which is configured as a transceiver for short-range wireless communication, e.g. by Wi-Fi or Bluetooth, or a combination thereof. It is understood that the communication unit 350 includes one or more antenna elements (cf. 7 in FIG. 1), which may be physically separated or arranged as an antenna array.
  • the headset 1 may further comprise an orientation sensor 360 which is configured to sense the momentary orientation of the head of the user, and thus the reference direction D2 (FIG. 2).
  • the orientation sensor 360 may comprise one or more gyroscopes, magnetometers or acceleration sensors, or any combination thereof.
  • the controller 310 is configured to implement a positioning unit 370 which computes momentary values of the above-mentioned set of position parameters.
  • a positioning unit 370 which computes momentary values of the above-mentioned set of position parameters.
  • the controller 310 is also configured to implement an audio spatializa- tion unit, ASU 380 which generates electrical audio signals x L , x R for the speaker elements 330, 340 by use of any known 3D audio spatialization technique and based on the set of position parameters provided by the positioning unit 370.
  • the ASU 380 may e.g. implement the HRTF technique as described in relation to FIG. 2.
  • the HRTF engine 20 and the convolution units 21, 22 of FIG. 2 may be included in the ASU 380, and the above-mentioned HRTF database may be stored in memory 320.
  • the HRTF technique is primarily useful for achieving directional spatialization, e.g.
  • the ASU 380 may be configured to reproduce the distance r in the sound by any well-known technique, e.g. by generating the audio signals x L , x R with a selected loudness and/or initial time delay and/or mix of direct and reverberant sound. For example, the loudness in x L , x R may be reduced as a function of the distance r.
  • the ASU 380 may be further configured to reproduce the effect of the second azimuth angle ⁇ ' and/or the second elevation angle ⁇ ', e.g. by selectively modifying one or more of the loudness, the initial time delay and the mix of direct and reverberant sound.
  • positioning unit 370 and the ASU 380 are illustrated in FIG. 3 as implemented by the controller 310, it is conceivable that either of these units 370, 380 are at least partly implemented by dedicated hardware which is structurally and functionally separate from the controller 310.
  • FIG. 4 is a flowchart of data processing steps executed by a headset in an embodi- ment of the invention.
  • the headset designated “headset i" in FIG. 4 and denoted “first headset” below, may be included among the headsets 1 in FIG. 1 and communicates with at least one other headset, designated “headset j" in FIG. 4 and denoted “second headset” below.
  • the first headset receives the incoming voice signal x in from the second headset.
  • the first headset estimates the current position of the second headset, e.g. by computing or otherwise obtaining one or more of the above- mentioned position parameters.
  • the first headset modifies the incoming voice signal x in by 3D audio spatialization as a function of the current position, to generate audio signals x L , x R that are tailored to cause the speaker elements to jointly produce sound that is perceived by the user to originate from the current position.
  • the audio signals x L , x R are provided to the respective speaker element.
  • FIG. 5 exemplifies the flow of data/signals during the data processing (FIG. 4) in a first headset configured in accordance with FIG. 3.
  • FIG. 5 does not imply that all data/signals need to be generated or processed concurrently.
  • the communication unit 350 receives an incoming voice signal x in which is transmitted from a second headset and comprises speech sensed by the microphone in the second headset.
  • the communication unit 350 also transmits an outgoing voice signal x out , which comprises speech sensed by the microphone 5 (FIG. 3) in the first headset.
  • the outgoing voice signal x out may be generated by the control unit 310 (FIG. 3) or by dedicated hardware (not shown) in the first headset.
  • step 401 the ASU 380 receives the incoming voice signal x in from the communication unit 350.
  • step 402 the positioning unit 370 estimates the current position of the second headset and outputs the corresponding set of position parameters.
  • step 403 the ASU 380 operates on the incoming voice signal x in and the set of position parameters to generate the audio signals x L , x R .
  • step 404 the ASU 380 provides the audio signals x L , x R to the speaker elements 330, 340.
  • the first headset may receive more than one incoming voice signal, generated by more than one headset within the group.
  • each of the voice signals may be separately processed as described in the foregoing based on the current position of the associated headset.
  • the resulting left and right audio signals for each voice signal may then be merged into composite left and right audio signals, which are provided to the left speaker element and the right speaker element, respectively.
  • the current position of the second headset may be determined in different ways.
  • the current position is at least partly determined by an external absolute positioning system, which configured to determine an absolute position for each headset in the group of headsets (FIG. 1).
  • the absolute position may be given in a predefined coordinate system which is common and known to all headsets.
  • the respective headset may receive, e.g. by the communication unit 350, a position signal generated by the positioning system to indicate the current absolute positions of all headsets in the group, and the positioning unit 370 may compute one or more position parameters based on the absolute positions.
  • the positioning unit 370 may also retrieve orientation data (OD in FIG. 5) from the orientation sensor 360 to determine the reference direction Dl.
  • the positioning system may implement any known technique for positional tracking, e.g. optical tracking, electromagnetic tracking, or acoustic tracking.
  • the positioning system may comprise a number of signal receivers which have a fixed and known location in the predefined coordinate system and are configured to receive a signal from the respective headset, e.g. transmitted by its communication unit 350.
  • the positioning system may determine, based on the received signals, the distance and/or angle of arrival at each signal receiver and use trilateration and/or triangulation to determine the absolute position of the respective headset.
  • the current position is at least partly determined by an internal absolute positioning system, which may be implemented by the positioning unit 370 in the respective headset and configured to determine an absolute position of the headset in a predefined coordinate system which is common and known to all headsets in the group of headsets (FIG. 1).
  • Each headset may communicate its absolute position to all other headsets in the group, e.g. by transmitting a respective position signal by the communication unit 350.
  • the respective headset may receive, by the communication unit 350, one or more position signals (REF in FIG. 5) that indicate the current absolute position(s) of the other headsets in the group, and the positioning unit 370 may compute one or more position parameters based on the absolute positions.
  • the positioning unit 370 may also retrieve orientation data (OD in FIG. 5) from the orientation sensor 360 to determine the reference direction Dl.
  • the positioning system may implement any known technique for positional tracking, e.g. optical tracking, electromagnetic tracking, inertial tracking, or acoustic tracking.
  • the respective headset may receive, e.g. by the communication unit 350, a reference signal from a number of external signal transmitters with a fixed and known location in the predefined coordi- nate system.
  • the positioning unit 370 may determine, based on the received signals, the distance and/or angle of arrival at one or more antennas of the communication unit 350 and use trilateration and/or triangulation to determine its absolute position.
  • the positioning unit 370 comprises a GPS receiver which is configured to determine an absolute position in the form of a GPS position for the headset.
  • the absolute position may then be handled in the same way as in the second embodiment, i.e. communicated to the other headsets in the group, thereby enabling the positioning unit 370 of the respective headset to compute one or more position parameters based on the absolute positions of the headsets.
  • the positioning unit 370 may also retrieve orientation data (OD in FIG. 5) from the orientation sensor 360 to determine the reference direction Dl.
  • the data processing in FIG. 4 is preferably executed in real time, i.e. with minimum time delay between sensing and reproduction of speech. It may therefore be desirable to reduce the need to communicate absolute positions to and from the headsets.
  • This objective may be achieved by a fourth positioning embodiment, in which each headset 1 is configured to independently determine the relative locations of the other headsets. Each relative location may be given by at least one of the position parameters ⁇ , ⁇ , ⁇ .
  • the positioning unit 370 comprises one or more head- mounted cameras and is configured to process images from the camera(s) to detect markers on the other headsets and to compute the relative position of the respective headset based on the detected markers in the images.
  • each headset is configured to transmit an electromagnetic reference signal, e.g. by the communication unit 350.
  • the positioning unit 370 in the respective headset receives the reference signal via the communication unit 350 (cf. REF in FIG. 5) and computes the relative position as a function of the reference signal.
  • the reference signal may e.g. be transmitted in connection with or even be included in the incoming voice signal x in . Such a transmission of the reference signal may greatly facilitate the design and operation of the headset.
  • the positioning unit 370 may be configured to estimate the distance r based on the signal strength of the incoming reference signal (e.g. RSSI).
  • the positioning unit 370 may be configured to determine the time of flight (ToF) of the reference signal and estimate the distance r and/or at least one of the angles ⁇ , ⁇ based on this information, as is known in the art.
  • the communication unit 350 comprises at least two antenna elements, and the positioning unit 370 may be configured to compute at least one of the angles ⁇ , ⁇ by comparing the reference signal as received at the respective antenna element.
  • the angle(s) may be determined as a function of the time difference of arrival (TDOA) or the phase-shift at the antenna elements, as is known in the art.
  • the antenna elements may be incorporated in an antenna array.
  • the antenna elements may be configured as physically separated antennas which are arranged on different parts of the headset, e.g. one antenna element on each earpiece (3 in FIG. 1) and/or distributed along a headband (4 in FIG. 1).
  • the fourth positioning embodiment also makes it possible determine either of the angles ⁇ , ⁇ without the need for an orientation sensor 360, provided that the camera(s) or antenna element(s) are mounted to the head of the user and thus has a fixed and known relation to the reference direction Dl (FIG. 3).
  • these angles may be given by orientation data OD from an orientation sensor 360 in the respective headset, and each headset in the group of headsets is suitably configured to communicate these angles to the other headsets in the group.

Abstract

A headset (1) is configured to be worn by an individual (U1) and comprises a microphone and speaker elements. A communication unit in the headset (1) is configured to transmit a first voice signal generated by the microphone and to receive a second voice signal from a second headset worn by another individual (U2). The headset (1) is operated, e.g. by program instructions provided on a computer-readable medium, to determine a current position of the second headset (1) and to operate the speaker elements to reproduce sound based on the second voice signal such that the sound represents the current position in relation to the headset (1). For example, the sound may be reproduced to represent a distance (r) and/or an angle (θ, φ) between the headset (1) and the current position, to give the individual (U1) an intuitive understanding of the location of the individual (U2) wearing the second headset (1).

Description

HEADSET AND METHOD OF OPERATING HEADSET
Technical Field
The present invention relates generally to headsets and in particular to headsets for use in a collaborative setting in which users communicate by headsets while being located in proximity of each other.
Background Art
Headsets are commonly used for communication between individuals. Generally, a headset is a device worn on or around the head of an individual and comprises a microphone and one or more loudspeakers. The microphone is operated to sense sound waves when the individual speaks and to generate a corresponding electric signal which is output from the headset. Each loudspeaker is arranged adjacent to one of the indivi- dual's ears, or in the ear canal, and is operated to generate sound waves that reproduce an incoming electric audio signal that comprises speech sensed by the microphone of another headset. Such headsets are e.g. commonly used for telephone communication. Headsets are sometimes also referred to as headphones or earphones, although these terms normally designate a sound-reproducing device without a microphone.
Headsets have also found use in collaborative settings, in which users are located in each other's physical proximity and communicate by headsets to jointly complete a task. The headsets guarantee that the collaborating users can properly communicate with each other even at adverse conditions, e.g. in noisy environments and/or when the distance between the users or the presence of obstacles makes communication difficult or increases the risk for misunderstandings. Such collaborative settings are found in industrial environments, such as factories, smelting plants, construction sites, etc, as well as at pit stops for racing vehicles.
Headsets are also used for collaborative applications in virtual reality (VR) environments. In VR, each user wears a VR headset that provides virtual reality for the user, via images shown to the user on a stereoscopic head-mounted display. The VR headset also typically includes a microphone and a pair of loudspeakers providing stereo sound. VR headsets are widely used with computer games but they are also used in other applications, including simulators and trainers. Headsets may also be designed for augmented reality (AR), optionally in combination with VR.
In all types of scenarios where two or more users communicate by headsets, the users typically identify who is talking by recognizing the voice of the talker. This may lead to misunderstandings that disrupt the collaborative effort and may even lead to accidents and injuries, e.g. in the above-mentioned industrial environments and pit stops.
The prior art comprises US2011/0164188, which discloses a remote control device for use with an audio visual (AV) system capable of displaying 3D images. The remote control device enables the AV system to track of the position of viewers that hold a respective remote control device relative to a display, thereby allowing the AV system to deliver optimized 2D or 3D content to the respective viewer on the display. Further, the AV system may deliver audio to the respective viewer, via stand-alone loudspeakers or a headset, with a volume and distribution which is adapted to the viewer position. The headset may include a microphone to enable each viewer to interact with the AV system through voice commands.
Brief Summary
It is an objective of the invention to at least partly overcome one or more limitations of the prior art.
Another objective is to facilitate collaboration between users that communicate by headsets.
A further objective is to facilitate identification of the person talking by users that communicate by headsets.
One or more of these objectives, as well as further objectives that may appear from the description below, are at least partly achieved by a headset, a method of operating a headset and a computer-readable medium according to the independent claims, embodiments thereof being defined by the dependent claims.
In a first aspect, there is provided a headset comprising: a microphone; first and second speaker elements configured to be mounted over, on or in a respective ear of an individual; a communication unit configured to transmit a first voice signal generated by the microphone and to receive a second voice signal from a second headset; a positioning unit configured to determine a current position of the second headset; and a control unit configured to operate the first and second speaker elements to reproduce sound based on the second voice signal, wherein the sound is reproduced to represent the current position of the second headset in relation to the headset.
In some embodiments, the second voice signal comprises speech sensed by a microphone in the second headset.
In some embodiments, the positioning unit is configured to determine at least one position parameter representing the current position, and the control unit is configured to generate audio signals for the first and second speaker elements as a function of second voice signal and the at least one position parameter, and provide the audio signals to the speaker elements so as to cause the speaker elements to reproduce the sound with a spatial origin given by the at least one position parameter.
In some embodiments, said at least one position parameter comprises one or more of: a distance from the headset to the current position of the second headset, and an angle between a reference direction of the headset and the current position of the second headset.
In some embodiments, the reference direction is predefined with respect to the speaker elements of the headset so as to follow head movement of the individual.
In some embodiments, said at least one position parameter further comprises an orientation of the second headset.
In some embodiments, the communication unit is configured to receive a reference signal from the second headset, and wherein the positioning unit is configured to compute at least one of the distance and the angle as a function of the reference signal.
In some embodiments, the communication unit comprises at least two antenna elements, and the positioning unit is configured to compute the angle by comparing the reference signal as received at the at least two antenna elements.
In some embodiments, the reference signal is included in or received together with the second voice signal.
In some embodiments, the positioning unit is configured to at least partly determine the current position as a function of position data received by the
communication unit.
In some embodiments, the position data is received from at least one of an external positioning system and the second headset.
In some embodiments the position data comprises an absolute position of the second headset in a predefined coordinate system, and the positioning unit is configured to determine an absolute position of the headset in the predefined coordinate system, and determine the current position as a function of the absolute position of the headset and the absolute position of the second headset.
In some embodiments, the absolute position of the headset and the absolute position of the second headset are GPS positions.
In some embodiments, the headset further comprises an orientation sensor configured to generate orientation data for the headset, wherein the positioning unit is further configured to determine the current position as a function of the orientation data.
In some embodiments, the control unit is configured to obtain, based on the at least one position parameter, a head-related transfer function for the first speaker element and the second speaker element, respectively, and operate the respective head- related transfer function on the second voice signal to generate the audio signals for the first and second speaker elements.
In some embodiments, the control unit is configured to operate the first and second speaker elements to reproduce the sound with a spatial origin at the current position.
In some embodiments, the communication unit is configured for wireless short- range communication with the second headset.
In a second aspect, there is provided a method of operating a headset comprising first and second speaker elements configured to be mounted over, on or in a respective ear of an individual. The method comprises: receiving a voice signal generated by a second headset; obtaining a current position of the second headset; and operating the first and second speaker elements to reproduce sound based on the voice signal, such that the sound is reproduced to represent the current position of the second headset in relation to the headset.
In a third aspect, there is provided a computer-readable medium comprising program instructions which, when executed by a control unit, cause the control unit to perform the method of the second aspect. The computer-readable medium may be a tangible (non-transitory) product (e.g. magnetic medium, optical disk, read-only memory, flash memory, etc) or a propagating signal.
Any one of the embodiments of the first aspect may be adapted as an embodiment of the second and third aspects.
Other objectives, features and aspects of the present invention, as well as further advantages, will appear from the following detailed description, from the attached claims as well as from the drawings.
Brief Description of Drawings
Embodiments of the invention will now be described in more detail with reference to the accompanying schematic drawings.
FIG. 1 is a top plan view of headsets operated in a collaborative setting.
FIG. 2 is a top plan view of a headset configured to modify an incoming voice signal from another headset based on the relative positioning of the headsets.
FIG. 3 is a block diagram of a headset according to an embodiment.
FIG. 4 is a flow chart of the data processing in the headset of FIG. 3.
FIG. 5 is a schematic view of signal paths in the headset of FIG. 3 when performing the data processing of FIG. 4. Detailed Description of Example Embodiments
Embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all, embodiments of the invention are shown. Like reference signs refer to like elements throughout.
Also, it will be understood that, where possible, any of the advantages, features, functions, devices or operational aspects of any of the embodiments of the present invention described or contemplated herein may be included in any of the other embodiments of the present invention described or contemplated herein. In addition, where possible, any terms expressed in the singular form herein are meant to also include the plural form, and vice versa, unless explicitly stated otherwise. As used herein, "at least one" shall mean "one or more" and these phrases are intended to be interchangeable. Accordingly, the terms "a" and "an" shall mean "at least one" or "one or more," even though the phrase "one or more" or "at least one" is also used herein. As used herein, except where the context requires otherwise owing to express language or necessary implication, the word "comprise" or variations such as "comprises" or "comprising" is used in an inclusive sense, that is, to specify the presence of the stated features but not to preclude the presence or addition of further features in various embodiments of the invention.
It will furthermore be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing the scope of the present invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Well-known functions or constructions may not be described in detail for brevity and/or clarity. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
Before describing embodiments of the invention in more detail, a few definitions will be given.
As used herein, a "headset" refers to any system that comprises a microphone which is arranged to sense sound generated by a user talking, and at least one speaker element for each ear of the user. The headset may be an integral device to be worn on or around the head of the user. Alternatively, the headset may comprise a part to be worn on or around the head, and a part located elsewhere. At least the speaker elements are included in the head- worn part.
As used herein, a "speaker element" or "speaker" refers to an electroacoustic transducer or "driver" that converts an electrical signal into a corresponding air- conducted signal. This driver is also commonly referred to as a "speaker capsule" in the art. The speakers may be of any kind, including but not limited to electrostatic speakers, magnetostrictive speakers, magnetostatic speakers, ribbon speakers, bending wave speakers, flat panel speakers, and thermo acoustic speakers.
As used herein, a "microphone" refers to an airborne sound sensing transducer that converts an air-conducted signal into an electrical signal. The microphone may be of any kind, including but not limited to electret microphone, magneto-dynamic microphone, condenser microphone, piezoelectric microphone, fiber optic microphone, and MEMS microphone.
As used herein, "sound" or "sound wave" is a vibration that propagates as an audible mechanical wave of pressure and displacement through a medium.
As used herein, "wireless short-range communication" refers to any form of wireless communication by use of a wireless communication transceiver that has a limited range, typically less than approximately 500 m. A wireless short-range transceiver thus implements any available wireless local area network (WLAN) technology and/or wireless personal area network (WPAN) technology, such as communication technology based on the IEEE 802.11 standards, denoted Wi-Fi herein, and Bluetooth communication technology based on any standard specified by Bluetooth SIG.
Embodiments of the invention are particularly, but not exclusively, suited to enable improved collaboration between individuals that communicate by use of headsets and are located in proximity of each other, typically within range for wireless- short-range communication. FIG. 1 illustrates such a situation, in which three individuals or users Ul, U2, U3 with a respective headset 1 may collaborate with respect to a physical or virtual object 2. In the illustrated example, each headset 1 comprises a pair of earpieces 3 which are joined by a headband 4 and are located over the ears of the individual. Each earpiece 3 comprises a speaker element (not shown). A microphone 5 is attached to one of the earpieces 3 by a microphone arm 6 and is located at the mouth of the wearer to sense speech.
The headsets 1 define a group and communicate wirelessly with each other within the group, via one or more antennas 7 (one shown) on the respective headset 1.
Specifically, each headset 1 is configured to wirelessly transmit outgoing voice signals comprising speech sensed by the microphone 5 in the headset 1 and to wirelessly receive incoming voice signals comprising speech sensed by the microphones 5 in the other headsets 1. The incoming voice signals are processed and transmitted to the speaker elements of the respective headset 1 to reproduce corresponding sound to the wearer. In a conventional headset, stereo sound is reproduced by the speaker elements to have an undefined origin somewhere inside the head of the wearer.
Embodiments of the invention are based on the insight that significant technical advantages may be achieved by configuring the headset to instead reproduce the sound with an origin outside of the wearer, and specifically to represent the physical location of the individual that generates the sound. For example, the speech may be reproduced so as to appear to originate in the direction of the individual that is currently speaking. In another example, the speech may be reproduced to reflect the distance from the wearer to the speaking individual. In certain embodiments, the speech may be reproduced to reflect the direction of the head, and thus the mouth, of the speaking individual.
Embodiments of the invention may thereby to some extent create, for the individuals that communicate by the headsets, an experience that resembles a real-life conversation without headsets. The participants may, by simply listening to each other speaking, gain an intuitive understanding of where the respective participant is located in the collaborative space. Such an understanding will facilitate collaboration between the participants, e.g. by reducing the risk for collisions and misunderstandings. Further, the intuitive understanding of the location will also make it easier for the participants to identify who is talking at each time point. This is in contrast to a conventional headset, in which the listener has to be able to discriminate between the voices of the participants to be able to identify the individual that is talking.
FIG. 2 shows a more detailed example of the structure for sound reproduction in a headset 1 worn by user Ul when receiving an incoming voice signal xin that comprises speech sensed by a microphone 5 in a headset 1 worn by user U2. The voice signal xin is typically received by the headset 1 of user Ul in real time. As understood from the foregoing, the receiving headset 1 is configured to apply a sound effect to the incoming voice signal xin that manipulates the sound produced by the speaker elements of the receiving headset 1, so as to achieve a virtual placement of the origin of the speech that represents the location of user U2 in relation to user Ul. Such signal manipulation is also known in the art as "3D audio spatialization". The current location of user U2 relative to user Ul is given by a set of position parameters, represented by
(r, θ, φ, θ' , φ') in FIG. 2. In the illustrated example, the receiving headset 1 is configured to operate so-called head-related transfer functions (HRTFs) on the voice signal xin to achieve the desired signal manipulation. Specifically, the receiving headset 1 comprises an HRTF engine 20 which is configured to obtain a suitable pair of HRTFs based on the current values of the position parameters. The HRTF engine 20 may retrieve the pair of HRTFs from an HRTF database, which associates pairs of predefined HRTFs with position parameter values. Such a database may be generic for all users or tailored to user Ul. The HRTF generator 20 may be configured to select the best match of HRTFs from the database given the current position parameters values, or perform an interpolation among the HRTFs in the database based on the current position parameter values. The selected pair of HRTFs are then operated on the voice signal xin to produce left and right audio signals xL, xR for the left and right speaker elements, respectively. This operation may be done in the time domain, as shown in FIG. 2 , in which the HRTFs are supplied in the form of head -related impulse responses hL, hR to a respective convolution unit 21, 22 that performs a time-convolution between hL, hR and Xin to produce the audio signals xL, xR . In an alternative, the operation is made in the frequency domain. The use of HRTFs for 3D audio spatialization in either the time or frequency domain is well-known in the art.
As indicated in FIG. 2, the set of position parameters that represent the current location of user U2 may include a distance r, a first azimuth angle Θ, a first elevation angle φ, a second azimuth angle θ', and a second elevation angle φ' . The distance r is defined to extend from a first reference position PI of the headset 1 on user Ul to a second reference position P2 of the headset 1 on user U2. The first azimuth angle Θ is defined in the plane of the drawing (i.e. the horizontal plane of the users) between a line extending from PI to P2 and a first reference direction Dl, which coincides with the viewpoint direction of user Ul and thus follows head movement of user Ul. The first reference direction Dl may, e.g., be predefined with respect to the speaker elements of the headset so as to follow head movement of the individual. The first elevation angle φ (not indicated) corresponds to the first azimuth angle Θ but is defined in a plane perpendicular to the plane of the drawing (i.e. in the vertical plane of the users). The second azimuth angle θ' is similar to the first azimuth angle Θ but is defined in relation to a second reference direction D2, which coincides with the viewpoint direction of user U2 and thus follows head movement of user U2. The second elevation angle φ' (not indicated) is similar to the first elevation angle φ but is defined but is defined in the vertical plane of user U2.
In certain embodiments, the reproduction of sound may be based only on a subset of the illustrated position parameters. In one example, the reproduction of sound may be based on only the first azimuth angle Θ to mimic the direction to the talker in the horizontal plane. In another example, the reproduction of sound may be based on the combination the first azimuth angle Θ and the first elevation angle φ to mimic the direction to the talker in three dimensions. In yet another example, the reproduction of sound may be based on the combination of the first azimuth angle Θ and the distance r, and optionally the first elevation angle φ. The second azimuth angle θ' and/or the second elevation angle φ' may be added to any of these examples to even further reproduce a real-life listening experience (i.e. without headset). It should be understood that different combinations of position parameters may be suitable for different types of scenarios with or without collaboration between the wearers of the headsets.
FIG. 3 illustrates a general structure of a headset 1 in accordance with
embodiments of the invention. The headset 1 comprises a controller or control unit 310 which is responsible for the overall operation of the headset 1 and may be implemented by any commercially available CPU ("Central Processing Unit"), DSP ("Digital Signal Processor"), microprocessor or other electronic programmable logic device. The controller 310 may be implemented using instructions that enable hardware functionality, for example, by using executable computer program instructions that may be stored on a computer memory 320. The controller 310 may be configured to read the instruc- tions from the memory 320 and execute these instructions to control the operation of the headset 1. The memory 320 may be implemented using any commonly known technology for computer-readable memories such as ROM, RAM, SRAM, DRAM, CMOS, FLASH, DDR, SDRAM or some other memory technology. The headset 1 further comprises left and right speaker elements 330, 340 which are arranged to register with or be inserted into a user's left ear and right ear, respectively. The speaker elements 330, 340 may thus be located around, on or in the ears of the user. Further, a microphone 5 is arranged to sense sound waves generated by the user when speaking. The headset 1 further comprises a communication unit or module 350 which is configured as a transceiver for short-range wireless communication, e.g. by Wi-Fi or Bluetooth, or a combination thereof. It is understood that the communication unit 350 includes one or more antenna elements (cf. 7 in FIG. 1), which may be physically separated or arranged as an antenna array. The headset 1 may further comprise an orientation sensor 360 which is configured to sense the momentary orientation of the head of the user, and thus the reference direction D2 (FIG. 2). The orientation sensor 360 may comprise one or more gyroscopes, magnetometers or acceleration sensors, or any combination thereof.
In the example of FIG. 3, the controller 310 is configured to implement a positioning unit 370 which computes momentary values of the above-mentioned set of position parameters. Various implementations of the positioning unit 370 will be described further below.
In FIG. 3, the controller 310 is also configured to implement an audio spatializa- tion unit, ASU 380 which generates electrical audio signals xL, xR for the speaker elements 330, 340 by use of any known 3D audio spatialization technique and based on the set of position parameters provided by the positioning unit 370. The ASU 380 may e.g. implement the HRTF technique as described in relation to FIG. 2. Thus, the HRTF engine 20 and the convolution units 21, 22 of FIG. 2 may be included in the ASU 380, and the above-mentioned HRTF database may be stored in memory 320. As is known to the person skilled in the art, the HRTF technique is primarily useful for achieving directional spatialization, e.g. based on the first azimuth angle Θ and/or the first elevation angle φ (FIG. 2). To the extent that the set of position parameters includes distance r (FIG. 2), the ASU 380 may be configured to reproduce the distance r in the sound by any well-known technique, e.g. by generating the audio signals xL, xR with a selected loudness and/or initial time delay and/or mix of direct and reverberant sound. For example, the loudness in xL, xR may be reduced as a function of the distance r. The ASU 380 may be further configured to reproduce the effect of the second azimuth angle θ' and/or the second elevation angle φ', e.g. by selectively modifying one or more of the loudness, the initial time delay and the mix of direct and reverberant sound.
Although the positioning unit 370 and the ASU 380 are illustrated in FIG. 3 as implemented by the controller 310, it is conceivable that either of these units 370, 380 are at least partly implemented by dedicated hardware which is structurally and functionally separate from the controller 310.
FIG. 4 is a flowchart of data processing steps executed by a headset in an embodi- ment of the invention. The headset, designated "headset i" in FIG. 4 and denoted "first headset" below, may be included among the headsets 1 in FIG. 1 and communicates with at least one other headset, designated "headset j" in FIG. 4 and denoted "second headset" below. In step 401, the first headset receives the incoming voice signal xin from the second headset. In step 402, the first headset estimates the current position of the second headset, e.g. by computing or otherwise obtaining one or more of the above- mentioned position parameters. In step 403, the first headset modifies the incoming voice signal xin by 3D audio spatialization as a function of the current position, to generate audio signals xL, xR that are tailored to cause the speaker elements to jointly produce sound that is perceived by the user to originate from the current position. In step 404, the audio signals xL, xR are provided to the respective speaker element.
FIG. 5 exemplifies the flow of data/signals during the data processing (FIG. 4) in a first headset configured in accordance with FIG. 3. FIG. 5 does not imply that all data/signals need to be generated or processed concurrently. In the illustrated example, the communication unit 350 receives an incoming voice signal xin which is transmitted from a second headset and comprises speech sensed by the microphone in the second headset. The communication unit 350 also transmits an outgoing voice signal xout, which comprises speech sensed by the microphone 5 (FIG. 3) in the first headset. The outgoing voice signal xout may be generated by the control unit 310 (FIG. 3) or by dedicated hardware (not shown) in the first headset. In step 401, the ASU 380 receives the incoming voice signal xin from the communication unit 350. In step 402, the positioning unit 370 estimates the current position of the second headset and outputs the corresponding set of position parameters. In step 403, the ASU 380 operates on the incoming voice signal xin and the set of position parameters to generate the audio signals xL, xR. In step 404, the ASU 380 provides the audio signals xL, xR to the speaker elements 330, 340.
It is realized that the first headset may receive more than one incoming voice signal, generated by more than one headset within the group. To the extent that the incoming voice signals overlap in time, each of the voice signals may be separately processed as described in the foregoing based on the current position of the associated headset. The resulting left and right audio signals for each voice signal may then be merged into composite left and right audio signals, which are provided to the left speaker element and the right speaker element, respectively.
The current position of the second headset may be determined in different ways. In a first positioning embodiment, the current position is at least partly determined by an external absolute positioning system, which configured to determine an absolute position for each headset in the group of headsets (FIG. 1). The absolute position may be given in a predefined coordinate system which is common and known to all headsets. The respective headset may receive, e.g. by the communication unit 350, a position signal generated by the positioning system to indicate the current absolute positions of all headsets in the group, and the positioning unit 370 may compute one or more position parameters based on the absolute positions. To compute either of the angles θ, φ, the positioning unit 370 may also retrieve orientation data (OD in FIG. 5) from the orientation sensor 360 to determine the reference direction Dl. The positioning system may implement any known technique for positional tracking, e.g. optical tracking, electromagnetic tracking, or acoustic tracking. In the example of electromagnetic tracking, the positioning system may comprise a number of signal receivers which have a fixed and known location in the predefined coordinate system and are configured to receive a signal from the respective headset, e.g. transmitted by its communication unit 350. The positioning system may determine, based on the received signals, the distance and/or angle of arrival at each signal receiver and use trilateration and/or triangulation to determine the absolute position of the respective headset.
In a second positioning embodiment, the current position is at least partly determined by an internal absolute positioning system, which may be implemented by the positioning unit 370 in the respective headset and configured to determine an absolute position of the headset in a predefined coordinate system which is common and known to all headsets in the group of headsets (FIG. 1). Each headset may communicate its absolute position to all other headsets in the group, e.g. by transmitting a respective position signal by the communication unit 350. Thereby, the respective headset may receive, by the communication unit 350, one or more position signals (REF in FIG. 5) that indicate the current absolute position(s) of the other headsets in the group, and the positioning unit 370 may compute one or more position parameters based on the absolute positions. To compute either of the angles θ, φ, the positioning unit 370 may also retrieve orientation data (OD in FIG. 5) from the orientation sensor 360 to determine the reference direction Dl. The positioning system may implement any known technique for positional tracking, e.g. optical tracking, electromagnetic tracking, inertial tracking, or acoustic tracking. In the example of electromagnetic tracking, the respective headset may receive, e.g. by the communication unit 350, a reference signal from a number of external signal transmitters with a fixed and known location in the predefined coordi- nate system. The positioning unit 370 may determine, based on the received signals, the distance and/or angle of arrival at one or more antennas of the communication unit 350 and use trilateration and/or triangulation to determine its absolute position.
It may be desirable to dispense with the need for external devices, to reduce cost and facilitate deployment. This objective may be achieved by a third positioning embodiment, in which the positioning unit 370 comprises a GPS receiver which is configured to determine an absolute position in the form of a GPS position for the headset. The absolute position may then be handled in the same way as in the second embodiment, i.e. communicated to the other headsets in the group, thereby enabling the positioning unit 370 of the respective headset to compute one or more position parameters based on the absolute positions of the headsets. Like in the first and second positioning embodiments, the positioning unit 370 may also retrieve orientation data (OD in FIG. 5) from the orientation sensor 360 to determine the reference direction Dl.
The data processing in FIG. 4 is preferably executed in real time, i.e. with minimum time delay between sensing and reproduction of speech. It may therefore be desirable to reduce the need to communicate absolute positions to and from the headsets. This objective may be achieved by a fourth positioning embodiment, in which each headset 1 is configured to independently determine the relative locations of the other headsets. Each relative location may be given by at least one of the position parameters τ, θ, φ. In one implementation, the positioning unit 370 comprises one or more head- mounted cameras and is configured to process images from the camera(s) to detect markers on the other headsets and to compute the relative position of the respective headset based on the detected markers in the images. In another implementation, which is currently believed to be more accurate and robust, less costly and simpler to deploy, each headset is configured to transmit an electromagnetic reference signal, e.g. by the communication unit 350. In such an implementation, the positioning unit 370 in the respective headset receives the reference signal via the communication unit 350 (cf. REF in FIG. 5) and computes the relative position as a function of the reference signal. The reference signal may e.g. be transmitted in connection with or even be included in the incoming voice signal xin. Such a transmission of the reference signal may greatly facilitate the design and operation of the headset. In one example, the positioning unit 370 may be configured to estimate the distance r based on the signal strength of the incoming reference signal (e.g. RSSI). In another example, the positioning unit 370 may be configured to determine the time of flight (ToF) of the reference signal and estimate the distance r and/or at least one of the angles θ, φ based on this information, as is known in the art. In yet another example, the communication unit 350 comprises at least two antenna elements, and the positioning unit 370 may be configured to compute at least one of the angles θ, φ by comparing the reference signal as received at the respective antenna element. For example, the angle(s) may be determined as a function of the time difference of arrival (TDOA) or the phase-shift at the antenna elements, as is known in the art. The antenna elements may be incorporated in an antenna array.
Alternatively, the antenna elements may be configured as physically separated antennas which are arranged on different parts of the headset, e.g. one antenna element on each earpiece (3 in FIG. 1) and/or distributed along a headband (4 in FIG. 1).
The fourth positioning embodiment also makes it possible determine either of the angles θ, φ without the need for an orientation sensor 360, provided that the camera(s) or antenna element(s) are mounted to the head of the user and thus has a fixed and known relation to the reference direction Dl (FIG. 3). To the extent that either of the angles θ', φ' are used by step 403, these angles may be given by orientation data OD from an orientation sensor 360 in the respective headset, and each headset in the group of headsets is suitably configured to communicate these angles to the other headsets in the group.
The description given above relates to various general and specific embodiments, but the scope of the invention is limited only by the appended claims.

Claims

1. A headset, comprising:
a microphone (5),
first and second speaker elements (330, 340) configured to be mounted over, on or in a respective ear of an individual,
a communication unit (350) configured to transmit a first voice signal (xout) generated by the microphone (5) and to receive a second voice signal (Χ η) from a second headset (1),
a positioning unit (370) configured to determine a current position of the second headset (1), and
a control unit (310) configured to operate the first and second speaker elements (330, 340) to reproduce sound based on the second voice signal (Χ η), wherein the sound is reproduced to represent the current position of the second headset (1) in relation to the headset.
2. The headset of claim 1 , wherein the second voice signal (x;n) comprises speech sensed by a microphone (5) in the second headset (1).
3. The headset of claim 1 or 2, wherein the positioning unit (370) is configured to determine at least one position parameter (r, θ, φ, θ' , φ') representing the current position, and wherein the control unit (310) is configured to generate audio signals (xL, xR) for the first and second speaker elements (330, 340) as a function of second voice signal (x;n) and the at least one position parameter (r, θ, φ, θ' , φ'), and provide the audio signals (xL, xR ) to the speaker elements (330, 340) so as to cause the speaker elements (330, 340) to reproduce the sound with a spatial origin given by the at least one position parameter (r, θ, φ, θ' , φ').
4. The headset of claim 3, wherein said at least one position parameter comprises one or more of: a distance (r) from the headset to the current position of the second headset (1), and an angle (θ, φ) between a reference direction (Dl) of the headset and the current position of the second headset (1).
5. The headset of claim 4, wherein the reference direction (Dl) is predefined with respect to the speaker elements (330, 340) of the headset so as to follow head movement of the individual.
6. The headset of claim 4 or 5, wherein said at least one position parameter further comprises an orientation (θ', φ') of the second headset (1).
7. The headset of any one of claims 4-6, wherein the communication unit (350) is configured to receive a reference signal (REF) from the second headset (1), and wherein the positioning unit (370) is configured to compute at least one of the distance (r) and the angle (θ, φ) as a function of the reference signal (REF).
8. The headset of claim 7, wherein the communication unit (350) comprises at least two antenna elements (7), and wherein the positioning unit (370) is configured to compute the angle (θ, φ) by comparing the reference signal (REF) as received at the at least two antenna elements (7).
9. The headset of claim 7 or 8, wherein the reference signal (REF) is included in or received together with the second voice signal (Χ η).
10. The headset of any one of claims 1-3, wherein the positioning unit (370) is configured to at least partly determine the current position as a function of position data received by the communication unit (350).
11. The headset of claim 10, wherein the position data is received from at least one of an external positioning system and the second headset (1).
12. The headset of claim 10 or 11, wherein the position data comprises an absolute position of the second headset (1) in a predefined coordinate system, wherein the positioning unit (370) is configured to determine an absolute position of the headset in the predefined coordinate system, and determine the current position as a function of the absolute position of the headset and the absolute position of the second headset (1).
13. The headset of claim 12, wherein the absolute position of the headset and the absolute position of the second headset (1) are GPS positions.
14. The headset of claim 12 or 13, which further comprises an orientation sensor (360) configured to generate orientation data (OD) for the headset, wherein the positioning unit (370) is further configured to determine the current position as a function of the orientation data (OD).
15. The headset of claim 3, wherein the control unit (310) is configured to obtain, based on the at least one position parameter, a head-related transfer function for the first speaker element (330) and the second speaker element (340), respectively, and operate the respective head-related transfer function on the second voice signal ( η) to generate the audio signals (xL, xR) for the first and second speaker elements (330, 340).
16. The headset of any preceding claim, wherein the control unit (310) is configured to operate the first and second speaker elements (330, 340) to reproduce the sound with a spatial origin at the current position.
17. The headset of any preceding claim, wherein the communication unit (350) is configured for wireless short-range communication with the second headset (1).
18. A method of operating a headset (1) comprising first and second speaker elements (330, 340) configured to be mounted over, on or in a respective ear of an individual, said method comprising:
receiving a voice signal (x;n) generated by a second headset (1),
obtaining a current position of the second headset (1), and
operating the first and second speaker elements (330, 340) to reproduce sound based on the voice signal (x;n), such that the sound is reproduced to represent the current position of the second headset (1) in relation to the headset.
19. A computer-readable medium comprising program instructions which, when executed by a control unit (310), cause the control unit (310) to perform the method of claim 18.
PCT/SE2018/050861 2017-08-31 2018-08-28 Headset and method of operating headset WO2019045622A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE1751049-6 2017-08-31
SE1751049 2017-08-31

Publications (1)

Publication Number Publication Date
WO2019045622A1 true WO2019045622A1 (en) 2019-03-07

Family

ID=65527740

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2018/050861 WO2019045622A1 (en) 2017-08-31 2018-08-28 Headset and method of operating headset

Country Status (1)

Country Link
WO (1) WO2019045622A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4132013A1 (en) * 2021-08-06 2023-02-08 Beijing Xiaomi Mobile Software Co., Ltd. Audio signal processing method, electronic apparatus, and storage medium
CN116033304A (en) * 2022-08-31 2023-04-28 荣耀终端有限公司 Audio output method, electronic equipment and readable storage medium
WO2023070778A1 (en) * 2021-10-29 2023-05-04 歌尔股份有限公司 Audio output control method and system, and related assemblies

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5426706A (en) * 1991-03-28 1995-06-20 Wood; William H. Remote simultaneous interpretation system
US20030130016A1 (en) * 2002-01-07 2003-07-10 Kabushiki Kaisha Toshiba Headset with radio communication function and communication recording system using time information
US20110164188A1 (en) * 2009-12-31 2011-07-07 Broadcom Corporation Remote control with integrated position, viewer identification and optical and audio test
WO2011139772A1 (en) * 2010-04-27 2011-11-10 James Fairey Sound wave modification
US20140269212A1 (en) * 2013-03-15 2014-09-18 Qualcomm Incorporated Ultrasound mesh localization for interactive systems
US20160123745A1 (en) * 2014-10-31 2016-05-05 Microsoft Technology Licensing, Llc Use of Beacons for Assistance to Users in Interacting with their Environments
GB2545222A (en) * 2015-12-09 2017-06-14 Nokia Technologies Oy An apparatus, method and computer program for rendering a spatial audio output signal

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5426706A (en) * 1991-03-28 1995-06-20 Wood; William H. Remote simultaneous interpretation system
US20030130016A1 (en) * 2002-01-07 2003-07-10 Kabushiki Kaisha Toshiba Headset with radio communication function and communication recording system using time information
US20110164188A1 (en) * 2009-12-31 2011-07-07 Broadcom Corporation Remote control with integrated position, viewer identification and optical and audio test
WO2011139772A1 (en) * 2010-04-27 2011-11-10 James Fairey Sound wave modification
US20140269212A1 (en) * 2013-03-15 2014-09-18 Qualcomm Incorporated Ultrasound mesh localization for interactive systems
US20160123745A1 (en) * 2014-10-31 2016-05-05 Microsoft Technology Licensing, Llc Use of Beacons for Assistance to Users in Interacting with their Environments
GB2545222A (en) * 2015-12-09 2017-06-14 Nokia Technologies Oy An apparatus, method and computer program for rendering a spatial audio output signal

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4132013A1 (en) * 2021-08-06 2023-02-08 Beijing Xiaomi Mobile Software Co., Ltd. Audio signal processing method, electronic apparatus, and storage medium
US11950087B2 (en) 2021-08-06 2024-04-02 Beijing Xiaomi Mobile Software Co., Ltd. Audio signal processing method, electronic apparatus, and storage medium
WO2023070778A1 (en) * 2021-10-29 2023-05-04 歌尔股份有限公司 Audio output control method and system, and related assemblies
CN116033304A (en) * 2022-08-31 2023-04-28 荣耀终端有限公司 Audio output method, electronic equipment and readable storage medium
CN116033304B (en) * 2022-08-31 2023-09-29 荣耀终端有限公司 Audio output method, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
EP3253078B1 (en) Wearable electronic device and virtual reality system
EP3424229B1 (en) Systems and methods for spatial audio adjustment
US9930456B2 (en) Method and apparatus for localization of streaming sources in hearing assistance system
US10257637B2 (en) Shoulder-mounted robotic speakers
US9307331B2 (en) Hearing device with selectable perceived spatial positioning of sound sources
US20090052703A1 (en) System and Method Tracking the Position of a Listener and Transmitting Binaural Audio Data to the Listener
US20150326963A1 (en) Real-time Control Of An Acoustic Environment
US20150189455A1 (en) Transformation of multiple sound fields to generate a transformed reproduced sound field including modified reproductions of the multiple sound fields
JP6193844B2 (en) Hearing device with selectable perceptual spatial sound source positioning
WO2012005894A1 (en) Facilitating communications using a portable communication device and directed sound output
CN104185130A (en) Hearing aid with spatial signal enhancement
CN112544089B (en) Microphone device providing audio with spatial background
US11758347B1 (en) Dynamic speech directivity reproduction
WO2019045622A1 (en) Headset and method of operating headset
CN104853283A (en) Audio signal processing method and apparatus
JP2008160397A (en) Voice communication device and voice communication system
US11805364B2 (en) Hearing device providing virtual sound
EP2887695B1 (en) A hearing device with selectable perceived spatial positioning of sound sources
KR20150142925A (en) Stereo audio input apparatus
WO2008119122A1 (en) An acoustically transparent earphone
JP2022543121A (en) Bilateral hearing aid system and method for enhancing speech of one or more desired speakers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18852298

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18852298

Country of ref document: EP

Kind code of ref document: A1