EP3588986A1 - Appareil et procédés associés de présentation d'audio - Google Patents

Appareil et procédés associés de présentation d'audio Download PDF

Info

Publication number
EP3588986A1
EP3588986A1 EP18180734.8A EP18180734A EP3588986A1 EP 3588986 A1 EP3588986 A1 EP 3588986A1 EP 18180734 A EP18180734 A EP 18180734A EP 3588986 A1 EP3588986 A1 EP 3588986A1
Authority
EP
European Patent Office
Prior art keywords
audio
user
loudspeakers
selection
audio content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP18180734.8A
Other languages
German (de)
English (en)
Inventor
Antti Eronen
Arto Lehtiniemi
Sujeet Shyamsundar Mate
Jussi LEPPÄNEN
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Priority to EP18180734.8A priority Critical patent/EP3588986A1/fr
Priority to PCT/EP2019/066783 priority patent/WO2020002302A1/fr
Publication of EP3588986A1 publication Critical patent/EP3588986A1/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R27/00Public address systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2227/00Details of public address [PA] systems covered by H04R27/00 but not provided for in any of its subgroups
    • H04R2227/003Digital PA systems using, e.g. LAN or internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/11Positioning of individual sound objects, e.g. moving airplane, within a sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/11Application of ambisonics in stereophonic audio systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space

Definitions

  • Telecommunication or telephony systems are being developed that provide for more than monophonic capture and presentation of audio.
  • the presentation of such audio may require careful consideration to ensure the telecommunication is clear and effective.
  • an apparatus comprising means configured to:
  • the primary audio comprises voice audio comprising audio determined to be generated by a voice of one or more remote users, such as for telecommunication with the first/second user
  • the secondary audio comprises ambient audio comprising audio other than that determined to be generated by the voice of the one or more remote users.
  • the primary audio comprises spatial audio that includes directional information such that, when presented, it is to be perceived as originating from a direction or range of directions in accordance with the directional information and the secondary audio comprises at least one of audio without said directional information and spatial audio with said directional information that defines a range of directions from which the audio should be perceived greater than a threshold range of directions.
  • the first and/or second audio content comprises telecommunication audio content comprising audio content provided for the purpose of telecommunication, which may be via a traditional telecommunication network or provided by a voice over IP or any other packet-based or circuit switched telephony service.
  • the apparatus includes means configured to provide for presentation of the primary audio of the first audio content and the primary audio of the second audio content from at least the speaker of the at least two speakers that satisfies a criterion, the criterion including at least in part the following qualifiers:
  • the apparatus includes means configured to provide for presentation of the primary audio of the first audio content and the primary audio of the second audio content from at least the loudspeaker of the at least two loudspeakers that satisfies a criterion, the criterion including at least in part the following qualifiers: the loudspeaker that is the closest loudspeaker to the respective user that is closer to the respective user than it is to the other respective user; and the primary audio of the first audio content and the primary audio of the second audio content are presented from different loudspeakers.
  • the second selection is determined based, at least in part, on which of the at least two loudspeakers are closer to the first user than the second user and the third selection is determined based, at least in part, on which of the at least two loudspeakers are closer to the second user than the first user, wherein the second and third selections are non-overlapping.
  • the first position information is also indicative of the orientation of the first user in the presentation space and the second position information is also indicative of the orientation of the second user in the presentation space;
  • the apparatus includes means configured to change the loudspeakers that form the second selection and the loudspeakers that form the third selection based on the respective first position information and the respective second position information being indicative of movement of the location of one or both of the first user and the second user in the presentation space.
  • said change in the loudspeakers that form the second selection and the loudspeakers that form the third selection is further based on the elapse of a predetermined amount of time since any previous change in the loudspeakers that form the second selection and the loudspeakers that form the third selection.
  • the means are configured to cease presentation of the secondary audio of one or both of the first audio content and the second audio content and maintain the presentation of the primary audio of both the first audio content and the second audio content, in response to the first position information and the second position information being indicative of the first user and the second user having moved to within a predetermined distance of one another.
  • said means are configured to, in response to the first position information and the second position information being indicative of the first user and the second user having moved to being within a predetermined distance of one another,
  • the first audio content and the second audio content comprise audio from respective first and second remote persons in telecommunication with the first user and second user respectively, and said means are configured to provide for presentation of the secondary audio of the first audio content using a fourth selection of the loudspeakers and presentation of the secondary audio of the second audio content using a fifth selection of the loudspeakers, the fourth and fifth selections comprising overlapping selections of the loudspeakers, in response to user input received from one or both of the first user and the second user.
  • the fourth and fifth selections are formed of the same selection of speakers. In one or more examples, the fourth and fifth selections comprise all of the at least two speakers in the presentation space.
  • the means are configured to provide for presentation of one of the first audio content and second audio content with a sixth selection of the loudspeakers, the sixth selection comprising a greater number of loudspeakers than the second selection and a greater number of loudspeakers than the third selection, in response to an end of the other of the first audio content and second audio content being reached such that there is no further audio content to present.
  • the primary audio is provided for presentation using a selection of the at least two loudspeakers based on primary-audio-criteria, wherein the primary-audio-criteria is different to criterion used to determine one or more of the first selection, the second selection and the third selection of loudspeakers of the at least two loudspeakers.
  • one or both of the first audio content and the second audio content comprises audio-type information which designates which audio of the respective first and second audio content comprises the primary audio and which audio of the respective first and second audio content comprises the secondary audio.
  • the apparatus is configured to provide for determination of which audio of one or both of the first audio content and second audio content is the primary audio, and which is the secondary audio based on audio analysis.
  • a computer readable medium comprising computer program code stored thereon, the computer readable medium and computer program code being configured to, when run on at least one processor, perform the method of:
  • an apparatus comprising:
  • the present disclosure includes one or more corresponding aspects, examples or features in isolation or in various combinations whether or not specifically stated (including claimed) in that combination or in isolation.
  • Corresponding means and corresponding functional units e.g., function enabler, speaker selector, amplifier, display device for performing one or more of the discussed functions are also within the present disclosure.
  • Telecommunication or telephony systems are being developed that provide for more than monophonic capture and monophonic presentation of audio.
  • Immersive telephony systems are being developed, such as by the 3rd Generation Partnership Project (3GPP), that will enable a new level of immersion in telephony services.
  • 3GPP 3rd Generation Partnership Project
  • Such systems may provide for transmission of and presentation of immersive, spatial audio content. This may enable receiving and sending of an enveloping sound scene from/to the telecommunication call participants.
  • the first user can experience the audio environment around the remote user as if he/she was physically located at the location of the remote user and vice versa.
  • the audio provided as part of said telecommunication may be categorised as primary audio and secondary audio. It will be appreciated that while the examples herein relate to the provision of audio in the field of telecommunication, the principles may be applied to other fields of audio presentation.
  • the primary audio may comprise voice audio comprising audio determined to be generated by a voice of one or more remote users in telecommunication with a first user.
  • the "voice" primary audio may be categorised at the point of capture or at the point of play back using audio analysis techniques, or by a server or any other entity involved in said telecommunication.
  • the secondary audio may, in one or more examples, comprise ambient audio comprising audio other than that determined to be generated by the voice of one or more remote persons.
  • the primary and secondary audio may be presented differently.
  • the primary audio may comprise spatial audio content that includes directional information such that, when presented, it is perceived as originating from one or more directions in accordance with the directional information.
  • the secondary audio may comprise ambient audio comprising audio without said directional information or without a direction of arrival distinguishable above a threshold level. Again, the primary and secondary audio may be presented differently.
  • the direction from which audio was received at the location of the remote user may be reproduced when presenting the audio to the first user (or any other user) by use of spatial audio presentation.
  • Spatial audio comprises audio presented in such a way to a user that it is perceived to originate from a particular location, as if the source of the audio was located at that particular location.
  • Spatial audio content comprises audio for presentation as spatial audio and, as such, typically comprises audio having directional information (either explicitly specified as, for example, metadata or inherently present in the way the audio is captured), such that the spatial audio content can be presented such that its component audio is perceived to originate from one or more points or one or more directions in accordance with the directional information.
  • the spatial positioning of the spatial audio may be provided by 3D audio effects, such as those that utilise a head related transfer function to create a spatial audio space (aligned with a real-world space in the case of augmented reality) in which audio can be positioned for presentation to a user.
  • Spatial audio may be presented by headphones by using head-related-transfer-function (HRTF) filtering techniques or, for loudspeakers, by using vector-base-amplitude panning techniques to position the perceived aural origin of the audio content.
  • HRTF head-related-transfer-function
  • Ambisonics audio presentation may be used to present spatial audio.
  • an apparatus 100 may be provided that is configured to provide for presentation of secondary audio to one or more users.
  • FIG. 1 shows an example overview of a system 110 in which such an apparatus 100 may be used.
  • the system may include means to receive audio content, such as a first audio input 101 and second audio input 102. It will be appreciated that the system 110 may include further audio inputs.
  • the audio inputs 101, 102 may receive audio from one or more telecommunication devices (not shown).
  • the telecommunication devices may comprise mobile telephones or any other telecommunication equipment.
  • the audio inputs 101, 102 receive audio from other sources, in the field of telecommunication or not.
  • the apparatus 100 may be configured to provide for presentation of the audio content from the first audio input 101 and the second audio input 102 to a plurality of loud speakers 103, 104, 105, 106.
  • the speakers 103-106 may be arranged in a presentation space 107, such as a room or part of a room.
  • the apparatus 100 may comprise or be connected to a processor 108 and a memory 109 and may be configured to execute computer program code.
  • the apparatus 100 may have only one processor 108 and one memory 109 but it will be appreciated that other embodiments may utilise more than one processor and/or more than one memory (e.g. same or different processor/memory types). Further, the apparatus 100 may be an Application Specific Integrated Circuit (ASIC).
  • ASIC Application Specific Integrated Circuit
  • the processor may be a general purpose processor dedicated to executing/processing information received from other components, such as received from the audio inputs 101, 102, or a user position information sensor 111 in accordance with instructions stored in the form of computer program code in the memory.
  • the output signalling generated by such operations of the processor is provided onwards to further components, such as to the speakers 103-106 or an amplifier or other audio presentation equipment (not shown) to provide the audio to the speakers 103-106.
  • the user position information sensor 111 may comprise one or more cameras for tracking the position of the one or more users in the presentation space 107. In one or more other examples, the sensor 111 may be configured to track the position of tags, such as RFID tags, worn by the one or more users. In one or more examples, the sensor 111 may be embodied as a plurality of sensors arranged at least within the presentation space 107. It will be appreciated that there are many ways the position and, optionally, the orientation, of one or more users may be tracked. In one or more embodiments, the sensor 111 may be associated with circuitry for performing the tracking of users and providing position information to the apparatus 100. In one or more examples, the apparatus 100 may use data from the sensor to track the one or more users and generate the position information.
  • tags such as RFID tags
  • the memory 109 (not necessarily a single memory unit) is a computer readable medium (solid state memory in this example, but may be other types of memory such as a hard drive, ROM, RAM, Flash or the like) that stores computer program code.
  • This computer program code stores instructions that are executable by the processor, when the program code is run on the processor.
  • the internal connections between the memory and the processor can be understood, in one or more example embodiments, to provide an active coupling between the processor and the memory to allow the processor to access the computer program code stored on the memory.
  • the respective processors and memories are electrically connected to one another internally to allow for electrical communication between the respective components.
  • the components are all located proximate to one another so as to be formed together as an ASIC, in other words, so as to be integrated together as a single chip/circuit that can be installed into an electronic device.
  • one or more or all of the components may be located separately from one another.
  • Example figures 2 to 8 below show various positions of a first user 201 and a second user 202 inside the presentation space 107 (and outside the presentation space in the case of example figure 2 ).
  • the apparatus 100 may be configured to receive first audio content, such as at first audio input 101, intended for presentation to the first user 201 and, at the same or later time, second audio content, such as at second audio input 102, for presentation to the second user 202 (shown in figures 3 to 8 ).
  • the first audio content may comprise audio content from a remote user with which the first user 201 is in telecommunication.
  • the second audio content may comprise audio content from a different remote user with which the second user 202 is in telecommunication.
  • the first and second audio content may comprise audio content outside the field of telecommunication, such as an audiobook.
  • the apparatus 100 may not receive the audio content itself but receives information indicative that it is available for presentation by audio presentation equipment. The apparatus 100 may then provide for control of the audio presentation equipment.
  • the first audio content and the second audio content may be categorised into primary audio content and secondary audio content.
  • the example apparatus 100 is configured to receive first position information indicative of at least the location of the first user 201 with respect to the presentation space 107 and, optionally, the orientation of the first user 201.
  • the first position information may be received from the sensor 111 or from a different device.
  • the example apparatus 100 is configured to receive second position information indicative of at least the location of the second user 202 with respect to the presentation space 107 and, optionally, the orientation of the second user 202.
  • the second position information may be received from the sensor 111 or from a different device.
  • the apparatus 100 may be provided with information regarding the location of the first through fourth loudspeakers 103-106 in the presentation space 107. This known location information may be pre-set or may be determined by the apparatus 100 prior to or at the time of determining selection(s) of loudspeakers to which to present the audio content. It will be appreciated that the location of the loudspeakers are known to the apparatus at the time the second and third speaker selections are determined but it may be inconsequential how far in advance the speakers locations are determined or provided to the apparatus.
  • the criteria for selecting which loudspeakers comprise the first selection is based, at least in part, on the first position information being indicative of the first user 201 being in the presentation space 107 and the second position information being indicative of the second user (not present in figure 2 ) being outside the presentation space 107.
  • the second position information may be considered to be indicative of the second user being outside the presentation space 107 by one or more of specifying a location for the second user that is outside the presentation space, not specifying a location at all, or the absence of second position information at the time of presentation of the first audio content.
  • the criteria may comprise use all available speakers when only one user requires secondary audio to be presented to them in the presentation space 107. It will be appreciated that in one or more other examples, the criteria may be different. For example, where the presentation space 107 is large and many speakers are provided therein, the criteria for the first selection may comprise present the secondary audio from all speakers within a predetermined distance of the first user 201. Other criteria may be used to determine the speakers of the first selection.
  • Example figure 3 shows the presentation space 107 including both the first user 201 and the second user 202.
  • the second user 202 may start a telecommunication call and therefore require the presentation of the associated second audio content by the speakers 103-106 or may be conducting a telecommunication call and enter the presentation space 107.
  • the apparatus 100 is configured to select which of the loudspeakers 103-106 to use to render the ambient secondary audio of the first audio content to the first user 201 and which of the speakers 103-106 to use to render the ambient secondary audio of the second audio content to the second user 202.
  • the rendering of the secondary audio for both the first and second users 201, 202 from all of the speakers may be confusing and distracting.
  • the selection of speakers is preferably required to provide an impression of the ambient scene created by the secondary audio but should not be distracting for the other of the first and second user 201, 202.
  • the apparatus 101 is configured to provide for presentation of the secondary audio of the first audio content for the first user 201 using a second selection of the speakers 103-106 rather than the first selection shown in figure 2 .
  • the apparatus 101 is also configured to provide for presentation of the secondary audio of the second audio content for the second user 202 using a third selection of the speakers 103-106.
  • the second selection comprises a fewer number of loudspeakers (e.g. two) than the first selection (e.g. four).
  • the ambient secondary audio of the first audio content can no longer be presented from all speakers, because the presentation space 107 is shared by a plurality of users who require the presentation of audio content, a compromise may be required.
  • the second and third selections may be based on the location of the first and second users 201, 202 in the presentation space 107 determined from the first position information and the second position information.
  • the criterion for selecting the loudspeakers for the second and third selections may comprise determining which non-overlapping subset of speakers are closer to the first user 201 and which speakers are closer to the second user 202.
  • the number of speakers assigned to each of the first and second selection may be equal or close to equal in the event an odd number of speakers are used.
  • the second selection may be determined prior to the third selection or vice versa.
  • the third selection may comprise the remaining speakers after determining the speakers that make up the second selection or vice versa.
  • the user who first initiated the telecommunication may have their speaker selection determined first.
  • the criteria for determining the second and third selections may comprise determining, for each speaker, which user 201, 202 is closer thereto and assigning the speaker to the second selection if the first user 201 is closer and assigning the speaker to the third selection if the second user 202 is closer.
  • An example strategy for determination of speaker selections is provided below with reference to figures 9 to 13 .
  • various qualifiers or conditions may be used to determine which speakers are assigned to which speaker selection based on the location of the users.
  • a first qualifier may be to identify the closest speaker or speakers to the respective users. If this does not yield an independent or partially independent selection of speakers then further qualifiers may be applied.
  • the speaker(s) that is closest to one of the users than the other is assigned to the speaker selection for that user and the other user is assigned the closest or next closest speaker(s) that is not part of the other selection.
  • the apparatus 100 may continue to assess each of the other speakers to determine which selection it should be part of.
  • the apparatus 100 may include means configured to provide for the presentation of the primary audio of the first audio content and the primary audio of the second audio content when the first and second position information is indicative of both the first and second users being within the presentation space.
  • the criteria for selecting which of the one or more speakers 103 -106, from which to present the primary audio of the first audio content is different to the criteria used for selecting which of the one or more speakers 103 -106, from which to present the secondary audio of the first audio content.
  • the criteria for selecting which of the one or more speakers 103 -106, from which to present the primary audio of the second audio content is different to the criteria used for selecting which of the one or more speakers 102 -106, from which to present the secondary audio of the second audio content.
  • the same primary-audio-criteria as described for Figure 2 may be used.
  • the apparatus 100 may be configured to provide for presentation of the primary audio from the speaker or speakers that are closest to the respective user or the speaker or speakers that most directly face the respective user.
  • the primary audio may comprise spatial audio and therefore, the direction from which the users 201, 202 perceives the audio may be based on the directional information of the primary audio of the first audio content and second audio content respectively.
  • Example figure 4 shows the first user 201 having moved within the presentation space 107.
  • Example figure 4 also shows the second user 202 having moved within the presentation space 107.
  • the apparatus 100 may include means configured to, based on movement of the location of one or both of the first user 201 and the second user 202 in the presentation space 107 as indicated in the respective first position information and second position information, provide for a change in the speakers 103-106 that comprise the second selection and the speakers 103-106 that comprise the third selection.
  • one or both of the second selection and the third selection comprise a dynamic selection of speakers based at least in part on the current position of the first user and second user respectively.
  • the same criteria may be used to assign speakers to the second and third selections when updating the selections, as mentioned above.
  • different criteria may be used to update the second and/or third selections after an initial assignment of speakers to the second and/or third selections. For example, it may be confusing if the criteria used to assign speakers to the second and third selections resulted in the borderline assignment of a particular speaker to either the second or third selection due to the relative positions of the first and second user 201, 202. In such a circumstance, small movements of the one or both users may result in updating of the second and third selections which may be confusing or annoying.
  • Example figure 5 shows the first user 201 and the second user 202 having moved to positions in the presentation space 107 that are within a predetermined distance 500 of one another. If the first and second users 201, 202 move close to each other, they may have a higher likelihood of hearing each other's audio content. To make the first and second audio content easier to listen to for the respective users, the apparatus 100 may be configured to stop playing the secondary audio or "ambient audio" of the first and/or second audio content entirely. In one or more examples, the apparatus 100 may be configured to provide for presentation of the primary audio of the first and/or second audio content from just one or more (e.g. not all) of the nearest speakers to the corresponding users 201, 202.
  • the selection of which speakers 103-106 are used may change based on the first user 201 and the second user 202 having moved to being within a predetermined distance 500 of one another.
  • the same primary-audio-criteria as described for Figure 2 may be used or a different criterion.
  • Example figure 6 shows an example of an alternative action by apparatus 100 in response to the first user 201 and the second user 202 having moved to positions in the presentation space 107 that are within a predetermined distance 500 of one another.
  • the apparatus 100 is configured to, in response to the first position information and the second position information being indicative of the first user 201 and the second user 202 having moved to being within a predetermined distance 500 of one another, change from presentation of the secondary audio using the second and third selections.
  • the apparatus 100 may be configured to provide for presentation of the secondary audio of the first audio content from the speaker that is closest to the first user 201 and closer to the first user 201 than the second user 202, which in this example comprises the third speaker 105, as shown by selection circle 601. Further, the apparatus 100 may be configured to provide for presentation of the secondary audio of the second audio content from the speaker that is closest to the second user 202 and closer to the second user 202 than the first user 201, which in this example comprises the fourth speaker 106, as shown by selection circle 602.
  • the ambient, secondary audio may be important for the telecommunication call because, for example, it carries the sound of children and the call is with the grandparents.
  • the apparatus 100 may provide the secondary audio of the first audio content for presentation additionally based on first-user-input indicative of them wanting to hear the secondary audio.
  • the apparatus 100 may provide the secondary audio of the first audio content for presentation additionally based on remote-user-input indicative of them wanting the first user 201 to hear the secondary audio.
  • the second user may provide corresponding second-user-input to hear the secondary audio of the second audio content and/or the remote user with which the second user is telecommunicating may provide said remote-user-input indicative of them wanting the second user 202 to hear the secondary audio.
  • the apparatus may thus be configured to act on said second-user-input and/or remote-user-input by presenting the secondary audio from at least one of the loudspeakers.
  • Example figure 7 shows how the apparatus 100 may provide for presentation of the secondary audio based on a user input received from one or both of the first user 201 and the second user 202 indicative of a desire to join the telecommunication to provide for telecommunication between any two of the first and second user 201, 202 and the first and second remote persons (not shown).
  • the apparatus 100 is configured to provide for presentation of the secondary audio of the first audio content using a fourth selection 701 of the speakers and presentation of the secondary audio of the second audio content using a fifth selection 702 of the speakers 103-106.
  • the fourth and fifth selections may comprise overlapping selections of the speakers 103-106.
  • the fourth 701 and fifth 702 selections are the same selection of speakers 103-106.
  • the fourth and fifth selections comprise all of the speakers 103-106 in the presentation space 107.
  • the primary audio of the first audio content and the second audio content may also be presented from the speakers 103-106 that are selected independently using different criteria than the criteria used to determine the fourth and fifth selections for the secondary audio.
  • Example figure 8 shows how the apparatus 100 may provide for presentation of the secondary audio based on a user input received from one or both of the first user 201 and the second user 202 indicative of a desire to join the telecommunication to provide for telecommunication between any two of the first and second user 201, 202 and more than two remote persons (not shown).
  • the first user may be in communication with a first remote user and the second user may be in communication with two remote users at the time the telecommunication is combined.
  • the third remote user may join the telecommunication after the first and second users have joined their calls.
  • there may be third or more audio content due to the number of remote users.
  • the presentation of the secondary audio from many, e.g. more than two, remote users may be confusing.
  • the secondary audio from the first audio content and the secondary audio from the second audio content is grouped and presented from the first and third speakers 103, 105 as shown by selection 801.
  • the secondary audio from the third audio content forms the other group and is presented from the second and fourth speakers 104, 106 as shown by selection 802.
  • the number of speakers 103-106 may be greater or less than four as shown in the examples but may be at least two or at least three speakers.
  • one or both of the first audio content and the second audio content comprises audio-type information which designates which audio of the respective first and second audio content comprises primary audio and which audio comprises secondary audio.
  • the apparatus 100 is configured to provide for determination of which audio of one or both of the first audio content and second audio content is the primary audio and which is the secondary audio based on audio analysis.
  • the audio of the first audio content may be labelled with a category comprising either secondary "ambient" audio or primary audio.
  • the labelling of the audio may be performed at a device of the remote person at the time of capture.
  • the first audio content may be separated into a primary audio stream and a secondary audio stream.
  • a server may perform the labelling of the audio and/or creation of said different component audio streams.
  • the apparatus 100 may be configured to determine which audio is primary user audio and which is secondary audio.
  • the determination of primary audio from secondary audio may be performed using any appropriate audio analysis technique and may be based on the frequency range of the audio at a particular time, the degree of reverberation, the volume or any other audiological factor.
  • the determination of primary audio from secondary audio may be performed based on predetermined speech samples of the remote person.
  • the first/second audio content may be recorded by a plurality of microphones and may be determined to be primary audio or secondary audio based on differences between how the audio was captured by the plurality of microphones. For example, audio deemed to be received at a greater volume or with a particular degree of reverberation by a front facing microphone relative to a rearfacing microphone may be deemed to be primary audio (possibly once subject to audio processing).
  • the remaining audio or audio received at a greater volume or with a particular degree of reverberation by the rear facing microphone relative to the front facing microphone may be determined to be secondary audio (possibly once subject to audio processing). It will be appreciated that various audio processing techniques may be used to separate the first user audio and the ambient audio.
  • the first audio content and the second audio content may comprise telecommunication audio content or any other audio content that may be categorized.
  • an audiobook may provide the reader of the story as primary audio and supplementing sounds as secondary audio.
  • Figure 9 shows a first stage of an example process for determining loudspeaker selections, such as the second and third selections.
  • Figure 9 shows an arrangement of six speakers 901-906 (first through sixth) in a presentation space 907 in which the first user 201 and the second user 202 are present.
  • it is a predetermined requirement of the speaker selection process for each speaker selection to include two speakers. It will be appreciated that other requirements are possible.
  • a first stage of an example speaker selection determination process comprises forming all possible speaker pairs and connecting the pairs with straight lines, shown by arrows 908.
  • An example second stage comprises removal from consideration of loudspeaker pairs which have associated connecting lines 908 that cross one another.
  • the follow speaker pairs are discounted - first-third speaker pair, first-fourth speaker pair, fifth-second speaker pair, fifth-third speaker pair and fourth-sixth speaker pair.
  • Example figure 10 shows only the non-intersecting connecting lines and thus the remaining pairs of speakers 901-906.
  • Example figure 10 also illustrates the determination of a mid-point between the remaining speaker pairs along their associated connecting lines, marked by a cross in Figure 10 along each connecting line.
  • Example figure 11 illustrates that, for each user 201, 202, the apparatus is configured to provide for the determination of the distance between the respective user 201, 202 and each of the determined mid-points.
  • the second selection for the first user 201 is then determined based on which mid-point the first user 201 is closest to.
  • the closest mid-point is associated with a speaker pair and the speakers of the speaker pair are determined to form the second selection.
  • the second selection for the first user 201 is determined to comprise the first and sixth speakers as shown by dashed oval 921.
  • the third selection may also be determined based on which mid-point the second user 202 is closest to, which in this example is the mid-point of the first-second speaker pair on connecting line 910.
  • Dashed oval 922 represents the (possible) third selection.
  • a further qualifier for determination of the speaker selection comprises that the second and third speaker selections must be mutually exclusive in terms of the speakers that form the selections. That is, there must not be any overlap between the speaker selections.
  • Dashed ovals 921 and 922 are shown to overlap and therefore the apparatus may be configured to identify a different selection of speakers for the third selection.
  • Example figure 12 shows the apparatus 100 providing for determination of which is the second closest mid-point to the second user 202 for determination of the third selection.
  • the mid-point of the connecting line 911 is the second closest to the second user 202, which thereby identifies the second and third speakers 902, 903 as the speakers that form the third selection, shown by dashed oval 923. Accordingly, the process has identified the first and sixth speakers 901, 906 as forming the second selection and the second and third speakers 902, 903 as forming the third selection.
  • the apparatus 100 may be configured to identify the speaker or speakers that are closest to each of the users, wherein if said selected speakers are not mutually exclusive then the speaker or speakers that are second closest to one of the users are determined and wherein the speakers determined based on the location of the first user form the second selection and the speakers determined based on the location of the second user form the third selection. It will be appreciated that the process may be followed for determining the speakers of selections disclosed herein other than the second and third selections. In one or more examples, only once the speaker selections have been determined is the apparatus 100 configured to begin to use those speakers for presentation of audio.
  • apparatus may be configured to provide for the following:
  • the apparatus is configured to, if the speakers are not all different, for at least one of the users, determine the second closest mid-point and provided that the speakers associated with the connecting line having the mid-point that is second closest to the at least one of the users are all different speakers, provide for determination of one of the second and third selection based on the speakers associated with the connecting line having the mid-point second closest to the respective user and provide for determination of the other of the second and third selection based on the speakers associated with the connecting line having the mid-point closest to the respective user.
  • the second and third speaker selections may be expanded to include one or more additional speakers, which may form part of the example selection process.
  • Example figure 13 shows the apparatus 100 configured to, for each of the second and third selections, provide for determination of the nearest mid-point to the mid-point of the speaker pair that currently forms part of the respective selection and thereby determine at least one further speaker from the speaker pairs associated with said nearest mid-point, and wherein, provided that said at least one additional speaker is not part of the other of the second and third selection, expand said second or third selection to include said at least one additional speaker.
  • the nearest mid-point to the mid-point of the first-sixth speaker pair 916 of the second selection 921 is the mid-point of the fifth-first speaker pair 914. Accordingly, the second selection 921 may be expanded to include the fifth speaker 905. In this example, the nearest mid-point to the mid-point of the second-third speaker pair 911 of the third selection is the mid-point of the third-fourth speaker pair 912. Accordingly, the third selection 923 may be expanded to include the fourth speaker 904.
  • the expanded selections 921, 923 do not overlap in terms of the speakers of which they are formed and therefore the apparatus 100 may be configured to provide for audio presentation from the determined second and third selections 921, 923 respectively.
  • Figure 14 shows a flow diagram illustrating the steps of
  • Figure 15 illustrates schematically a computer/processor readable medium 1500 providing a program according to an example.
  • the computer/processor readable medium is a disc such as a digital versatile disc (DVD) or a compact disc (CD).
  • DVD digital versatile disc
  • CD compact disc
  • the computer readable medium may be any medium that has been programmed in such a way as to carry out an inventive function.
  • the computer program code may be distributed between the multiple memories of the same type, or multiple memories of a different type, such as ROM, RAM, flash, hard disk, solid state, etc.
  • User inputs may be gestures which comprise one or more of a tap, a swipe, a slide, a press, a hold, a rotate gesture, a static hover gesture proximal to the user interface of the device, a moving hover gesture proximal to the device, bending at least part of the device, squeezing at least part of the device, a multi-finger gesture, tilting the device, or flipping a control device.
  • the gestures may be any free space user gesture using the user's body, such as their arms, or a stylus or other element suitable for performing free space user gestures.
  • the apparatus shown in the above examples may be a portable electronic device, a laptop computer, a mobile phone, a Smartphone, a tablet computer, a personal digital assistant, a digital camera, a smartwatch, smart eyewear, a pen based computer, a non-portable electronic device, a desktop computer, a monitor, a smart TV, a server, a wearable apparatus, a virtual reality apparatus, or a module/circuitry for one or more of the same.
  • any mentioned apparatus and/or other features of particular mentioned apparatus may be provided by apparatus arranged such that they become configured to carry out the desired operations only when enabled, e.g. switched on, or the like. In such cases, they may not necessarily have the appropriate software loaded into the active memory in the non-enabled (e.g. switched off state) and only load the appropriate software in the enabled (e.g. on state).
  • the apparatus may comprise hardware circuitry and/or firmware.
  • the apparatus may comprise software loaded onto memory.
  • Such software/computer programs may be recorded on the same memory/processor/functional units and/or on one or more memories/processors/ functional units.
  • a particular mentioned apparatus may be pre-programmed with the appropriate software to carry out desired operations, and wherein the appropriate software can be enabled for use by a user downloading a "key", for example, to unlock/enable the software and its associated functionality.
  • Advantages associated with such examples can include a reduced requirement to download data when further functionality is required for a device, and this can be useful in examples where a device is perceived to have sufficient capacity to store such pre-programmed software for functionality that may not be enabled by a user.
  • Any mentioned apparatus/circuitry/elements/processor may have other functions in addition to the mentioned functions, and that these functions may be performed by the same apparatus/circuitry/elements/processor.
  • One or more disclosed aspects may encompass the electronic distribution of associated computer programs and computer programs (which may be source/transport encoded) recorded on an appropriate carrier (e.g. memory, signal).
  • Any "computer” described herein can comprise a collection of one or more individual processors/processing elements that may or may not be located on the same circuit board, or the same region/position of a circuit board or even the same device. In some examples one or more of any mentioned processors may be distributed over a plurality of devices. The same or different processor/processing elements may perform one or more functions described herein.
  • signal may refer to one or more signals transmitted as a series of transmitted and/or received electrical/optical signals.
  • the series of signals may comprise one, two, three, four or even more individual signal components or distinct signals to make up said signalling. Some or all of these individual signals may be transmitted/received by wireless or wired communication simultaneously, in sequence, and/or such that they temporally overlap one another.
  • processors and memory may comprise a computer processor, Application Specific Integrated Circuit (ASIC), field-programmable gate array (FPGA), and/or other hardware components that have been programmed in such a way to carry out the inventive function.
  • ASIC Application Specific Integrated Circuit
  • FPGA field-programmable gate array

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)
  • Circuit For Audible Band Transducer (AREA)
EP18180734.8A 2018-06-29 2018-06-29 Appareil et procédés associés de présentation d'audio Withdrawn EP3588986A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18180734.8A EP3588986A1 (fr) 2018-06-29 2018-06-29 Appareil et procédés associés de présentation d'audio
PCT/EP2019/066783 WO2020002302A1 (fr) 2018-06-29 2019-06-25 Appareil et procédés associés de présentation d'audio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP18180734.8A EP3588986A1 (fr) 2018-06-29 2018-06-29 Appareil et procédés associés de présentation d'audio

Publications (1)

Publication Number Publication Date
EP3588986A1 true EP3588986A1 (fr) 2020-01-01

Family

ID=62916419

Family Applications (1)

Application Number Title Priority Date Filing Date
EP18180734.8A Withdrawn EP3588986A1 (fr) 2018-06-29 2018-06-29 Appareil et procédés associés de présentation d'audio

Country Status (2)

Country Link
EP (1) EP3588986A1 (fr)
WO (1) WO2020002302A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130191753A1 (en) * 2012-01-25 2013-07-25 Nobukazu Sugiyama Balancing Loudspeakers for Multiple Display Users
US20150358756A1 (en) * 2013-02-05 2015-12-10 Koninklijke Philips N.V. An audio apparatus and method therefor
US20170070820A1 (en) * 2015-09-04 2017-03-09 MUSIC Group IP Ltd. Method of relating a physical location of a loudspeaker of a loudspeaker system to a loudspeaker identifier
US20170188170A1 (en) * 2015-12-29 2017-06-29 Koninklijke Kpn N.V. Automated Audio Roaming
US20170353811A1 (en) * 2016-06-03 2017-12-07 Nureva, Inc. Method, apparatus and computer-readable media for virtual positioning of a remote participant in a sound space

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130191753A1 (en) * 2012-01-25 2013-07-25 Nobukazu Sugiyama Balancing Loudspeakers for Multiple Display Users
US20150358756A1 (en) * 2013-02-05 2015-12-10 Koninklijke Philips N.V. An audio apparatus and method therefor
US20170070820A1 (en) * 2015-09-04 2017-03-09 MUSIC Group IP Ltd. Method of relating a physical location of a loudspeaker of a loudspeaker system to a loudspeaker identifier
US20170188170A1 (en) * 2015-12-29 2017-06-29 Koninklijke Kpn N.V. Automated Audio Roaming
US20170353811A1 (en) * 2016-06-03 2017-12-07 Nureva, Inc. Method, apparatus and computer-readable media for virtual positioning of a remote participant in a sound space

Also Published As

Publication number Publication date
WO2020002302A1 (fr) 2020-01-02

Similar Documents

Publication Publication Date Title
US11055057B2 (en) Apparatus and associated methods in the field of virtual reality
US10798518B2 (en) Apparatus and associated methods
US11477598B2 (en) Apparatuses and associated methods for spatial presentation of audio
US11399254B2 (en) Apparatus and associated methods for telecommunications
EP2956941A1 (fr) Génération de données audio multivoies assistée par analyse vidéo
US10993066B2 (en) Apparatus and associated methods for presentation of first and second virtual-or-augmented reality content
EP3422744B1 (fr) Appareil et procédés associés
JP2022533755A (ja) 空間オーディオをキャプチャする装置および関連する方法
US11223925B2 (en) Apparatus and associated methods for presentation of captured spatial audio content
US20220095047A1 (en) Apparatus and associated methods for presentation of audio
US10993064B2 (en) Apparatus and associated methods for presentation of audio content
US20230370801A1 (en) Information processing device, information processing terminal, information processing method, and program
JP2021508193A5 (fr)
EP3588986A1 (fr) Appareil et procédés associés de présentation d'audio
EP3734966A1 (fr) Appareil et procédés associés de présentation d'audio
CN116057927A (zh) 信息处理装置、信息处理终端、信息处理方法和程序

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20200702