EP3706432A1 - Processing multiple spatial audio signals which have a spatial overlap - Google Patents
Processing multiple spatial audio signals which have a spatial overlap Download PDFInfo
- Publication number
- EP3706432A1 EP3706432A1 EP19160886.8A EP19160886A EP3706432A1 EP 3706432 A1 EP3706432 A1 EP 3706432A1 EP 19160886 A EP19160886 A EP 19160886A EP 3706432 A1 EP3706432 A1 EP 3706432A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- spatial
- region
- spatial audio
- audio
- audio signals
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 161
- 238000012545 processing Methods 0.000 title claims abstract description 47
- 230000000694 effects Effects 0.000 claims abstract description 72
- 238000000034 method Methods 0.000 claims abstract description 19
- 238000004590 computer program Methods 0.000 claims abstract description 10
- 238000012800 visualization Methods 0.000 claims description 20
- 238000002592 echocardiography Methods 0.000 claims description 18
- 230000004044 response Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 12
- 238000000926 separation method Methods 0.000 description 6
- 238000001914 filtration Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 241001342895 Chorus Species 0.000 description 3
- HAORKNGNJCEJBX-UHFFFAOYSA-N cyprodinil Chemical compound N=1C(C)=CC(C2CC2)=NC=1NC1=CC=CC=C1 HAORKNGNJCEJBX-UHFFFAOYSA-N 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
- H04R1/406—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2499/00—Aspects covered by H04R or H04S not otherwise provided for in their subgroups
- H04R2499/10—General applications
- H04R2499/11—Transducers incorporated or for use in hand-held devices, e.g. mobile phones, PDA's, camera's
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/40—Visual indication of stereophonic sound image
Definitions
- This specification relates to spatial audio, for example to processing multiple spatial audio signals, wherein a spatial overlap may potentially occur between some of said audio signals.
- this specification describes an apparatus comprising: means for receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed (e.g. using beamforming) in first and second directions respectively; means for applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; means for identifying a first region, in which a spatial overlap exists between the first and second directions; means for obtaining instructions (e.g. user instructions) for processing said first and/or second spatial audio signals within said first region; and means for processing said first and/or second spatial audio signals within said first region in accordance with said instructions.
- Example audio effects include reverberation, modulations, filtering, chorus, flanger etc.
- the said spatial overlap may be a partial spatial overlap.
- the first and/or second spatial audio signals within said first region may be processed to provide a modified audio effect (e.g. a user-defined modified audio effect) to one or more of said audio signals when a spatial overlap of said audio signals is identified.
- a modified audio effect e.g. a user-defined modified audio effect
- the first direction may have a first spatial width and/or the second direction may have a second spatial width.
- the first and second spatial widths may be the same or may be different.
- the first and/or second direction may have a spectral width, such that the spatial overlap may be a partial overlap.
- the instructions may be pre-set instructions and said means for obtaining instructions may retrieve said pre-set instructions.
- Some embodiments further comprise means (such as a video or audio editor) for generating a user interface output related to said first and/or second spatial audio signals within said first region, wherein said means for obtaining instructions receives instructions in response to said user interface output.
- the user interface output may comprise a time-domain visualisation and/or a frequency-domain visualisation of the first and/or second spatial audio signals within said first region.
- the said visualisation(s) may be provided to highlight the first region.
- the said user interface output may depict potential echoes of at least some of said first and/or second spatial audio signals.
- This embodiment may further comprise means for receiving a user indication of which of said potential echoes are related to the first and/or the second spatial audio signals.
- the means for identifying the first region may comprise one or more of: identifying if the first and second audio effects are identical and, if so, said first region is not identified; and identifying if the first and/or second spatial audio signals is/are below an audio threshold level (e.g. silent) and, if so, said first region is not identified.
- an audio threshold level e.g. silent
- the means for processing said first and/or second spatial audio signals may disable audio effects associated with said first and/or second spatial audio signals in the absence of user instructions to the contrary when said first region is identified.
- Some embodiments further comprise means for providing a preview output providing an output including said processed first and/or second spatial audio signals within said first region.
- a user may preview how the overlap region may be rendered in a particular configuration.
- the means for processing said first and/or second spatial audio signals within said first region may modify audio settings in said first region.
- Some embodiments further comprise means for setting said first and/or second audio effects.
- a user input may be provided to enable the user to set the first and/or second audio effects. This may, for example, be implemented using a user interface.
- Some embodiments further comprise means (e.g. a mobile phone, such as a multi-microphone mobile phone, or a similar user device) for capturing the first and/or second spatial audio signals.
- means e.g. a mobile phone, such as a multi-microphone mobile phone, or a similar user device for capturing the first and/or second spatial audio signals.
- Some embodiments further comprise uploading at least some audio content of said first region for further audio processing.
- the said further audio processing may include sound separation.
- the said means may comprise: at least one processor; and at least one memory including computer program code, the at least one memory and the computer program configured, with the at least one processor, to cause the performance of the apparatus.
- this specification describes a method comprising: receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed (e.g. using beamforming) in first and second directions respectively; applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identifying a first region, in which a spatial overlap exists between the first and second directions; obtaining instructions (e.g. user instructions) for processing said first and/or second spatial audio signals within said first region; and processing said first and/or second spatial audio signals within said first region in accordance with said instructions.
- Example audio effects include reverberation, modulations, filtering, chorus, flanger etc.
- the said spatial overlap may be a partial spatial overlap.
- the first direction may have a first spatial width and/or the second direction may have a second spatial width.
- the first and second spatial widths may be the same or may be different.
- Some embodiments further comprise generating a user interface output related to said first and/or second spatial audio signals within said first region, wherein obtaining said instructions comprises receiving instructions in response to said user interface output.
- the user interface output may comprise a time-domain visualisation and/or a frequency-domain visualisation of the first and/or second spatial audio signals within said first region.
- the said visualisation(s) may be provided to highlight the first region.
- Identifying the first region may comprise one or more of: identifying if the first and second audio effects are identical and, if so, said first region is not identified; and identifying if the first and/or second spatial audio signals is/are below an audio threshold level (e.g. silent) and, if so, said first region is not identified.
- an audio threshold level e.g. silent
- Some embodiments further comprise providing a preview output providing an output including said processed first and/or second spatial audio signals within said first region.
- a user may preview how the overlap region may be rendered in a particular configuration.
- Some embodiments further comprise setting said first and/or second audio effects.
- a user input may be provided to enable the user to set the first and/or second audio effects. This may, for example, be implemented using a user interface.
- Some embodiments further comprise capturing the first and/or second spatial audio signals.
- Some embodiments further comprise uploading at least some audio content of said first region for further audio processing (such as sound separation).
- this specification describes any apparatus configured to perform any method as described with reference to the second aspect.
- this specification describes computer-readable instructions which, when executed by computing apparatus, cause the computing apparatus to perform any method as described with reference to the second aspect.
- this specification describes a computer program comprising instructions for causing an apparatus to perform at least the following: receive a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed in first and second directions respectively; apply a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identify a first region, in which a spatial overlap exists between the first and second directions; obtain instructions for processing said first and/or second spatial audio signals within said first region; and process said first and/or second spatial audio signals within said first region in accordance with said instructions.
- this specification describes a computer-readable medium (such as a non-transitory computer readable medium) comprising program instructions stored thereon for performing at least the following: receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed in first and second directions respectively; applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identifying a first region, in which a spatial overlap exists between the first and second directions; obtaining instructions for processing said first and/or second spatial audio signals within said first region; and processing said first and/or second spatial audio signals within said first region in accordance with said instructions.
- this specification describes an apparatus comprising: at least one processor; and at least one memory including computer program code which, when executed by the at least one processor, causes the apparatus to: receive a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed in first and second directions respectively; apply a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identify a first region, in which a spatial overlap exists between the first and second directions; obtain instructions for processing said first and/or second spatial audio signals within said first region; and process said first and/or second spatial audio signals within said first region in accordance with said instructions.
- this specification describes an apparatus comprising: a first input for receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed (e.g. using beamforming) in first and second directions respectively; a first audio processing module for applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; a spatial processing module for identifying a first region, in which a spatial overlap exists between the first and second directions; a first control module for obtaining instructions (e.g. user instructions) for processing said first and/or second spatial audio signals within said first region; and a second audio processing module (which may be the same module as the first audio processing module) for processing said first and/or second spatial audio signals within said first region in accordance with said instructions.
- a first input for receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed (e.
- FIG. 1 is a plan view of a system, indicated generally by the reference numeral 10, in which embodiments described herein may be used.
- the system 10 comprises a user device 12, such as a mobile phone.
- the user device 12 may include one or more microphones for capturing spatial audio signals.
- the user device 12 may also include one or more cameras for capturing images and/or any related data such as depth maps.
- the spatial audio data captured by the user device 12 may include audio content that includes at least some indication of sound directions. This may, for example, be by means of a parametric representation or otherwise perceivable by a listener. For example, an omnidirectional microphone may be used that captures audio from all directions around the user device 12.
- FIG. 2 is a block diagram of a system, indicated generally by the reference numeral 20, in accordance with an example embodiment.
- the system 20 comprises the user device 12 described above and additionally includes a first audio source 22.
- the first audio source 22 may, for example, be a person talking (although other audio sources are possible in example embodiments).
- the user device 12 receives a first spatial audio signal from the first audio source 22.
- the user device 12 may include a first audio focus arrangement 24, such as a beamforming arrangement, in the direction of the first audio source 22 in order to obtain, for example, a monophonic audio signal enhancing the specified direction.
- a first audio focus arrangement 24 such as a beamforming arrangement
- Such audio focussing may be provided, for example, to extract audio data relating to the first audio source 22, or to focus on the source 22 in the spatial audio field.
- the first spatial audio signal is focussed in a first direction (i.e. in the direction of the user device 12).
- the beam can be narrow or wide.
- the beamforming may amplify the beamed direction only slightly, or may substantially amplify the beamed direction and, optionally, attenuate or cancel audio signals from other directions.
- FIG. 3 is a block diagram of a system, indicated generally by the reference numeral 30, in accordance with an example embodiment.
- the system 30 comprises the user device 12 and the first audio source 22 described above and additionally includes a second audio source 32.
- the first and second audio sources 22 and 32 may be two people talking, although this is not essential to all embodiments.
- the audio objects may include musical instruments (e.g. guitars and drums), stationary loudspeakers, movable loudspeakers, multimedia devices such as TV's, or may include any other sound sources.
- the user device 12 receives a first spatial audio signal from the first audio source 22 and a second spatial audio signal from the second audio source 32.
- the user device 12 includes a first audio focus arrangement 24, such as a first beamforming arrangement, in the direction of the first audio source 22 in order to obtain, for example, a monophonic audio signal enhancing the specified direction.
- the user device 12 of the system 30 includes a second audio focus arrangement 34, such as a second beamforming arrangement, in the direction of the second audio source.
- two beamforming arrangements may be operated in parallel, one implementing the first audio focus arrangement 24 and the other implementing the second audio focus arrangement 34.
- a tracking employed to guide at least one audio focus or beamforming direction.
- a visual tracking based on a camera input or any other suitable tracking, where an audio source, such as audio sources 22 and 32, that moves relative to the capture device is maintained under audio focus or beamforming.
- an audio source such as audio sources 22 and 32
- a moving audio source may be maintained under the audio focus arrangement.
- FIG. 4 is a flow chart, indicated generally by the reference numeral 40, showing an example algorithm.
- the algorithm 40 starts at operation 41, where the first and second audio focus arrangements 24 and 34 are arranged to enable the user device 12 to receive a first spatial audio signal focussed in the direction of the first audio source 22 and a second spatial audio signal focussed in the direction of the second audio source 32.
- the operation 41 may implement suitable beamforming arrangements.
- the first and second audio focus arrangements 24 and 34 may have a first and second spatial width respectively.
- a user selects a first audio effect to be applied to the first spatial audio signal and/or a second audio effect to be applied to the second spatial audio signal.
- Example audio effects include reverberations, modulations, filtering etc.
- the system 30 processes the spatial audio signals (e.g., audio objects) according to the effects selected in the operation 42.
- first and/or second spatial audio signals captured by the user device 12 can be modified according to the effects selected in the operation 42.
- the algorithm 40 can be applied to the system 30 in order to process the first and second audio sources separately. Indeed, entirely different effects may be selected in operation 42 and applied to the first and second audio sources in operation 43. This is relatively simple to implement in the event that the separation between the audio sources is much greater than the width of the audio focusing arrangements 24 and 34. However, if the audio sources are relatively close together (e.g. such that they overlap, at least to some degree), it may become difficult to separate the audio sources and to process the relevant audio signals separately.
- FIG. 5 is a block diagram of a system, indicated generally by the reference numeral 50, in accordance with an example embodiment.
- the system 50 comprises the user device 12, the first audio source 22 and the second audio source 32 described above.
- the user device 12 includes the first audio focus arrangement 24, in the direction of the first audio source 22, and the second audio focus arrangement 34, in the direction of the second audio source 32, described above.
- the first and second audio sources 22 and 32 are much closer together in the system 50 than in the system 30. Accordingly, the first and second audio focus arrangements 24 and 34 are much closer together in the system 50 than in the system 30; indeed, in the system 50, the first and second audio focus arrangements 24 and 34 overlap to a significant degree. Of course, the degree of overlap is to some extent determined by the spatial width of the audio focus arrangements 24 and 34.
- FIG. 6 is a flow chart showing an algorithm, indicated generally by the reference numeral 60, in accordance with an example embodiment.
- the algorithm 60 maybe implemented using the systems 30 or 50.
- the algorithm 60 starts at operation 41, where, as discussed above, audio focus arrangements 24 and 34 are arranged to enable the user device 12 to receive a first spatial audio signal focussed in the direction of the first audio source 22 (a "first direction") and a second spatial audio signal focussed in the direction of the second audio source 32 (a "second direction”).
- the operation 41 maybe implemented at the user device 12 by providing means for receiving the first spatial audio signal and the second spatial audio signal at the user device.
- the first and second audio focus arrangements 24 and 34 may have first and second spatial widths respectively.
- a user selects a first audio effect to be applied to the first spatial audio signal and/or a second audio effect to be applied to the second spatial audio signal.
- example audio effects include reverberations, modulations, filtering etc.
- the algorithm 60 then moves to operation 61, where it is determined whether a first region is identified, in which a spatial overlap exists between the first and second directions. For example, in the system 50, a substantial overlap exists, whereas in the system 30, no significant overlap exists. If an overlap is identified, the algorithm 60 moves to operation 62; otherwise, the algorithm 60 moves to operation 63. It should be noted that the extent of the overlap necessary for an overlap to be identified may vary in different embodiments (and may be user-definable). Moreover, as noted above, the extent to which an overlap exists may be dependent on the spatial widths of the respective audio focus arrangements.
- overlap regions are processed (as discussed further below), before the algorithm 60 moves to operation 63.
- the audio objects are processed.
- the operation 63 may, for example, be similar to (or identical to) the operation 43 discussed above.
- the first and/or second spatial audio signals within said first region may be processed to modify audio settings in said first region.
- the operation 63 may disable the audio effects associated with said first and/or second spatial audio signals in the absence of instructions (e.g. user instructions) to the contrary when said first region is identified (i.e. when an overlap is detected).
- a preview output providing an (e.g. a preview output) output including said processed first and/or second spatial audio signals within said first region may be provided to a user.
- a user may preview how the overlap region may be rendered in a particular configuration. This may, for example, enable a user (such as an editor) to trial a number of different approaches to dealing with overlap conditions.
- the algorithm 60 can therefore be used to identify areas of overlap between the spatial positions of audio sources (or between focus arrangements relating to said audio sources). This may be detected, for example, by calculating the absolute difference between directions-of-arrival (DOA) of audio signals, differences in azimuth and/or elevation angle, differences in the directions of audio focussing arrangements etc.
- DOE directions-of-arrival
- a problem due to spatial overlapping of the form described above may not arise.
- the presence of an overlap may not present a problem.
- the operation 61 may proceed to the operation 63 regardless of whether an overlap is detected.
- the operation 61 may determine whether a relevant overlap exists.
- FIG. 7 is a flow chart showing an algorithm, indicated generally by the reference numeral 70, in accordance with an example embodiment.
- the algorithm 70 is an example implementation of the operation 62 of the algorithm 60 described above. Other example implementations of the operation 62 are possible.
- instructions for processing audio signals within an overlap region may be pre-set.
- a first effect may overtake a second effect. For example, if a first audio object having a chorus effect applied to it, and a second audio object having a reverb effect applied to it, overlap then a rule may be applied to modify the reverb effect of the second audio object.
- Different effects may be provided in a prioritized order in user settings, device settings, or learnt by the system 50.
- the system may learn over the course of time when a user is making changes in overlapping situations and propose to apply the change the user has most often selected when an overlapping has occurred.
- the system may also change the effect of the first and/or second audio object according to the learnt behaviour without asking user's confirmation.
- Different ways to handle the overlapping situation exist as explained earlier. For example, one or more of the effects may be removed, one or more of the effects may be changed, one or more effects maybe applied to one or more audio objects.
- a surprise random or semi-random effect e.g. when a certain first effect and a certain second effect overlap a certain third effect may be applied to both etc.
- the algorithm 70 starts at operation 71, where a user interface output is generated related to said first and/or second spatial audio signals within said overlap region.
- the user interface may, for example, be an audio editor (or a video and audio editor). Example user interface outputs are described below.
- an indication is given, for example by a user, indicating how the overlap (e.g. the overlap identified in the operation 61) is to be handled.
- the operation 72 may involve receiving user instructions in response to said user interface output.
- the identified overlap is processed in accordance with the user indication provided in the operation 72.
- the first and/or second spatial audio signals within the overlap region may be processed in accordance with said user instructions.
- FIG. 8 shows an output, indicated generally by the reference numeral 80, in accordance with an example embodiment.
- the output 80 is an example implementation of the operation 72 described above and shows a time-domain visualisation of a first audio signal 81 and a second audio signal 82.
- the time-domain visualisation may correspond to a video (or audio and video) editing software timeline such that it provides a linear view of the audio or audio-visual content across a device screen or a software window.
- the visualisation may be provided to highlight the first region.
- the output 80 is therefore an example user interface output.
- the time-domain visualisation of the first audio signal 81 includes three parts: labelled 81a, 81b and 81c respectively.
- the three parts of the first audio signal (which are provided at different time periods) are provided at different spatial positions.
- the time-domain visualisation of the second audio signal 82 includes three parts: labelled 82a, 82b and 82c respectively. Again, the three parts of the second audio signal (which are provided at different time periods) are provided at different spatial positions.
- the output 80 includes a visualisation of an overlap period 84, during which the first and second audio signals are at the same spatial position.
- the output 80 is therefore an example of a time-domain user interface that can be used to highlight, to a user (such as an editor) a region in which a spatial overlap exists between the first and second audio signals.
- a time-domain user interface that can be used to highlight, to a user (such as an editor) a region in which a spatial overlap exists between the first and second audio signals.
- time-domain visualisations are possible.
- FIG. 9 shows an output, indicated generally by the reference numeral 90, in accordance with an example embodiment.
- the output 90 is an example implementation of the operation 72 described above and shows a frequency-domain visualisation of a first signal 91 and a second audio signal 92. The visualisation may be provided to highlight the first region.
- the output 90 is therefore an example user interface output.
- the first audio signal 91 has generally lower frequency components and the second audio signal 92 has generally higher frequency components.
- a frequency 94 (which may be user defined) indicates a frequency at which a filter may be provided to separate the first and second audio signals. Such filtering in an example of processing that may be performed in the operations 62 and 73 described above. Of course, many alternative frequency-domain visualisations are possible.
- FIG. 10 is a block diagram of a system, indicated generally by the reference numeral 100, in accordance with an example embodiment.
- the system 100 comprises the user device 12, the first audio source 22 and the second audio source 32 described above.
- the user device 12 receives a first spatial audio signal from the first audio source 22 and a second spatial audio signal from the second audio source 32, with the user device 12 including the first audio focus arrangement 24 and the second audio focus arrangement 34 described above.
- the user device 12 and the first and second audio source 22, 32 are provided within a space 102, such that echoes occur.
- a first echo 104 may be provided between the first audio source 22 and the user device 12 and a second echo 105 may be provided between the second audio source 32 and the user device 12.
- the system 100 is a simple example; more complicated systems, including multiple echoes for each of multiple audio sources are possible.
- FIG. 11 is a flow chart showing an algorithm, indicated generally by the reference numeral 110, in accordance with an example embodiment.
- the algorithm 110 may be implemented using the system 100.
- the algorithm 110 may be implemented as part of the operation 62 of the algorithm 60 described above, although this is not essential to all embodiments.
- the algorithm 110 starts at operation 111, where echoes (as received at the user device 12) are determined.
- echoes (as received at the user device 12) are determined.
- a visualisation of potential echoes of at least some of first and second spatial audio signals (originating at the first and second audio sources respectively) are presented to a user (such as an editor).
- a user identifies echoes, for example by using a user interface to indicate which echo relates to which audio source.
- a user indication of which of said potential echoes are related to the first and/or the second spatial audio signals is received.
- the beamforming arrangement is enhanced accordingly at operation 114 of the algorithm 110.
- FIG. 12 shows a user interface, indicated generally by the reference numeral 120, in accordance with an example embodiment.
- the user interface 120 provides an example visualisation of echoes and is therefore an example implementation of the operation 112 of the algorithm 110.
- the user interface 120 shows a representation of the first audio source 22 and the second audio source 32 described above.
- the user interface 120 also shows first to sixth echoes (labelled 121 to 126 respectively) that could be related to either the first audio source 22 or the second audio source 32.
- the user device 12 may not be able to determine which echoes relate to which sound source (particular if the sound sources are spatially close, e.g. overlapping, and the separation of the original audio source signals is therefore difficult).
- the user interface 120 may enable a user (such as an editor) to listen to individual echoes 121 to 126 and to indicate which sound source some or all of the echoes relate to (thereby implementing the operation 113 described above).
- FIG. 13 shows a user interface, indicated generally by the reference numeral 130, in accordance with an example embodiment.
- the user interface 130 shows a representation of the first audio source 22 and the second audio source 32 described above.
- the user interface 130 also shows a first echo 122a, a second echo 122b and a third echo 122c that relate to the first audio source 22, a fourth echo 132a that relates to the second audio source 32, and a fifth echo 134 and a sixth echo 135 that have not (or, perhaps, have not yet) been assigned to either the first or the second audio sources.
- the operation 114 may be implemented using an acoustic rake receiver beamformer.
- a user (such as an editor) may be able to preview how the audio sources sound when the acoustic rake receiver is applied to enhance an output using the selected echoes.
- the user may also be able to preview an audio output following rake receiving beamforming following the application of the selected effects (as discussed above).
- the adding of rake receiver beamformed echoes to a sound also affects the sound so that an effect is modified; for example, a sound may become more reverberant as more echoes are added to the sound.
- FIG. 14 is a flow chart showing an algorithm, indicated generally by the reference numeral 140, in accordance with an example embodiment.
- the algorithm 140 is an example implementation of the operation 62 of the algorithm 60 described above.
- the algorithm 140 starts at operation 141, where a determination made (e.g. in the form of a user input) that the audio objects should be uploaded to a remote server (or elsewhere) for further processing. Then, at operation 142, the audio objects are uploaded to a server, which server processes the data in some way. In this way, an overlap region can be uploaded for further audio processing.
- the server may run a separation method to separate the audio signals within the overlap region.
- the algorithm 140 may be useful, for example, in the event that the user device 12 has insufficient resources (e.g. insufficient processing power) to perform the required processing, such as an audio separation method.
- a process in the server may process the audio signals to apply effects to the overlapping region or modify the effect of at least one of the audio signals within the overlapping region.
- the process in the server may for example identify the audio objects and their audio signals and identify the effects applied to them.
- the process in the server may modify the effects of the audio signals as described for example in relation to FIG. 7 .
- the process in the server may then create a modified audio stream and store it at the server and/or provide it for user devices.
- audio signals could be recorded and uploaded to a server, which applies effects to the audio signals and sends modified audio signals to an editor process.
- a user operating an audio software on a device could be arranged to have a server inspect for overlapping audio effects and modify the recorded audio signals, and the user device could be arranged to receive modified audio signals which the user can then further edit if needed.
- FIG. 15 is a block diagram of a system, indicated generally by the reference numeral 150, in accordance with an example embodiment.
- the system 150 comprises an apparatus 151, which may, for example, form part of the user device 12 described above. In any particular implementation, some of the elements of the system 150 may be omitted and/or some other elements added.
- the apparatus 151 comprises an input module 152, a controller 153, an effects module 154, an overlap identifier module 155, a user interface controller 156, and a processor 157.
- the input module 152 receives one or more audio and/or visual inputs
- the user interface controller 156 receiver user input
- the processor 157 is in two-way communication with one or more external modules 158.
- the inputs received at the input module 152 may, for example, be the audio signals received from the first and second audio sources 22 and 32.
- the audio signals maybe focussed in first and second directions respectively, as discussed above.
- audio effects may be applied to the audio sources (by the effects module 154), spatial overlaps between the first and second directions may be identified by the overlap identifier module 155 and a user interface related the first and/or second audio signals provided by the user interface controller 156.
- a user/editor may provide user instructions to the user interface controller 156.
- the processor 157 may then process the first and/or second spatial audio signals within said first region in accordance with said user instructions.
- the processor 157 may also communicate with the external module(s) 158, for example to implement the algorithm 140 described above.
- system 150 is highly schematic and is provided by way of example only. Many alternatives to the configuration of the system 150 will be apparent to those skilled in the art.
- FIG. 16 is a schematic diagram of components of one or more of the example embodiments described previously, which hereafter are referred to generically as processing systems 300.
- a processing system 300 may have a processor 302, a memory 304 closely coupled to the processor and comprised of a RAM 314 and ROM 312, and, optionally, user input 310 and a display 318.
- the processing system 300 may comprise one or more network/apparatus interfaces 308 for connection to a network/apparatus, e.g. a modem which may be wired or wireless. Interface 308 may also operate as a connection to other apparatus such as device/apparatus which is not network side apparatus. Thus, direct connection between devices/apparatus without network participation is possible.
- the processor 302 is connected to each of the other components in order to control operation thereof.
- the memory 304 may comprise a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD).
- the ROM 312 of the memory 314 stores, amongst other things, an operating system 315 and may store software applications 316.
- the RAM 314 of the memory 304 is used by the processor 302 for the temporary storage of data.
- the operating system 315 may contain code which, when executed by the processor implements aspects of the algorithms 40, 60, 70, 110 and 140 described above. Note that in the case of small device/ apparatus the memory can be most suitable for small size usage i.e. not always hard disk drive (HDD) or solid-state drive (SSD) is used.
- the processor 302 may take any suitable form. For instance, it may be a microcontroller, a plurality of microcontrollers, a processor, or a plurality of processors.
- the processing system 300 may be a standalone computer, a server, a console, or a network thereof.
- the processing system 300 and needed structural parts may be all inside device/ apparatus such as IoT device/ apparatus i.e. embedded to very small size
- the processing system 300 may also be associated with external software applications. These may be applications stored on a remote server device/ apparatus and may run partly or exclusively on the remote server device/apparatus. These applications maybe termed cloud-hosted applications.
- the processing system 300 may be in communication with the remote server device/ apparatus in order to utilize the software application stored there.
- FIGS. 17A and 17B show tangible media, respectively a removable memory unit 365 and a compact disc (CD) 368, storing computer-readable code which when run by a computer may perform methods according to example embodiments described above.
- the removable memory unit 365 may be a memory stick, e.g. a USB memory stick, having internal memory 366 storing the computer-readable code.
- the memory 366 may be accessed by a computer system via a connector 367.
- the CD 368 may be a CD-ROM or a DVD or similar. Other forms of tangible storage media may be used.
- Tangible media can be any device/apparatus capable of storing data/information which data/information can be exchanged between devices/apparatus/network.
- Embodiments of the present invention may be implemented in software, hardware, application logic or a combination of software, hardware and application logic.
- the software, application logic and/or hardware may reside on memory, or any computer media.
- the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media.
- a "memory" or “computer-readable medium” may be any non-transitory media or means that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer.
- references to, where relevant, "computer-readable storage medium”, “computer program product”, “tangibly embodied computer program” etc., or a “processor” or “processing circuitry” etc. should be understood to encompass not only computers having differing architectures such as single/multi-processor architectures and sequencers/parallel architectures, but also specialised circuits such as field programmable gate arrays FPGA, application specify circuits ASIC, signal processing devices/ apparatus and other devices/ apparatus.
- References to computer program, instructions, code etc. should be understood to express software for a programmable processor firmware such as the programmable content of a hardware device/ apparatus as instructions for a processor or configured or configuration settings for a fixed function device/ apparatus, gate array, programmable logic device/ apparatus, etc.
- circuitry refers to all of the following: (a) hardware-only circuit implementations (such as implementations in only analogue and/or digital circuitry) and (b) to combinations of circuits and software (and/or firmware), such as (as applicable): (i) to a combination of processor(s) or (ii) to portions of processor(s)/ software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a server, to perform various functions) and (c) to circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Stereophonic System (AREA)
Abstract
An apparatus, method and computer program is described comprising: receiving a first spatial audio signal and a second spatial audio signal, wherein the first and second spatial audio signals are focussed in first and second directions respectively; applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identifying a first region, in which a spatial overlap exists between the first and second directions; obtaining instructions for processing said first and/or second spatial audio signals within said first region; and processing said first and/or second spatial audio signals within said first region in accordance with said instructions.
Description
- This specification relates to spatial audio, for example to processing multiple spatial audio signals, wherein a spatial overlap may potentially occur between some of said audio signals.
- Systems exist in which multiple audio signals can be captured by user devices, such as mobile phones. However, there remains a need for further developments and improvements in this field.
- In a first aspect, this specification describes an apparatus comprising: means for receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed (e.g. using beamforming) in first and second directions respectively; means for applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; means for identifying a first region, in which a spatial overlap exists between the first and second directions; means for obtaining instructions (e.g. user instructions) for processing said first and/or second spatial audio signals within said first region; and means for processing said first and/or second spatial audio signals within said first region in accordance with said instructions. Example audio effects include reverberation, modulations, filtering, chorus, flanger etc. The said spatial overlap may be a partial spatial overlap.
- The first and/or second spatial audio signals within said first region may be processed to provide a modified audio effect (e.g. a user-defined modified audio effect) to one or more of said audio signals when a spatial overlap of said audio signals is identified.
- The first direction may have a first spatial width and/or the second direction may have a second spatial width. The first and second spatial widths may be the same or may be different.
- The first and/or second direction may have a spectral width, such that the spatial overlap may be a partial overlap.
- In some embodiments, the instructions may be pre-set instructions and said means for obtaining instructions may retrieve said pre-set instructions.
- Some embodiments further comprise means (such as a video or audio editor) for generating a user interface output related to said first and/or second spatial audio signals within said first region, wherein said means for obtaining instructions receives instructions in response to said user interface output. The user interface output may comprise a time-domain visualisation and/or a frequency-domain visualisation of the first and/or second spatial audio signals within said first region. The said visualisation(s) may be provided to highlight the first region. In one embodiment, the said user interface output may depict potential echoes of at least some of said first and/or second spatial audio signals. This embodiment may further comprise means for receiving a user indication of which of said potential echoes are related to the first and/or the second spatial audio signals.
- The means for identifying the first region may comprise one or more of: identifying if the first and second audio effects are identical and, if so, said first region is not identified; and identifying if the first and/or second spatial audio signals is/are below an audio threshold level (e.g. silent) and, if so, said first region is not identified.
- The means for processing said first and/or second spatial audio signals may disable audio effects associated with said first and/or second spatial audio signals in the absence of user instructions to the contrary when said first region is identified.
- Some embodiments further comprise means for providing a preview output providing an output including said processed first and/or second spatial audio signals within said first region. Thus, a user may preview how the overlap region may be rendered in a particular configuration.
- The means for processing said first and/or second spatial audio signals within said first region may modify audio settings in said first region.
- Some embodiments further comprise means for setting said first and/or second audio effects. For example, a user input may be provided to enable the user to set the first and/or second audio effects. This may, for example, be implemented using a user interface.
- Some embodiments further comprise means (e.g. a mobile phone, such as a multi-microphone mobile phone, or a similar user device) for capturing the first and/or second spatial audio signals.
- Some embodiments further comprise uploading at least some audio content of said first region for further audio processing. The said further audio processing may include sound separation.
- The said means may comprise: at least one processor; and at least one memory including computer program code, the at least one memory and the computer program configured, with the at least one processor, to cause the performance of the apparatus.
- In a second aspect, this specification describes a method comprising: receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed (e.g. using beamforming) in first and second directions respectively; applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identifying a first region, in which a spatial overlap exists between the first and second directions; obtaining instructions (e.g. user instructions) for processing said first and/or second spatial audio signals within said first region; and processing said first and/or second spatial audio signals within said first region in accordance with said instructions. Example audio effects include reverberation, modulations, filtering, chorus, flanger etc. The said spatial overlap may be a partial spatial overlap.
- The first direction may have a first spatial width and/or the second direction may have a second spatial width. The first and second spatial widths may be the same or may be different.
- Some embodiments further comprise generating a user interface output related to said first and/or second spatial audio signals within said first region, wherein obtaining said instructions comprises receiving instructions in response to said user interface output. The user interface output may comprise a time-domain visualisation and/or a frequency-domain visualisation of the first and/or second spatial audio signals within said first region. The said visualisation(s) may be provided to highlight the first region.
- Identifying the first region may comprise one or more of: identifying if the first and second audio effects are identical and, if so, said first region is not identified; and identifying if the first and/or second spatial audio signals is/are below an audio threshold level (e.g. silent) and, if so, said first region is not identified.
- Some embodiments further comprise providing a preview output providing an output including said processed first and/or second spatial audio signals within said first region. Thus, a user may preview how the overlap region may be rendered in a particular configuration.
- Some embodiments further comprise setting said first and/or second audio effects. For example, a user input may be provided to enable the user to set the first and/or second audio effects. This may, for example, be implemented using a user interface.
- Some embodiments further comprise capturing the first and/or second spatial audio signals.
- Some embodiments further comprise uploading at least some audio content of said first region for further audio processing (such as sound separation).
- In a third aspect, this specification describes any apparatus configured to perform any method as described with reference to the second aspect.
- In a fourth aspect, this specification describes computer-readable instructions which, when executed by computing apparatus, cause the computing apparatus to perform any method as described with reference to the second aspect.
- In a fifth aspect, this specification describes a computer program comprising instructions for causing an apparatus to perform at least the following: receive a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed in first and second directions respectively; apply a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identify a first region, in which a spatial overlap exists between the first and second directions; obtain instructions for processing said first and/or second spatial audio signals within said first region; and process said first and/or second spatial audio signals within said first region in accordance with said instructions.
- In a sixth aspect, this specification describes a computer-readable medium (such as a non-transitory computer readable medium) comprising program instructions stored thereon for performing at least the following: receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed in first and second directions respectively; applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identifying a first region, in which a spatial overlap exists between the first and second directions; obtaining instructions for processing said first and/or second spatial audio signals within said first region; and processing said first and/or second spatial audio signals within said first region in accordance with said instructions.
- In a seventh aspect, this specification describes an apparatus comprising: at least one processor; and at least one memory including computer program code which, when executed by the at least one processor, causes the apparatus to: receive a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed in first and second directions respectively; apply a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; identify a first region, in which a spatial overlap exists between the first and second directions; obtain instructions for processing said first and/or second spatial audio signals within said first region; and process said first and/or second spatial audio signals within said first region in accordance with said instructions.
- In an eighth aspect, this specification describes an apparatus comprising: a first input for receiving a first spatial audio signal and a second spatial audio signal (and optionally first and/or second spatial video signals), wherein the first and second spatial audio signals are focussed (e.g. using beamforming) in first and second directions respectively; a first audio processing module for applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal; a spatial processing module for identifying a first region, in which a spatial overlap exists between the first and second directions; a first control module for obtaining instructions (e.g. user instructions) for processing said first and/or second spatial audio signals within said first region; and a second audio processing module (which may be the same module as the first audio processing module) for processing said first and/or second spatial audio signals within said first region in accordance with said instructions.
- Example embodiments will now be described, by way of non-limiting examples, with reference to the following schematic drawings:
-
FIG. 1 is a plan view of a system in which embodiments described herein may be used; -
FIG. 2 is a block diagram of a system in accordance with an example embodiment; -
FIG. 3 is a block diagram of a system in accordance with an example embodiment; -
FIG. 4 is a flow chart showing an example algorithm in accordance with an example embodiment; -
FIG. 5 is a block diagram of a system in accordance with an example embodiment; -
FIG. 6 is a flow chart showing an algorithm in accordance with an example embodiment; -
FIG. 7 is a flow chart showing an algorithm in accordance with an example embodiment; -
FIG. 8 shows an output in accordance with an example embodiment; -
FIG. 9 shows an output in accordance with an example embodiment; -
FIG. 10 is a block diagram of a system in accordance with an example embodiment; -
FIG. 11 is a flow chart showing an algorithm in accordance with an example embodiment; -
FIG. 12 shows a user interface in accordance with an example embodiment; -
FIG. 13 shows a user interface in accordance with an example embodiment; -
FIG. 14 is a flow chart showing an algorithm in accordance with an example embodiment; and -
FIG. 15 is a block diagram of a system in accordance with an example embodiment; -
FIG. 16 is a block diagram of components of a system in accordance with an example embodiment; and -
FIGS. 17A and 17B show tangible media, respectively a removable memory unit and a compact disc (CD) storing computer-readable code which when run by a computer perform operations according to example embodiments. - In the description, like reference numerals relate to like elements throughout.
-
FIG. 1 is a plan view of a system, indicated generally by thereference numeral 10, in which embodiments described herein may be used. Thesystem 10 comprises auser device 12, such as a mobile phone. Theuser device 12 may include one or more microphones for capturing spatial audio signals. Theuser device 12 may also include one or more cameras for capturing images and/or any related data such as depth maps. - The spatial audio data captured by the
user device 12 may include audio content that includes at least some indication of sound directions. This may, for example, be by means of a parametric representation or otherwise perceivable by a listener. For example, an omnidirectional microphone may be used that captures audio from all directions around theuser device 12. -
FIG. 2 is a block diagram of a system, indicated generally by thereference numeral 20, in accordance with an example embodiment. Thesystem 20 comprises theuser device 12 described above and additionally includes afirst audio source 22. Thefirst audio source 22 may, for example, be a person talking (although other audio sources are possible in example embodiments). - In the
system 20, theuser device 12 receives a first spatial audio signal from thefirst audio source 22. Theuser device 12 may include a firstaudio focus arrangement 24, such as a beamforming arrangement, in the direction of thefirst audio source 22 in order to obtain, for example, a monophonic audio signal enhancing the specified direction. Such audio focussing may be provided, for example, to extract audio data relating to thefirst audio source 22, or to focus on thesource 22 in the spatial audio field. The first spatial audio signal is focussed in a first direction (i.e. in the direction of the user device 12). Depending on how the firstaudio focus arrangement 24 is implemented, the beam can be narrow or wide. Furthermore, the beamforming may amplify the beamed direction only slightly, or may substantially amplify the beamed direction and, optionally, attenuate or cancel audio signals from other directions. -
FIG. 3 is a block diagram of a system, indicated generally by thereference numeral 30, in accordance with an example embodiment. Thesystem 30 comprises theuser device 12 and thefirst audio source 22 described above and additionally includes asecond audio source 32. The first and secondaudio sources - In the
system 30, theuser device 12 receives a first spatial audio signal from thefirst audio source 22 and a second spatial audio signal from thesecond audio source 32. As described above, theuser device 12 includes a firstaudio focus arrangement 24, such as a first beamforming arrangement, in the direction of thefirst audio source 22 in order to obtain, for example, a monophonic audio signal enhancing the specified direction. Similarly, theuser device 12 of thesystem 30 includes a secondaudio focus arrangement 34, such as a second beamforming arrangement, in the direction of the second audio source. - By way of example, two beamforming arrangements may be operated in parallel, one implementing the first
audio focus arrangement 24 and the other implementing the secondaudio focus arrangement 34. - In various embodiments, there may be a tracking employed to guide at least one audio focus or beamforming direction. For example, there may be utilized a visual tracking based on a camera input or any other suitable tracking, where an audio source, such as
audio sources -
FIG. 4 is a flow chart, indicated generally by thereference numeral 40, showing an example algorithm. - The
algorithm 40 starts atoperation 41, where the first and secondaudio focus arrangements user device 12 to receive a first spatial audio signal focussed in the direction of thefirst audio source 22 and a second spatial audio signal focussed in the direction of thesecond audio source 32. Thus, theoperation 41 may implement suitable beamforming arrangements. The first and secondaudio focus arrangements - As
operation 42, a user selects a first audio effect to be applied to the first spatial audio signal and/or a second audio effect to be applied to the second spatial audio signal. Example audio effects include reverberations, modulations, filtering etc. - At
operation 43, thesystem 30 processes the spatial audio signals (e.g., audio objects) according to the effects selected in theoperation 42. Thus, first and/or second spatial audio signals captured by theuser device 12 can be modified according to the effects selected in theoperation 42. - The
algorithm 40 can be applied to thesystem 30 in order to process the first and second audio sources separately. Indeed, entirely different effects may be selected inoperation 42 and applied to the first and second audio sources inoperation 43. This is relatively simple to implement in the event that the separation between the audio sources is much greater than the width of the audio focusingarrangements -
FIG. 5 is a block diagram of a system, indicated generally by thereference numeral 50, in accordance with an example embodiment. Thesystem 50 comprises theuser device 12, thefirst audio source 22 and thesecond audio source 32 described above. Theuser device 12 includes the firstaudio focus arrangement 24, in the direction of thefirst audio source 22, and the secondaudio focus arrangement 34, in the direction of thesecond audio source 32, described above. - The first and second
audio sources system 50 than in thesystem 30. Accordingly, the first and secondaudio focus arrangements system 50 than in thesystem 30; indeed, in thesystem 50, the first and secondaudio focus arrangements audio focus arrangements - When applying the
algorithm 40 to thesystem 50, it may be difficult to implement theoperation 43, since it may be difficult for theuser device 12 to distinguish between the first and secondaudio sources -
FIG. 6 is a flow chart showing an algorithm, indicated generally by thereference numeral 60, in accordance with an example embodiment. Thealgorithm 60 maybe implemented using thesystems - The
algorithm 60 starts atoperation 41, where, as discussed above,audio focus arrangements user device 12 to receive a first spatial audio signal focussed in the direction of the first audio source 22 (a "first direction") and a second spatial audio signal focussed in the direction of the second audio source 32 (a "second direction"). Theoperation 41 maybe implemented at theuser device 12 by providing means for receiving the first spatial audio signal and the second spatial audio signal at the user device. As noted above, the first and secondaudio focus arrangements - At
operation 42, a user selects a first audio effect to be applied to the first spatial audio signal and/or a second audio effect to be applied to the second spatial audio signal. As indicated above, example audio effects include reverberations, modulations, filtering etc. - The
algorithm 60 then moves tooperation 61, where it is determined whether a first region is identified, in which a spatial overlap exists between the first and second directions. For example, in thesystem 50, a substantial overlap exists, whereas in thesystem 30, no significant overlap exists. If an overlap is identified, thealgorithm 60 moves tooperation 62; otherwise, thealgorithm 60 moves tooperation 63. It should be noted that the extent of the overlap necessary for an overlap to be identified may vary in different embodiments (and may be user-definable). Moreover, as noted above, the extent to which an overlap exists may be dependent on the spatial widths of the respective audio focus arrangements. - At
operation 62, overlap regions are processed (as discussed further below), before thealgorithm 60 moves tooperation 63. - At
operation 63, the audio objects are processed. Theoperation 63 may, for example, be similar to (or identical to) theoperation 43 discussed above. In theoperation 63, the first and/or second spatial audio signals within said first region may be processed to modify audio settings in said first region. By way of example, theoperation 63 may disable the audio effects associated with said first and/or second spatial audio signals in the absence of instructions (e.g. user instructions) to the contrary when said first region is identified (i.e. when an overlap is detected). - At
operation 64, a preview output providing an (e.g. a preview output) output including said processed first and/or second spatial audio signals within said first region may be provided to a user. In this way, a user may preview how the overlap region may be rendered in a particular configuration. This may, for example, enable a user (such as an editor) to trial a number of different approaches to dealing with overlap conditions. - The
algorithm 60 can therefore be used to identify areas of overlap between the spatial positions of audio sources (or between focus arrangements relating to said audio sources). This may be detected, for example, by calculating the absolute difference between directions-of-arrival (DOA) of audio signals, differences in azimuth and/or elevation angle, differences in the directions of audio focussing arrangements etc. - In some circumstances, regardless of whether an overlap exists, a problem due to spatial overlapping of the form described above may not arise. For example, in the event that audio effects applied to potentially overlapping audio sources are identical, then the presence of an overlap may not present a problem. Alternatively, or in addition, in the event that One or more of potentially overlapping audio sources have audio levels below an audio threshold level (e.g. are silent), then the presence of an overlap may not present a problem. In such circumstances, the
operation 61 may proceed to theoperation 63 regardless of whether an overlap is detected. Thus, theoperation 61 may determine whether a relevant overlap exists. -
FIG. 7 is a flow chart showing an algorithm, indicated generally by thereference numeral 70, in accordance with an example embodiment. Thealgorithm 70 is an example implementation of theoperation 62 of thealgorithm 60 described above. Other example implementations of theoperation 62 are possible. For example, instructions for processing audio signals within an overlap region may be pre-set. In an example embodiment, a first effect may overtake a second effect. For example, if a first audio object having a chorus effect applied to it, and a second audio object having a reverb effect applied to it, overlap then a rule may be applied to modify the reverb effect of the second audio object. Different effects may be provided in a prioritized order in user settings, device settings, or learnt by thesystem 50. For example, the system may learn over the course of time when a user is making changes in overlapping situations and propose to apply the change the user has most often selected when an overlapping has occurred. The system may also change the effect of the first and/or second audio object according to the learnt behaviour without asking user's confirmation. Different ways to handle the overlapping situation exist as explained earlier. For example, one or more of the effects may be removed, one or more of the effects may be changed, one or more effects maybe applied to one or more audio objects. In the last option, a surprise random or semi-random effect (e.g. when a certain first effect and a certain second effect overlap a certain third effect may be applied to both etc.) may be applied when audio objects with different effects overlap. This could indicate to the user that the audio signals are overlapping and the user may decide to further edit the audio objects and their effects. - The
algorithm 70 starts atoperation 71, where a user interface output is generated related to said first and/or second spatial audio signals within said overlap region. The user interface may, for example, be an audio editor (or a video and audio editor). Example user interface outputs are described below. - At
operation 72, an indication is given, for example by a user, indicating how the overlap (e.g. the overlap identified in the operation 61) is to be handled. For example, theoperation 72 may involve receiving user instructions in response to said user interface output. - Finally, at
operation 73, the identified overlap is processed in accordance with the user indication provided in theoperation 72. For example, the first and/or second spatial audio signals within the overlap region may be processed in accordance with said user instructions. -
FIG. 8 shows an output, indicated generally by thereference numeral 80, in accordance with an example embodiment. Theoutput 80 is an example implementation of theoperation 72 described above and shows a time-domain visualisation of a first audio signal 81 and a second audio signal 82. For example, the time-domain visualisation may correspond to a video (or audio and video) editing software timeline such that it provides a linear view of the audio or audio-visual content across a device screen or a software window. The visualisation may be provided to highlight the first region. Theoutput 80 is therefore an example user interface output. - The time-domain visualisation of the first audio signal 81 includes three parts: labelled 81a, 81b and 81c respectively. The three parts of the first audio signal (which are provided at different time periods) are provided at different spatial positions. Similarly, the time-domain visualisation of the second audio signal 82 includes three parts: labelled 82a, 82b and 82c respectively. Again, the three parts of the second audio signal (which are provided at different time periods) are provided at different spatial positions.
- The
output 80 includes a visualisation of anoverlap period 84, during which the first and second audio signals are at the same spatial position. Theoutput 80 is therefore an example of a time-domain user interface that can be used to highlight, to a user (such as an editor) a region in which a spatial overlap exists between the first and second audio signals. Of course, many alternative time-domain visualisations are possible. -
FIG. 9 shows an output, indicated generally by thereference numeral 90, in accordance with an example embodiment. Theoutput 90 is an example implementation of theoperation 72 described above and shows a frequency-domain visualisation of afirst signal 91 and asecond audio signal 92. The visualisation may be provided to highlight the first region. Theoutput 90 is therefore an example user interface output. - The
first audio signal 91 has generally lower frequency components and thesecond audio signal 92 has generally higher frequency components. A frequency 94 (which may be user defined) indicates a frequency at which a filter may be provided to separate the first and second audio signals. Such filtering in an example of processing that may be performed in theoperations -
FIG. 10 is a block diagram of a system, indicated generally by thereference numeral 100, in accordance with an example embodiment. Thesystem 100 comprises theuser device 12, thefirst audio source 22 and thesecond audio source 32 described above. As described above, theuser device 12 receives a first spatial audio signal from thefirst audio source 22 and a second spatial audio signal from thesecond audio source 32, with theuser device 12 including the firstaudio focus arrangement 24 and the secondaudio focus arrangement 34 described above. - In the
system 100, theuser device 12 and the first and secondaudio source space 102, such that echoes occur. By way of example, afirst echo 104 may be provided between thefirst audio source 22 and theuser device 12 and asecond echo 105 may be provided between thesecond audio source 32 and theuser device 12. Of course, thesystem 100 is a simple example; more complicated systems, including multiple echoes for each of multiple audio sources are possible. -
FIG. 11 is a flow chart showing an algorithm, indicated generally by thereference numeral 110, in accordance with an example embodiment. Thealgorithm 110 may be implemented using thesystem 100. Thealgorithm 110 may be implemented as part of theoperation 62 of thealgorithm 60 described above, although this is not essential to all embodiments. - The
algorithm 110 starts atoperation 111, where echoes (as received at the user device 12) are determined. Next, atoperation 112, a visualisation of potential echoes of at least some of first and second spatial audio signals (originating at the first and second audio sources respectively) are presented to a user (such as an editor). - At
operation 113, a user identifies echoes, for example by using a user interface to indicate which echo relates to which audio source. Thus, a user indication of which of said potential echoes are related to the first and/or the second spatial audio signals is received. - Finally on the basis of the information provided in the
operation 113, the beamforming arrangement is enhanced accordingly atoperation 114 of thealgorithm 110. -
FIG. 12 shows a user interface, indicated generally by thereference numeral 120, in accordance with an example embodiment. Theuser interface 120 provides an example visualisation of echoes and is therefore an example implementation of theoperation 112 of thealgorithm 110. - The
user interface 120 shows a representation of thefirst audio source 22 and thesecond audio source 32 described above. Theuser interface 120 also shows first to sixth echoes (labelled 121 to 126 respectively) that could be related to either thefirst audio source 22 or thesecond audio source 32. By way of example, theuser device 12 may not be able to determine which echoes relate to which sound source (particular if the sound sources are spatially close, e.g. overlapping, and the separation of the original audio source signals is therefore difficult). - The
user interface 120 may enable a user (such as an editor) to listen toindividual echoes 121 to 126 and to indicate which sound source some or all of the echoes relate to (thereby implementing theoperation 113 described above). - By way of example,
FIG. 13 shows a user interface, indicated generally by thereference numeral 130, in accordance with an example embodiment. Theuser interface 130 shows a representation of thefirst audio source 22 and thesecond audio source 32 described above. Theuser interface 130 also shows afirst echo 122a, asecond echo 122b and athird echo 122c that relate to thefirst audio source 22, afourth echo 132a that relates to thesecond audio source 32, and a fifth echo 134 and asixth echo 135 that have not (or, perhaps, have not yet) been assigned to either the first or the second audio sources. - The
operation 114 may be implemented using an acoustic rake receiver beamformer. A user (such as an editor) may be able to preview how the audio sources sound when the acoustic rake receiver is applied to enhance an output using the selected echoes. The user may also be able to preview an audio output following rake receiving beamforming following the application of the selected effects (as discussed above). In an example embodiment, the adding of rake receiver beamformed echoes to a sound also affects the sound so that an effect is modified; for example, a sound may become more reverberant as more echoes are added to the sound. -
FIG. 14 is a flow chart showing an algorithm, indicated generally by thereference numeral 140, in accordance with an example embodiment. Thealgorithm 140 is an example implementation of theoperation 62 of thealgorithm 60 described above. - The
algorithm 140 starts atoperation 141, where a determination made (e.g. in the form of a user input) that the audio objects should be uploaded to a remote server (or elsewhere) for further processing. Then, atoperation 142, the audio objects are uploaded to a server, which server processes the data in some way. In this way, an overlap region can be uploaded for further audio processing. By way of example, the server may run a separation method to separate the audio signals within the overlap region. Thealgorithm 140 may be useful, for example, in the event that theuser device 12 has insufficient resources (e.g. insufficient processing power) to perform the required processing, such as an audio separation method. By way of example, a process in the server may process the audio signals to apply effects to the overlapping region or modify the effect of at least one of the audio signals within the overlapping region. The process in the server may for example identify the audio objects and their audio signals and identify the effects applied to them. In the event the audio signals overlap, the process in the server may modify the effects of the audio signals as described for example in relation toFIG. 7 . The process in the server may then create a modified audio stream and store it at the server and/or provide it for user devices. By way of example, audio signals could be recorded and uploaded to a server, which applies effects to the audio signals and sends modified audio signals to an editor process. A user operating an audio software on a device could be arranged to have a server inspect for overlapping audio effects and modify the recorded audio signals, and the user device could be arranged to receive modified audio signals which the user can then further edit if needed. -
FIG. 15 is a block diagram of a system, indicated generally by thereference numeral 150, in accordance with an example embodiment. Thesystem 150 comprises anapparatus 151, which may, for example, form part of theuser device 12 described above. In any particular implementation, some of the elements of thesystem 150 may be omitted and/or some other elements added. - The
apparatus 151 comprises aninput module 152, acontroller 153, aneffects module 154, anoverlap identifier module 155, auser interface controller 156, and aprocessor 157. As shown inFIG. 15 , theinput module 152 receives one or more audio and/or visual inputs, theuser interface controller 156 receiver user input and theprocessor 157 is in two-way communication with one or moreexternal modules 158. - The inputs received at the
input module 152 may, for example, be the audio signals received from the first and secondaudio sources controller 153, audio effects may be applied to the audio sources (by the effects module 154), spatial overlaps between the first and second directions may be identified by theoverlap identifier module 155 and a user interface related the first and/or second audio signals provided by theuser interface controller 156. A user/editor may provide user instructions to theuser interface controller 156. Theprocessor 157 may then process the first and/or second spatial audio signals within said first region in accordance with said user instructions. Theprocessor 157 may also communicate with the external module(s) 158, for example to implement thealgorithm 140 described above. - Of course, the
system 150 is highly schematic and is provided by way of example only. Many alternatives to the configuration of thesystem 150 will be apparent to those skilled in the art. - For completeness,
FIG. 16 is a schematic diagram of components of one or more of the example embodiments described previously, which hereafter are referred to generically as processingsystems 300. Aprocessing system 300 may have aprocessor 302, amemory 304 closely coupled to the processor and comprised of aRAM 314 andROM 312, and, optionally,user input 310 and adisplay 318. Theprocessing system 300 may comprise one or more network/apparatus interfaces 308 for connection to a network/apparatus, e.g. a modem which may be wired or wireless.Interface 308 may also operate as a connection to other apparatus such as device/apparatus which is not network side apparatus. Thus, direct connection between devices/apparatus without network participation is possible. - The
processor 302 is connected to each of the other components in order to control operation thereof. - The
memory 304 may comprise a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD). TheROM 312 of thememory 314 stores, amongst other things, anoperating system 315 and may storesoftware applications 316. TheRAM 314 of thememory 304 is used by theprocessor 302 for the temporary storage of data. Theoperating system 315 may contain code which, when executed by the processor implements aspects of thealgorithms - The
processor 302 may take any suitable form. For instance, it may be a microcontroller, a plurality of microcontrollers, a processor, or a plurality of processors. - The
processing system 300 may be a standalone computer, a server, a console, or a network thereof. Theprocessing system 300 and needed structural parts may be all inside device/ apparatus such as IoT device/ apparatus i.e. embedded to very small size - In some example embodiments, the
processing system 300 may also be associated with external software applications. These may be applications stored on a remote server device/ apparatus and may run partly or exclusively on the remote server device/apparatus. These applications maybe termed cloud-hosted applications. Theprocessing system 300 may be in communication with the remote server device/ apparatus in order to utilize the software application stored there. -
FIGS. 17A and 17B show tangible media, respectively aremovable memory unit 365 and a compact disc (CD) 368, storing computer-readable code which when run by a computer may perform methods according to example embodiments described above. Theremovable memory unit 365 may be a memory stick, e.g. a USB memory stick, havinginternal memory 366 storing the computer-readable code. Thememory 366 may be accessed by a computer system via aconnector 367. TheCD 368 may be a CD-ROM or a DVD or similar. Other forms of tangible storage media may be used. Tangible media can be any device/apparatus capable of storing data/information which data/information can be exchanged between devices/apparatus/network. - Embodiments of the present invention may be implemented in software, hardware, application logic or a combination of software, hardware and application logic. The software, application logic and/or hardware may reside on memory, or any computer media. In an example embodiment, the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media. In the context of this document, a "memory" or "computer-readable medium" may be any non-transitory media or means that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer.
- Reference to, where relevant, "computer-readable storage medium", "computer program product", "tangibly embodied computer program" etc., or a "processor" or "processing circuitry" etc. should be understood to encompass not only computers having differing architectures such as single/multi-processor architectures and sequencers/parallel architectures, but also specialised circuits such as field programmable gate arrays FPGA, application specify circuits ASIC, signal processing devices/ apparatus and other devices/ apparatus. References to computer program, instructions, code etc. should be understood to express software for a programmable processor firmware such as the programmable content of a hardware device/ apparatus as instructions for a processor or configured or configuration settings for a fixed function device/ apparatus, gate array, programmable logic device/ apparatus, etc.
- As used in this application, the term "circuitry" refers to all of the following: (a) hardware-only circuit implementations (such as implementations in only analogue and/or digital circuitry) and (b) to combinations of circuits and software (and/or firmware), such as (as applicable): (i) to a combination of processor(s) or (ii) to portions of processor(s)/ software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a server, to perform various functions) and (c) to circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present.
- If desired, the different functions discussed herein may be performed in a different order and/or concurrently with each other. Furthermore, if desired, one or more of the above-described functions may be optional or may be combined. Similarly, it will also be appreciated that the flow charts of
Figures 4 ,6 ,7 ,11 and14 are examples only and that various operations depicted therein maybe omitted, reordered and/or combined. - It will be appreciated that the above described example embodiments are purely illustrative and are not limiting on the scope of the invention. Other variations and modifications will be apparent to persons skilled in the art upon reading the present specification.
- Moreover, the disclosure of the present application should be understood to include any novel features or any novel combination of features either explicitly or implicitly disclosed herein or any generalization thereof and during the prosecution of the present application or of any application derived therefrom, new claims may be formulated to cover any such features and/or combination of such features.
Claims (15)
- An apparatus comprising:means for receiving a first spatial audio signal and a second spatial audio signal, wherein the first and second spatial audio signals are focussed in first and second directions respectively;means for applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal;means for identifying a first region, in which a spatial overlap exists between the first and second directions;means for obtaining instructions for processing said first and/or second spatial audio signals within said first region; andmeans for processing said first and/or second spatial audio signals within said first region in accordance with said instructions.
- An apparatus as claimed in claim 1, wherein the first direction has a first spatial width and/or the second direction has a second spatial width.
- An apparatus as claimed in claim 1 or claim 2, wherein said instructions are pre-set instructions and wherein said means for obtaining instructions retrieves said pre-set instructions.
- An apparatus as claimed in any one of claims 1 to 3, further comprising means for generating a user interface output related to said first and/or second spatial audio signals within said first region, wherein said means for obtaining instructions receives instructions in response to said user interface output.
- An apparatus as claimed in claim 4, wherein the user interface output comprises a time-domain visualisation and/or a frequency-domain visualisation of the first and/or second spatial audio signals within said first region.
- An apparatus as claimed in claim 4 or claim 5, wherein said user interface output depicts potential echoes of at least some of said first and/or second spatial audio signals.
- An apparatus as claimed in claim 6, further comprising means for receiving a user indication of which of said potential echoes are related to the first and/or the second spatial audio signals.
- An apparatus as claimed in any one of the preceding claims, wherein the means for identifying the first region comprises one or more of:identifying if the first and second audio effects are identical and, if so, said first region is not identified; andidentifying if the first and/or second spatial audio signals is/are below an audio threshold level and, if so, said first region is not identified.
- An apparatus as claimed in any one of the preceding claims, wherein the means for processing said first and/or second spatial audio signals disables audio effects associated with said first and/or second spatial audio signals in the absence of user instructions to the contrary when said first region is identified.
- An apparatus as claimed in any one of the preceding claims, further comprising means for providing a preview output providing an output including said processed first and/or second spatial audio signals within said first region.
- An apparatus as claimed in any one of the preceding claims, wherein said means for processing said first and/or second spatial audio signals within said first region modifies audio settings in said first region.
- An apparatus as claimed in any one of the preceding claims, further comprising uploading at least some audio content of said first region for further audio processing.
- An apparatus as claimed in any one of the preceding claims, wherein the means comprise:at least one processor; andat least one memory including computer program code, the at least one memory and the computer program configured, with the at least one processor, to cause the performance of the apparatus.
- A method comprising:receiving a first spatial audio signal and a second spatial audio signal, wherein the first and second spatial audio signals are focussed in first and second directions respectively;applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal;identifying a first region, in which a spatial overlap exists between the first and second directions;obtaining instructions for processing said first and/or second spatial audio signals within said first region; andprocessing said first and/or second spatial audio signals within said first region in accordance with said instructions.
- A computer readable medium comprising program instructions stored thereon for performing at least the following:receiving a first spatial audio signal and a second spatial audio signal, wherein the first and second spatial audio signals are focussed in first and second directions respectively;applying a first audio effect to the first spatial audio signal and/or a second audio effect to the second spatial audio signal;identifying a first region, in which a spatial overlap exists between the first and second directions;obtaining instructions for processing said first and/or second spatial audio signals within said first region; andprocessing said first and/or second spatial audio signals within said first region in accordance with said instructions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19160886.8A EP3706432A1 (en) | 2019-03-05 | 2019-03-05 | Processing multiple spatial audio signals which have a spatial overlap |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19160886.8A EP3706432A1 (en) | 2019-03-05 | 2019-03-05 | Processing multiple spatial audio signals which have a spatial overlap |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3706432A1 true EP3706432A1 (en) | 2020-09-09 |
Family
ID=65724172
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19160886.8A Withdrawn EP3706432A1 (en) | 2019-03-05 | 2019-03-05 | Processing multiple spatial audio signals which have a spatial overlap |
Country Status (1)
Country | Link |
---|---|
EP (1) | EP3706432A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2030660A1 (en) * | 2006-06-16 | 2009-03-04 | Konami Digital Entertainment Co., Ltd. | Game sound output device, game sound control method, information recording medium, and program |
JP6182169B2 (en) * | 2015-01-15 | 2017-08-16 | 日本電信電話株式会社 | Sound collecting apparatus, method and program thereof |
EP3312718A1 (en) * | 2016-10-20 | 2018-04-25 | Nokia Technologies OY | Changing spatial audio fields |
-
2019
- 2019-03-05 EP EP19160886.8A patent/EP3706432A1/en not_active Withdrawn
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2030660A1 (en) * | 2006-06-16 | 2009-03-04 | Konami Digital Entertainment Co., Ltd. | Game sound output device, game sound control method, information recording medium, and program |
JP6182169B2 (en) * | 2015-01-15 | 2017-08-16 | 日本電信電話株式会社 | Sound collecting apparatus, method and program thereof |
EP3312718A1 (en) * | 2016-10-20 | 2018-04-25 | Nokia Technologies OY | Changing spatial audio fields |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106157986B (en) | Information processing method and device and electronic equipment | |
JP4449987B2 (en) | Audio processing apparatus, audio processing method and program | |
EP3189521B1 (en) | Method and apparatus for enhancing sound sources | |
US20140192997A1 (en) | Sound Collection Method And Electronic Device | |
US20140241702A1 (en) | Dynamic audio perspective change during video playback | |
US11521591B2 (en) | Apparatus and method for processing volumetric audio | |
RU2018119087A (en) | DEVICE AND METHOD FOR FORMING A FILTERED AUDIO SIGNAL REALIZING AN ANGLE RENDERIZATION | |
RU2020116581A (en) | PROGRAM, METHOD AND DEVICE FOR SIGNAL PROCESSING | |
US11631422B2 (en) | Methods, apparatuses and computer programs relating to spatial audio | |
US20190289418A1 (en) | Method and apparatus for reproducing audio signal based on movement of user in virtual space | |
US11792512B2 (en) | Panoramas | |
CN108781310A (en) | The audio stream for the video to be enhanced is selected using the image of video | |
JP6329679B1 (en) | Audio controller, ultrasonic speaker, audio system, and program | |
JP6742216B2 (en) | Sound processing system, sound processing method, program | |
EP3706432A1 (en) | Processing multiple spatial audio signals which have a spatial overlap | |
US20210227150A1 (en) | Multi-camera device | |
US11290812B2 (en) | Audio data arrangement | |
US11937071B2 (en) | Augmented reality system | |
US11979732B2 (en) | Generating audio output signals | |
EP3029671A1 (en) | Method and apparatus for enhancing sound sources | |
WO2021209683A1 (en) | Audio processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20210310 |