EP3249956A1 - Commande de rendu audio - Google Patents

Commande de rendu audio Download PDF

Info

Publication number
EP3249956A1
EP3249956A1 EP16171383.9A EP16171383A EP3249956A1 EP 3249956 A1 EP3249956 A1 EP 3249956A1 EP 16171383 A EP16171383 A EP 16171383A EP 3249956 A1 EP3249956 A1 EP 3249956A1
Authority
EP
European Patent Office
Prior art keywords
sound object
time
sound
mode
microphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP16171383.9A
Other languages
German (de)
English (en)
Inventor
Antti Johannes Eronen
Arto Juhani Lehtiniemi
Jussi Artturi LEPPÄNEN
Francesco Cricri
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Priority to EP16171383.9A priority Critical patent/EP3249956A1/fr
Publication of EP3249956A1 publication Critical patent/EP3249956A1/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/11Positioning of individual sound objects, e.g. moving airplane, within a sound field

Definitions

  • Embodiments of the present invention relate to control of audio rendering.
  • they relate to control of audio rendering of a sound scene comprising at least one sound object.
  • a sound scene in this document is used to refer to the arrangement of one or more sound sources in a three-dimensional space.
  • the sound scene changes.
  • the sound source changes its audio properties such as its audio output, then the sound scene changes.
  • a sound scene may be defined in relation to recording sounds (a recorded sound scene) and in relation to rendering sounds (a rendered sound scene).
  • Some current technology focuses on accurately reproducing a recorded sound scene as a rendered sound scene either in real time or at a distance in time and/or space from the recorded sound scene.
  • the recorded sound scene is encoded for storage and/or transmission and/or rendering.
  • a sound object within a sound scene may be a source sound object that represents a sound source within the sound scene or may be a recorded sound object which represents sounds recorded at a particular microphone.
  • reference to a sound object refers to both a recorded sound object and a source sound object.
  • the sound object(s) may be only source sound objects and in other examples the sound object(s) may be only recorded sound objects.
  • Some microphones such as Lavalier microphones, or other portable microphones, may be attached to or may follow a sound source in the sound scene.
  • Other microphones may be static in the sound scene.
  • a method comprising: detecting a change in position of rendering a sound object from a first position at a first time to a second position, different to the first position, at a second time immediately after the first time; and at the second time, generating a visual distraction.
  • a computer program when run on a processor causes performance of:
  • a system or apparatus comprising means for performing: detecting a change in position of rendering a sound object from a first position at a first time to a second position, different to the first position, at a second time immediately after the first time; and at the second time, generating a visual distraction.
  • an apparatus comprising: at least one processor; and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to perform: detecting a change in position of rendering a sound object from a first position at a first time to a second position, different to the first position, at a second time immediately after the first time; and at the second time, generating a visual distraction.
  • Fig. 1 illustrates an example of a system 100 and also an example of a method 200.
  • the system 100 and method 200 record a sound scene 10 and process the recorded sound scene to enable an accurate rendering of the recorded sound scene as a rendered sound scene for a listener at a particular position (the origin) within the recorded sound scene 10.
  • the system 100 comprises one or more portable microphones 110 and may comprise one or more static microphones 120.
  • the origin of the sound scene is at a microphone.
  • the microphone at the origin is a static microphone 120. It may record one or more channels, for example it may be a microphone array.
  • static microphone 120 only a single static microphone 120 is illustrated. However, in other examples multiple static microphones 120 may be used independently. In such circumstances the origin may be at any one of these static microphones 120 and it may be desirable to switch, in some circumstances, the origin between static microphones 120 or to position the origin at an arbitrary position within the sound scene.
  • the system 100 comprises one or more portable microphones 110.
  • the portable microphone 110 may, for example, move with a sound source within the recorded sound scene 10.
  • the portable microphone may, for example, be an 'up-close' microphone that remains close to a sound source. This may be achieved, for example, using a boom microphone or, for example, by attaching the microphone to the sound source, for example, by using a Lavalier microphone.
  • the portable microphone 110 may record one or more recording channels.
  • Fig. 2 schematically illustrates the relative positions of the portable microphone (PM) 110 and the static microphone (SM) 120 (if present) relative to an arbitrary reference point (REF).
  • the position of the static microphone 120 relative to the reference point REF is represented by the vector x .
  • the position of the portable microphone PM relative to the reference point REF is represented by the vector y .
  • the relative position of the portable microphone PM 110 from the static microphone SM 120 is represented by the vector z .
  • z y - x .
  • the vector z gives the relative position of the portable microphone 110 relative to the static microphone 120 which, in this example, is the origin of the sound scene 10.
  • the vector z therefore positions the portable microphone 110 relative to a notional listener of the recorded sound scene 10.
  • the vector x is constant. Therefore, if one has knowledge of x and tracks variations in y , it is possible to also track variations in z , the relative position of the portable microphone 110 relative to the origin of the sound scene 10.
  • the sound scene 10 as recorded is rendered to a user (listener) by the system 100 in Fig. 1 , it is rendered to the listener as if the listener is positioned at the origin of the recorded sound scene 10. It is therefore important that, as the portable microphone 110 moves in the recorded sound scene 10, its position z relative to the origin of the recorded sound scene 10 is tracked and is correctly represented in the rendered sound scene.
  • the system 100 is configured to achieve this.
  • the audio signals 122 output from the static microphone 120 are coded by audio coder 130 into a multichannel audio signal 132. If multiple static microphones were present, the output of each would be separately coded by an audio coder into a multichannel audio signal.
  • the audio coder 130 may be a spatial audio coder such that the multichannels 132 represent the sound scene 10 as recorded by the static microphone 120 and can be rendered giving a spatial audio effect.
  • the audio coder 130 may be configured to produce multichannel audio signals 132 according to a defined standard such as, for example, binaural coding, 5.1 surround sound coding, 7.1 surround sound coding etc. If multiple static microphones were present, the multichannel signal of each static microphone would be produced according to the same defined standard such as, for example, binaural coding, 5.1 surround sound coding, and 7.1 surround sound coding and in relation to the same common rendered sound scene.
  • the multichannel audio signals 132 from one or more the static microphones 120 are mixed by mixer 102 with multichannel audio signals 142 from the one or more portable microphones 110 to produce a multi-microphone multichannel audio signal 103 that represents the recorded sound scene 10 relative to the origin and which can be rendered by an audio decoder corresponding to the audio coder 130 to reproduce a rendered sound scene to a listener that corresponds to the recorded sound scene when the listener is at the origin.
  • the multichannel audio signal 142 from the, or each, portable microphone 110 is processed before mixing to take account of any movement of the portable microphone 110 relative to the origin at the static microphone 120.
  • the audio signals 112 output from the portable microphone 110 are processed by the positioning block 140 to adjust for movement of the portable microphone 110 relative to the origin.
  • the positioning block 140 takes as an input the vector z or some parameter or parameters dependent upon the vector z .
  • the vector z represents the relative position of the portable microphone 110 relative to the origin at the static microphone 120 in this example.
  • the positioning block 140 may be configured to adjust for any time misalignment between the audio signals 112 recorded by the portable microphone 110 and the audio signals 122 recorded by the static microphone 120 so that they share a common time reference frame. This may be achieved, for example, by correlating naturally occurring or artificially introduced (non-audible) audio signals that are present within the audio signals 112 from the portable microphone 110 with those within the audio signals 122 from the static microphone 120. Any timing offset identified by the correlation may be used to delay/advance the audio signals 112 from the portable microphone 110 before processing by the positioning block 140.
  • the positioning block 140 processes the audio signals 112 from the portable microphone 110, taking into account the relative orientation (Arg( z )) of that portable microphone 110 relative to the origin at the static microphone 120.
  • the audio coding of the static microphone audio signals 122 to produce the multichannel audio signal 132 assumes a particular orientation of the rendered sound scene relative to an orientation of the recorded sound scene and the audio signals 122 are encoded to the multichannel audio signals 132 accordingly.
  • the relative orientation Arg ( z ) of the portable microphone 110 in the recorded sound scene 10 is determined and the audio signals 112 representing the sound object are coded to the multichannels defined by the audio coding 130 such that the sound object is correctly oriented within the rendered sound scene at a relative orientation Arg ( z ) from the listener.
  • the audio signals 112 may first be mixed or encoded into the multichannel signals 142 and then a transformation T may be used to rotate the multichannel audio signals 142, representing the moving sound object, within the space defined by those multiple channels by Arg ( z ).
  • a head-mounted audio output device 300 for example headphones using binaural audio coding
  • the relative orientation between the listener and the rendered sound scene 310 is represented by an angle ⁇ .
  • the sound scene is rendered by the audio output device 300 which physically rotates in the space 320.
  • the relative orientation between the audio output device 300 and the rendered sound scene 310 is represented by an angle ⁇ .
  • the user turns their head clockwise increasing ⁇ by magnitude ⁇ and increasing ⁇ by magnitude ⁇ .
  • the rendered sound scene is rotated relative to the audio device in an anticlockwise direction by magnitude ⁇ so that the rendered sound scene 310 remains fixed in space.
  • the orientation of the rendered sound scene 310 tracks with the rotation of the listener's head so that the orientation of the rendered sound scene 310 remains fixed in space 320 and does not move with the listener's head 330.
  • Fig. 3 illustrates a system 100 as illustrated in Fig. 1 , modified to rotate the rendered sound scene 310 relative to the recorded sound scene 10. This will rotate the rendered sound scene 310 relative to the audio output device 300 which has a fixed relationship with the recorded sound scene 310.
  • An orientation block 150 is used to rotate the multichannel audio signals 142 by ⁇ , determined by rotation of the user's head.
  • an orientation block 150 is used to rotate the multichannel audio signals 132 by ⁇ , determined by rotation of the user's head.
  • orientation block 150 is very similar to the functionality of the orientation function of the positioning block 140.
  • the audio coding of the static microphone signals 122 to produce the multichannel audio signals 132 assumes a particular orientation of the rendered sound scene relative to the recorded sound scene. This orientation is offset by ⁇ . Accordingly, the audio signals 122 are encoded to the multichannel audio signals 132 and the audio signals 112 are encoded to the multichannel audio signals 142 accordingly.
  • the transformation T may be used to rotate the multichannel audio signals 132 within the space defined by those multiple channels by ⁇ .
  • An additional transformation T may be used to rotate the multichannel audio signals 142 within the space defined by those multiple channels by ⁇ .
  • the portable microphone signals 112 are additionally processed to control the perception of the distance D of the sound object from the listener in the rendered sound scene, for example, to match the distance
  • the distance block 160 processes the multichannel audio signal 142 to modify the perception of distance.
  • Fig. 5 illustrates a module 170 which may be used, for example, to perform the functions of the positioning block 140, orientation block 150 and distance block 160 in Fig. 3 .
  • the module 170 may be implemented using circuitry and/or programmed processors.
  • the Figure illustrates the processing of a single channel of the multichannel audio signal 142 before it is mixed with the multichannel audio signal 132 to form the multi-microphone multichannel audio signal 103.
  • a single input channel of the multichannel signal 142 is input as signal 187.
  • the input signal 187 passes in parallel through a "direct” path and one or more "indirect” paths before the outputs from the paths are mixed together, as multichannel signals, by mixer 196 to produce the output multichannel signal 197.
  • the output multichannel signal 197, for each of the input channels, are mixed to form the multichannel audio signal 142 that is mixed with the multichannel audio signal 132.
  • the direct path represents audio signals that appear, to a listener, to have been received directly from an audio source and an indirect path represents audio signals that appear to a listener to have been received from an audio source via an indirect path such as a multipath or a reflected path or a refracted path.
  • the distance block 160 by modifying the relative gain between the direct path and the indirect paths, changes the perception of the distance D of the sound object from the listener in the rendered sound scene 310.
  • Each of the parallel paths comprises a variable gain device 181, 191 which is controlled by the distance block 160.
  • the perception of distance can be controlled by controlling relative gain between the direct path and the indirect (decorrelated) paths. Increasing the indirect path gain relative to the direct path gain increases the perception of distance.
  • the input signal 187 is amplified by variable gain device 181, under the control of the distance block 160, to produce a gain-adjusted signal 183.
  • the gain-adjusted signal 183 is processed by a direct processing module 182 to produce a direct multichannel audio signal 185.
  • the input signal 187 is amplified by variable gain device 191, under the control of the distance block 160, to produce a gain-adjusted signal 193.
  • the gain-adjusted signal 193 is processed by an indirect processing module 192 to produce an indirect multichannel audio signal 195.
  • the direct multichannel audio signal 185 and the one or more indirect multichannel audio signals 195 are mixed in the mixer 196 to produce the output multichannel audio signal 197.
  • the direct processing block 182 and the indirect processing block 192 both receive direction of arrival signals 188.
  • the direction of arrival signal 188 gives the orientation Arg( z ) of the portable microphone 110 (moving sound object) in the recorded sound scene 10 and the orientation ⁇ of the rendered sound scene 310 relative to the audio output device 300.
  • the position of the moving sound object changes as the portable microphone 110 moves in the recorded sound scene 10 and the orientation of the rendered sound scene 310 changes as the head-mounted audio output device, rendering the sound scene rotates.
  • the direct processing block 182 may, for example, include a system 184 similar to that illustrated in Figure 6A that rotates the single channel audio signal, gain-adjusted input signal 183, in the appropriate multichannel space producing the direct multichannel audio signal 185.
  • the system 184 uses a transfer function to performs a transformation T that rotates multichannel signals within the space defined for those multiple channels by Arg( z ) and by ⁇ , defined by the direction of arrival signal 188.
  • a transfer function to performs a transformation T that rotates multichannel signals within the space defined for those multiple channels by Arg( z ) and by ⁇ , defined by the direction of arrival signal 188.
  • HRTF head related transfer function
  • VBAP Vector Base Amplitude Panning
  • loudspeaker format e.g. 5.1
  • the indirect processing block 192 may, for example, be implemented as illustrated in Fig. 6B .
  • the direction of arrival signal 188 controls the gain of the single channel audio signal, the gain-adjusted input signal 193, using a variable gain device 194.
  • the amplified signal is then processed using a static decorrelator 196 and then a system 198 that applies a static transformation T to produce the indirect multichannel audio signal 195.
  • the static decorrelator in this example uses a pre-delay of at least 2 ms.
  • the transformation T rotates multichannel signals within the space defined for those multiple channels in a manner similar to the system 184 but by a fixed amount.
  • HRTF static head related transfer function
  • module 170 can be used to process the portable microphone signals 112 and perform the functions of:
  • the module 170 may also be used for performing the function of the orientation block 150 only, when processing the audio signals 122 provided by the static microphone 120.
  • the direction of arrival signal will include only ⁇ and will not include Arg( z ).
  • gain of the variable gain devices 191 modifying the gain to the indirect paths may be put to zero and the gain of the variable gain device 181 for the direct path may be fixed.
  • the module 170 reduces to the system 184 illustrated in Fig. 6A that rotates the recorded sound scene to produce the rendered sound scene according to a direction of arrival signal that includes only ⁇ and does not include Arg( z ).
  • Fig. 7 illustrates an example of the system 100 implemented using an apparatus 400.
  • the apparatus 400 may, for example, be a static electronic device, a portable electronic device or a hand-portable electronic device that has a size that makes it suitable to carried on a palm of a user or in an inside jacket pocket of the user.
  • the apparatus 400 comprises the static microphone 120 as an integrated microphone but does not comprise the one or more portable microphones 110 which are remote.
  • the static microphone 120 is a microphone array.
  • the apparatus 400 does not comprise the static microphone 120.
  • the apparatus 400 comprises an external communication interface 402 for communicating externally with external microphones, for example, the remote portable microphone(s) 110.
  • This may, for example, comprise a radio transceiver.
  • a positioning system 450 is illustrated as part of the system 100. This positioning system 450 is used to position the portable microphone(s) 110 relative to the origin of the sound scene e.g. the static microphone 120. In this example, the positioning system 450 is illustrated as external to both the portable microphone 110 and the apparatus 400. It provides information dependent on the position z of the portable microphone 110 relative to the origin of the sound scene to the apparatus 400. In this example, the information is provided via the external communication interface 402, however, in other examples a different interface may be used. Also, in other examples, the positioning system may be wholly or partially located within the portable microphone 110 and/or within the apparatus 400.
  • the position system 450 provides an update of the position of the portable microphone 110 with a particular frequency and the term 'accurate' and 'inaccurate' positioning of the sound object should be understood to mean accurate or inaccurate within the constraints imposed by the frequency of the positional update. That is accurate and inaccurate are relative terms rather than absolute terms.
  • the apparatus 400 wholly or partially operates the system 100 and method 200 described above to produce a multi-microphone multichannel audio signal 103.
  • the apparatus 400 provides the multi-microphone multichannel audio signal 103 via an output communications interface 404 to an audio output device 300 for rendering.
  • the audio output device 300 may use binaural coding. Alternatively or additionally, in some but not necessarily all examples, the audio output device 300 may be a head-mounted audio output device.
  • the apparatus 400 comprises a controller 410 configured to process the signals provided by the static microphone 120 and the portable microphone 110 and the positioning system 450.
  • the controller 410 may be required to perform analogue to digital conversion of signals received from microphones 110, 120 and/or perform digital to analogue conversion of signals to the audio output device 300 depending upon the functionality at the microphones 110, 120 and audio output device 300.
  • Fig. 7 for clarity of presentation no converters are illustrated in Fig. 7 .
  • controller circuitry may be as controller circuitry.
  • the controller 410 may be implemented in hardware alone, have certain aspects in software including firmware alone or can be a combination of hardware and software (including firmware).
  • controller 410 may be implemented using instructions that enable hardware functionality, for example, by using executable instructions of a computer program 416 in a general-purpose or special-purpose processor 412 that may be stored on a computer readable storage medium (disk, memory etc) to be executed by such a processor 412.
  • a general-purpose or special-purpose processor 412 may be stored on a computer readable storage medium (disk, memory etc) to be executed by such a processor 412.
  • the processor 412 is configured to read from and write to the memory 414.
  • the processor 412 may also comprise an output interface via which data and/or commands are output by the processor 412 and an input interface via which data and/or commands are input to the processor 412.
  • the memory 414 stores a computer program 416 comprising computer program instructions (computer program code) that controls the operation of the apparatus 400 when loaded into the processor 412.
  • the computer program instructions, of the computer program 416 provide the logic and routines that enables the apparatus to perform the methods illustrated in Figs. 1-12 .
  • the processor 412 by reading the memory 414 is able to load and execute the computer program 416.
  • the computer program 416 may arrive at the apparatus 400 via any suitable delivery mechanism 430.
  • the delivery mechanism 430 may be, for example, a non-transitory computer-readable storage medium, a computer program product, a memory device, a record medium such as a compact disc read-only memory (CD-ROM) or digital versatile disc (DVD), an article of manufacture that tangibly embodies the computer program 416.
  • the delivery mechanism may be a signal configured to reliably transfer the computer program 416.
  • the apparatus 400 may propagate or transmit the computer program 416 as a computer data signal.
  • memory 414 is illustrated as a single component/circuitry it may be implemented as one or more separate components/circuitry some or all of which may be integrated/removable and/or may provide permanent/semi-permanent/ dynamic/cached storage.
  • processor 412 is illustrated as a single component/circuitry it may be implemented as one or more separate components/circuitry some or all of which may be integrated/removable.
  • the processor 412 may be a single core or multi-core processor.
  • the position system 450 enables a position of the portable microphone 110 to be determined.
  • the position system 450 may receive positioning signals and determine a position which is provided to the processor 412 or it may provide positioning signals or data dependent upon positioning signals so that the processor 412 may determine the position of the portable microphone 110.
  • a position system 450 may be used by a position system 450 to position an object including passive systems where the positioned object is passive and does not produce a positioning signal and active systems where the positioned object produces one or more positioning signals.
  • An example of system, used in the Kinect TM device, is when an object is painted with a non-homogenous pattern of symbols using infrared light and the reflected light is measured using multiple cameras and then processed, using the parallax effect, to determine a position of the object.
  • An example of an active radio positioning system is when an object has a transmitter that transmits a radio positioning signal to multiple receivers to enable the object to be positioned by, for example, trilateration or triangulation.
  • An example of a passive radio positioning system is when an object has a receiver or receivers that receive a radio positioning signal from multiple transmitters to enable the object to be positioned by, for example, trilateration or triangulation.
  • Trilateration requires an estimation of a distance of the object from multiple, non-aligned, transmitter/receiver locations at known positions. A distance may, for example, be estimated using time of flight or signal attenuation.
  • Triangulation requires an estimation of a bearing of the object from multiple, non-aligned, transmitter/receiver locations at known positions.
  • a bearing may, for example, be estimated using a transmitter that transmits with a variable narrow aperture, a receiver that receives with a variable narrow aperture, or by detecting phase differences at a diversity receiver.
  • Other positioning systems may use dead reckoning and inertial movement or magnetic positioning.
  • the object that is positioned may be the portable microphone 110 or it may an object worn or carried by a person associated with the portable microphone 110 or it may be the person associated with the portable microphone 110.
  • a problem can arise is relation to positioning using transmission and/or reception of radio signals, particularly indoors, because of multi-path effects arising from reflections. While the high accuracy indoor positioning system (HAIP) is a radio positioning system that addresses such problems, problems can still arise in consistently and accurately positioning a portable microphone 110.
  • HAIP high accuracy indoor positioning system
  • one or more positioning signals received or transmitted by the position system 450 can be subject to noise, resulting in an incorrect position of a portable microphone 110 (incorrect y and incorrect z). In a rendered sound scene, this would result in an incorrect positioning of the rendered sound object associated with the portable microphone 110 which can be disconcerting to a listener.
  • Figs. 8A and 9A both illustrate a plot of how a determined position p i (t i ) of the portable microphone 110 varies with time t i .
  • the determined position p i (t i ) of the portable microphone 110 is the position that is determined or which would be determined based on the original positioning signals by the position system 450. It is the position of the portable microphone 110 based on the measurements made.
  • the determined position p i (t i ) of the portable microphone 110 suffers from noise (deviations from a true position value). There is, in the examples illustrated, frequent low intensity deviation of the determined position p i (t i ) from a position p 1 Some of the deviation may arise from small variations in the actual position of the portable microphone 110 but some of the deviation may be modeled as an unpredictable small amplitude noise n i (t i ).
  • the signal processing may, for example, use filtering to remove noise from the determined positions p i (t i ).
  • which occurs over a time interval ⁇ t i t i - t i-m, then the speed v i is given by ⁇ p i / ⁇ t i .
  • the threshold T i may be variable. This filter may be used to remove the unpredictable errors E i (t i ).
  • the threshold X i may be variable. This filter may be used to remove the unpredictable errors E i (t i ).
  • Figs. 9A and 9B will be used to explain a second mode of operation of the system 100.
  • some of the deviation (noise) illustrated in Fig. 8A is modeled using a second model as arising from variations in the actual position of the portable microphone 110, some of the deviation is modeled as arising from unpredictable small amplitude noise n i (t i ).
  • the second model does not and large intensity deviations in the determined position are more likely to cause a large intensity deviation of the processed position.
  • the system 100 in the second mode, may use this second model to attempt to remove the deviation arising from unpredictable small amplitude noise n i (t i ) in a manner similar to that described for the first model using a filter.
  • the resultant processed position P i (t i ) of the portable microphone 110 should according to the second model correspond to the actual position of the portable microphone 110.
  • Fig. 9B illustrates a plot of how a processed position P i (t i ) of the portable microphone 110 varies with time t i .
  • This processed position P i (t i ) is used to control rendering of the sound object associated with the portable microphone 110.
  • the processed position P i (t i ), not the determined position p i (t i ) is used to provide z used by the positioning block 140 to adjust for movement of the sound object associated with the portable microphone 110 relative to the origin.
  • the signal processing used in the first mode to remove the errors E i (t i ) is not used on the determined positions p i (t i ) and, in a rendered sound scene, a change in the processed positions P i (t i ) positioning the rendered sound object associated with the portable microphone 110 occurs when there is a change in the determined position of the portable microphone 110.
  • a change in the processed positions P i (t i ) from P 1 to P 2 occurs when the determined position p i (t i ) of the portable microphone 110 changes from p 1 to p 2 .
  • P 1 is the same as or very similar to p 1 .
  • P 2 is the same as or very similar to p 2 .
  • a change in the processed positions P i (t i ) from P 1 to P 3 occurs when the determined position p i (t i ) of the portable microphone 110 changes from p 1 to p 3 .
  • P 1 is the same as or very similar to p 1 and/or P 3 is the same as or very similar to p 2 .
  • Fig. 10 illustrates an example of a method 600 suitable for use during the second mode or on transition from the first mode to the second mode.
  • the method generates a distraction, for example a visual distraction when there is a discontinuous or abrupt change in the processed position P i (t) of the rendered sound object.
  • the method 600 comprises, at block 602, providing a process for detecting a change in position P i (t) of rendering a sound object from a first position at a first time to a second position, different to the first position, at a second time immediately after the first time. If block 602 detects a change in position P i (t) of rendering the sound object from a first position at a first time to a second position, different to the first position, at a second time immediately after the first time, then the method moves to block 604.
  • the method 600 comprises, at block 604, providing a process for generating a visual distraction at the second time.
  • a visual distraction 504 is generated on each transition between P 1 (t) and P 3 (t) in a first sense only (away from P 1 (t)). That is, a visual distraction 504 is generated on each transition from P 1 (t) to P 3 (t).
  • a visual distraction 504 is generated on each transition between P 1 (t) and P 3 (t) in a first sense and a second opposite sense (away from and towards P 1 (t)). That is, a visual distraction 504 is generated on each transition from P 1 (t) to P 3 (t) and a visual distraction 504 is generated on each transition to P 1 (t) from P 3 (t).
  • a visual distraction 504 is generated on every qualifying transition between two processed positions P i (t i ) in a first sense only and in in some examples (not illustrated), a visual distraction 504 is generated on every qualifying transition between two processed positions P i (t i ) irrespective of the sense of transition, that is, in the first sense and in the opposite second sense.
  • a qualifying transition may be a transition in processed position that satisfies a qualifying criterion or criteria.
  • the threshold Y i may be variable.
  • the threshold Z i may be variable.
  • the classification of a change in position as a qualifying transition may identify the change as an anomalous change in a processed position of the portable microphone. That is a change in position that cannot physically occur because, for example, the speed of position change is too great.
  • a qualifying condition for classifying a change in processed position as a qualifying transition is that the change in position occurs to a second position at which there is not stage lighting or some other stage effect.
  • the classification of a change in position as a qualifying transition may identify the change as an incorrect change in a processed position of the portable microphone.
  • the generated visual distraction may provide the stage lighting or stage effect at the second position, previously determined to be absent.
  • a visual distraction may be generated with each qualifying transition, until the second mode is exited.
  • the visual distractions generated may be generated in real time within an audio scene of the sound object.
  • a visual distraction generated may comprises a stage effect for example a lighting effect, a smoke effect, pyrotechnics and/or moving stage objects.
  • a visual distraction generated may comprise a lighting effect.
  • a lighting effect may for example comprise a change in a lighting property or lighting properties.
  • a change in a lighting effect may comprise one or more of: changing a position of a spotlight, changing a beam width of a spotlight, adding a spot light, removing a spotlight, changing a color, intensity and/or number of spotlights, and changing a lighting pattern.
  • a visual distraction generated may be dependent upon a classification of a qualifying transition e.g. as anomalous or incorrect.
  • a visual distraction generated may be dependent upon a property of a qualifying transition, for example, a size of a change in processed position of the rendered sound object.
  • a visual distraction generated may be dependent upon a history of previous stage effects and/or visual distractions.
  • Figs. 11A, 11B and 11C illustrate one example of a changing stage effect 610, in this example a lighting effect, in accordance with the rendering of a sound object based on the processed position of the portable microphone illustrated in Fig. 9B .
  • a visual distraction is generated by moving the spotlight 612 so that is no longer trained on p 1 but is trained on p 3 , while processed position P is p 3 .
  • a visual distraction is generated by moving the spotlight 612 so that is no longer trained on p 1 but is trained on p 2 , while processed position P is p 2 .
  • the spotlight 612 follows the processed position P.
  • the distraction generation corresponds to a gross change in position of the spotlight.
  • Figs. 12A, 12B and 12C illustrate an example of a changing stage effect 610, in this example a lighting effect, in accordance with the rendering of a sound object based on the processed position of the portable microphone illustrated in Fig. 9B .
  • a spotlight 612 is trained on the position p 1 .
  • a visual distraction is generated by training an additional spotlight 612' on the position p 3 , while processed position P is position p 3 .
  • a visual distraction is generated by training an additional spotlight 612' on position p 2 , while processed position P is position p 2 .
  • the distraction generation corresponds to a new additional spotlight.
  • the modes differ in how large intensity deviations in the determined position of a portable microphone (processed positions of a sound object) are handled.
  • a position of the sound object (portable microphone 110) is compensated to prevent rapid changes in a position of the rendered sound object.
  • the first mode rejects gross variances in position of the sound object. This compensation removes the unpredictable error E i (t i ).
  • the unpredictable error E i (t i ) is not modeled or removed and a position of the sound object (portable microphone 110) is not compensated and there may be a rapid changes in a position of the rendered sound object.
  • the second mode accepts gross variances in position of the sound object.
  • the first mode may be suitable when it is possible to discriminate between a correct and incorrect position and to correct the incorrect position. That is it is possible to confidently identify an error and confidently remove the error. There is a high level of confidence as to what is an accurate position.
  • the second mode may be suitable when it is not possible to discriminate between a correct and incorrect position and/or it is not possible to correct the incorrect position. That is it is not possible to confidently identify an error and confidently remove the error. There is a low level of confidence as to what is an accurate position. There may, for example, be a high degree of confidence that at least some positions are not possible, anomalous, incorrect and arise from error but a lower degree of confidence as to which positions are not possible, anomalous, incorrect and arise from error, that is it is difficult to discriminate between a correct and incorrect position.
  • the transition between the first mode and the second mode may be based on processing the determined positions of the portable microphone.
  • the first mode is used.
  • the second mode is used.
  • the system 100 may therefore automatically transition between the first mode and the second mode.
  • the system changes from the first mode to the second mode and generates visual distractions.
  • processing positioning signals described should be understood to also encompass processing of data dependent upon the positioning signals.
  • the processing of positioning signals to determine a processed position at which a sound object is rendered may or may not occur, wholly or partially, within the position system 450.
  • the processing of positioning signals to determine a processed position at which a sound object is rendered may or may not occur, wholly or partially, within the processor 412 of the apparatus 400.
  • the processing of positioning signals to determine a mode for rendering a sound object may or may not occur, wholly or partially, within the position system 450.
  • the processing of positioning signals to determine a mode for rendering a sound object may or may not occur, wholly or partially, within the processor 412 of the apparatus 400.
  • the processing to cause a visual distraction to accompany rendering of a sound object may or may not occur, wholly or partially, within the position system 450.
  • the processing to cause a visual distraction to accompany rendering of a sound object may or may not occur, wholly or partially, within the processor 412 of the apparatus 400.
  • the method 600 may, for example, be performed by the system 100, for example, using the controller 410 of the apparatus 400.
  • the electronic apparatus 400 may in some examples be a part of an audio output device 300 such as a head-mounted audio output device or a module for such an audio output device 300.
  • the electronic apparatus 400 may in some examples additionally or alternatively be a part of a head-mounted apparatus 800 comprising the display 420 that displays images to a user.
  • an apparatus 400 may comprise:
  • references to 'computer-readable storage medium', 'computer program product', 'tangibly embodied computer program' etc. or a 'controller', 'computer', 'processor' etc. should be understood to encompass not only computers having different architectures such as single /multi- processor architectures and sequential (Von Neumann)/parallel architectures but also specialized circuits such as field-programmable gate arrays (FPGA), application specific circuits (ASIC), signal processing devices and other processing circuitry.
  • References to computer program, instructions, code etc. should be understood to encompass software for a programmable processor or firmware such as, for example, the programmable content of a hardware device whether instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device etc.
  • circuitry refers to all of the following:
  • the blocks illustrated in the Figs. 1-12 may represent steps in a method and/or sections of code in the computer program 416.
  • the illustration of a particular order to the blocks does not necessarily imply that there is a required or preferred order for the blocks and the order and arrangement of the block may be varied. Furthermore, it may be possible for some blocks to be omitted.
  • module' refers to a unit or apparatus that excludes certain parts/components that would be added by an end manufacturer or a user.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)
EP16171383.9A 2016-05-25 2016-05-25 Commande de rendu audio Withdrawn EP3249956A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP16171383.9A EP3249956A1 (fr) 2016-05-25 2016-05-25 Commande de rendu audio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP16171383.9A EP3249956A1 (fr) 2016-05-25 2016-05-25 Commande de rendu audio

Publications (1)

Publication Number Publication Date
EP3249956A1 true EP3249956A1 (fr) 2017-11-29

Family

ID=56108497

Family Applications (1)

Application Number Title Priority Date Filing Date
EP16171383.9A Withdrawn EP3249956A1 (fr) 2016-05-25 2016-05-25 Commande de rendu audio

Country Status (1)

Country Link
EP (1) EP3249956A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011063830A1 (fr) * 2009-11-24 2011-06-03 Nokia Corporation Appareil
US20130279706A1 (en) * 2012-04-23 2013-10-24 Stefan J. Marti Controlling individual audio output devices based on detected inputs
US20150110285A1 (en) * 2013-10-21 2015-04-23 Harman International Industries, Inc. Modifying an audio panorama to indicate the presence of danger or other events of interest

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011063830A1 (fr) * 2009-11-24 2011-06-03 Nokia Corporation Appareil
US20130279706A1 (en) * 2012-04-23 2013-10-24 Stefan J. Marti Controlling individual audio output devices based on detected inputs
US20150110285A1 (en) * 2013-10-21 2015-04-23 Harman International Industries, Inc. Modifying an audio panorama to indicate the presence of danger or other events of interest

Similar Documents

Publication Publication Date Title
US10638247B2 (en) Audio processing
EP3209034A1 (fr) Contrôle de rendu audio
US10524076B2 (en) Control of audio rendering
US10542368B2 (en) Audio content modification for playback audio
GB2543276A (en) Distributed audio capture and mixing
JP6764490B2 (ja) 媒介現実
US20190113598A1 (en) Methods, Apparatus, System and Computer Program for Controlling a Positioning Module and/or an Audio Capture
EP3503592B1 (fr) Procédés, appareils et programmes informatiques relatifs à un audio spatial
US10536794B2 (en) Intelligent audio rendering
US20210092545A1 (en) Audio processing
US11514108B2 (en) Content search
US9832587B1 (en) Assisted near-distance communication using binaural cues
CN105163209A (zh) 一种接收声音的处理方法及装置
US10524074B2 (en) Intelligent audio rendering
US10051403B2 (en) Controlling audio rendering
WO2013084056A1 (fr) Dispositifs électroniques, procédés et produits programmes d'ordinateur pour déterminer des écarts de position dans un dispositif électronique et générer un signal audio binaural sur la base des écarts de position
US10667073B1 (en) Audio navigation to a point of interest
EP3249956A1 (fr) Commande de rendu audio
EP3293987B1 (fr) Traitement audio
EP4164256A1 (fr) Appareil, procédés et programmes informatiques pour traiter un contenu audio spatial
Sunder et al. An HRTF based approach towards binaural sound source localization

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

17P Request for examination filed

Effective date: 20180529

RBV Designated contracting states (corrected)

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

17Q First examination report despatched

Effective date: 20180829

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: NOKIA TECHNOLOGIES OY

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20200603