EP1563485B1 - Method for processing audio data and sound acquisition device therefor - Google Patents

Method for processing audio data and sound acquisition device therefor Download PDF

Info

Publication number
EP1563485B1
EP1563485B1 EP03782553A EP03782553A EP1563485B1 EP 1563485 B1 EP1563485 B1 EP 1563485B1 EP 03782553 A EP03782553 A EP 03782553A EP 03782553 A EP03782553 A EP 03782553A EP 1563485 B1 EP1563485 B1 EP 1563485B1
Authority
EP
European Patent Office
Prior art keywords
distance
sound
components
point
signals
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP03782553A
Other languages
German (de)
French (fr)
Other versions
EP1563485A1 (en
Inventor
Jérôme DANIEL
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=32187712&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=EP1563485(B1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by France Telecom SA filed Critical France Telecom SA
Publication of EP1563485A1 publication Critical patent/EP1563485A1/en
Application granted granted Critical
Publication of EP1563485B1 publication Critical patent/EP1563485B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0091Means for obtaining special acoustic effects
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/11Application of ambisonics in stereophonic audio systems

Definitions

  • the present invention relates to the processing of sound data.
  • Techniques relating to the propagation of a sound wave in the three-dimensional space implement methods for processing the audio signal applied to the simulation of acoustic and psychoacoustic phenomena. .
  • Such processing methods provide spatial encoding of the acoustic field, its transmission and spatial reproduction on a set of loudspeakers or on headphones of a stereo headset.
  • a first category of treatments relates to processes for synthesizing the room effect, or more generally for environmental effects. From a description of one or more sound sources (transmitted signal, position, orientation, directivity, or other) and based on a room effect model (involving a room geometry, or an acoustic perception desired), a set of elementary acoustic phenomena (direct waves, reflected or diffracted), or a macroscopic acoustic phenomenon (reverberated and diffuse field), allowing to translate the spatial effect at the level of a listener located at a selected point of auditory perception, in the three-dimensional space.
  • sound sources transmitted signal, position, orientation, directivity, or other
  • a room effect model involving a room geometry, or an acoustic perception desired
  • a set of elementary acoustic phenomena direct waves, reflected or diffracted
  • a macroscopic acoustic phenomenon reverberated and diffuse field
  • a set of signals typically associated with reflections ( "secondary" sources , active by re-emission of a received main wave, having a spatial position attribute) and / or associated with a late reverberation (decorrelated signals for a field) are then calculated. diffuse).
  • a second category of processes relates to the positional or directional rendering of sound sources. These methods are applied to signals determined by a method of the first category described above (involving primary and secondary sources) depending on the spatial description (source position) associated with them.
  • these methods according to this second category make it possible to obtain signals to be broadcast on loudspeakers or headphones, in order to finally give to a listener the auditory impression of sound sources placed at predetermined respective positions around the auditor.
  • Processes according to this second category are called "creators of three-dimensional sound images", because of the distribution in the three-dimensional space of the feeling of the position of the sources by a listener.
  • Methods according to the second category generally comprise a first stage of spatial encoding of the elementary acoustic events which produces a representation of the sound field in the three-dimensional space.
  • this representation is transmitted or stored for a deferred use.
  • the decoded signals are delivered on loudspeakers or headphones of a playback device.
  • the present invention is rather in the second category mentioned above. It concerns in particular the spatial encoding of sound sources and a specification of the three-dimensional sound representation of these sources. It applies as well to an encoding of " virtual " sound sources (applications where sound sources are simulated such as games, a spatialized conference, or others), an "acoustic" encoding of a natural sound field, during sound recording by one or more three-dimensional microphone networks.
  • virtual sound sources
  • acoustic encoding of a natural sound field
  • a similar acoustic encoding method is presented byJ. Chen et al: "Synthesis of 3D virtual auditory space via a spatial feature extraction and regularization model", Proceedings of the virtual reality annual international symposium, Seattle, Sept. 18-22, 1993, IEEE, New York, US, pages 188 -193.
  • the ambisonic encoding which will be described in detail below, consists of representing signals relating to one or more sound waves in a base of spherical harmonics (in spherical coordinates involving in particular an elevation angle and an azimuthal angle, characterizing a direction of the sound or sounds).
  • the components representing these signals and expressed in this basis of spherical harmonics are also a function, for the waves emitted in the near field, of a distance between the sound source emitting this field and a point corresponding to the origin of the harmonic base. spherical. More particularly, this dependence of distance is expressed as a function of the sound frequency, as will be seen below.
  • this document presents a horizontal network of sensors, which assumes that the acoustic phenomena considered here propagate only in horizontal directions, which excludes any other direction of propagation and which, therefore, does not represent the physical reality of an ordinary acoustic field.
  • An object of the present invention is to provide a method for processing, by encoding, transmission and reproduction, any type of sound field, in particular the effect of a sound source in the near field.
  • Another object of the present invention is to provide a method for encoding virtual sources, not only in direction, but also in distance, and to define a decoding adaptable to any rendering device.
  • Another object of the present invention is to provide a robust processing method for sounds of all sound frequencies (including low frequencies), especially for sound recording of natural acoustic fields using three-dimensional microphone networks.
  • the data encoded and filtered in steps a) and b) are transmitted to the rendering device with a parameter representative of said second distance.
  • the rendering device comprising means for reading a memory medium
  • the encoded and filtered data is stored on a memory medium intended to be read by the rendering device. in steps a) and b) with a parameter representative of said second distance.
  • an adaptation filter is applied to the coded and filtered data whose coefficients are a function of said second and third distances.
  • the coefficients of a digital audio filter are defined from the root digital values of said power polynomials m.
  • the aforementioned polynomials are Bessel polynomials.
  • a microphone comprising an acoustic transducer array arranged substantially on the surface of a sphere whose center corresponds substantially to said reference point, to obtain said signals representative of at least one sound. propagating in three-dimensional space.
  • a global filter is applied to step b) in order firstly to compensate for a near-field effect as a function of said second distance and secondly to equalize the signals from the transducers to compensate for a difference. directivity weighting of said transducers.
  • a number of transducers function of a selected total number of components to represent the sound in said base of spherical harmonics.
  • a total number of components in the base of spherical harmonics to obtain, at the restitution, a region of the space around the point of perception in which the sound reproduction is faithful and whose dimensions are increasing with the total number of components.
  • a playback device having a number of speakers at least equal to said total number of components.
  • the filtering performed by the processing unit consists, on the one hand, of equalizing, as a function of the radius of the sphere, the signals coming from the transducers to compensate for a directivity weighting of said transducers and, on the other hand, to compensating a near-field effect according to said reference distance.
  • FIG. 1 represents by way of illustration a global system of sound spatialization.
  • a simulation module of a virtual scene defines a sound object as a virtual source of a signal, for example monophonic, of position chosen in the three-dimensional space and which defines a direction of sound.
  • specifications of the geometry of a virtual room can be provided to simulate reverberation of the sound.
  • a processing module 11 applies a management of one or more of these sources with respect to a listener (definition of a virtual position of the sources with respect to this listener). It implements a room effect processor for simulating reverberations or the like by applying delays and / or routine filtering.
  • the signals thus constructed are transmitted to a spatial encoding module 2a of the elementary contributions of the sources.
  • a natural sound recording can be performed as part of a sound recording by one or more microphones arranged in a chosen manner with respect to real sources (module 1b).
  • the signals picked up by the microphones are encoded by a module 2b.
  • the acquired and encoded signals can be transformed according to an intermediate representation format (module 3b), before being mixed by the module 3 to the signals generated by the module 1a and encoded by the module 2a (from the virtual sources).
  • the mixed signals are then transmitted or stored on a medium, for later retrieval (arrow TR). They are then applied to a decoding module 5, for the purpose of rendering on a reproduction device 6 comprising loudspeakers.
  • the decoding step 5 may be preceded by a step of manipulation of the sound field, for example by rotation, by means of a processing module 4 provided upstream of the decoding module 5.
  • the reproduction device may be in the form of a multiplicity of loudspeakers, arranged for example on the surface of a sphere in a three-dimensional configuration (periphery) to ensure, in the restitution, in particular a sense of direction sound in three-dimensional space.
  • an auditor place generally in the center of the sphere formed by the network of speakers, this center corresponding to the auditory perception point cited above.
  • the speakers of the playback device can be arranged in a plane (two-dimensional panoramic configuration), the speakers being arranged in particular on a circle and the listener usually placed in the center of this circle.
  • the rendering device may be in the form of a "surround" type device (5.1).
  • the rendering device can be in the form of a headset with two earphones for a binaural synthesis of the sound reproduced, which allows the listener to feel a direction of the sources in the three-dimensional space, as will be discussed in more detail below.
  • a two-speaker reproduction device for a feeling in the three-dimensional space, can also be in the form of a transaural restitution device, with two loudspeakers arranged at a selected distance from a listener.
  • a signal from a source 1 to N is transmitted to a spatial encoding module 2, as well as its position (real or virtual). Its position can be as well defined in terms of incidence (direction of the source as seen by the listener) and in terms of distance between this source and a listener.
  • the plurality of signals thus encoded allows to obtain a multi-channel representation of a global sound field.
  • the encoded signals are transmitted (arrow TR) to a sound reproduction device 6, for a sound reproduction in the three-dimensional space, as indicated above with reference to FIG.
  • the set of weighting factors B mn ⁇ which are implicitly a function of the frequency, thus describe the pressure field in the zone considered. For this reason, these factors are called "spherical harmonic components" and represent a frequency expression of sound (or pressure field) in the spherical harmonics basis Y mn ⁇ .
  • the spherical harmonics form an orthonormal basis where the scalar products between harmonic components and, generally between two functions F and G, are respectively defined by: ⁇ Y m not ⁇
  • Y m ' not ' ⁇ ' > 4 ⁇ ⁇ mm ' ⁇ nn ' ⁇ ⁇ ' .
  • BOY WUT 4 ⁇ 1 4 ⁇ ⁇ F ( ⁇ , ⁇ ) BOY WUT ( ⁇ , ⁇ ) , d ⁇ ( ⁇ , ⁇ )
  • Spherical harmonics are bounded real functions, as shown in Figure 4, as a function of the order m and the indices n and ⁇ .
  • the dark and light parts correspond respectively to the positive and negative values of spherical harmonic functions.
  • the radial functions j m (kr) are spherical Bessel functions, the module of which is illustrated for some values of the order m in FIG.
  • ambisonic representation can be given by a base of spherical harmonics as follows.
  • the ambisonic components of the same order m finally express “derivatives” or “moments” of order m of the pressure field in the vicinity of the origin O (center of the sphere shown in FIG. 3).
  • B 11 + 1 X
  • B 11 - 1 Y
  • an ambisonic system takes into account a subset of spherical harmonic components, as described above.
  • a system of order M when this one takes into account ambisonic components of subscript m ⁇ M.
  • the rendering device comprises loudspeakers disposed on the surface of a sphere (" periphery "), it is possible in principle to use as many harmonics as there are loudspeakers.
  • the reference S designates the pressure signal carried by a plane wave and picked up at the point O corresponding to the center of the sphere of FIG. 3 (origin of the base in spherical coordinates).
  • the incidence of the wave is described by the azimuth ⁇ and the elevation ⁇ .
  • a filter is applied F m ( ⁇ / vs ) to "bend" the shape of the wave fronts, considering that a near field emits, as a first approximation, a spherical wave.
  • this additional filter is of the "integrator" type, with an amplifying effect increasing and diverging (unbounded) as the sound frequencies decrease towards zero.
  • a pre-compensation of the near field is introduced at the very stage of the encoding, this compensation involving filters of the analytical form.
  • amplification F m ( ⁇ / vs ) ( ⁇ ) whose effect appears in FIG. 6 is compensated by attenuation of the filter applied as soon as encoding 1 F m ( R / vs ) ( ⁇ ) .
  • the coefficients of this compensation filter 1 F m ( R / vs ) ( ⁇ ) are increasing with the frequency of the sound and, in particular, tend towards zero, for the low frequencies.
  • this pre-compensation, performed as soon as encoding ensures that the data transmitted are not divergent for low frequencies.
  • a pre-compensation is applied to the encoding, involving a filter of the type 1 F m ( R / vs ) ( ⁇ ) as indicated above, which allows, on the one hand, to transmit bounded signals, and, on the other hand, to choose the distance R, from the encoding, for the restitution of the sound from the loudspeakers HP i , as shown in FIG. 7.
  • a virtual source placed at the distance p of the origin O was simulated at the time of acquisition (FIG.
  • the pre-compensation of the near field of the loudspeakers (placed at the distance R), at the stage of the encoding, can be combined with a simulated near-field effect of a virtual source placed at a distance p.
  • a total filter ultimately comes into play resulting, on the one hand, from the simulation of the near field, and, on the other hand, from the compensation of the near field, the coefficients of this filter being able to express itself.
  • H m NFC ( ⁇ / vs , R / vs ) ( ⁇ ) F m ( ⁇ / vs ) ( ⁇ ) F m ( R / vs ) ( ⁇ )
  • the total filter given by the relation [A11] is stable and constitutes the "distance encoding" part in the spatial ambisonic encoding according to the invention, as represented in FIG. 8.
  • the coefficients of these filters correspond to the functions of FIG. monotonic transfer of the frequency, which tend towards the value 1 in high frequencies and towards the value (R / ⁇ ) m in low frequencies.
  • the distance R between an auditory perception point and the speakers HP i is actually of the order of one or a few meters.
  • steps a) and b) above can be brought together in one and the same global step, or even be interchanged (with distance encoding and compensation filtering, followed by direction encoding).
  • the method according to the invention is therefore not limited to a successive implementation over time of steps a) and b).
  • FIG. 11B shows the propagation of the initial sound wave from a near-field source situated at a distance p from a point in the acquisition space that corresponds, in the restitution space at point P of Figure 7 of auditory perception. Note in FIG. 11A that the listeners (symbolized by schematized heads) can locate the virtual source in the same geographical location located at the distance p from the perception point P in FIG. 11B.
  • H m NFC ( ⁇ / vs , R / vs ) ( ⁇ ) F m ( ⁇ / vs ) ( ⁇ ) F m ( R / vs ) ( ⁇ )
  • Table 1 values ⁇ i> R ⁇ / i> ⁇ sub> ⁇ i> e ⁇ / i> ⁇ /sub>[ ⁇ i> X ⁇ / i> ⁇ sub> ⁇ i> m, q ⁇ / i> ⁇ / sub>],
  • the digital filters are thus implemented from the values of Table 1, by providing cascades of cells of order 2 (for m even), and an additional cell (for odd m), from the relationships [A14] given here. -before.
  • Digital filters are thus produced in an infinite impulse response form, which is easily parameterizable as shown above. It should be noted that an implementation in a finite impulse response form can be envisaged and consists in calculating the complex spectrum of the transfer function from the analytic formula, then deduce a finite impulse response by inverse Fourier transform. A convolution operation is then applied for filtering.
  • R is a reference distance with which a compensated near-field effect is associated and c is the speed of sound (typically 340 m / s in air).
  • This modified ambisonic representation has the same scalability properties (schematically represented by transmitted data "surrounded" near the arrow TR of FIG. 1) and obeys the same transformations of rotation of the field (module 4 of FIG. usual ambisonic.
  • the decoding operation is adaptable to any rendering device, of radius R 2 , different from the reference distance R above.
  • filters of the type H m NFC ( ⁇ / vs , R / vs ) ( ⁇ ) as described above, but with distance parameters R and R 2 , instead of p and R.
  • R / c is to be memorized (and / or transmitted) between encoding and decoding.
  • the filtering module represented therein is provided, for example, in a processing unit of a rendering device.
  • Ambisonic components received were pre-compensated for encoding for a reference distance R 1 as a second distance.
  • the rendering device comprises a plurality of loudspeakers arranged at a third distance R 2 from an auditory perception point P, this third distance R 2 being different from the second aforementioned distance R 1 .
  • the filtering module of FIG. 12, in the form H m NFC ( R 1 / vs , R two / vs ) ( ⁇ ) then adapts, upon reception of the data, the pre-compensation at the distance R 1 for a reproduction at the distance R 2 .
  • the rendering device also receives the parameter R 1 / c.
  • the invention also makes it possible to mix several ambisonic representations of sound fields (real and / or virtual sources), whose reference distances R are different (where appropriate with infinite reference distances and corresponding to distant sources).
  • a pre-compensation of all these sources will be filtered at a smallest reference distance, before mixing the signals ambisic, which allows the restitution to obtain a correct definition of the sound relief.
  • the distance encoding with near-field pre-compensation is advantageously applied in combination with the focus processing.
  • the wave transmitted by each speaker is defined by a processing prior to "re-encoding" the ambisonic field in the center of the rendering device, as follows.
  • the wave emitted by a loudspeaker of index i and incidence ( ⁇ i and ⁇ i ) is powered by a signal Si.
  • This loudspeaker participates in the reconstruction of the component B mn ' , by his contribution S i .
  • Y mn ⁇ ( ⁇ i , ⁇ i ) the wave emitted by a loudspeaker of index i and incidence ( ⁇ i and ⁇ i ) is powered by a signal Si.
  • vs i [ Y 00 + 1 ( ⁇ i , ⁇ i ) Y 11 + 1 ( ⁇ i , ⁇ i ) Y 11 - 1 ( ⁇ i , ⁇ i ) ⁇ Y m not ⁇ ( ⁇ i , ⁇ i ) ⁇ ]
  • the relation [B4] thus defines a re-encoding operation, prior to the restitution.
  • decoding verifying different criteria by frequency bands is possible, which makes it possible to offer an optimized reproduction according to the listening conditions, in particular with regard to the positioning constraint at the center. O of the sphere of Figure 3, during the restitution.
  • the mastering operation is preceded by a filtering operation that compensates for the near field on each component.
  • B mn ⁇ and which can be implemented in digital form, as described above, with reference to relation [A14].
  • the matrix C of "re-encoding" is specific to the rendering device. Its coefficients can be determined initially by parameterization and sound characterization of the restitution device reacting to a predetermined excitation.
  • a listener having a two-headset headset of a binaural synthesis device is shown.
  • the two ears of the listener are arranged at respective points O L (left ear) and O R (right ear) of the space.
  • the center of the listener's head is located at point O and the radius of the listener's head is of value a.
  • a sound source must be audibly perceived at a point M in the space, at a distance r from the center of the listener's head (and respectively at distances r R from the right ear and r L from the ear left).
  • the direction of the source at the point M is defined by the vectors r ⁇ , r ⁇ R and r ⁇ The .
  • binaural synthesis is defined as follows.
  • Each listener has an ear shape of its own.
  • the perception of a sound in the space by this listener is done by learning, from birth, according to the form of the ears (in particular the shape of the pavilions and the dimensions of the head) peculiar to this listener.
  • the perception of sound in space is manifested inter alia by the fact that the sound reaches one ear, before the other ear, which results in a delay ⁇ between the signals to be emitted by each earphone of the device. restitution applying binaural synthesis.
  • the playback device is initially set, for the same listener, by scanning a sound source around his head, at the same distance R from the center of his head. It will be understood that this distance R can be considered as a distance between a "restitution point" as stated above and a point of auditory perception (here the center O of the listener's head).
  • the index L is associated with the signal to be restored by the earpiece attached to the left ear and the index R is associated with the signal to be restored by the earpiece attached to the right ear.
  • a delay for each channel for producing a signal for a separate earphone is applied to the initial signal S.
  • These delays ⁇ L and ⁇ R are a function of a maximum delay ⁇ MAX which corresponds here to the ratio a / c where a, as indicated previously, corresponds to the radius of the listener's head and c to the speed of sound.
  • these delays are defined as a function of the difference in distance from the point O (center of the head) to the point M (position of the source whose sound is to be restored, in FIG. 13A) and of each ear at this point. M.
  • respective gains g L and g R are also applied to each channel, which are a function of a ratio of the distances from the point O to the point M and from each ear to the point M.
  • the respective modules applied to each channel 2 L and 2 R encode the signals of each channel, in an ambisonic representation, with near field pre-compensation NFC (for "Near Field Compensation") in the sense of the present invention.
  • the signals coming from the source M are transmitted to the reproduction device comprising ambisonic decoding modules, for each channel, 5 L and 5 R.
  • the reproduction device comprising ambisonic decoding modules, for each channel, 5 L and 5 R.
  • an ambisonic encoding / decoding, with near-field compensation is applied for each channel (left listener, right listener) in the binaural synthesis restitution (here of type "B-FORMAT"), in split form.
  • the near-field compensation is effected, for each channel, with the first distance p a distance r L and r R between each ear and the position M of the sound source to be restored.
  • a microphone 141 comprises a plurality of transducer capsules capable of picking up acoustic pressures and reproducing electrical signals S l , ..., S N.
  • Caps CAP i are arranged on a sphere of predetermined radius r (here, a rigid sphere, such as a ping-pong ball for example). The capsules are spaced with a regular pitch on the sphere. In practice, the number N of capsules is chosen according to the desired order M for the ambisonic representation.
  • the pre-compensation of the near field can be applied not only for the virtual source simulation, as indicated above, but also to the acquisition and, more generally, by combining the pre-compensation of field close to all types of treatments involving an ambisonic representation.
  • EQ m is an equalizer filter that compensates a weighting W m which is related to the directivity of the capsules and which further includes the diffraction by the rigid sphere.
  • this equalization filter is not stable and we obtain an infinite gain at very low frequencies.
  • the spherical harmonic components themselves, are not of finite amplitude when the sound field is not limited to propagation of plane waves, that is to say from distant sources, as we saw earlier.
  • the signals S 1 to S N are recovered from the microphone 141. If necessary, a pre-equalization of these signals is applied by a processing module 142.
  • the module 143 makes it possible to express these signals in the ambisonic context, under matrix form.
  • the module 144 applies the filter of the relation [C7] to the components ambisonic expressed as a function of the radius r of the sphere of the microphone 141.
  • the near-field compensation is performed for a reference distance R as a second distance.
  • the signals encoded and thus filtered by the module 144 can be transmitted, if necessary, with the parameter representative of the reference distance R / c.
  • near-field compensation within the meaning of the present invention can be applied to all types of processing involving an ambisonic representation.
  • This near-field compensation makes it possible to apply the ambisonic representation to a multiplicity of sound contexts where the direction of a source and advantageously its distance must be taken into account.
  • the possibility of the representation of sound phenomena of all types (near or far fields) in the ambisonic context is ensured by this pre-compensation, because of the limitation to finite real values of the ambison components.
  • the near-field pre-compensation can be integrated, at the encoding, as much for a near source as for a distant source.
  • the distance p expressed above will be considered infinite, without substantially modifying the expression of the filters H m given above.
  • processing using room effect processors that typically provide decorrelated signals that can be used to model the late diffuse field (late reverberation) can be combined with near field pre-compensation.
  • the various spherical harmonic components (with a chosen order M) can then be constructed by applying a gain correction for each ambisonic component and a field compensation close to the loudspeakers (with a reference distance R separating the loudspeakers the point of auditory perception as shown in Figure 7).
  • the encoding principle in the sense of the present invention is generalizable to radiation models other than monopolar sources (real or virtual) and / or speakers.
  • any form of radiation can be expressed by integration of a continuous distribution of point elementary sources.
  • a decoding method has been described above in which a matrix system involving the ambison components is applied.
  • it may be provided a generalized processing by fast Fourier transforms (circular or spherical) to limit the computing time and computing resources (in terms of memory) necessary for the decoding process.
  • the pre-compensation encoding method may be coupled to a digital audio compression for quantizing and adjusting the gain for each frequency subband.
  • the present invention applies to all types of sound spatialization systems, especially for "virtual reality” type applications (navigation in virtual scenes in three-dimensional space, cat-type conversations on the Internet), interface sonification, audio editing software for recording, mixing and restoring music, but also for acquiring, from use of three-dimensional microphones, for taking musical or cinematic sound, or for the transmission of sound environment on the Internet, for example for "Webcam” sound.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The digital sound processing method codes sounds from a three dimensional space at a set distance from a reference point representing the components on a spherical harmonic base. The components are then applied to a near field compensation by filtering as a function of a second distance (R) defined by the loudspeaker positions and the distance to the hearing position.

Description

La présente invention concerne le traitement de données sonores.The present invention relates to the processing of sound data.

Des techniques relatives à la propagation d'une onde sonore dans l'espace tridimensionnel, impliquant notamment une simulation et/ou une restitution sonores spécialisées, mettent en oeuvre des procédés de traitement du signal audio appliqués à la simulation de phénomènes acoustiques et psycho-acoustiques. De tels procédés de traitement prévoient un encodage spatial du champ acoustique, sa transmission et sa reproduction spatialisée sur un ensemble de hauts-parleurs ou sur des écouteurs d'un casque stéréophonique.Techniques relating to the propagation of a sound wave in the three-dimensional space, notably involving specialized simulation and / or sound reproduction, implement methods for processing the audio signal applied to the simulation of acoustic and psychoacoustic phenomena. . Such processing methods provide spatial encoding of the acoustic field, its transmission and spatial reproduction on a set of loudspeakers or on headphones of a stereo headset.

Parmi les techniques de son spatialisé, on distingue deux catégories de traitements complémentaires l'une de l'autre mais qui sont généralement mise en oeuvre, l'une et l'autre, au sein d'un même système.Among the spatialized sound techniques, there are two categories of treatments complementary to one another but which are generally implemented, one and the other, within the same system.

D'une part, une première catégorie de traitements concerne les procédés de synthèse d'effet de salle, ou plus généralement d'effets environnementaux. A partir d'une description d'une ou plusieurs sources sonores (signal émis, position, orientation, directivité, ou autre) et en se basant sur un modèle d'effet de salle (impliquant une géométrie de salle, ou encore une perception acoustique souhaitée), on calcule et l'on décrit un ensemble de phénomènes acoustiques élémentaires (ondes directes, réfléchies ou diffractées), ou encore un phénomène acoustique macroscopique (champ réverbéré et diffus), permettant de traduire l'effet spatial au niveau d'un auditeur situé à un point choisi de perception auditive, dans l'espace tridimensionnel. On calcule alors un ensemble de signaux associés typiquement aux réflexions (sources "secondaires", actives par ré-émission d'une onde principale reçue, ayant un attribut de position spatiale) et/ou associés à une réverbération tardive (signaux décorrélés pour un champ diffus).On the one hand, a first category of treatments relates to processes for synthesizing the room effect, or more generally for environmental effects. From a description of one or more sound sources (transmitted signal, position, orientation, directivity, or other) and based on a room effect model (involving a room geometry, or an acoustic perception desired), a set of elementary acoustic phenomena (direct waves, reflected or diffracted), or a macroscopic acoustic phenomenon (reverberated and diffuse field), allowing to translate the spatial effect at the level of a listener located at a selected point of auditory perception, in the three-dimensional space. A set of signals typically associated with reflections ( "secondary" sources , active by re-emission of a received main wave, having a spatial position attribute) and / or associated with a late reverberation (decorrelated signals for a field) are then calculated. diffuse).

D'autre part, une seconde catégorie de procédés concerne le rendu positionnel ou directionnel de sources sonores. Ces procédés sont appliqués à des signaux déterminés par un procédé de la première catégorie décrite ci-avant (impliquant des sources primaires et secondaires) en fonction de la description spatiale (position de la source) qui leur est associée. En particulier, de tels procédés selon cette seconde catégorie permettent d'obtenir des signaux à diffuser sur des hauts-parleurs ou écouteurs, pour finalement donner à un auditeur l'impression auditive de sources sonores placées à des positions respectives prédéterminées, autour de l'auditeur. Les procédés selon cette seconde catégorie sont qualifiés de "créateurs d'images sonores tridimensionnelles", du fait de la répartition dans l'espace tridimensionnel du ressenti de la position des sources par un auditeur. Des procédés selon la seconde catégorie comportent généralement une première étape d'encodage spatial des événements acoustiques élémentaires qui produit une représentation du champ sonore dans l'espace tridimensionnel. Dans une seconde étape, cette représentation est transmise ou stockée pour un usage différé. Dans une troisième étape, de décodage, les signaux décodés sont délivrés sur des hauts-parleurs ou des écouteurs d'un dispositif de restitution.On the other hand, a second category of processes relates to the positional or directional rendering of sound sources. These methods are applied to signals determined by a method of the first category described above (involving primary and secondary sources) depending on the spatial description (source position) associated with them. In particular, such methods according to this second category make it possible to obtain signals to be broadcast on loudspeakers or headphones, in order to finally give to a listener the auditory impression of sound sources placed at predetermined respective positions around the auditor. Processes according to this second category are called "creators of three-dimensional sound images", because of the distribution in the three-dimensional space of the feeling of the position of the sources by a listener. Methods according to the second category generally comprise a first stage of spatial encoding of the elementary acoustic events which produces a representation of the sound field in the three-dimensional space. In a second step, this representation is transmitted or stored for a deferred use. In a third decoding step, the decoded signals are delivered on loudspeakers or headphones of a playback device.

La présente invention s'inscrit plutôt dans la seconde catégorie précitée. Elle concerne en particulier l'encodage spatial de sources sonores et une spécification de la représentation sonore tridimensionnelle de ces sources. Elle s'applique aussi bien à un encodage de sources sonores "virtuelles" (applications où des sources sonores sont simulées telles que des jeux, une conférence spatialisée, ou autres), qu'un encodage "acoustique" d'un champ sonore naturel, lors d'une prise de son par un ou plusieurs réseaux tridimensionnels de microphones. Une méthode d'encodage acoustique similaire est présentée parJ. Chen et al: "Synthesis of 3D virtual auditory space via a spatial feature extraction and regularisation model", Proceedings of the virtual reality annual international symposium, Seattle, Sept. 18-22, 1993, IEEE, New-York, US, pages 188-193.The present invention is rather in the second category mentioned above. It concerns in particular the spatial encoding of sound sources and a specification of the three-dimensional sound representation of these sources. It applies as well to an encoding of " virtual " sound sources (applications where sound sources are simulated such as games, a spatialized conference, or others), an "acoustic" encoding of a natural sound field, during sound recording by one or more three-dimensional microphone networks. A similar acoustic encoding method is presented byJ. Chen et al: "Synthesis of 3D virtual auditory space via a spatial feature extraction and regularization model", Proceedings of the virtual reality annual international symposium, Seattle, Sept. 18-22, 1993, IEEE, New York, US, pages 188 -193.

Parmi les techniques envisageables de spatialisation du son, l'approche "ambisonique" est préférée. L'encodage ambisonique, qui sera décrit en détail plus loin, consiste à représenter des signaux relatifs à une ou plusieurs ondes sonores dans une base d'harmoniques sphériques (en coordonnées sphériques impliquant notamment un angle d'élévation et un angle azimutal, caractérisant une direction du ou des sons). Les composantes représentant ces signaux et exprimées dans cette base d'harmoniques sphériques sont aussi fonction, pour les ondes émises en champ proche, d'une distance entre la source sonore émettant ce champ et un point correspondant à l'origine de la base des harmonique sphériques. Plus particulièrement, cette dépendance de la distance s'exprime en fonction de la fréquence sonore, comme on le verra plus loin.Among the conceivable techniques of spatialization of sound, the "ambisonic" approach is preferred. The ambisonic encoding, which will be described in detail below, consists of representing signals relating to one or more sound waves in a base of spherical harmonics (in spherical coordinates involving in particular an elevation angle and an azimuthal angle, characterizing a direction of the sound or sounds). The components representing these signals and expressed in this basis of spherical harmonics are also a function, for the waves emitted in the near field, of a distance between the sound source emitting this field and a point corresponding to the origin of the harmonic base. spherical. More particularly, this dependence of distance is expressed as a function of the sound frequency, as will be seen below.

Cette approche ambisonique offre un grand nombre de fonctionnalités possibles, notamment en terme de simulation de sources virtuelles, et, de manière générale, présente les avantages suivants :

  • elle traduit, de façon rationnelle, la réalité des phénomènes acoustiques et apporte un rendu auditif spatial réaliste, convaincant et immersif ;
  • la représentation des phénomènes acoustiques est scalable : elle offre une résolution spatiale qui peut être adaptée à différentes situations. En effet, cette représentation peut être transmise et exploitée en fonction de contraintes de débit lors de la transmission des signaux encodés et/ou de limitations du dispositif de restitution ;
  • la représentation ambisonique est flexible et il est possible simuler une rotation du champ sonore, ou encore, à la restitution, d'adapter le décodage des signaux ambisoniques à tout dispositif de restitution, de géométries diverses.
This ambisonic approach offers a large number of possible functionalities, especially in terms of virtual source simulation, and, in general, has the following advantages:
  • it translates, in a rational way, the reality of the acoustic phenomena and brings a realistic, convincing and immersive spatial auditory rendering;
  • the representation of acoustic phenomena is scalable: it offers a spatial resolution that can be adapted to different situations. Indeed, this representation can be transmitted and exploited as a function of flow constraints during the transmission of encoded signals and / or limitations of the rendering device;
  • the ambisonic representation is flexible and it is possible to simulate a rotation of the sound field, or else, at restitution, to adapt the decoding of the ambisonic signals to any restitution device, of various geometries.

Dans l'approche ambisonique connue, l'encodage des sources virtuelles est essentiellement directionnel. Les fonctions d'encodage reviennent à calculer des gains qui dépendent de l'incidence de l'onde sonore exprimée par les fonctions harmoniques sphériques qui dépendent de l'angle d'élévation et de l'angle azimutal en coordonnées sphériques. En particulier, au décodage, on suppose que les hauts-parleurs, à la restitution, sont lointains. Il en résulte une distorsion (ou une incurvation) de la forme des fronts d'onde reconstruits. En effet, comme indiqué ci-avant, les composantes du signal sonore dans la base des harmoniques sphériques, pour un champ proche, dépendent en fait aussi de la distance de la source et de la fréquence sonore. Plus précisément, ces composantes peuvent s'exprimer mathématiquement sous la forme d'un polynôme dont la variable est inversement proportionnelle à la distance précitée et à la fréquence sonore. Ainsi, les composantes ambisoniques, au sens de leur expression théorique, sont divergentes dans les basses fréquences et, en particulier, tendent vers l'infini quand la fréquence sonore décroît vers zéro, lorsqu'elles représentent un son en champ proche émis par une source située à une distance finie. Ce phénomène mathématique est connu, dans le domaine de la représentation ambisonique, déjà pour l'ordre 1, par le terme de "bass boost", notamment par :

  • M.A.GERZON, "General Metatheory of Auditory Localisation", preprint 3306 of the 92nd AES Convention, 1992, page 52.
Ce phénomène devient particulièrement critique pour des ordres d'harmoniques sphériques élevées impliquant des polynômes de puissance élevée.In the known ambisonic approach, the encoding of virtual sources is essentially directional. The encoding functions return to calculate gains that depend on the incidence of the sound wave expressed by spherical harmonic functions that depend on the elevation angle and the azimuth angle in spherical coordinates. In particular, at decoding, it is assumed that the speakers, at restitution, are far away. he This results in a distortion (or curvature) of the shape of reconstructed wave fronts. Indeed, as indicated above, the components of the sound signal in the base of spherical harmonics, for a near field, in fact also depend on the distance of the source and the sound frequency. More precisely, these components can be expressed mathematically in the form of a polynomial whose variable is inversely proportional to the aforementioned distance and to the sound frequency. Thus, the ambison components, in the sense of their theoretical expression, are divergent in the low frequencies and, in particular, tend towards infinity when the sound frequency decreases towards zero, when they represent a sound in near field emitted by a source located at a finite distance. This mathematical phenomenon is known, in the field of ambisonic representation, already for order 1, by the term "bass boost", in particular by:
  • MAGERZON, "General metatheory of Auditory Localization", preprint 3306 of the 92 nd AES Convention 1992, page 52.
This phenomenon becomes particularly critical for high spherical harmonic orders involving high power polynomials.

On connaît par :

  • SONTACCHI et HÔLDRICH, "Further Investigations on 3D Sound Fields using Distance Coding" (Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-01), Limerick, Irlande, 6-8 Décembre 2001),
    une technique pour prendre en compte une incurvation des fronts d'ondes au sein d'une représentation proche d'une représentation ambisonique, dont le principe consiste à :
    • appliquer un encodage ambisonique (d'ordre élevé) aux signaux issus d'une prise de son virtuelle (simulée), de type WFS (pour "Wave Field Synthesis") ;
    • et reconstruire le champ acoustique sur une zone d'après ses valeurs sur une frontière de zone, se fondant ainsi sur le principe de HUYGENS-FRESNEL.
We know by:
  • SONTACCHI and HOLDRICH, "Further Investigations on 3D Sound Fields Using Distance Coding" , Limerick, Ireland, 6-8 December 2001),
    a technique to take into account a curvature of the wave fronts within a representation close to an ambisonic representation, the principle of which consists in:
    • apply an ambisonic encoding (high order) to the signals coming from a virtual sound recording (simulated), of type WFS (for "Wave Field Synthesis" );
    • and reconstructing the acoustic field over an area according to its values on a zone boundary, thus relying on the principle of HUYGENS-FRESNEL.

Cependant, la technique présentée dans ce document, bien que prometteuse du fait qu'elle utilise une représentation ambisonique à un ordre élevé, pose un certain nombre de problèmes :

  • les ressources informatiques nécessaires pour le calcul de toutes les surfaces permettant d'appliquer le principe de HUYGENS-FRESNEL, ainsi que les temps de calcul nécessaires, sont excessifs ;
  • des artefacts de traitement dits d'"aliasing spatial" apparaissent à cause de la distance entre les microphones, à moins de choisir un maillage de microphone virtuels serré dans l'espace, ce qui alourdit les traitements ;
  • cette technique est difficilement transposable à un cas réel de capteurs à disposer en réseau, en présence d'une source réelle, à l'acquisition ;
  • à la restitution, la représentation sonore tridimensionnelle est implicitement assujettie à un rayon figé du dispositif de restitution car le décodage ambisonique doit se faire, ici, sur un réseau de hauts-parleurs de mêmes dimensions que le réseau de microphones initial, ce document ne proposant aucun moyen d'adapter l'encodage ou le décodage à d'autres tailles de dispositifs de restitution.
However, the technique presented in this document, although promising because it uses a high order ambisonic representation, poses a number of problems:
  • the computer resources necessary for the calculation of all the surfaces making it possible to apply the HUYGENS-FRESNEL principle, as well as the necessary calculation times, are excessive;
  • so-called " spatial aliasing " processing artifacts appear because of the distance between the microphones, unless you choose a tight virtual microphone mesh in the space, which makes the processing more cumbersome;
  • this technique is difficult to transpose to a real case of sensors to be networked, in the presence of a real source, acquisition;
  • in the restitution, the three-dimensional sound representation is implicitly subject to a frozen ray of the rendering device because the ambisonic decoding must be done here on a network of loudspeakers of the same dimensions as the initial microphone array, this document not proposing no way to adapt encoding or decoding to other sizes of rendering devices.

Surtout, ce document présente un réseau horizontal de capteurs, ce qui suppose que les phénomènes acoustiques dont on tient compte, ici, ne se propagent que dans des directions horizontales, ce qui exclut toute autre direction de propagation et qui, donc, ne représente pas la réalité physique d'un champ acoustique ordinaire.Above all, this document presents a horizontal network of sensors, which assumes that the acoustic phenomena considered here propagate only in horizontal directions, which excludes any other direction of propagation and which, therefore, does not represent the physical reality of an ordinary acoustic field.

De façon plus générale, les techniques actuelles ne permettent pas de traiter de façon satisfaisante tout type de sources sonores, notamment en champ proche, mais plutôt des sources sonores lointaines (ondes planes), ce qui correspond à une situation restrictive et artificielle dans nombreuses applications.More generally, the current techniques do not make it possible to deal satisfactorily with all types of sound sources, especially in the near field, but rather with distant sound sources (plane waves), which corresponds to a restrictive and artificial situation in many applications. .

Un objet de la présente invention est de fournir un procédé pour traiter, par encodage, transmission et restitution, un type quelconque de champ sonore, en particulier l'effet d'une source sonore en champ proche.An object of the present invention is to provide a method for processing, by encoding, transmission and reproduction, any type of sound field, in particular the effect of a sound source in the near field.

Un autre objet de la présente invention est de fournir un procédé permettant l'encodage de sources virtuelles, non seulement en direction, mais aussi en distance, et de définir un décodage adaptable à un dispositif de restitution quelconque.Another object of the present invention is to provide a method for encoding virtual sources, not only in direction, but also in distance, and to define a decoding adaptable to any rendering device.

Un autre objet de la présente invention est de fournir un procédé de traitement robuste pour des sons de toutes fréquences sonores (y compris les basses fréquences), notamment pour la prise de son de champs acoustiques naturels à l'aide de réseaux tridimensionnels de microphones.Another object of the present invention is to provide a robust processing method for sounds of all sound frequencies (including low frequencies), especially for sound recording of natural acoustic fields using three-dimensional microphone networks.

A cet effet, la présente invention propose un procédé de traitement de données sonores, dans lequel :

  • a) on code des signaux représentatifs d'au moins un son se propageant dans l'espace tridimensionnel et issu d'une source située à une première distance d'un point de référence, pour obtenir une représentation du son par des composantes exprimées dans une base d'harmoniques sphériques, d'origine correspondant audit point de référence, et
  • b) on applique auxdites composantes une compensation d'un effet de champ proche par un filtrage qui est fonction d'une seconde distance définissant sensiblement, pour une restitution du son par un dispositif de restitution, une distance entre un point de restitution et un point de perception auditive.
For this purpose, the present invention proposes a sound data processing method, in which:
  • a) coding signals representative of at least one sound propagating in the three-dimensional space and coming from a source located at a first distance from a reference point, to obtain a representation of the sound by components expressed in a basis of spherical harmonics, of origin corresponding to said reference point, and
  • b) a compensation of a near-field effect is applied to said components by a filtering which is a function of a second distance substantially defining, for a restitution of the sound by a rendering device, a distance between a restitution point and a point of auditory perception.

Dans un premier mode de réalisation, ladite source étant lointaine du point de référence,

  • on obtient des composantes d'ordres successifs m pour la, représentation du son dans ladite base d'harmoniques sphériques, et
  • on applique un filtre dont les coefficients, appliqués chacun à une composante d'ordre m, s'expriment analytiquement sous la forme de l'inverse d'un polynôme de puissance m, dont la variable est inversement proportionnelle à la fréquence sonore et à ladite seconde distance, pour compenser un effet de champ, proche au niveau du dispositif de restitution.
In a first embodiment, said source being distant from the reference point,
  • components of successive orders m are obtained for the representation of the sound in said base of spherical harmonics, and
  • a filter is applied whose coefficients, each applied to a component of order m, are expressed analytically in the form of the inverse of a power polynomial m, whose variable is inversely proportional to the sound frequency and to said second distance, to compensate for a field effect, close to the rendering device.

Dans un second mode de réalisation, ladite source étant une source virtuelle prévue à ladite première distance,

  • on obtient des composantes d'ordres successifs m pour la représentation du son dans ladite base d'harmoniques sphériques, et
  • on applique un filtre global dont les coefficients, appliqués chacun à une composante d'ordre m, s'expriment analytiquement sous la forme d'une fraction, dont :
    • le numérateur est un polynôme de puissance m, dont la variable est inversement proportionnelle à la fréquence sonore et à ladite première distance, pour simuler un effet de champ proche de la source virtuelle, et
    • le dénominateur est un polynôme de puissance m, dont la variable est inversement proportionnelle à la fréquence sonore et à ladite seconde distance, pour compenser l'effet du champ proche de la source virtuelle dans les basses fréquences sonores.
In a second embodiment, said source being a virtual source provided at said first distance,
  • we obtain components of successive orders m for the representation of the sound in said base of spherical harmonics, and
  • a global filter is applied whose coefficients, each applied to a component of order m, are expressed analytically in the form of a fraction, of which:
    • the numerator is a polynomial of power m, whose variable is inversely proportional to the sound frequency and to said first distance, to simulate a near-field effect of the virtual source, and
    • the denominator is a power polynomial m, the variable of which is inversely proportional to the sound frequency and to said second distance, to compensate for the effect of the near field of the virtual source in the low frequencies of sound.

Préférentiellement, on transmet au dispositif de restitution les données codées et filtrées aux étapes a) et b) avec un paramètre représentatif de ladite seconde distance.Preferably, the data encoded and filtered in steps a) and b) are transmitted to the rendering device with a parameter representative of said second distance.

En complément ou en variante, le dispositif de restitution comportant des moyens de lecture d'un support mémoire, on mémorise sur un support mémoire destiné à être lu par le dispositif de restitution les données codées et filtrées aux étapes a) et b) avec un paramètre représentatif de ladite seconde distance.In addition or alternatively, the rendering device comprising means for reading a memory medium, the encoded and filtered data is stored on a memory medium intended to be read by the rendering device. in steps a) and b) with a parameter representative of said second distance.

Avantageusement, préalablement à une restitution sonore par un dispositif de restitution comportant une pluralité de hauts-parleurs disposés à une troisième distance dudit point de perception auditive, on applique aux données codées et filtrées un filtre d'adaptation dont les coefficients sont fonction desdites seconde et troisième distances.Advantageously, prior to a sound reproduction by a rendering device comprising a plurality of loudspeakers arranged at a third distance from said auditory perception point, an adaptation filter is applied to the coded and filtered data whose coefficients are a function of said second and third distances.

Dans une réalisation particulière, les coefficients de ce filtre d'adaptation, appliqués chacun à une composante d'ordre m, s'expriment analytiquement sous la forme d'une fraction, dont :

  • le numérateur est un polynôme de puissance m, dont la variable est inversement proportionnelle à la fréquence sonore et à ladite seconde distance,
  • et le dénominateur est un polynôme de puissance m, dont la variable est inversement proportionnelle à la fréquence sonore et à ladite troisième distance.
In a particular embodiment, the coefficients of this adaptation filter, each applied to a component of order m, are expressed analytically in the form of a fraction, of which:
  • the numerator is a polynomial of power m whose variable is inversely proportional to the sound frequency and to the said second distance,
  • and the denominator is a polynomial of power m whose variable is inversely proportional to the sound frequency and to said third distance.

Avantageusement, pour la mise en oeuvre de l'étape b), on prévoit :

  • pour des composantes d'ordre m pair, des filtres audionumériques sous la forme d'une cascade de cellules d'ordre deux ; et
  • pour des composantes d'ordre m impair, des filtres audionumériques sous la forme d'une cascade de cellules d'ordre deux et une cellule supplémentaire d'ordre un.
Advantageously, for the implementation of step b), provision is made for:
  • for even-order components m, digital audio filters in the form of a cascade of second-order cells; and
  • for odd m-order components, digital audio filters in the form of a cascade of second-order cells and an additional order-one cell.

Dans cette réalisation, les coefficients d'un filtre audionumérique, pour une composante d'ordre m, sont définis à partir des valeurs numériques des racines desdits polynômes de puissance m.In this embodiment, the coefficients of a digital audio filter, for a component of order m, are defined from the root digital values of said power polynomials m.

Dans une réalisation particulière, les polynômes précités sont des polynômes de Bessel.In a particular embodiment, the aforementioned polynomials are Bessel polynomials.

A l'acquisition des signaux sonores, on prévoit avantageusement un microphone comportant un réseau de transducteurs acoustiques agencés sensiblement sur la surface d'une sphère dont le centre correspond sensiblement audit point de référence, pour obtenir lesdits signaux représentatifs d'au moins un son se propageant dans l'espace tridimensionnel.To the acquisition of the sound signals, a microphone is advantageously provided comprising an acoustic transducer array arranged substantially on the surface of a sphere whose center corresponds substantially to said reference point, to obtain said signals representative of at least one sound. propagating in three-dimensional space.

Dans cette réalisation, on applique à l'étape b) un filtre global pour, d'une part, compenser un effet de champ proche en fonction de ladite seconde distance et, d'autre part, égaliser les signaux issus des transducteurs pour compenser une pondération de directivité desdits transducteurs.In this embodiment, a global filter is applied to step b) in order firstly to compensate for a near-field effect as a function of said second distance and secondly to equalize the signals from the transducers to compensate for a difference. directivity weighting of said transducers.

Préférentiellement, on prévoit un nombre de transducteurs fonction d'un nombre total choisi de composantes pour représenter le son dans ladite base d'harmoniques sphériques.Preferably, there is provided a number of transducers function of a selected total number of components to represent the sound in said base of spherical harmonics.

Selon une caractéristique avantageuse, on choisit à l'étape a) un nombre total de composantes dans la base des harmoniques sphériques pour obtenir, à la restitution, une région de l'espace autour du point de perception dans laquelle la restitution du son est fidèle et dont les dimensions sont croissantes avec le nombre total de composantes.According to an advantageous characteristic, a total number of components in the base of spherical harmonics to obtain, at the restitution, a region of the space around the point of perception in which the sound reproduction is faithful and whose dimensions are increasing with the total number of components.

Préférentiellement, on prévoit en outre un dispositif de restitution comportant un nombre de haut-parleurs au moins égal audit nombre total de composantes.Preferably, there is further provided a playback device having a number of speakers at least equal to said total number of components.

En variante, dans le cadre d'une restitution avec synthèse binaurale ou transaurale :

  • on prévoit un dispositif de restitution comportant au moins un premier et un second haut-parleur disposés à une distance choisie d'un auditeur,
  • on obtient, pour cet auditeur, une information de ressenti attendu de la position dans l'espace de sources sonores situées à une distance de référence prédéterminée de l'auditeur pour l'application d'une technique dite de "synthèse binaurale" ou "transaurale", et
  • on applique la compensation de l'étape b) avec ladite distance de référence sensiblement en tant que seconde distance.
As a variant, in the context of a restitution with binaural or transaural synthesis:
  • there is provided a rendering device comprising at least a first and a second loudspeaker arranged at a selected distance from a listener,
  • for this listener, an expected feeling information is obtained of the position in the space of sound sources located at a predetermined reference distance from the listener for the application of a so-called " binaural synthesis" or " transaural " technique. ", and
  • the compensation of step b) is applied with said reference distance substantially as a second distance.

Dans une variante où l'on introduit une adaptation au dispositif de restitution à deux écouteurs :

  • on prévoit un dispositif de restitution comportant au moins un premier et un second haut-parleur disposés à une distance choisie d'un auditeur,
  • on obtient, pour cet auditeur, une information de ressenti de la position dans l'espace de sources sonores situées à une distance de référence prédéterminée de l'auditeur, et
  • préalablement à une restitution sonore par le dispositif de restitution, on applique aux données codées et filtrées aux étapes a) et b) un filtre d'adaptation dont les coefficients sont fonction de la seconde distance et sensiblement de la distance de référence.
In a variant where one introduces an adaptation to the playback device with two earphones:
  • there is provided a rendering device comprising at least a first and a second loudspeaker arranged at a selected distance from a listener,
  • we get, for this listener, a feeling information of the position in the space of sound sources located at a predetermined reference distance from the listener, and
  • prior to sound reproduction by the rendering device, an adaptation filter whose coefficients are a function of the second distance and substantially of the reference distance is applied to the data coded and filtered in steps a) and b).

En particulier, dans le cadre d'une restitution avec synthèse binaurale :

  • le dispositif de restitution comporte un casque à deux écouteurs pour les oreilles respectives de l'auditeur,
  • et préférentiellement, séparément pour chaque écouteur, on applique le codage et le filtrage des étapes a) et b) pour des signaux respectifs destinés à alimenter chaque écouteur, avec, en tant que première distance, respectivement une distance séparant chaque oreille d'une position d'une source à restituer dans l'espace de restitution.
In particular, in the context of a binaural synthesis restitution:
  • the rendering device comprises a headset with two earphones for the respective ears of the listener,
  • and preferably, separately for each earphone, the coding and filtering of steps a) and b) are applied for respective signals intended to feed each earphone, with, as first distance, respectively a distance separating each ear from a position a source to restore in the restitution space.

Préférentiellement, on met en forme, aux étapes a) et b), un système matriciel comportant au moins :

  • une matrice comportant lesdites composantes dans la base des harmoniques sphériques, et
  • une matrice diagonale dont les coefficients correspondent à des coefficients de filtrage de l'étape b),
et on multiplie lesdites matrices pour obtenir une matrice résultat de composantes compensées.Preferably, in steps a) and b), a matrix system comprising at least:
  • a matrix comprising said components in the base of the spherical harmonics, and
  • a diagonal matrix whose coefficients correspond to filter coefficients of step b),
and multiplying said matrices to obtain a result matrix of compensated components.

De préférence, à la restitution :

  • le dispositif de restitution comporte une pluralité de haut-parleurs disposés sensiblement à une même distance du point de perception auditive, et
  • pour décoder lesdites données codées et filtrées aux étapes a) et b) et former des signaux adaptés pour alimenter lesdits haut-parleurs :
    • * on forme un système matriciel comportant ladite matrice résultat de composantes compensées et une matrice de décodage prédéterminée, propre au dispositif de restitution, et
    • * on obtient une matrice comportant des coefficients représentatifs des signaux d'alimentation des hauts-parleurs par multiplication de la matrice résultat par ladite matrice de décodage.
Preferably, at restitution:
  • the rendering device comprises a plurality of loudspeakers disposed substantially at the same distance from the auditory perception point, and
  • for decoding said coded and filtered data in steps a) and b) and forming appropriate signals to power said loudspeakers:
    • a matrix system is formed comprising said compensated component result matrix and a predetermined decoding matrix specific to the rendering device, and
    • a matrix is obtained comprising coefficients representative of the signals for supplying the loudspeakers by multiplication of the result matrix by said decoding matrix.

La présente invention vise aussi un dispositif d'acquisition sonore, comportant un microphone muni d'un réseau de transducteurs acoustiques disposés sensiblement sur la surface d'une sphère. Selon l'invention le dispositif comporte en outre une unité de traitement agencée pour :

  • recevoir des signaux émanant chacun d'un transducteur,
  • appliquer auxdits signaux un codage pour obtenir une représentation du son par des composantes exprimées dans une base d'harmoniques sphériques, d'origine correspondant au centre de ladite sphère,
  • et appliquer auxdites composantes un filtrage qui est fonction, d'une part, d'une distance correspondant au rayon de la sphère et, d'autre part, d'une distance de référence.
The present invention also relates to a sound acquisition device comprising a microphone provided with a network of acoustic transducers disposed substantially on the surface of a sphere. According to the invention the device further comprises a processing unit arranged for:
  • receive signals from each of a transducer,
  • applying to said signals a coding to obtain a representation of the sound by components expressed in a base of spherical harmonics, of origin corresponding to the center of said sphere,
  • and applying to said components a filtering which is a function, on the one hand, of a distance corresponding to the radius of the sphere and, on the other hand, a reference distance.

Préférentiellement, le filtrage effectué par l'unité de traitement consiste, d'une part, à égaliser, en fonction du rayon de la sphère, les signaux issus des transducteurs pour compenser une pondération de directivité desdits transducteurs et, d'autre part, à compenser un effet de champ proche en fonction de ladite distance de référence.Preferably, the filtering performed by the processing unit consists, on the one hand, of equalizing, as a function of the radius of the sphere, the signals coming from the transducers to compensate for a directivity weighting of said transducers and, on the other hand, to compensating a near-field effect according to said reference distance.

D'autres avantages et caractéristiques de l'invention apparaîtront à la lecture de la description détaillée ci-après et à l'examen des figures qui l'accompagnent, sur lesquelles :

  • la figure 1 illustre schématiquement un système d'acquisition et création, par simulation de sources virtuelles, de signaux sonores, avec encodage, transmission, décodage et restitution par un dispositif de restitution spatialisé,
  • la figure 2 représente plus précisément un encodage de signaux définis à la fois en intensité et par rapport à la position d'une source dont ils sont issus,
  • la figure 3 illustre les paramètres en jeu dans la représentation ambisonique, en coordonnées sphériques ;
  • la figure 4 illustre une représentation par une métrique tridimensionnelle dans un repère de coordonnées sphériques, d'harmoniques sphériques Y mn σ
    Figure imgb0001
    de différents ordres ;
  • la figure 5 est un diagramme des variations du module de fonctions radiales j m (kr), qui sont des fonctions de Bessel sphériques, pour des valeurs d'ordre m successives, ces fonctions radiales intervenant dans la représentation ambisonique d'un champ de pression acoustique ;
  • la figure 6 représente l'amplification due à l'effet de champ proche pour différents ordres successifs m, en particulier dans les basses fréquences ;
  • la figure 7 représente schématiquement un dispositif de restitution comportant une pluralité de hauts-parleurs HPi , avec le point (référencé P) de perception auditive précité, la première distance précitée (référencée ρ) et la seconde distance précitée (référencée R) ;
  • la figure 8 représente schématiquement les paramètres mis en jeu dans l'encodage ambisonique, avec un encodage directionnel, ainsi qu'un encodage de distance selon l'invention ;
  • la figure 9 représente des spectres d'énergie des filtres de compensation et de champ proche simulés pour une première distance d'une source virtuelle p = 1 m et une pré-compensation de hauts-parleurs situés à une seconde distance R = 1,5 m ;
  • la figure 10 représente des spectres d'énergie des filtres de compensation et de champ proche simulés pour une première distance de la source virtuelle p = 3 m et une pré-compensation de hauts-parleurs situés à une distance R = 1,5 m ;
  • la figure 11A représente une reconstruction du champ proche avec compensation, au sens de la présente invention, pour une onde sphérique dans le plan horizontal ;
  • la figure 11B, à comparer avec la figure 11A, représente le front d'onde initial, issu d'une source S ;
  • la figure 12 représente schématiquement un module de filtrage pour adapter les composantes ambisoniques reçues et pré-compensées à l'encodage pour une distance de référence R en tant que seconde distance, à un dispositif de restitution comportant une pluralité de hauts-parleurs disposés à une troisième distance R2 d'un point de perception auditive ;
  • la figure 13A représente schématiquement la disposition d'une source sonore M, à la restitution, pour un auditeur utilisant un dispositif de restitution appliquant une synthèse binaurale, avec une source émettant en champ proche ;
  • la figure 13B représente schématiquement les étapes d'encodage et de décodage avec effet de champ proche dans le cadre de la synthèse binaurale de la figure 13A à laquelle est combiné un encodage/décodage ambisonique ;
  • la figure 14 représente schématiquement le traitement des signaux issus d'un microphone comportant une pluralité de capteurs de pression agencés sur une sphère, à titre illustratif, par encodage ambisonique, égalisation et compensation de champ proche au sens de l'invention.
Other advantages and characteristics of the invention will become apparent on reading the detailed description below and on examining the figures which accompany it, in which:
  • FIG. 1 schematically illustrates a system for acquiring and creating, by simulation of virtual sources, sound signals, with encoding, transmission, decoding and restitution by a spatialized reproduction device,
  • FIG. 2 more precisely represents an encoding of signals defined both in intensity and with respect to the position of a source from which they are derived,
  • FIG. 3 illustrates the parameters involved in the ambisonic representation, in spherical coordinates;
  • FIG. 4 illustrates a representation by a three-dimensional metric in a coordinate system of spherical coordinates, of spherical harmonics Y mn σ
    Figure imgb0001
    different orders;
  • FIG. 5 is a diagram of the variations of the radial function module j m (kr), which are spherical Bessel functions, for successive order values m, these radial functions occurring in the ambisonic representation of a pressure field acoustic ;
  • FIG. 6 represents the amplification due to the near-field effect for different successive orders m, in particular in the low frequencies;
  • FIG. 7 schematically represents a reproduction device comprising a plurality of loudspeakers HP i , with the aforementioned point (referenced P) of auditory perception, the first aforementioned distance (referenced ρ) and the second aforementioned distance (referenced R);
  • FIG. 8 schematically represents the parameters involved in the ambisonic encoding, with a directional encoding, as well as a distance encoding according to the invention;
  • FIG. 9 represents energy spectra of the simulated compensation and near field filters for a first distance of a virtual source p = 1 m and a pre-compensation of loudspeakers situated at a second distance R = 1.5 m;
  • FIG. 10 represents energy spectra of the simulated compensation and near field filters for a first distance from the virtual source p = 3 m and a pre-compensation of loudspeakers located at a distance R = 1.5 m;
  • FIG. 11A represents a near field reconstruction with compensation, within the meaning of the present invention, for a spherical wave in the horizontal plane;
  • FIG. 11B, to be compared with FIG. 11A, represents the initial wavefront, originating from a source S;
  • FIG. 12 schematically represents a filtering module for adapting the received and pre-compensated ambisonic components to the encoding for a distance of reference R as a second distance, to a rendering device having a plurality of loudspeakers arranged at a third distance R 2 from a point of auditory perception;
  • FIG. 13A schematically represents the arrangement of a sound source M, at restitution, for a listener using a rendering device applying a binaural synthesis, with a source emitting in the near field;
  • Fig. 13B schematically illustrates the near field effect encoding and decoding steps in the context of the binaural synthesis of Fig. 13A to which ambisonic encoding / decoding is combined;
  • FIG. 14 diagrammatically represents the processing of signals from a microphone comprising a plurality of pressure sensors arranged on a sphere, by way of illustration, by ambisonic encoding, equalization and near-field compensation in the sense of the invention.

On se réfère tout d'abord à la figure 1 qui représente à titre illustratif un système global de spatialisation sonore. Un module la de simulation d'une scène virtuelle définit un objet sonore comme une source virtuelle d'un signal, par exemple monophonique, de position choisie dans l'espace tridimensionnel et qui définit une direction du son. Il peut être prévu en outre des spécifications de la géométrie d'une salle virtuelle, pour simuler une réverbération du son. Un module de traitement 11 applique une gestion d'une ou plusieurs de ces sources par rapport à un auditeur (définition d'une position virtuelle des sources par rapport à cet auditeur). Il met en oeuvre un processeur d'effet de salle pour simuler des réverbérations ou autres en appliquant des retards et/ou des filtrages usuels. Les signaux ainsi construits sont transmis à un module 2a d'encodage spatial des contributions élémentaires des sources.Reference is firstly made to FIG. 1, which represents by way of illustration a global system of sound spatialization. A simulation module of a virtual scene defines a sound object as a virtual source of a signal, for example monophonic, of position chosen in the three-dimensional space and which defines a direction of sound. In addition, specifications of the geometry of a virtual room can be provided to simulate reverberation of the sound. A processing module 11 applies a management of one or more of these sources with respect to a listener (definition of a virtual position of the sources with respect to this listener). It implements a room effect processor for simulating reverberations or the like by applying delays and / or routine filtering. The signals thus constructed are transmitted to a spatial encoding module 2a of the elementary contributions of the sources.

Parallèlement, une prise de son naturelle peut être effectuée dans le cadre d'un enregistrement sonore par un ou plusieurs microphones disposés de façon choisie par rapport aux sources réelles (module 1b). Les signaux captés par les microphones sont encodés par un module 2b. Les signaux acquis et encodés peuvent être transformés selon un format de représentation intermédiaire (module 3b), avant d'être mixés par le module 3 aux signaux générés par le module la et encodés par le module 2a (issu des sources virtuelles). Les signaux mixés sont ensuite transmis, ou encore mémorisés sur un support, en vue d'une restitution ultérieure (flèche TR). Ils sont ensuite appliqués à un module de décodage 5, en vue de la restitution sur un dispositif de restitution 6 comportant des hauts-parleurs. Le cas échéant, l'étape de décodage 5 peut être précédée d'une étape de manipulation du champ sonore, par exemple par rotation, grâce à un module de traitement 4 prévu en amont du module de décodage 5.In parallel, a natural sound recording can be performed as part of a sound recording by one or more microphones arranged in a chosen manner with respect to real sources (module 1b). The signals picked up by the microphones are encoded by a module 2b. The acquired and encoded signals can be transformed according to an intermediate representation format (module 3b), before being mixed by the module 3 to the signals generated by the module 1a and encoded by the module 2a (from the virtual sources). The mixed signals are then transmitted or stored on a medium, for later retrieval (arrow TR). They are then applied to a decoding module 5, for the purpose of rendering on a reproduction device 6 comprising loudspeakers. If necessary, the decoding step 5 may be preceded by a step of manipulation of the sound field, for example by rotation, by means of a processing module 4 provided upstream of the decoding module 5.

Le dispositif de restitution peut se présenter sous la forme d'une multiplicité de hauts-parleurs, agencés par exemple à la surface d'une sphère dans une configuration tridimensionnelle (périphonique) pour assurer, à la restitution, notamment un ressenti d'une direction du son dans l'espace tridimensionnel. A cet effet, un auditeur se place généralement au centre de la sphère formée par le réseau de haut-parleurs, ce centre correspondant au point de perception auditive cité ci-avant. En variante, les hauts-parleurs du dispositif de restitution peuvent être agencés dans un plan (configuration panoramique bidimensionnelle), les hauts-parleurs étant disposés en particulier sur un cercle et l'auditeur se plaçant habituellement au centre de ce cercle. Dans une autre variante, le dispositif de restitution peut se présenter sous la forme d'un dispositif de type "surround" (5.1). Enfin, dans une variante avantageuse, le dispositif de restitution peut se présenter sous la forme d'un casque à deux écouteurs pour une synthèse binaurale du son restitué, qui permet à l'auditeur de ressentir une direction des sources dans l'espace tridimensionnel, comme on le verra plus loin de façon détaillée. Un tel dispositif de restitution à deux hauts-parleurs, pour un ressenti dans l'espace tridimensionnel, peut se présenter aussi sous la forme d'un dispositif de restitution transaurale, à deux hauts-parleurs disposés à une distance choisie d'un auditeur.The reproduction device may be in the form of a multiplicity of loudspeakers, arranged for example on the surface of a sphere in a three-dimensional configuration (periphery) to ensure, in the restitution, in particular a sense of direction sound in three-dimensional space. For this purpose, an auditor place generally in the center of the sphere formed by the network of speakers, this center corresponding to the auditory perception point cited above. Alternatively, the speakers of the playback device can be arranged in a plane (two-dimensional panoramic configuration), the speakers being arranged in particular on a circle and the listener usually placed in the center of this circle. In another variant, the rendering device may be in the form of a "surround" type device (5.1). Finally, in an advantageous variant, the rendering device can be in the form of a headset with two earphones for a binaural synthesis of the sound reproduced, which allows the listener to feel a direction of the sources in the three-dimensional space, as will be discussed in more detail below. Such a two-speaker reproduction device, for a feeling in the three-dimensional space, can also be in the form of a transaural restitution device, with two loudspeakers arranged at a selected distance from a listener.

On se réfère maintenant à la figure 2 pour décrire un encodage spatial et un décodage pour une restitution sonore tridimensionnelle, de sources sonores élémentaires. On transmet à un module d'encodage spatial 2 le signal issu d'une source 1 à N, ainsi que sa position (réelle ou virtuelle). Sa position peut être aussi bien définie en terme d'incidence (direction de la source vue de l'auditeur) qu'en terme de distance entre cette source et un auditeur. La pluralité des signaux ainsi encodés permet d'obtenir une représentation multi-canale d'un champ sonore global. Les signaux encodés sont transmis (flèche TR) à un dispositif de restitution sonore 6, pour une restitution sonore dans l'espace tridimensionnel, comme indiqué ci-avant en référence à la figure 1.Reference is now made to FIG. 2 to describe a spatial encoding and a decoding for a three-dimensional sound reproduction of elementary sound sources. A signal from a source 1 to N is transmitted to a spatial encoding module 2, as well as its position (real or virtual). Its position can be as well defined in terms of incidence (direction of the source as seen by the listener) and in terms of distance between this source and a listener. The plurality of signals thus encoded allows to obtain a multi-channel representation of a global sound field. The encoded signals are transmitted (arrow TR) to a sound reproduction device 6, for a sound reproduction in the three-dimensional space, as indicated above with reference to FIG.

On se réfère maintenant à la figure 3 pour décrire ci-après la représentation ambisonique par des harmoniques sphériques dans l'espace tridimensionnel, d'un champ acoustique. On considère une zone autour d'une origine O (sphère de rayon R) exempte de source acoustique. On adopte un système de coordonnées sphériques dans lequel chaque vecteur r

Figure imgb0002
dès l'origine O à un point de la sphère est décrit par un azimut θr, une élévation δr et un rayon r (correspondant à la distance à l'origine O).Reference is now made to FIG. 3 to describe hereinafter the ambisonic representation by spherical harmonics in the three-dimensional space of an acoustic field. An area around an origin O (sphere of radius R) devoid of acoustic source is considered. We adopt a system of spherical coordinates in which each vector r
Figure imgb0002
from the origin O at a point of the sphere is described by an azimuth θ r , an elevation δ r and a radius r (corresponding to the distance to the origin O).

Le champ de pression p ( r )

Figure imgb0003
à l'intérieur de cette sphère (r < R où R est le rayon de la sphère) peut s'écrire dans le domaine fréquentiel comme une série dont les termes sont les produits pondérés de fonctions angulaires y mn σ ( θ , δ )
Figure imgb0004
et de fonction radiale jm(kr) qui dépendent ainsi d'un terme de propagation où k=2πf/c, où f est la fréquence sonore et c est la vitesse du son dans le milieu de propagation.The pressure field p ( r )
Figure imgb0003
inside this sphere (r <R where R is the radius of the sphere) can be written in the frequency domain as a series whose terms are the weighted products of angular functions there mn σ ( θ , δ )
Figure imgb0004
and of radial function j m (kr) which thus depend on a propagation term where k = 2πf / c, where f is the sound frequency and c is the speed of sound in the propagation medium.

Le champ de pression s'exprime alors par : p ( r ) = m = 0 j m j m ( k r ) 0 n m , σ = ± 1 B m n σ Y m n σ ( N 3 D ) ( θ r , δ r )

Figure imgb0005
The pressure field is expressed by: p ( r ) = Σ m = 0 j m j m ( k r ) Σ 0 not m , σ = ± 1 B m not σ Y m not σ ( NOT 3 D ) ( θ r , δ r )
Figure imgb0005

L'ensemble des facteurs de pondération B mn σ ,

Figure imgb0006
qui sont implicitement fonction de la fréquence, décrivent ainsi le champ de pression dans la zone considérée. Pour cette raison, ces facteurs sont appelés "composantes harmoniques sphériques" et représentent une expression fréquentielle du son (ou du champ de pression) dans la base des harmoniques sphériques Y mn σ .
Figure imgb0007
The set of weighting factors B mn σ ,
Figure imgb0006
which are implicitly a function of the frequency, thus describe the pressure field in the zone considered. For this reason, these factors are called "spherical harmonic components" and represent a frequency expression of sound (or pressure field) in the spherical harmonics basis Y mn σ .
Figure imgb0007

Les fonctions angulaires sont appelées "harmoniques sphériques" et sont définies par : Y m n σ ( θ , δ ) = 2 m + 1 ( 2 δ 0 ; n ) ( m n ) ! ( m + n ) ! P m n ( sin δ ) × { cos n θ si σ = + 1 sin n θ si σ = 1

Figure imgb0008

P mn (sinδ) dont des fonctions de Legendre de degré m et d'ordre n ;
δp,q est le symbole de Krönecker (égal à 1 si p=q et 0, sinon)The angular functions are called "spherical harmonics" and are defined by: Y m not σ ( θ , δ ) = two m + 1 ( two - δ 0 ; not ) ( m - not ) ! ( m + not ) ! P m not ( sin δ ) × { cos not θ if σ = + 1 sin not θ if σ = - 1
Figure imgb0008

where P mn (sinδ) whose functions of Legendre of degree m and of order n;
δ p, q is the Krönecker symbol (equal to 1 if p = q and 0 otherwise)

Les harmoniques sphériques forment une base orthonormée où les produits scalaires entre composantes harmoniques et, de façon générale entre deux fonctions F et G, sont respectivement définies par : Y m n σ | Y m n σ 4 π = δ mm δ nn δ σσ .

Figure imgb0009
F | G 4 π = 1 4 π F ( θ , δ ) G ( θ , δ ) , d Ω ( θ , δ )
Figure imgb0010
The spherical harmonics form an orthonormal basis where the scalar products between harmonic components and, generally between two functions F and G, are respectively defined by: < Y m not σ | Y m ' not ' σ ' > 4 π = δ mm ' δ nn ' δ σσ ' .
Figure imgb0009
F | BOY WUT 4 π = 1 4 π F ( θ , δ ) BOY WUT ( θ , δ ) , d Ω ( θ , δ )
Figure imgb0010

Les harmoniques sphériques sont des fonctions réelles bornées, comme représenté sur la figure 4, en fonction de l'ordre m et des indices n et σ. Les parties sombres et claires correspondent respectivement aux valeurs positives et négatives des fonctions harmoniques sphériques. Plus l'ordre m est élevé et plus la fréquence angulaire (et donc la discrimination entre fonctions) est élevée. Les fonctions radiales j m (kr) sont des fonctions de Bessel sphériques, dont le module est illustré pour quelques valeurs de l'ordre m dans la figure 5.Spherical harmonics are bounded real functions, as shown in Figure 4, as a function of the order m and the indices n and σ. The dark and light parts correspond respectively to the positive and negative values of spherical harmonic functions. The higher the order m, the higher the angular frequency (and hence the discrimination between functions). The radial functions j m (kr) are spherical Bessel functions, the module of which is illustrated for some values of the order m in FIG.

On peut donner une interprétation de la représentation ambisonique par une base d'harmoniques sphériques comme suit. Les composantes ambisoniques de même ordre m expriment finalement des "dérivées" ou des "moments" d'ordre m du champ de pression au voisinage de l'origine O (centre de la sphère représentée sur la figure 3).An interpretation of the ambisonic representation can be given by a base of spherical harmonics as follows. The ambisonic components of the same order m finally express "derivatives" or "moments" of order m of the pressure field in the vicinity of the origin O (center of the sphere shown in FIG. 3).

En particulier, B 00 + 1 = W

Figure imgb0011
décrit la grandeur scalaire de la pression, tandis que B 11 + 1 = X , B 11 - 1 = Y , B 10 + 1 = Z
Figure imgb0012
sont liés aux gradients de pression (ou encore à la vélocité particulaire), à l'origine O. Ces quatre premières composantes W, X, Y et Z sont obtenues lors d'une prise de son naturelle à l'aide de microphones omnidirectifs (pour la composante W d'ordre 0) et bidirectifs (pour les trois autres composantes suivantes). En utilisant un plus grand nombre de transducteurs acoustiques, un traitement approprié, notamment par égalisation, permet d'obtenir d'avantage de composantes ambisoniques (ordres m plus élevés supérieurs à 1).In particular, B 00 + 1 = W
Figure imgb0011
describes the scalar magnitude of the pressure, while B 11 + 1 = X , B 11 - 1 = Y , B 10 + 1 = Z
Figure imgb0012
are related to the pressure gradients (or particle velocity), at the origin O. These four first components W, X, Y and Z are obtained during a natural sound recording using omnidirectional microphones ( for the W component of order 0) and bidirectional (for the other three components below). By using a larger number of acoustic transducers, a treatment appropriate, in particular by equalization, makes it possible to obtain more ambisonic components (orders m higher than 1).

En prenant en compte des composantes supplémentaires d'ordre plus élevé (supérieur à 1), donc en augmentant la résolution angulaire de la description ambisonique, on accède à une approximation du champ de pression sur un voisinage plus large au regard de la longueur d'onde de l'onde sonore, autour de l'origine O. On comprendra ainsi qu'il existe une relation étroite entre la résolution angulaire (ordre des harmoniques sphériques) et la portée radiale (rayon r) qui peut être représentée. En bref, lorsque l'on s'écarte spatialement du point d'origine O de la figure 3, plus le nombre de composantes ambisoniques est élevé (ordre M élevé) et meilleure est la représentation du son par l'ensemble de ces composantes ambisoniques. On comprendra aussi que la représentation ambisonique du son est toutefois moins satisfaisante au fur et à mesure que l'on s'éloigne de l'origine O. Cet effet devient critique en particulier pour des fréquences sonores élevées (de longueur d'onde courte). On a donc intérêt à obtenir un nombre de composantes ambisoniques qui soit le plus grand possible, ce qui permet de créer une région de l'espace autour du point de perception, dans laquelle la restitution du son est fidèle et dont les dimensions sont croissantes avec le nombre total de composantes.By taking into account additional components of higher order (greater than 1), thus increasing the angular resolution of the ambisonic description, an approximation of the pressure field is obtained on a wider neighborhood with respect to the length of the wave of the sound wave, around the origin O. It will thus be understood that there is a close relationship between the angular resolution (order of spherical harmonics) and the radial range (radius r) that can be represented. In short, when one deviates spatially from the point of origin O of FIG. 3, the higher the number of ambisonic components (high order M) and the better the representation of the sound by all these ambisonic components. . It will also be understood that the ambisonic representation of the sound is however less satisfactory as one moves away from the origin O. This effect becomes critical especially for high sound frequencies (of short wavelength). . It is therefore advantageous to obtain a maximum number of ambisonic components, which makes it possible to create a region of the space around the point of perception, in which the reproduction of the sound is faithful and whose dimensions are increasing with the total number of components.

On décrit ci-après une application à système d'encodage/transmission/restitution d'un son spatialisé.An application with an encoding / transmission / reproduction system of a spatialized sound is described below.

En pratique, un système ambisonique prend en compte un sous-ensemble de composantes harmoniques sphériques, comme décrit ci-avant. On parle d'un système d'ordre M lorsque celui-ci prend en compte des composantes ambisoniques d'indice m < M. Lorsqu'il s'agit d'une restitution par un dispositif de restitution à hauts-parleurs, on comprendra que si ces hauts-parleurs sont disposés dans un plan horizontal, seules les harmoniques d'indice m=n sont exploitées. En revanche, lorsque le dispositif de restitution comporte des hauts-parleurs disposés sur la surface d'une sphère ("périphonie"), on peut en principe exploiter autant d'harmoniques qu'il existe de haut-parleurs.In practice, an ambisonic system takes into account a subset of spherical harmonic components, as described above. We speak of a system of order M when this one takes into account ambisonic components of subscript m <M. When it is a restitution by a rendering device with loudspeakers, it will be understood that if these loudspeakers are arranged in a horizontal plane, only the harmonics of index m = n are exploited. On the other hand, when the rendering device comprises loudspeakers disposed on the surface of a sphere (" periphery "), it is possible in principle to use as many harmonics as there are loudspeakers.

On désigne par la référence S le signal de pression porté par une onde plane et capté au point O correspondant au centre de la sphère de la figure 3 (origine de la base en coordonnées sphériques). L'incidence de l'onde est décrite par l'azimut θ et l'élévation δ. L'expression des composantes du champ associé à cette onde plane est donnée par la relation : B m n σ = S . Y m n σ ( θ , δ )

Figure imgb0013
The reference S designates the pressure signal carried by a plane wave and picked up at the point O corresponding to the center of the sphere of FIG. 3 (origin of the base in spherical coordinates). The incidence of the wave is described by the azimuth θ and the elevation δ. The expression of the components of the field associated with this plane wave is given by the relation: B m not σ = S . Y m not σ ( θ , δ )
Figure imgb0013

Pour encoder (simuler) une source en champ proche à une distance p de l'origine O, on applique un filtre F m ( ρ / c )

Figure imgb0014
pour "incurver" la forme des fronts d'onde, en considérant qu'un champ proche émet, en première approximation, une onde sphérique. Les composantes encodées du champ deviennent : B m n σ = S . F m ( ρ / c ) ( ω ) Y m n σ ( θ , δ )
Figure imgb0015
et l'expression du filtre précité F m ( ρ / c )
Figure imgb0016
est donnée par la relation : F m ( ρ / c ) ( ω ) = n = 0 m ( m + n ) ! ( m n ) ! n ! ( 2 j ω ρ / c ) n
Figure imgb0017

où ω = 2πf est la pulsation de l'onde, f étant la fréquence du son.To encode (simulate) a near-field source at a distance p from the origin O, a filter is applied F m ( ρ / vs )
Figure imgb0014
to "bend" the shape of the wave fronts, considering that a near field emits, as a first approximation, a spherical wave. The encoded components of the field become: B m not σ = S . F m ( ρ / vs ) ( ω ) Y m not σ ( θ , δ )
Figure imgb0015
and the expression of the aforementioned filter F m ( ρ / vs )
Figure imgb0016
is given by the relation: F m ( ρ / vs ) ( ω ) = Σ not = 0 m ( m + not ) ! ( m - not ) ! not ! ( two j ω ρ / vs ) - not
Figure imgb0017

where ω = 2πf is the pulsation of the wave, where f is the frequency of the sound.

Ces deux dernières relations [A4] et [A5] montrent finalement que, aussi bien pour une source virtuelle (simulée) que pour une source réelle en champ proche, les composantes du son dans la représentation ambisonique s'expriment mathématiquement (en particulier analytiquement) sous la forme d'un polynôme, ici de Bessel, de puissance m et dont la variable (c/2jωρ) est inversement proportionnelle à la fréquence sonore.These last two relations [A4] and [A5] show finally that, for a virtual source (simulated) as for a real source in near field, the components of the sound in the ambisonic representation are expressed mathematically (in particular analytically) in the form of a polynomial, here Bessel, power m and whose variable (c / 2jωρ) is inversely proportional to the sound frequency.

Ainsi, on comprendra que :

  • dans le cas d'une onde plane, l'encodage produit des signaux qui ne diffèrent du signal d'origine que d'un gain réel, fini, ce qui correspond à un encodage purement directionnel (relation [A3]) ;
  • dans le cas d'une onde sphérique (source en champ proche), le filtre supplémentaire F m ( ρ / c ) ( ω )
    Figure imgb0018
    encode l'information de distance en introduisant, dans l'expression des composantes ambisoniques, des rapports d'amplitudes complexes qui dépendent de la fréquence, comme exprimé dans la relation [A5].
Thus, it will be understood that:
  • in the case of a plane wave, the encoding produces signals which differ from the original signal only by a real, finite gain, which corresponds to a purely directional encoding (relation [A3]);
  • in the case of a spherical wave (source in the near field), the additional filter F m ( ρ / vs ) ( ω )
    Figure imgb0018
    encodes distance information by introducing, in the expression of the ambison components, frequency-dependent complex amplitude ratios as expressed in relation [A5].

Il est à noter que ce filtre supplémentaire est de type "intégrateur", avec un effet d'amplification croissant et divergent (non-borné) au fur et à mesure que les fréquences sonores décroissent vers zéro. La figure 6 montre, pour chaque ordre m, une augmentation du gain en basses fréquences (ici la première distance ρ = lm). Il s'agit donc de filtres instables et divergents lorsque l'on cherche à les appliquer à des signaux audio quelconques. Cette divergence est d'autant plus critique pour les ordres m de valeur élevée.It should be noted that this additional filter is of the "integrator" type, with an amplifying effect increasing and diverging (unbounded) as the sound frequencies decrease towards zero. FIG. 6 shows, for each order m, an increase in the gain at low frequencies (here the first distance ρ = 1m). It is therefore unstable and divergent filters when one seeks to apply them to any audio signals. This divergence is all the more critical for high value orders m.

On comprendra en particulier, à partir des relations [A3], [A4], et [A5] , que la modélisation d'une source virtuelle en champ proche présente des composantes ambisoniques divergentes en basses fréquences, de façon particulièrement critique pour des ordres m élevés, comme représenté sur la figure 6. Cette divergence, dans les basses fréquences, correspond au phénomène de "bass boost" énoncé ci-avant. Il se manifeste aussi en acquisition sonore, pour des sources réelles.It will be understood in particular, from the relations [A3], [A4], and [A5], that the modeling of a virtual source in the near field presents divergent ambison components at low frequencies, in a particularly critical manner for m orders. As shown in FIG. 6, this divergence in the low frequencies corresponds to the "bass boost" phenomenon stated above. It is also manifested in sound acquisition, for real sources.

Pour cette raison notamment, l'approche ambisonique, en particulier pour des ordres m élevés, n'a pas connu, dans l'état de la technique, une application concrète (autre que théorique) dans le traitement du son.For this reason in particular, the ambisonic approach, especially for high orders, has not known, in the state of the art, a concrete application (other than theoretical) in sound processing.

On comprend en particulier qu'une compensation du champ proche est nécessaire pour respecter, à la restitution, la forme des fronts d'ondes encodés dans la représentation ambisonique. En se référant à la figure 7, un dispositif de restitution comporte une pluralité de hauts-parleurs HPi, disposés à une même distance R, dans l'exemple décrit, d'un point de perception auditive P. Sur cette figure 7 :

  • chaque point où se situe un haut-parleur HPi correspond à un point de restitution énoncé ci-avant,
  • le point P est le point de perception auditive énoncé ci-avant,
  • ces points sont séparés de la seconde distance R énoncée ci-avant,
tandis que sur la figure 3 décrite ci-avant :
  • le point O correspond au point de référence, énoncé ci-avant, qui forme l'origine de la base des harmoniques sphériques,
  • le point M correspond à la position d'une source (réelle ou virtuelle) située à la première distance p, énoncée ci-avant, du point de référence O.
It is understood in particular that a compensation of the near field is necessary to respect, at the restitution, the shape of the wavefronts encoded in the ambisonic representation. Referring to FIG. 7, a rendering device comprises a plurality of loudspeakers HP i , arranged at the same distance R, in the example described, from a point of auditory perception P. In this FIG.
  • each point where an HP i speaker is located corresponds to a restitution point stated above,
  • the point P is the point of auditory perception stated above,
  • these points are separated from the second distance R stated above,
while in Figure 3 described above:
  • the point O corresponds to the point of reference, stated above, which forms the origin of the base of the spherical harmonics,
  • the point M corresponds to the position of a source (real or virtual) situated at the first distance p, stated above, from the reference point O.

Selon l'invention, on introduit une pré-compensation du champ proche au stade même de l'encodage, cette compensation mettant en jeu des filtres de la forme analytique 1 F m ( R / c ) ( ω )

Figure imgb0019
et qui s'appliquent aux composantes ambisoniques B mn σ
Figure imgb0020
précitées.According to the invention, a pre-compensation of the near field is introduced at the very stage of the encoding, this compensation involving filters of the analytical form. 1 F m ( R / vs ) ( ω )
Figure imgb0019
and that apply to ambisonic components B mn σ
Figure imgb0020
above.

Selon l'un des avantages que procure l'invention, l'amplification F m ( ρ / c ) ( ω )

Figure imgb0021
dont l'effet apparaît sur la figure 6 est compensée par l'atténuation du filtre appliqué dès l'encodage 1 F m ( R / c ) ( ω ) .
Figure imgb0022
En particulier, les coefficients de ce filtre de compensation 1 F m ( R / c ) ( ω )
Figure imgb0023
sont croissants avec la fréquence du son et, en particulier, tendent vers zéro, pour les basses fréquences. Avantageusement, cette pré-compensation, effectuée dès l'encodage, assure que les données transmises ne sont pas divergentes pour les basses fréquences.According to one of the advantages provided by the invention, amplification F m ( ρ / vs ) ( ω )
Figure imgb0021
whose effect appears in FIG. 6 is compensated by attenuation of the filter applied as soon as encoding 1 F m ( R / vs ) ( ω ) .
Figure imgb0022
In particular, the coefficients of this compensation filter 1 F m ( R / vs ) ( ω )
Figure imgb0023
are increasing with the frequency of the sound and, in particular, tend towards zero, for the low frequencies. Advantageously, this pre-compensation, performed as soon as encoding, ensures that the data transmitted are not divergent for low frequencies.

Pour indiquer la signification physique de la distance R qui intervient dans le filtre de compensation, on considère, à titre illustratif, une onde plane réelle, initiale, à l'acquisition des signaux sonores. Pour simuler un effet de champ proche de cette source lointaine, on applique le premier filtre de la relation [A5], comme indiqué dans la relation [A4]. La distance p représente alors une distance entre une source virtuelle proche M et le point O représentant l'origine de la base sphérique de la figure 3. On applique ainsi un premier filtre de simulation de champ proche pour simuler la présence d'une source virtuelle à la distance p décrite ci-avant. Néanmoins, d'une part, comme indiqué ci-avant, les termes du coefficient de ce filtre divergent dans les basses fréquences (figure 6) et, d'autre part, la distance p précitée ne représentera pas forcément la distance entre les hauts-parleurs d'un dispositif de restitution et un point P de perception (figure 7). Selon l'invention, on applique une pré-compensation, à l'encodage, mettant en jeu un filtre de type 1 F m ( R / c ) ( ω )

Figure imgb0024
comme indiqué ci-avant, ce qui permet, d'une part, de transmettre des signaux bornés, et, d'autre part, de choisir la distance R, dès l'encodage, pour la restitution du son à partir des hauts-parleurs HPi, tel que représenté sur la figure 7. En particulier, on comprendra que si l'on a simulé, à l'acquisition, une source virtuelle placée à la distance p de l'origine O, à la restitution (figure 7), un auditeur placé au point P de perception auditive (à une distance R des hauts-parleurs HPi) ressentira, à l'audition, la présence d'une source sonore S, placée à la distance p du point de perception P et qui correspond à la source virtuelle simulée lors de l'acquisition.To indicate the physical significance of the distance R which occurs in the compensation filter, it is considered, by way of illustration, to consider a real initial plane wave at the acquisition of the sound signals. To simulate a near field effect of this distant source, we apply the first filter of the relation [A5], as indicated in relation [A4]. The distance p then represents a distance between a near virtual source M and the point O representing the origin of the spherical base of FIG. 3. A first near-field simulation filter is thus applied to simulate the presence of a virtual source. at the distance p described above. Nevertheless, on the one hand, as indicated above, the terms of the coefficient of this filter diverge in the low frequencies (FIG. 6) and, on the other hand, the distance p mentioned above will not necessarily represent the distance between the speakers of a rendering device and a point P of perception (Figure 7). According to the invention, a pre-compensation is applied to the encoding, involving a filter of the type 1 F m ( R / vs ) ( ω )
Figure imgb0024
as indicated above, which allows, on the one hand, to transmit bounded signals, and, on the other hand, to choose the distance R, from the encoding, for the restitution of the sound from the loudspeakers HP i , as shown in FIG. 7. In particular, it will be understood that if a virtual source placed at the distance p of the origin O was simulated at the time of acquisition (FIG. 7) a listener placed at the point P of auditory perception (at a distance R from the loudspeakers HP i ) will feel, at the hearing, the presence of a sound source S, placed at the distance p from the perception point P and which corresponds to the virtual source simulated during the acquisition.

Ainsi, la pré-compensation du champ proche des hauts-parleurs (placés à la distance R), au stade de l'encodage, peut être combinée à un effet de champ proche simulé d'une source virtuelle placée à une distance p. A l'encodage, on met finalement en jeu un filtre total résultant, d'une part, de la simulation du champ proche, et, d'autre part, de la compensation du champ proche, les coefficients de ce filtre pouvant s'exprimer analytiquement par la relation : H m NFC ( ρ / c , R / c ) ( ω ) = F m ( ρ / c ) ( ω ) F m ( R / c ) ( ω )

Figure imgb0025
Le filtre total donné par la relation [A11] est stable et constitue la partie "encodage de distance" dans l'encodage ambisonique spatial selon l'invention, tel que représenté sur la figure 8. Les coefficients de ces filtres correspondent à des fonctions de transfert monotones de la fréquence, qui tendent vers la valeur 1 en hautes fréquences et vers la valeur (R/ρ)m en basses fréquences. En se référant à la figure 9, les spectres d'énergie des filtres H m NFC ( ρ / c , R / c ) ( ω )
Figure imgb0026
traduisent l'amplification des composantes encodées, dues à l'effet de champ de la source virtuelle (placée ici à une distance p = 1 m), avec une pré-compensation du champ des hauts-parleurs (placés à une distance R = 1,5 m). L'amplification en décibels est donc positive lorsque p < R (cas de la figure 9) et négative quand p > R (cas de la figure 10 où p = 3 m et R = 1,5 m). Dans un dispositif de restitution spatialisée, la distance R entre un point de perception auditive et les haut-parleurs HPi est effectivement de l'ordre de un ou quelques mètres.Thus, the pre-compensation of the near field of the loudspeakers (placed at the distance R), at the stage of the encoding, can be combined with a simulated near-field effect of a virtual source placed at a distance p. At encoding, a total filter ultimately comes into play resulting, on the one hand, from the simulation of the near field, and, on the other hand, from the compensation of the near field, the coefficients of this filter being able to express itself. analytically by the relation: H m NFC ( ρ / vs , R / vs ) ( ω ) = F m ( ρ / vs ) ( ω ) F m ( R / vs ) ( ω )
Figure imgb0025
The total filter given by the relation [A11] is stable and constitutes the "distance encoding" part in the spatial ambisonic encoding according to the invention, as represented in FIG. 8. The coefficients of these filters correspond to the functions of FIG. monotonic transfer of the frequency, which tend towards the value 1 in high frequencies and towards the value (R / ρ) m in low frequencies. Referring to FIG. 9, the energy spectra of the filters H m NFC ( ρ / vs , R / vs ) ( ω )
Figure imgb0026
translate the amplification of the encoded components, due to the field effect of the virtual source (placed here at a distance p = 1 m), with a pre-compensation of the field of the loudspeakers (placed at a distance R = 1 , 5 m). The amplification in decibels is therefore positive when p <R (case of FIG. 9) and negative when p> R (case of FIG. 10 where p = 3 m and R = 1.5 m). In a spatialized rendering device, the distance R between an auditory perception point and the speakers HP i is actually of the order of one or a few meters.

En se référant à nouveau à la figure 8, on comprendra que, outre les paramètres de direction θ et δ habituels, on transmettra une information sur les distances qui interviennent à l'encodage. Ainsi, les fonctions angulaires correspondant aux harmoniques sphériques Y mn σ ( θ , δ )

Figure imgb0027
sont conservées pour l'encodage directionnel.Referring again to FIG. 8, it will be understood that, in addition to the usual directional parameters θ and δ, information on the distances involved in the encoding will be transmitted. Thus, the angular functions corresponding to spherical harmonics Y mn σ ( θ , δ )
Figure imgb0027
are preserved for directional encoding.

Toutefois, au sens de la présente invention, on prévoit en outre des filtres totaux (compensation de champ proche et, le cas échéant, simulation d'un champ proche) H m NFC ( ρ / c , R / c ) ( ω )

Figure imgb0028
qui sont appliqués aux composantes ambisoniques, en fonction de leur ordre m, pour réaliser l'encodage de la distance, comme représenté sur la figure 8. Un mode de réalisation de ces filtres dans le domaine audionumérique sera décrit en détail plus loin.However, for the purposes of the present invention, it is furthermore possible to provide total filters (near-field compensation and, where appropriate, simulation of a near field) H m NFC ( ρ / vs , R / vs ) ( ω )
Figure imgb0028
which are applied to the ambison components, according to their order m, to perform the encoding of the distance, as shown in Figure 8. An embodiment of these filters in the digital audio domain will be described in detail below.

On remarquera en particulier que ces filtres peuvent être appliqués dès même l'encodage de distance (r) et avant même l'encodage de direction (θ,δ). On comprendra ainsi que les étapes a) et b) ci-avant peuvent être rassemblées en une même étape globale, ou même être interverties (avec un encodage de distance et filtrage de compensation, suivis d'un encodage de direction). Le procédé selon l'invention ne se limite donc pas à une mise en oeuvre successive dans le temps des étapes a) et b).It will be noted in particular that these filters can be applied as soon as the distance encoding (r) and even before the direction encoding (θ, δ). It will thus be understood that steps a) and b) above can be brought together in one and the same global step, or even be interchanged (with distance encoding and compensation filtering, followed by direction encoding). The method according to the invention is therefore not limited to a successive implementation over time of steps a) and b).

La figure 11A représente une visualisation (vue de dessus) d'une reconstruction d'un champ proche avec compensation, d'une onde sphérique, dans le plan horizontal (avec les mêmes paramètres de distance que ceux de la figure 9), pour un système d'ordre total M = 15 et une restitution sur 32 hauts-parleurs. Sur la figure 11B, on a représenté la propagation de l'onde sonore initiale à partir d'une source en champ proche située à une distance p d'un point de l'espace d'acquisition qui correspond, dans l'espace de restitution, au point P de la figure 7 de perception auditive. On remarque sur la figure 11A que les auditeurs (symbolisés par des têtes schématisées) peuvent localiser la source virtuelle en un même lieu géographique situé à la distance p du point de perception P sur la figure 11B.FIG. 11A represents a visualization (top view) of a near-field reconstruction with compensation, of a spherical wave, in the horizontal plane (with the same distance parameters as those of FIG. 9), for a total order system M = 15 and playback on 32 loudspeakers. FIG. 11B shows the propagation of the initial sound wave from a near-field source situated at a distance p from a point in the acquisition space that corresponds, in the restitution space at point P of Figure 7 of auditory perception. Note in FIG. 11A that the listeners (symbolized by schematized heads) can locate the virtual source in the same geographical location located at the distance p from the perception point P in FIG. 11B.

On vérifie bien ainsi que la forme du front d'onde encodé est respectée après décodage et restitution. Toutefois, on constate sensiblement des interférences à droite du point P tel que représenté sur la figure 11A qui sont dues au fait que le nombre de hauts-parleurs (donc de composantes ambisoniques prises en compte) n'est pas suffisant pour restituer parfaitement le front d'ondes en jeu sur toute la surface délimitée par les haut-parleurs.It is thus well verified that the shape of the encoded wavefront is respected after decoding and playback. However, there is substantially interference to the right of the point P as shown in Figure 11A due to the fact that the number of speakers (therefore ambison components considered) is not sufficient to restore the front perfectly waves in play over the entire area bounded by the speakers.

Dans ce qui suit, on décrit, à titre d'exemple, l'obtention d'un filtre audionumérique pour la mise en oeuvre du procédé au sens de l'invention.In what follows, it is described, by way of example, obtaining a digital audio filter for implementing the method within the meaning of the invention.

Comme indiqué ci-avant, si l'on cherche à simuler un effet de champ proche, compensé dès l'encodage, on applique aux composantes ambisoniques du son un filtre de la forme : H m NFC ( ρ / c , R / c ) ( ω ) = F m ( ρ / c ) ( ω ) F m ( R / c ) ( ω )

Figure imgb0029
As indicated above, if one tries to simulate a near field effect, compensated as soon as the encoding is done, one applies to the ambisonic components of the sound a filter of the form: H m NFC ( ρ / vs , R / vs ) ( ω ) = F m ( ρ / vs ) ( ω ) F m ( R / vs ) ( ω )
Figure imgb0029

De l'expression de la simulation d'un champ proche donné par la relation [A5], il apparaît que pour des sources lointaines (p = ∞), la relation [A11] devient simplement : 1 F m ( R / c ) ( ω ) = H m NFC ( , R / c ) ( ω )

Figure imgb0030
From the expression of the simulation of a near field given by the relation [A5], it appears that for distant sources (p = ∞), the relation [A11] simply becomes: 1 F m ( R / vs ) ( ω ) = H m NFC ( , R / vs ) ( ω )
Figure imgb0030

Il apparaît donc de cette dernière relation [A12] que le cas où la source à simuler émet en champ lointain (source lointaine) n'est qu'un cas particulier de l'expression générale du filtre formulée dans la relation [A11].It thus appears from this last relation [A12] that the case where the source to be simulated emits in far field (source distant) is only a special case of the general expression of the filter formulated in relation [A11].

Dans le domaine des traitements audionumériques, un procédé avantageux pour définir un filtre numérique à partir de l'expression analytique de ce filtre dans le domaine analogique à temps continu consiste en une "transformée bilinéaire". In the field of digital audio processing, an advantageous method to set a digital filter from the analytical expression of this filter in the analog domain to continuous time is a "bilinear transform."

On exprime d'abord la relation [A5] sous la forme d'une transformée de Laplace, ce qui correspond à : F m ( τ ) ( p ) = n = 0 m ( m + n ) ! ( m n ) ! n ! ( 2 τ p ) n

Figure imgb0031

où τ = ρ/c (c étant la vitesse acoustique dans le milieu, typiquement 340 m/s dans l'air).We first express the relation [A5] in the form of a Laplace transform, which corresponds to: F m ( τ ) ( p ) = Σ not = 0 m ( m + not ) ! ( m - not ) ! not ! ( two τ p ) - not
Figure imgb0031

where τ = ρ / c (where c is the acoustic velocity in the medium, typically 340 m / s in air).

La transformée bilinéaire consiste à présenter, pour une fréquence d'échantillonnage fs, la relation [A11] sous la forme : H m ( z ) q = 1 m / 2 b 0 q + b 1 q z 1 + b 2 q z 2 a 0 q + a 1 q z 1 + a 2 q z 2 × b 0 ( m + 1 ) / 2 + b 1 ( m + 1 ) / 2 z 1 a 0 ( m + 1 ) / 2 + a 1 ( m + 1 ) / 2 z 1

Figure imgb0032
si m est impair et H m ( z ) = q = 1 m / 2 b 0 q + b 1 q z 1 + b 2 q z 2 a 0 q + a 1 q z 1 + a 2 q z 2
Figure imgb0033
si m est pair,
où z est défini par p = 2 f s 1 z 1 1 + z 1
Figure imgb0034
par rapport à la relation [A13] précédente,
et avec : x 0 = 1 2 Re ( X m , q ) α + | X m , q | 2 α 2 , x 1 = 2 ( 1 | X m , q | 2 α 2 )
Figure imgb0035
et x 2 = 1 + 2 Re ( X m , q ) α + | X m , q | 2 α 2
Figure imgb0036
x 0 ( m + 1 ) / 2 = 1 X m , q α     et     x 1 ( m + 1 ) / 2 = ( 1 + X m , q α )
Figure imgb0037

où α = 4fs R/c pour x=a
et α = 4fs ρ/c pour x=bThe bilinear transform consists in presenting, for a sampling frequency f s , the relation [A11] in the form: H m ( z ) Π q = 1 m / two b 0 q + b 1 q z - 1 + b two q z - two at 0 q + at 1 q z - 1 + at two q z - two × b 0 ( m + 1 ) / two + b 1 ( m + 1 ) / two z - 1 at 0 ( m + 1 ) / two + at 1 ( m + 1 ) / two z - 1
Figure imgb0032
if m is odd and H m ( z ) = Π q = 1 m / two b 0 q + b 1 q z - 1 + b two q z - two at 0 q + at 1 q z - 1 + at two q z - two
Figure imgb0033
if m is even,
where z is defined by p = two f s 1 - z - 1 1 + z - 1
Figure imgb0034
compared to the previous relation [A13],
and with : x 0 = 1 - two Re ( X m , q ) α + | X m , q | two α two , x 1 = - two ( 1 - | X m , q | two α two )
Figure imgb0035
and x two = 1 + two Re ( X m , q ) α + | X m , q | two α two
Figure imgb0036
x 0 ( m + 1 ) / two = 1 - X m , q α and x 1 ( m + 1 ) / two = - ( 1 + X m , q α )
Figure imgb0037

where α = 4f s R / c for x = a
and α = 4f s ρ / c for x = b

Xm,q sont les q racines successives du polynôme de Bessel : F m ( x ) = n = 0 m ( m + n ) ! ( m n ) ! n ! X m n = q = 1 m ( X X m , q )

Figure imgb0038
et sont exprimées dans le tableau 1 ci-après, pour différents ordres m, sous les formes respectives de leur partie réelle, leur module (séparés par une virgule) et leur valeur (réelle) lorsque m est impair. Tableau 1 : valeurs R e [X m,q ], |X m,q | (et R e [X m,m ] lorsque m est impair) d'un polynôme de Bessel calculées à l'aide du logiciel de calcul MATLAB©. m=1 -2.0000000000 m=2 -3.0000000000, 3.4641016151 m=3 -3.6778146454, 5.0830828022 ; -4.6443707093 m=4 -4.2075787944, 6.7787315854 ; -5.7924212056, 6.0465298776 m=5 -4.6493486064, 8.5220456027 ; -6.7039127983, 7.5557873219 ; -7.2934771907 m=6 -5.0318644956, 10.2983543043 -7.4714167127, 9.1329783045 -8.4967187917, 8.6720541026 m=7 -5.3713537579, 12.0990553610; -8.1402783273, 10.7585400670 ; -9.5165810563, 10.1324122997 ; -9.9435737171 m=8 -5.6779678978, 13.9186233016 ; -8.7365784344, 12.4208298072 ; -10.4096815813, 11.6507064310 ; -11.1757720865, 11.3096817388 m=9 -5.9585215964, 15.7532774523 ; -9.2768797744, 14.1121936859 ; -11.2088436390, 13.2131216226 ; -12.2587358086, 12.7419414392 ; -12.5940383634 m=10 -6.2178324673, 17.6003068759 ; -9.7724391337, 15.8272658299 ; -11.9350566572, 14.8106929213 ; -13.2305819310, 14.2242555605 ; -13.8440898109, 13.9524261065 m=11 -6.4594441798, 19.4576958063 ; -10.2312965678, 17.5621095176 ; -12.6026749098, 16.4371594915 ; -14.1157847751, 15.7463731900 ; -14.9684597220, 15.3663558234 ; -15.2446796908 m=12 -6.6860466156, 21.3239012076 ; -10.6594171817, 19.3137363168 ; -13.2220085001, 18.0879209819 ; -14.9311424804, 17.3012295772 ; -15.9945411996, 16.8242165032 ; -16.5068440226, 16.5978151615 m=13 -6.8997344413, 23.1977134580 ; -11.0613619668, 21.0798161546 ; -13.8007456514, 19.7594692366 ; -15.6887605582, 18.8836767359 ; -16.9411835315, 18.3181073534 ; -17.6605041890, 17.9988179873 ; -17.8954193236 m=14 -7.1021737668, 25.0781652657 ; -11.4407047669, 22.8584924996 ; -14.3447919297, 21.4490520815 ; -16.3976939224, 20.4898067617 ; -17.8220011429, 19.8423306934 ; -18.7262916698, 19.4389130000 ; -19.1663428016, 19.2447495545 m=15 -7.2947137247, 26.9644699653 ; -11.8003034312, 24.6482592959 ; -14.8587939669, 23.1544615283 ; -17.0649181370, 22.1165594535 ; -18.6471986915, 21.3925954403 ; -19.7191341042, 20.9118275261 ; -20.3418287818, 20.6361378957 ; -20.5462183256 m=16 -7.4784635949, 28.8559784487 ;-12.1424827551, 26.4478760957 ; -15.3464816324, 24.8738935490 ; -17.6959363478, 23.7614799683 ; -19.4246523327, 22.9655586516 ; -20.6502404436, 22.4128776078 ; -21.4379698156, 22.0627133056 ; -21.8237730778, 21.8926662470 m=17 -7.6543475694, 30.7521483222 ; -12.4691619784, 28.2563077987 ; -15.8108990691, 26.6058519104 ; -18.2951775164, 25.4225585034 ; -20.1605894729, 24.5585534450 ; -21.5282660840, 23.9384287933 ; -22.4668764601, 23.5193877036 ; -23.0161527444, 23.2766166711 ; -23.1970582109 m=18 -7.8231445835, 32.6525213363 ; -12.7819455282, 30.0726807554 ; -16.2545681590, 28.3490792784 ; -18.8662638563, 27.0981271991 ; -20.8600257104, 26.1693913642 ; -22.3600808236, 25.4856138632 ; -23.4378933084, 25.0022244227 ; -24.1362741870, 24.6925542646 ; -24.4798038436, 24.5412441597 m=19 -7.9855178345, 34.5567065132 ; -13.0821901901, 31.8962504142 ; -16.6796008200, 30.1025072510 ; -19.4122071436, 28.7867778706 ; -21.5270719955, 27.7962699865 ; -23.1512112785, 27.0520753105 ; -24.3584393996, 26.5081174988 ; -25.1941793616, 26.1363057951 ; -25.6855663388, 25.9191817486 ; -25.8480312755 X m, q are the q successive roots of the Bessel polynomial: F m ( x ) = Σ not = 0 m ( m + not ) ! ( m - not ) ! not ! X m - not = Π q = 1 m ( X - X m , q )
Figure imgb0038
and are expressed in Table 1 below, for different orders m, in the respective forms of their real part, their module (separated by a comma) and their (real) value when m is odd. Table 1: values <i> R </ i><sub><i> e </ i></sub>[<i> X </ i><sub><i> m, q </ i>< / sub>], | <i> X </ i><sub><i> m, q </ i></sub> | (and <i> R </ i><sub><i> e </ i></sub>[<i> X </ i><sub><i> m, m </ i></ sub >] when m is odd) of a Bessel polynomial calculated using the MATLAB © calculation software. m = 1 -2.0000000000 m = 2 -3.0000000000, 3.4641016151 m = 3 -3.6778146454, 5.0830828022; -4.6443707093 m = 4 -4.2075787944, 6.7787315854; -5.7924212056, 6.0465298776 m = 5 -4.6493486064, 8.5220456027; -6.7039127983, 7.5557873219; -7.2934771907 m = 6 -5.0318644956, 10.2983543043 -7.4714167127, 9.1329783045 -8.4967187917, 8.6720541026 m = 7 -5.3713537579, 12.0990553610; -8.1402783273, 10.7585400670; -9.5165810563, 10.1324122997; -9.9435737171 m = 8 -5.6779678978, 13.9186233016; -8.7365784344, 12.4208298072; -10.4096815813, 11.6507064310; -11.1757720865, 11.3096817388 m = 9 -5.9585215964, 15.7532774523; -9.2768797744, 14.1121936859; -11.2088436390, 13.2131216226; -12.2587358086, 12.7419414392; -12.5940383634 m = 10 -6.2178324673, 17.6003068759; -9.7724391337, 15.8272658299; -11.9350566572, 14.8106929213; -13.2305819310, 14.2242555605; -13.8440898109, 13.9524261065 m = 11 -6.4594441798, 19.4576958063; -10.2312965678, 17.5621095176; -12.6026749098, 16.4371594915; -14.1157847751, 15.7463731900; -14.9684597220, 15.3663558234; -15.2446796908 m = 12 -6.6860466156, 21.3239012076; -10.6594171817, 19.3137363168; -13.2220085001, 18.0879209819; -14.9311424804, 17.3012295772; -15.9945411996, 16.8242165032; -16.5068440226, 16.5978151615 m = 13 -6.8997344413, 23.1977134580; -11.0613619668, 21.0798161546; -13.8007456514, 19.7594692366; -15.6887605582, 18.8836767359; -16.9411835315, 18.3181073534; -17.6605041890, 17.9988179873; -17.8954193236 m = 14 -7.1021737668, 25.0781652657; -11.4407047669, 22.8584924996; -14.3447919297, 21.4490520815; -16.3976939224, 20.4898067617; -17.8220011429, 19.8423306934; -18.7262916698, 19.4389130000; -19.1663428016, 19.2447495545 m = 15 -7.2947137247, 26.9644699653; -11.8003034312, 24.6482592959; -14.8587939669, 23.1544615283; -17.0649181370, 22.1165594535; -18.6471986915, 21.3925954403; -19.7191341042, 20.9118275261; -20.3418287818, 20.6361378957; -20.5462183256 m = 16 -7.4784635949, 28.8559784487; -12.1424827551, 26.4478760957; -15.3464816324, 24.8738935490; -17.6959363478, 23.7614799683; -19.4246523327, 22.9655586516; -20.6502404436, 22.4128776078; -21.4379698156, 22.0627133056; -21.8237730778, 21.8926662470 m = 17 -7.6543475694, 30.7521483222; -12.4691619784, 28.2563077987; -15.8108990691, 26.6058519104; -18.2951775164, 25.4225585034; -20.1605894729, 24.5585534450; -21.5282660840, 23.9384287933; -22.4668764601, 23.5193877036; -23.0161527444, 23.2766166711; -23.1970582109 m = 18 -7.8231445835, 32.6525213363; -12.7819455282, 30.0726807554; -16.2545681590, 28.3490792784; -18.8662638563, 27.0981271991; -20.8600257104, 26.1693913642; -22.3600808236, 25.4856138632; -23.4378933084, 25.0022244227; -24.1362741870, 24.6925542646; -24.4798038436, 24.5412441597 m = 19 -7.9855178345, 34.5567065132; -13.0821901901, 31.8962504142; -16.6796008200, 30.1025072510; -19.4122071436, 28.7867778706; -21.5270719955, 27.7962699865; -23.1512112785, 27.0520753105; -24.3584393996, 26.5081174988; -25.1941793616, 26.1363057951; -25.6855663388, 25.9191817486; -25.8480312755

On implémente ainsi les filtres numériques, à partir des valeurs du tableau 1, en prévoyant des cascades de cellules d'ordre 2 (pour m pair), et une cellule supplémentaire (pour m impair), à partir des relations [A14] données ci-avant.The digital filters are thus implemented from the values of Table 1, by providing cascades of cells of order 2 (for m even), and an additional cell (for odd m), from the relationships [A14] given here. -before.

On réalise ainsi des filtres numériques sous une forme de réponse impulsionnelle infinie, aisément paramétrable comme montré ci-avant. Il est à noter qu'une implémentation sous une forme de réponse impulsionnelle finie peut être envisagée et consiste à calculer le spectre complexe de la fonction de transfert à partir de la formule analytique, puis à en déduire une réponse impulsionnelle finie par transformée de Fourier inverse. On applique ensuite une opération de convolution pour le filtrage.Digital filters are thus produced in an infinite impulse response form, which is easily parameterizable as shown above. It should be noted that an implementation in a finite impulse response form can be envisaged and consists in calculating the complex spectrum of the transfer function from the analytic formula, then deduce a finite impulse response by inverse Fourier transform. A convolution operation is then applied for filtering.

Ainsi, en introduisant cette pré-compensation du champ proche à l'encodage, on définit une représentation ambisonique modifiée (figure 8), en adoptant comme représentation transmissible des signaux exprimés dans le domaine fréquentiel, sous la forme : B m n σ ( R / c ) = 1 F m R / c ( ω ) B m n σ

Figure imgb0039
Thus, by introducing this pre-compensation of the near field to the encoding, a modified ambisonic representation is defined (FIG. 8), adopting as transmissible representation signals expressed in the frequency domain, in the form: B m not σ ( R / vs ) = 1 F m R / vs ( ω ) B m not σ
Figure imgb0039

Comme indiqué ci-avant, R est une distance de référence à laquelle est associé un effet de champ proche compensé et c est la vitesse du son (typiquement 340 m/s dans l' air) . Cette représentation ambisonique modifiée possède les mêmes propriétés de scalabilité (schématiquement représentée par des données transmises "entourées" près de la flèche TR de la figure 1) et obéit aux mêmes transformations de rotation du champ (module 4 de la figure 1) que la représentation ambisonique habituelle.As indicated above, R is a reference distance with which a compensated near-field effect is associated and c is the speed of sound (typically 340 m / s in air). This modified ambisonic representation has the same scalability properties (schematically represented by transmitted data "surrounded" near the arrow TR of FIG. 1) and obeys the same transformations of rotation of the field (module 4 of FIG. usual ambisonic.

On indique ci-après les opérations à mettre en oeuvre pour le décodage des signaux ambisoniques reçus.The operations to be implemented for the decoding of the received ambisonic signals are indicated below.

On indique tout d'abord que l'opération de décodage est adaptable à un dispositif de restitution quelconque, de rayon R2, différent de la distance de référence R ci-avant. A cet effet, on applique des filtres de type H m NFC ( ρ / c , R / c ) ( ω ) ,

Figure imgb0040
tels que décrits plus haut, mais avec des paramètres de distance R et R2, au lieu de p et R. En particulier, il est à noter que seul le paramètre R/c est à mémoriser (et/ou transmettre) entre l'encodage et le décodage.It is firstly indicated that the decoding operation is adaptable to any rendering device, of radius R 2 , different from the reference distance R above. For this purpose, filters of the type H m NFC ( ρ / vs , R / vs ) ( ω ) ,
Figure imgb0040
as described above, but with distance parameters R and R 2 , instead of p and R. In particular, it should be noted that only the parameter R / c is to be memorized (and / or transmitted) between encoding and decoding.

En se référant à la figure 12, le module de filtrage qui y est représenté est prévu par exemple dans une unité de traitement d'un dispositif de restitution. Les composantes ambisoniques reçues ont été pré-compensées à l'encodage pour une distance de référence R1 en tant que seconde distance. Toutefois, le dispositif de restitution comporte une pluralité de hauts-parleurs disposés à une troisième distance R2 d'un point de perception auditive P, cette troisième distance R2 étant différente de la seconde distance précitée R1. Le module de filtrage de la figure 12, sous la forme H m NFC ( R 1 / c , R 2 / c ) ( ω ) ,

Figure imgb0041
adapte alors, à la réception des données, la pré-compensation à la distance R1 pour une restitution à la distance R2. Bien entendu, comme indiqué ci-avant, le dispositif de restitution reçoit aussi le paramètre R1/c.Referring to FIG. 12, the filtering module represented therein is provided, for example, in a processing unit of a rendering device. Ambisonic components received were pre-compensated for encoding for a reference distance R 1 as a second distance. However, the rendering device comprises a plurality of loudspeakers arranged at a third distance R 2 from an auditory perception point P, this third distance R 2 being different from the second aforementioned distance R 1 . The filtering module of FIG. 12, in the form H m NFC ( R 1 / vs , R two / vs ) ( ω ) ,
Figure imgb0041
then adapts, upon reception of the data, the pre-compensation at the distance R 1 for a reproduction at the distance R 2 . Of course, as indicated above, the rendering device also receives the parameter R 1 / c.

Il est à noter que l'invention permet en outre de mixer plusieurs représentations ambisoniques de champs sonores (sources réelles et/ou virtuelles), dont les distances de référence R sont différentes (le cas échéant avec des distances de référence infinies et correspondant à des sources lointaines). Préférentiellement, on filtrera une pré-compensation de toutes ces sources à une distance de référence la plus petite, avant de mélanger les signaux ambisoniques, ce qui permet à la restitution d'obtenir une définition correcte du relief sonore.It should be noted that the invention also makes it possible to mix several ambisonic representations of sound fields (real and / or virtual sources), whose reference distances R are different (where appropriate with infinite reference distances and corresponding to distant sources). Preferably, a pre-compensation of all these sources will be filtered at a smallest reference distance, before mixing the signals ambisic, which allows the restitution to obtain a correct definition of the sound relief.

Dans le cadre d'un traitement dit de "focalisation sonore" avec, à la restitution, un effet d'enrichissement sonore pour une direction choisie de l'espace (à la manière d'un projecteur lumineux éclairant dans une direction choisie en optique), impliquant un traitement matriciel de focalisation sonore (avec pondération des composantes ambisoniques), on applique avantageusement l'encodage de distance avec pré-compensation de champ proche de façon combinée au traitement de focalisation.In the context of a so-called "sound focusing" treatment, with the restitution, a sound enrichment effect for a chosen direction of the space (in the manner of a light projector illuminating in a direction chosen in optics) , involving a sound focusing matrix processing (with weighting of the ambison components), the distance encoding with near-field pre-compensation is advantageously applied in combination with the focus processing.

Dans ce qui suit, on décrit un procédé de décodage ambisonique, avec compensation du champ proche des hauts-parleurs, à la restitution.In what follows, an ambisonic decoding method, with compensation of the near-field of the loudspeakers, is described for the restitution.

Pour reconstruire un champ acoustique encodé suivant le formalisme ambisonique, à partir des composantes B mn σ

Figure imgb0042
et en utilisant des hauts-parleurs d'un dispositif de restitution qui prévoit un emplacement "idéal" d'un auditeur qui correspond au point de restitution P de la figure 7, l'onde émise par chaque haut-parleur est définie par un traitement préalable de "ré-encodage" du champ ambisonique au centre du dispositif de restitution, comme suit.To reconstruct an acoustic field encoded according to the ambisonic formalism, from the components B mn σ
Figure imgb0042
and using loudspeakers of a playback device which provides an "ideal" location for a listener corresponding to the playback point P of FIG. 7, the wave transmitted by each speaker is defined by a processing prior to "re-encoding" the ambisonic field in the center of the rendering device, as follows.

Dans ce contexte de "ré-encodage", on considère dans un premier temps et pour simplification que les sources émettent en champ lointain.In this context of " re-encoding ", it is initially considered, and for simplification, that the sources emit in the far field.

En se référant à nouveau à la figure 7, l'onde émise par un haut-parleur d'indice i et d'incidence (θi et δi) est alimenté par un signal Si. Ce haut-parleur participe à la reconstruction de la composante B mn ,

Figure imgb0043
par sa contribution S i . Y mn σ ( θ i , δ i ) .
Figure imgb0044
Referring again to FIG. 7, the wave emitted by a loudspeaker of index i and incidence (θ i and δ i ) is powered by a signal Si. This loudspeaker participates in the reconstruction of the component B mn ' ,
Figure imgb0043
by his contribution S i . Y mn σ ( θ i , δ i ) .
Figure imgb0044

Le vecteur ci des coefficients d'encodage associés aux hauts-parleurs d'indice i s'exprime par la relation : c i = [ Y 00 + 1 ( θ i , δ i ) Y 11 + 1 ( θ i , δ i ) Y 11 1 ( θ i , δ i ) Y m n δ ( θ i , δ i ) ]

Figure imgb0045
The vector c i of the encoding coefficients associated with the speakers of index i is expressed by the relation: vs i = [ Y 00 + 1 ( θ i , δ i ) Y 11 + 1 ( θ i , δ i ) Y 11 - 1 ( θ i , δ i ) Y m not δ ( θ i , δ i ) ]
Figure imgb0045

Le vecteur S des signaux émanant de l'ensemble des N hauts-parleurs est donné par l'expression : S = [ S 1 S 2 S N ]

Figure imgb0046
The vector S of the signals emanating from the set of N loudspeakers is given by the expression: S = [ S 1 S two S NOT ]
Figure imgb0046

La matrice d'encodage de ces N hauts-parleurs (qui correspond finalement à une matrice de "ré-encodage"), s'exprime par la relation : C = [ C 1 C 2 C N ]

Figure imgb0047

où chaque terme ci représente un vecteur selon la relation [B1] ci-avant.The matrix of encoding of these N loudspeakers (which finally corresponds to a matrix of "re-encoding"), is expressed by the relation: VS = [ VS 1 VS two ... VS NOT ]
Figure imgb0047

where each term c i represents a vector according to the relation [B1] above.

Ainsi, la reconstruction du champ ambisonique B' est définie par la relation : B = [ B 00 + 1 B 11 + 1 B 11 1 B m n σ ] = C . S

Figure imgb0048
Thus, the reconstruction of the ambisonic field B 'is defined by the relation: B = [ B ' 00 + 1 B ' 11 + 1 B ' 11 - 1 B ' m not σ ] = VS . S
Figure imgb0048

La relation [B4] définit ainsi une opération de ré-encodage, préalable à la restitution. Finalement, le décodage, en tant que tel, consiste à comparer les signaux ambisoniques originaux et reçus par le dispositif de restitution, sous la forme : B = [ B 00 + 1 B 11 + 1 B 11 1 B m n σ ]

Figure imgb0049
aux signaux ré-encodés B̃, pour définir la relation générale : B = B
Figure imgb0050
The relation [B4] thus defines a re-encoding operation, prior to the restitution. Finally, the decoding, as such, consists in comparing the original and received ambisonic signals by the rendering device, in the form: B = [ B 00 + 1 B 11 + 1 B 11 - 1 B m not σ ]
Figure imgb0049
to the re-encoded signals B, to define the general relation: B ' = B
Figure imgb0050

Il s'agit, en particulier, de déterminer les coefficients d'une matrice de décodage D, qui vérifie la relation : S = D . B

Figure imgb0051
It is, in particular, to determine the coefficients of a decoding matrix D, which verifies the relation: S = D . B
Figure imgb0051

De préférence, le nombre de hauts-parleurs est supérieur ou égal au nombre de composantes ambisoniques à décoder et la matrice de décodage D s'exprime, en fonction de la matrice de ré-encodage C, sous la forme : D = C T . ( C . C T ) 1

Figure imgb0052

où la notation CT correspond à la transposée de la matrice C.Preferably, the number of loudspeakers is greater than or equal to the number of ambison components to be decoded and the decoding matrix D is expressed, as a function of the re-encoding matrix C, in the form: D = VS T . ( VS . VS T ) - 1
Figure imgb0052

where the notation C T corresponds to the transpose of the matrix C.

Il est à noter que la définition d'un décodage vérifiant des critères différents par bandes de fréquences est possible, ce qui permet d'offrir une restitution optimisée en fonction des conditions d'écoute, notamment pour ce qui concerne la contrainte de positionnement au centre O de la sphère de la figure 3, lors de la restitution. A cet effet, on prévoit avantageusement un filtrage simple, en égalisation fréquentielle par paliers, à chaque composante ambisonique.It should be noted that the definition of decoding verifying different criteria by frequency bands is possible, which makes it possible to offer an optimized reproduction according to the listening conditions, in particular with regard to the positioning constraint at the center. O of the sphere of Figure 3, during the restitution. For this purpose, it is advantageous to provide simple filtering, in stepwise frequency equalization, to each ambisonic component.

Toutefois, pour obtenir une reconstruction d'une onde originellement encodée, il faut corriger l'hypothèse de champ lointain pour les hauts-parleurs, c'est-à-dire exprimer l'effet de leur champ proche dans la matrice de ré-encodage C ci-avant et inverser ce nouveau système pour définir le décodeur. A cet effet, en supposant une concentricité des hauts-parleurs (disposés à une même distance R du point P de la figure 7), tous les hauts-parleurs ont un même effet de champ proche F m ( R / c ) ( ω ) ,

Figure imgb0053
sur chaque composante ambisonique du type B mn σ .
Figure imgb0054
En introduisant les termes de champ proche sous la forme d'une matrice diagonale, la relation [B4] ci-avant devient : B = Diag ( [ 1 F 1 R / c ( ω ) F 1 R / c ( ω ) F m R / c ( ω ) F m R / c ( ω ) ] ) . C . S
Figure imgb0055
However, to obtain a reconstruction of an originally encoded wave, it is necessary to correct the far-field hypothesis for the loudspeakers, that is to say to express the effect of their near field in the re-encoding matrix. C above and invert this new system to set the decoder. For this purpose, assuming a concentricity of the loudspeakers (arranged at the same distance R from point P of FIG. 7), all the loudspeakers have the same near-field effect F m ( R / vs ) ( ω ) ,
Figure imgb0053
on each ambisonic component of the type B ' mn σ .
Figure imgb0054
By introducing the near-field terms as a diagonal matrix, the relation [B4] above becomes: B ' = Diag ( [ 1 F 1 R / vs ( ω ) F 1 R / vs ( ω ) ... F m R / vs ( ω ) F m R / vs ( ω ) ... ] ) . VS . S
Figure imgb0055

La relation [B7] ci-avant devient : S = D . Diag ( [ 1 1 F 1 R / c ( ω ) 1 F 1 R / c ( ω ) 1 F m R / c ( ω ) 1 F m R / c ( ω ) ] ) . B

Figure imgb0056
The relation [B7] above becomes: S = D . Diag ( [ 1 1 F 1 R / vs ( ω ) 1 F 1 R / vs ( ω ) ... 1 F m R / vs ( ω ) 1 F m R / vs ( ω ) ... ] ) . B
Figure imgb0056

Ainsi, l'opération de matriçage est précédée par une opération de filtrage qui compense le champ proche sur chaque composante B mn σ ,

Figure imgb0057
et qui peut être mise en oeuvre sous forme numérique, comme décrit ci-avant, en référence à la relation [A14].Thus, the mastering operation is preceded by a filtering operation that compensates for the near field on each component. B mn σ ,
Figure imgb0057
and which can be implemented in digital form, as described above, with reference to relation [A14].

On retiendra qu'en pratique, la matrice C de "ré-encodage" est propre au dispositif de restitution. Ses coefficients peuvent être déterminés initialement par paramétrage et caractérisation sonore du dispositif de restitution réagissant à un excitation prédéterminée. La matrice de décodage D est, elle aussi, propre au dispositif de restitution. Ses coefficients peuvent être déterminés par la relation [B8]. En reprenant la notation précédente où est la matrice des composantes ambisoniques pré-compensées, ces dernières peuvent être transmises au dispositif de restitution sous forme matricielle avec : B = Diag ( [ 1 1 F 1 R / c ( ω ) 1 F 1 R / c ( ω ) 1 F m R / c ( ω ) 1 F m R / c ( ω ) ] ) . B

Figure imgb0058
It should be noted that in practice, the matrix C of "re-encoding" is specific to the rendering device. Its coefficients can be determined initially by parameterization and sound characterization of the restitution device reacting to a predetermined excitation. The decoding matrix D is also specific to the rendering device. Its coefficients can be determined by the relation [B8]. By repeating the previous notation where B is the matrix of precompensated ambison components, these latter can be transmitted to the rendering device in matrix form B with: B = Diag ( [ 1 1 F 1 R / vs ( ω ) 1 F 1 R / vs ( ω ) ... 1 F m R / vs ( ω ) 1 F m R / vs ( ω ) ... ] ) . B
Figure imgb0058

Le dispositif de restitution décode ensuite les données reçues sous forme matricielle (vecteur colonne des composantes transmises) en appliquant la matrice de décodage D aux composantes ambisoniques pré-compensées, pour former les signaux Si destinés à alimenter les haut-parleurs HPi, avec : S = ( S 1 S i S N ) = D . B

Figure imgb0059
The rendering device then decodes the received data in matrix form B (column vector of the transmitted components) by applying the decoding matrix D to the pre-compensated ambisonic components, to form the signals Si intended to feed the speakers HP i , with : S = ( S 1 S i S NOT ) = D . B
Figure imgb0059

En se référant à nouveau à la figure 12, si une opération de décodage doit être adaptée à un dispositif de restitution de rayon R2 différent de la distance de référence R1, un module d'adaptation préalable au décodage proprement dit et décrit ci-avant permet de filtrer chaque composante ambisonique B ˜ mn σ ,

Figure imgb0060
pour l'adapter à un dispositif de restitution de rayon R2. L'opération de décodage proprement dite est effectuée ensuite, comme décrit ci-avant, en référence à la relation [B11].Referring again to FIG. 12, if a decoding operation is to be adapted to a rendering device with a radius R 2 other than the reference distance R 1 , an adaptation module prior to the decoding proper and described hereinafter before filtering each ambisonic component B ~ mn σ ,
Figure imgb0060
to adapt it to a restitution device of radius R 2 . The actual decoding operation is then performed, as described above, with reference to the relation [B11].

On décrit ci-après une application de l'invention à la synthèse binaurale.An application of the invention to binaural synthesis is described below.

On se réfère à la figure 13A sur laquelle un auditeur disposant d'un casque à deux écouteurs d'un dispositif de synthèse binaurale est représenté. Les deux oreilles de l'auditeur sont disposées à des points respectifs OL (oreille gauche) et OR (oreille droite) de l'espace. Le centre de la tête de l'auditeur est disposé au point O et le rayon de la tête de l'auditeur est de valeur a. Une source sonore doit être perçue auditivement à un point M de l'espace, situé à une distance r du centre de la tête de l'auditeur (et respectivement à des distances rR de l'oreille droite et rL de l'oreille gauche). Par ailleurs, la direction de la source placée au point M est définie par les vecteurs r ,

Figure imgb0061
r R
Figure imgb0062
et r L .
Figure imgb0063
Referring to Fig. 13A, a listener having a two-headset headset of a binaural synthesis device is shown. The two ears of the listener are arranged at respective points O L (left ear) and O R (right ear) of the space. The center of the listener's head is located at point O and the radius of the listener's head is of value a. A sound source must be audibly perceived at a point M in the space, at a distance r from the center of the listener's head (and respectively at distances r R from the right ear and r L from the ear left). In addition, the direction of the source at the point M is defined by the vectors r ,
Figure imgb0061
r R
Figure imgb0062
and r The .
Figure imgb0063

De façon générale, la synthèse binaurale se définit comme suit.In general, binaural synthesis is defined as follows.

Chaque auditeur a une forme d'oreille qui lui est propre. La perception d'un son dans l'espace par cet auditeur se fait par apprentissage, depuis la naissance, en fonction de la forme des oreilles (notamment la forme des pavillons et les dimensions de la tête) propre à cet auditeur. La perception d'un son dans l'espace se manifeste entre autres par le fait que le son parvient à une oreille, avant l'autre oreille, ce qui se traduit par un retard τ entre les signaux à émettre par chaque écouteur du dispositif de restitution appliquant la synthèse binaurale.Each listener has an ear shape of its own. The perception of a sound in the space by this listener is done by learning, from birth, according to the form of the ears (in particular the shape of the pavilions and the dimensions of the head) peculiar to this listener. The perception of sound in space is manifested inter alia by the fact that the sound reaches one ear, before the other ear, which results in a delay τ between the signals to be emitted by each earphone of the device. restitution applying binaural synthesis.

Le dispositif de restitution est paramétré initialement, pour un même auditeur, en balayant une source sonore autour de sa tête, à une même distance R du centre de sa tête. On comprendra ainsi que cette distance R peut être considérée comme une distance entre un "point de restitution" comme énoncé ci-avant et un point de perception auditive (ici le centre O de la tête de l'auditeur).The playback device is initially set, for the same listener, by scanning a sound source around his head, at the same distance R from the center of his head. It will be understood that this distance R can be considered as a distance between a "restitution point" as stated above and a point of auditory perception (here the center O of the listener's head).

Dans ce qui suit, l'indice L est associé au signal à restituer par l'écouteur accolé à l'oreille gauche et l'indice R est associé au signal à restituer par l'écouteur accolé à l'oreille droite. En se référant à la figure 13B, on applique au signal initial S un retard pour chaque voie destinée à produire un signal pour un écouteur distinct. Ces retards τL et τR sont fonction d'un retard maximum τMAX qui correspond ici au rapport a/c où a, comme indiqué précédemment, correspond au rayon de la tête de l'auditeur et c à la vitesse du son. En particulier, ces retards sont définis en fonction de la différence de distance du point O (centre de la tête) au point M (position de la source dont le son est à restituer, sur la figure 13A) et de chaque oreille à ce point M. Avantageusement, on applique en outre des gains respectifs gL et gR, à chaque voie, qui sont fonction d'un rapport des distances du point O au point M et de chaque oreille au point M. Des modules respectifs appliqués à chaque voie 2L et 2R encodent les signaux de chaque voie, dans une représentation ambisonique, avec pré-compensation de champ proche NFC (pour "Near Field Compensation") au sens de la présente invention. On comprendra ainsi que, par la mise en oeuvre du procédé au sens de la présente invention, on peut définir les signaux issus de la source M, non seulement par leur direction (angles azimutaux θL et θR et angles d'élévation δL et δR), mais aussi en fonction de la distance séparant chaque oreille rL et rR de la source M. Les signaux ainsi encodés sont transmis au dispositif de restitution comportant des modules de décodage ambisonique, pour chaque voie, 5L et 5R. Ainsi, on applique un encodage/décodage ambisonique, avec compensation de champ proche, pour chaque voie (écouteur gauche, écouteur droit) dans la restitution avec synthèse binaurale (ici de type "B-FORMAT"), sous forme dédoublée. La compensation de champ proche s'effectue, pour chaque voie, avec comme première distance p une distance rL et rR entre chaque oreille et la position M de la source sonore à restituer.In what follows, the index L is associated with the signal to be restored by the earpiece attached to the left ear and the index R is associated with the signal to be restored by the earpiece attached to the right ear. Referring to Fig. 13B, a delay for each channel for producing a signal for a separate earphone is applied to the initial signal S. These delays τ L and τ R are a function of a maximum delay τ MAX which corresponds here to the ratio a / c where a, as indicated previously, corresponds to the radius of the listener's head and c to the speed of sound. In particular, these delays are defined as a function of the difference in distance from the point O (center of the head) to the point M (position of the source whose sound is to be restored, in FIG. 13A) and of each ear at this point. M. Advantageously, respective gains g L and g R are also applied to each channel, which are a function of a ratio of the distances from the point O to the point M and from each ear to the point M. The respective modules applied to each channel 2 L and 2 R encode the signals of each channel, in an ambisonic representation, with near field pre-compensation NFC (for "Near Field Compensation") in the sense of the present invention. It will thus be understood that, by implementing the method within the meaning of the present invention, it is possible to define the signals coming from the source M, not only by their direction (azimuthal angles θ L and θ R and elevation angles δ L and δ R ), but also as a function of the distance separating each ear r L and r R from the source M. The signals thus encoded are transmitted to the reproduction device comprising ambisonic decoding modules, for each channel, 5 L and 5 R. Thus, an ambisonic encoding / decoding, with near-field compensation, is applied for each channel (left listener, right listener) in the binaural synthesis restitution (here of type "B-FORMAT"), in split form. The near-field compensation is effected, for each channel, with the first distance p a distance r L and r R between each ear and the position M of the sound source to be restored.

On décrit ci-après une application de la compensation au sens de l'invention, au contexte de l'acquisition sonore en représentation ambisonique.An application of the compensation in the sense of the invention, in the context of the sound acquisition in ambisonic representation, is described below.

On se réfère à la figure 14 sur laquelle un microphone 141 comporte une pluralité de capsules transductrices, capables de capter des pressions acoustiques et restituer des signaux électriques Sl, ..., SN. Les capsules CAPi sont agencées sur une sphère de rayon r prédéterminé (ici, une sphère rigide, telle qu'une balle de ping-pong par exemple). Les capsules sont espacées d'un pas régulier sur la sphère. En pratique, on choisit le nombre N de capsules en fonction de l'ordre M désiré pour la représentation ambisonique.Referring to FIG. 14, a microphone 141 comprises a plurality of transducer capsules capable of picking up acoustic pressures and reproducing electrical signals S l , ..., S N. Caps CAP i are arranged on a sphere of predetermined radius r (here, a rigid sphere, such as a ping-pong ball for example). The capsules are spaced with a regular pitch on the sphere. In practice, the number N of capsules is chosen according to the desired order M for the ambisonic representation.

On indique ci-après, dans le contexte d'un microphone comportant des capsules agencées sur une sphère rigide, comment compenser l'effet de champ proche, dès l'encodage dans le contexte ambisonique. On montrera ainsi que la pré-compensation du champ proche peut s'appliquer non seulement pour la simulation de source virtuelle, comme indiqué ci-avant, mais aussi à l'acquisition et, de façon plus générale, en combinant la pré-compensation de champ proche à tous types de traitements impliquant une représentation ambisonique.In the context of a microphone comprising capsules arranged on a rigid sphere, it is indicated below how to compensate for the near field effect, as soon as encoding in the ambisonic context. It will thus be shown that the pre-compensation of the near field can be applied not only for the virtual source simulation, as indicated above, but also to the acquisition and, more generally, by combining the pre-compensation of field close to all types of treatments involving an ambisonic representation.

En présence d'une sphère rigide (susceptible d'introduire une diffraction des ondes sonores reçues), la relation [A1] donnée ci -avant devient : P r ( u i ) = m = 0 j m 1 ( k r ) 2 h m ( k r ) 0 n m σ = ± 1 B m n σ Y m n σ ( u i )

Figure imgb0064
In the presence of a rigid sphere (capable of introducing a diffraction of the sound waves received), the relation [A1] given above becomes: P r ( u i ) = Σ m = 0 j m - 1 ( k r ) two h m - ' ( k r ) Σ 0 not m σ = ± 1 B m not σ Y m not σ ( u i )
Figure imgb0064

Les dérivées des fonctions de Hankel sphériques h- m obéissent à la loi de récurrence : ( 2 m + 1 ) h m ( x ) = m h m 1 ( x ) ( m + 1 ) h m + 1 ( x )

Figure imgb0065
Derivatives of Hankel spherical functions h - m obey the law of recurrence: ( two m + 1 ) h m - ' ( x ) = m h m - 1 - ( x ) - ( m + 1 ) h m + 1 - ( x )
Figure imgb0065

On déduit les composantes ambisoniques B mn σ

Figure imgb0066
du champ initial à partir du champ de pression à la surface de la sphère, en mettant en oeuvre des opérations de projection et d'égalisation données par la relation : B m n σ = E Q m < p r | Y m n σ > 4 π
Figure imgb0067
We deduce the ambisonic components B mn σ
Figure imgb0066
of the initial field from the pressure field to the surface of the sphere, by implementing projection and equalization operations given by the relation: B m not σ = E Q m < p r | Y m not σ > 4 π
Figure imgb0067

Dans cette expression, EQm est un filtre égaliseur qui compense une pondération Wm qui est liée à la directivité des capsules et qui inclut en outre la diffraction par la sphère rigide.In this expression, EQ m is an equalizer filter that compensates a weighting W m which is related to the directivity of the capsules and which further includes the diffraction by the rigid sphere.

L'expression de ce filtre EQm est donnée par la relation suivante : EQ m = 1 W m = ( k r ) 2 h m ( k r ) j m + 1

Figure imgb0068
The expression of this filter EQ m is given by the following relation: EQ m = 1 W m = ( k r ) two h m - ' ( k r ) j - m + 1
Figure imgb0068

Les coefficients de ce filtre d'égalisation ne sont pas stables et on obtient un gain infini en très basses fréquences. D'ailleurs, il convient de noter que les composantes harmoniques sphériques, elles-mêmes, ne sont pas d'amplitude finie lorsque le champ sonore n'est pas limité à une propagation d'ondes planes, c'est-à-dire issues de sources lointaines, comme on l'a vu précédemment.The coefficients of this equalization filter are not stable and we obtain an infinite gain at very low frequencies. Moreover, it should be noted that the spherical harmonic components, themselves, are not of finite amplitude when the sound field is not limited to propagation of plane waves, that is to say from distant sources, as we saw earlier.

Par ailleurs, si, plutôt que de prévoir des capsules encastrées dans une sphère solide, on prévoit des capsules de type cardioïdes, avec une directivité en champ lointain donnée par l'expression : G ( θ ) = α + ( 1 α ) cos θ

Figure imgb0069
On the other hand, if, instead of providing capsules embedded in a solid sphere, cardioid-type capsules with a far-field directivity given by the expression: BOY WUT ( θ ) = α + ( 1 - α ) cos θ
Figure imgb0069

En considérant ces capsules montées sur un support "transparent acoustiquement", le terme de pondération à compenser devient : W m = j m ( α j m ( k r ) j ( 1 α ) j m ( k r ) )

Figure imgb0070
By considering these capsules mounted on an "acoustically transparent" support , the weighting term to be compensated becomes: W m = j m ( α j m ( k r ) - j ( 1 - α ) j m ' ( k r ) )
Figure imgb0070

Il apparaît encore que les coefficients d'un filtre d'égalisation correspondant à l'inverse analytique de cette pondération donnée par la relation [C6] sont divergents pour les très basses fréquences.It also appears that the coefficients of an equalization filter corresponding to the analytical inverse of this weighting given by the relation [C6] are divergent for very low frequencies.

De façon générale, on indique que pour tout type de directivité de capteurs, le gain du filtre EQm pour compenser la pondération Wm liée à la directivité des capteurs est infini pour les basses fréquences sonores. En se référant à la figure 14, on applique avantageusement une pré-compensation de champ proche dans l'expression même du filtre d'égalisation EQm, donnée par la relation : EQ m NFC ( R / c ) ( ω ) = EQ m ( r , ω ) F m ( R / c ) ( ω )

Figure imgb0071
In general, it is indicated that for any type of sensor directivity, the gain of the filter EQ m to compensate for the weighting W m related to the directivity of the sensors is infinite for the low frequencies of sound. Referring to FIG. 14, a near field pre-compensation is advantageously applied in the expression of the equalization filter EQ m , given by the relation: EQ m NFC ( R / vs ) ( ω ) = EQ m ( r , ω ) F m ( R / vs ) ( ω )
Figure imgb0071

Ainsi, les signaux S1 à SN sont récupérés du microphone 141. Le cas échéant, on applique une pré-égalisation de ces signaux par un module de traitement 142. Le module 143 permet d'exprimer ces signaux dans le contexte ambisonique, sous forme matricielle. Le module 144 applique le filtre de la relation [C7] aux composantes ambisoniques exprimées en fonction du rayon r de la sphère du microphone 141. La compensation de champ proche s'effectue pour une distance de référence R en tant que seconde distance. Les signaux encodés et ainsi filtrés par le module 144 peuvent être transmis, le cas échéant, avec le paramètre représentatif de la distance de référence R/c.Thus, the signals S 1 to S N are recovered from the microphone 141. If necessary, a pre-equalization of these signals is applied by a processing module 142. The module 143 makes it possible to express these signals in the ambisonic context, under matrix form. The module 144 applies the filter of the relation [C7] to the components ambisonic expressed as a function of the radius r of the sphere of the microphone 141. The near-field compensation is performed for a reference distance R as a second distance. The signals encoded and thus filtered by the module 144 can be transmitted, if necessary, with the parameter representative of the reference distance R / c.

Ainsi, il apparaît dans les différents modes de réalisation liés respectivement à la création d'une source virtuelle en champ proche, à l'acquisition de signaux sonores issues de sources réelles, ou même à la restitution (pour compenser un effet de champ proche des hauts-parleurs), que la compensation de champ proche au sens de la présente invention peut s'appliquer à tous types de traitements faisant intervenir une représentation ambisonique. Cette compensation de champ proche permet d'appliquer la représentation ambisonique à une multiplicité de contextes sonores où la direction d'une source et avantageusement sa distance doivent être prises en compte. De plus, la possibilité de la représentation de phénomènes sonores de tous types (champs proches ou lointains) dans le contexte ambisonique est assurée par cette pré-compensation, du fait de la limitation à des valeurs réelles finies des composantes ambisoniques.Thus, it appears in the various embodiments respectively related to the creation of a virtual source in the near field, to the acquisition of sound signals from real sources, or even to restitution (to compensate for a near-field effect of loudspeakers), that near-field compensation within the meaning of the present invention can be applied to all types of processing involving an ambisonic representation. This near-field compensation makes it possible to apply the ambisonic representation to a multiplicity of sound contexts where the direction of a source and advantageously its distance must be taken into account. Moreover, the possibility of the representation of sound phenomena of all types (near or far fields) in the ambisonic context is ensured by this pre-compensation, because of the limitation to finite real values of the ambison components.

Bien entendu, la présente invention ne se limite pas à la forme de réalisation décrite ci-avant à titre d'exemple ; elle s'étend à d'autres variantes.Of course, the present invention is not limited to the embodiment described above by way of example; it extends to other variants.

Ainsi, on comprendra que la pré-compensation de champ proche peut être intégrée, à l'encodage, autant pour une source proche que pour une source lointaine. Dans ce dernier cas (source lointaine et réception d'ondes planes), la distance p exprimée ci-avant sera considérée comme infinie, sans modifier de façon substantielle l'expression des filtres Hm donnée ci-avant. Ainsi, le traitement utilisant des processeurs d'effet de salle qui fournissent en général des signaux décorrélés utilisables pour modéliser le champ diffus tardif (réverbération tardive) peut être combiné à une pré-compensation de champ proche. On peut considérer que ces signaux sont de même énergie et correspondent à une part de champ diffus correspondant à la composante omnidirective W = B 00 + 1

Figure imgb0072
(figure 4). On peut alors construire les diverses composantes harmoniques sphériques (avec un ordre M choisi) en appliquant une correction de gain pour chaque composante ambisonique et on applique une compensation de champ proche des hauts-parleurs (avec une distance de référence R séparant les haut-parleurs du point de perception auditive comme représenté sur la figure 7).Thus, it will be understood that the near-field pre-compensation can be integrated, at the encoding, as much for a near source as for a distant source. In the latter case (distant source and plane wave reception), the distance p expressed above will be considered infinite, without substantially modifying the expression of the filters H m given above. Thus, processing using room effect processors that typically provide decorrelated signals that can be used to model the late diffuse field (late reverberation) can be combined with near field pre-compensation. We can consider that these signals are of the same energy and correspond to a part of diffuse field corresponding to the omnidirectional component W = B 00 + 1
Figure imgb0072
(Figure 4). The various spherical harmonic components (with a chosen order M) can then be constructed by applying a gain correction for each ambisonic component and a field compensation close to the loudspeakers (with a reference distance R separating the loudspeakers the point of auditory perception as shown in Figure 7).

Bien entendu, le principe d'encodage au sens de la présente invention est généralisable à des modèles de rayonnement autres que des sources monopolaires (réelles ou virtuelles) et/ou des hauts-parleurs. En effet, toute forme de rayonnement (notamment une source étalée dans l'espace) peut être exprimée par intégration d'une distribution continue de sources élémentaires ponctuelles.Of course, the encoding principle in the sense of the present invention is generalizable to radiation models other than monopolar sources (real or virtual) and / or speakers. Indeed, any form of radiation (including a source spread in space) can be expressed by integration of a continuous distribution of point elementary sources.

En outre, dans le contexte de la restitution, il est possible d'adapter la compensation de champ proche à tout contexte de restitution. A cet effet, il peut être prévu de calculer des fonctions de transfert (ré-encodage des composantes harmoniques sphériques de champ proche pour chaque haut-parleur, compte tenu d'une propagation réelle dans la salle où le son est restitué), ainsi qu'une inversion de ce ré-encodage pour redéfinir le décodage.In addition, in the context of rendering, it is possible to adapt the near-field compensation to any rendering context. For this purpose, it is possible to calculate transfer functions (re-encoding the spherical harmonic components of the near field for each loudspeaker, taking into account a real propagation in the room where the sound is restored), as well as an inversion of this re-encoding to redefine the decoding.

On a décrit ci-avant un procédé de décodage dans lequel on appliquait un système matriciel faisant intervenir les composantes ambisoniques. Dans une variante, il peut être prévu un traitement généralisé par transformées de Fourier rapides (circulaire ou sphérique) pour limiter les temps de calcul et les ressources informatiques (en terme de mémoire) nécessaires au traitement de décodage.A decoding method has been described above in which a matrix system involving the ambison components is applied. In a variant, it may be provided a generalized processing by fast Fourier transforms (circular or spherical) to limit the computing time and computing resources (in terms of memory) necessary for the decoding process.

Comme indiqué ci-avant en référence aux figures 9 et 10, on constate que le choix d'une distance de référence R par rapport à la distance p de la source en champ proche introduit une différence de gain pour différentes valeurs de la fréquence sonore. On indique que le procédé d'encodage avec pré-compensation peut être couplé à une compression audionumérique permettant de quantifier et d'ajuster le gain pour chaque sous-bande fréquentielle.As indicated above with reference to FIGS. 9 and 10, it can be seen that the choice of a reference distance R with respect to the distance p from the near-field source introduces a difference in gain for different values of the sound frequency. It is indicated that the pre-compensation encoding method may be coupled to a digital audio compression for quantizing and adjusting the gain for each frequency subband.

Avantageusement, la présente invention s'applique à tous types de systèmes de spatialisation sonore, notamment pour des applications de type "réalité virtuelle" (navigation dans des scènes virtuelles dans l'espace tridimensionnel, conversations de type "chat" sonorisées sur le réseau Internet), à des sonifications d'interfaces, à des logiciels d'édition audio pour enregistrer, mixer et restituer de la musique, mais aussi à l'acquisition, à partir d'usage de microphones tridimensionnels, pour la prise de son musicale ou cinématographique, ou encore pour la transmission d'ambiance sonore sur Internet, par exemple pour des "Webcam" sonorisées.Advantageously, the present invention applies to all types of sound spatialization systems, especially for "virtual reality" type applications (navigation in virtual scenes in three-dimensional space, cat-type conversations on the Internet), interface sonification, audio editing software for recording, mixing and restoring music, but also for acquiring, from use of three-dimensional microphones, for taking musical or cinematic sound, or for the transmission of sound environment on the Internet, for example for "Webcam" sound.

Claims (22)

  1. A method of processing sound data, in which:
    a) signals representative of at least one sound propagating in a three-dimensional space and arising from a source situated at a first distance (ρ) from a reference point (O) are coded so as to obtain a representation of the sound by components (Bmn σ) expressed in a base of spherical harmonics, of origin corresponding to said reference point (O),
    b) and a compensation of a near field effect is applied to said components (Bmn σ) by a filtering which is dependent on a second distance (R) defining substantially, for a playback of the sound by a playback device, a distance between a playback point (HPi) and a point (P) of auditory perception.
  2. The method as claimed in claim 1, in which, said source being far removed from the reference point (O),
    - components of successive orders m are obtained for the representation of the sound in said base of spherical harmonics, and
    - a filter (1/Fm) is applied, the coefficients of which, each applied to a component of order m, are expressed analytically in the form of the inverse of a polynomial of power m, whose variable is inversely proportional to the sound frequency and to said second distance (R), so as to compensate for a near field effect at the level of the playback device.
  3. The method as claimed in claim 1, in which, said source being a virtual source envisaged at said first distance (ρ),
    - components of successive orders m are obtained for the representation of the sound in said base of spherical harmonics, and
    - a global filter (Hm) is applied, the coefficients of which, each applied to a component of order m, are expressed analytically in the form of a fraction; in which:
    - the numerator is a polynomial of power m, whose variable is inversely proportional to the sound frequency and to said first distance (p), so as to simulate a near field effect of the virtual source, and
    - the denominator is a polynomial of power m, whose variable is inversely proportional to the sound frequency and to said second distance (R), so as to compensate for the effect of the near field of the virtual source in the low sound frequencies.
  4. The method as claimed in one of the preceding claims, in which the data coded and filtered in steps a) and b) are transmitted to the playback device with a parameter representative of said second distance (R/c).
  5. The method as claimed in one of claims 1 to 3, in which, the playback device comprising means for reading a memory medium, the data coded and filtered in steps a) and b) are stored with a parameter representative of said second distance (R/c) on a memory medium intended to be read by the playback device.
  6. The method as claimed in one of claims 4 and 5, in which, prior to a sound playback by a playback device comprising a plurality of loudspeakers disposed at a third distance (R2) from said point of auditory perception (P), an adaptation filter (Hm (R1/c, R2/c) whose coefficients are dependent on said second (R1) and third distances (R2) is applied to the coded and filtered data.
  7. The method as claimed in claim 6, in which the coefficients of said adaptation filter (Hm (R1/c,R2/c)), each applied to a component of order m, are expressed analytically in the form of a fraction, in which:
    - the numerator is a polynomial of power m, whose variable is inversely proportional to the sound frequency and to said second distance (R),
    - and the denominator is a polynomial of power m, whose variable is inversely proportional to the sound frequency and to said third distance (R2),
  8. The method as claimed in one of claims 2, 3 and 7, in which, for the implementation of step b), there is provided:
    - in respect of the components of even order m, audiodigital filters in the form of a cascade of cells of order two; and
    - in respect of the components of odd order m, audiodigital filters in the form of a cascade of cells of order two and an additional cell of order one.
  9. The method as claimed in claim 8, in which the coefficients of an audiodigital filter, for a component of order m, are defined from the numerical values of the roots of said polynomials of power m.
  10. The method as claimed in one of claims 2, 3, 7, 8 and 9, in which said polynomials are Bessel polynomials.
  11. The method as claimed in one of claims 1, 2 and 4 to 10, in which there is provided a microphone comprising an array of acoustic transducers arranged substantially on the surface of a sphere whose center corresponds substantially to said reference point (O), so as to obtain said signals representative of at least one sound propagating in the three-dimensional space.
  12. The method as claimed in claim 11, in which a global filter is applied in step b) so as, on the one hand, to compensate for a near field effect as a function of said second distance (R) and, on the other hand, to equalize the signals arising from the transducers so as to compensate for a weighting of directivity of said transducers.
  13. The method as claimed in one of claims 11 and 12, in which there is provided a number of transducers that depends on a total number of components chosen to represent the sound in said base of spherical harmonics.
  14. The method as claimed in one of the preceding claims, in which in step a) a total number of components is chosen from the base of spherical harmonics so as to obtain, on playback, a region of the space around the point of perception (P) in which the playback of the sound is faithful and whose dimensions are increasing with the total number of components.
  15. The method as claimed in claim 14, in which there is provided a playback device comprising a number of loudspeakers at least equal to said total number of components.
  16. The method as claimed in one of claims 1 to 5 and 8 to 13, in which:
    - there is provided a playback device comprising at least a first and a second loudspeaker disposed at a chosen distance from a listener,
    - a cue of awareness of the position in space of sound sources situated at a predetermined reference distance (R) from the listener is obtained for this listener, and
    - the compensation of step b) is applied with said reference distance substantially as second distance.
  17. The method as claimed in one of claims 1 to 3 and 8 to 13, taken in combination with one of claims 4 and 5, in which:
    - there is provided a playback device comprising at least a first and a second loudspeaker disposed at a chosen distance from a listener,
    - a cue of awareness of the position in space of sound sources situated at a predetermined reference distance (R2) from the listener is obtained for this listener, and
    - prior to a sound playback by the playback device, an adaptation filter (Hm (R/c, R2/c)), whose coefficients are dependent on the second distance (R) and substantially on the reference distance (R2), is applied to the data coded and filtered in steps a) and b).
  18. The method as claimed in one of claims 16 and 17, in which:
    - the playback device comprises a headset with two headphones for the respective ears of the listener, and
    - separately for each headphone, the coding and the filtering of steps a) and b) are applied with regard to respective signals intended to be fed to each headphone, with, as first distance (p), respectively a distance (rR, rL) separating each ear from a position (M) of a source to be played back.
  19. The method as claimed in one of the preceding claims, in which a matrix system is fashioned, in steps a) and b), said system comprising at least:
    - a matrix (B) comprising said components in the base of spherical harmonics, and
    - a diagonal matrix (Diag (1/Fm)) whose coefficients correspond to filtering coefficients of step b), and said matrices are multiplied to obtain a result matrix of compensated components (B̃).
  20. The method as claimed in claim 19, in which:
    - the playback device comprises a plurality of loudspeakers disposed substantially at one and the same distance (R) from the point of auditory perception (P), and
    - to decode said data coded and filtered in steps a) and b) and to form signals suitable for feeding said loudspeakers:
    * a matrix system is formed comprising said result matrix (B̃) and a predetermined decoding matrix (D), specific to the playback device, and
    * a matrix (S) is obtained comprising coefficients representative of the loudspeakers feed signals by multiplication of the matrix of the compensated components (B) by said decoding matrix (D).
  21. A. sound acquisition device, comprising a microphone furnished with an array of acoustic transducers disposed substantially on the surface of a sphere, characterized in that it furthermore comprises a processing unit arranged so as to:
    - receive signals each emanating from a transducer,
    - apply a coding to said signals so as to obtain a representation of the sound by components (Bmn σ) expressed in a base of spherical harmonics, of origin corresponding to the center of said sphere (O),
    - and apply a filtering to said components (Bmn σ), which filtering is dependent, on the one hand, on a distance corresponding to the radius of the sphere (r) and, on the other hand, on a reference distance (R).
  22. The device as claimed in claim 21, characterized in that said filtering consists, on the one hand, in equalizing, as a function of the radius of the sphere, the signals arising from the transducers so as to compensate for a weighting of directivity of said transducers and, on the other hand, in compensating for a near field effect as a function of a chosen reference distance (R), defining substantially, for a playback of the sound, a distance between a playback point (HPi) and a point (P) of auditory perception.
EP03782553A 2002-11-19 2003-11-13 Method for processing audio data and sound acquisition device therefor Expired - Lifetime EP1563485B1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR0214444A FR2847376B1 (en) 2002-11-19 2002-11-19 METHOD FOR PROCESSING SOUND DATA AND SOUND ACQUISITION DEVICE USING THE SAME
FR0214444 2002-11-19
PCT/FR2003/003367 WO2004049299A1 (en) 2002-11-19 2003-11-13 Method for processing audio data and sound acquisition device therefor

Publications (2)

Publication Number Publication Date
EP1563485A1 EP1563485A1 (en) 2005-08-17
EP1563485B1 true EP1563485B1 (en) 2006-03-29

Family

ID=32187712

Family Applications (1)

Application Number Title Priority Date Filing Date
EP03782553A Expired - Lifetime EP1563485B1 (en) 2002-11-19 2003-11-13 Method for processing audio data and sound acquisition device therefor

Country Status (12)

Country Link
US (1) US7706543B2 (en)
EP (1) EP1563485B1 (en)
JP (1) JP4343845B2 (en)
KR (1) KR100964353B1 (en)
CN (1) CN1735922B (en)
AT (1) ATE322065T1 (en)
AU (1) AU2003290190A1 (en)
DE (1) DE60304358T2 (en)
ES (1) ES2261994T3 (en)
FR (1) FR2847376B1 (en)
WO (1) WO2004049299A1 (en)
ZA (1) ZA200503969B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10595148B2 (en) 2016-01-08 2020-03-17 Sony Corporation Sound processing apparatus and method, and program

Families Citing this family (77)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10328335B4 (en) * 2003-06-24 2005-07-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Wavefield syntactic device and method for driving an array of loud speakers
US20050271216A1 (en) * 2004-06-04 2005-12-08 Khosrow Lashkari Method and apparatus for loudspeaker equalization
JP4927848B2 (en) * 2005-09-13 2012-05-09 エスアールエス・ラブス・インコーポレーテッド System and method for audio processing
PL1994526T3 (en) * 2006-03-13 2010-03-31 France Telecom Joint sound synthesis and spatialization
FR2899424A1 (en) * 2006-03-28 2007-10-05 France Telecom Audio channel multi-channel/binaural e.g. transaural, three-dimensional spatialization method for e.g. ear phone, involves breaking down filter into delay and amplitude values for samples, and extracting filter`s spectral module on samples
US8180067B2 (en) * 2006-04-28 2012-05-15 Harman International Industries, Incorporated System for selectively extracting components of an audio input signal
US7876903B2 (en) * 2006-07-07 2011-01-25 Harris Corporation Method and apparatus for creating a multi-dimensional communication space for use in a binaural audio system
US8036767B2 (en) * 2006-09-20 2011-10-11 Harman International Industries, Incorporated System for extracting and changing the reverberant content of an audio input signal
US8103006B2 (en) * 2006-09-25 2012-01-24 Dolby Laboratories Licensing Corporation Spatial resolution of the sound field for multi-channel audio playback systems by deriving signals with high order angular terms
DE102006053919A1 (en) * 2006-10-11 2008-04-17 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating a number of speaker signals for a speaker array defining a playback space
JP2008118559A (en) * 2006-11-07 2008-05-22 Advanced Telecommunication Research Institute International Three-dimensional sound field reproducing apparatus
JP4873316B2 (en) * 2007-03-09 2012-02-08 株式会社国際電気通信基礎技術研究所 Acoustic space sharing device
EP2094032A1 (en) * 2008-02-19 2009-08-26 Deutsche Thomson OHG Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same
KR20100131467A (en) * 2008-03-03 2010-12-15 노키아 코포레이션 Apparatus for capturing and rendering a plurality of audio channels
EP2154910A1 (en) * 2008-08-13 2010-02-17 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus for merging spatial audio streams
PL2154677T3 (en) * 2008-08-13 2013-12-31 Fraunhofer Ges Forschung An apparatus for determining a converted spatial audio signal
GB0815362D0 (en) 2008-08-22 2008-10-01 Queen Mary & Westfield College Music collection navigation
US8819554B2 (en) * 2008-12-23 2014-08-26 At&T Intellectual Property I, L.P. System and method for playing media
EP2205007B1 (en) 2008-12-30 2019-01-09 Dolby International AB Method and apparatus for three-dimensional acoustic field encoding and optimal reconstruction
GB2476747B (en) * 2009-02-04 2011-12-21 Richard Furse Sound system
US8718285B2 (en) 2009-03-26 2014-05-06 Panasonic Corporation Decoding device, coding and decoding device, and decoding method
US9372251B2 (en) * 2009-10-05 2016-06-21 Harman International Industries, Incorporated System for spatial extraction of audio signals
CN102823277B (en) 2010-03-26 2015-07-15 汤姆森特许公司 Method and device for decoding an audio soundfield representation for audio playback
JP5672741B2 (en) * 2010-03-31 2015-02-18 ソニー株式会社 Signal processing apparatus and method, and program
US20110317522A1 (en) * 2010-06-28 2011-12-29 Microsoft Corporation Sound source localization based on reflections and room estimation
US9313599B2 (en) 2010-11-19 2016-04-12 Nokia Technologies Oy Apparatus and method for multi-channel signal playback
US9055371B2 (en) * 2010-11-19 2015-06-09 Nokia Technologies Oy Controllable playback system offering hierarchical playback options
US9456289B2 (en) 2010-11-19 2016-09-27 Nokia Technologies Oy Converting multi-microphone captured signals to shifted signals useful for binaural signal processing and use thereof
EP2541547A1 (en) * 2011-06-30 2013-01-02 Thomson Licensing Method and apparatus for changing the relative positions of sound objects contained within a higher-order ambisonics representation
WO2013068402A1 (en) * 2011-11-10 2013-05-16 Sonicemotion Ag Method for practical implementations of sound field reproduction based on surface integrals in three dimensions
KR101282673B1 (en) 2011-12-09 2013-07-05 현대자동차주식회사 Method for Sound Source Localization
US8996296B2 (en) * 2011-12-15 2015-03-31 Qualcomm Incorporated Navigational soundscaping
KR102068186B1 (en) 2012-02-29 2020-02-11 어플라이드 머티어리얼스, 인코포레이티드 Abatement and strip process chamber in a load lock configuration
EP2645748A1 (en) * 2012-03-28 2013-10-02 Thomson Licensing Method and apparatus for decoding stereo loudspeaker signals from a higher-order Ambisonics audio signal
CN104335599A (en) 2012-04-05 2015-02-04 诺基亚公司 Flexible spatial audio capture apparatus
US9288603B2 (en) * 2012-07-15 2016-03-15 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
US9473870B2 (en) 2012-07-16 2016-10-18 Qualcomm Incorporated Loudspeaker position compensation with 3D-audio hierarchical coding
EP2688066A1 (en) * 2012-07-16 2014-01-22 Thomson Licensing Method and apparatus for encoding multi-channel HOA audio signals for noise reduction, and method and apparatus for decoding multi-channel HOA audio signals for noise reduction
US9479886B2 (en) 2012-07-20 2016-10-25 Qualcomm Incorporated Scalable downmix design with feedback for object-based surround codec
US9761229B2 (en) * 2012-07-20 2017-09-12 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for audio object clustering
BR112015004288B1 (en) 2012-08-31 2021-05-04 Dolby Laboratories Licensing Corporation system for rendering sound using reflected sound elements
US9301069B2 (en) * 2012-12-27 2016-03-29 Avaya Inc. Immersive 3D sound space for searching audio
US10203839B2 (en) * 2012-12-27 2019-02-12 Avaya Inc. Three-dimensional generalized space
US9892743B2 (en) 2012-12-27 2018-02-13 Avaya Inc. Security surveillance via three-dimensional audio space presentation
US9838824B2 (en) 2012-12-27 2017-12-05 Avaya Inc. Social media processing with three-dimensional audio
US9736609B2 (en) * 2013-02-07 2017-08-15 Qualcomm Incorporated Determining renderers for spherical harmonic coefficients
US9959875B2 (en) * 2013-03-01 2018-05-01 Qualcomm Incorporated Specifying spherical harmonic and/or higher order ambisonics coefficients in bitstreams
US10635383B2 (en) 2013-04-04 2020-04-28 Nokia Technologies Oy Visual audio processing apparatus
US9706324B2 (en) 2013-05-17 2017-07-11 Nokia Technologies Oy Spatial object oriented audio apparatus
US9369818B2 (en) 2013-05-29 2016-06-14 Qualcomm Incorporated Filtering with binaural room impulse responses with content analysis and weighting
US9883312B2 (en) 2013-05-29 2018-01-30 Qualcomm Incorporated Transformed higher order ambisonics audio data
EP2824661A1 (en) * 2013-07-11 2015-01-14 Thomson Licensing Method and Apparatus for generating from a coefficient domain representation of HOA signals a mixed spatial/coefficient domain representation of said HOA signals
DE102013013378A1 (en) * 2013-08-10 2015-02-12 Advanced Acoustic Sf Gmbh Distribution of virtual sound sources
EP3056025B1 (en) 2013-10-07 2018-04-25 Dolby Laboratories Licensing Corporation Spatial audio processing system and method
EP2866475A1 (en) * 2013-10-23 2015-04-29 Thomson Licensing Method for and apparatus for decoding an audio soundfield representation for audio playback using 2D setups
US9922656B2 (en) 2014-01-30 2018-03-20 Qualcomm Incorporated Transitioning of ambient higher-order ambisonic coefficients
EP2930958A1 (en) * 2014-04-07 2015-10-14 Harman Becker Automotive Systems GmbH Sound wave field generation
US10770087B2 (en) 2014-05-16 2020-09-08 Qualcomm Incorporated Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals
JP6388551B2 (en) * 2015-02-27 2018-09-12 アルパイン株式会社 Multi-region sound field reproduction system and method
DE102015008000A1 (en) * 2015-06-24 2016-12-29 Saalakustik.De Gmbh Method for reproducing sound in reflection environments, in particular in listening rooms
US10582329B2 (en) 2016-01-08 2020-03-03 Sony Corporation Audio processing device and method
BR112018013526A2 (en) 2016-01-08 2018-12-04 Sony Corporation apparatus and method for audio processing, and, program
KR102197544B1 (en) 2016-08-01 2020-12-31 매직 립, 인코포레이티드 Mixed reality system with spatialized audio
WO2018064528A1 (en) * 2016-09-29 2018-04-05 The Trustees Of Princeton University Ambisonic navigation of sound fields from an array of microphones
EP3497944A1 (en) * 2016-10-31 2019-06-19 Google LLC Projection-based audio coding
FR3060830A1 (en) * 2016-12-21 2018-06-22 Orange SUB-BAND PROCESSING OF REAL AMBASSIC CONTENT FOR PERFECTIONAL DECODING
US10182303B1 (en) * 2017-07-12 2019-01-15 Google Llc Ambisonics sound field navigation using directional decomposition and path distance estimation
US10764684B1 (en) * 2017-09-29 2020-09-01 Katherine A. Franco Binaural audio using an arbitrarily shaped microphone array
EP3525482B1 (en) 2018-02-09 2023-07-12 Dolby Laboratories Licensing Corporation Microphone array for capturing audio sound field
WO2019166988A2 (en) * 2018-03-02 2019-09-06 Wilfred Edwin Booij Acoustic positioning transmitter and receiver system and method
WO2019217808A1 (en) * 2018-05-11 2019-11-14 Dts, Inc. Determining sound locations in multi-channel audio
CN110740404B (en) * 2019-09-27 2020-12-25 广州励丰文化科技股份有限公司 Audio correlation processing method and audio processing device
CN110740416B (en) * 2019-09-27 2021-04-06 广州励丰文化科技股份有限公司 Audio signal processing method and device
US11363402B2 (en) 2019-12-30 2022-06-14 Comhear Inc. Method for providing a spatialized soundfield
CN111537058B (en) * 2020-04-16 2022-04-29 哈尔滨工程大学 Sound field separation method based on Helmholtz equation least square method
US11743670B2 (en) 2020-12-18 2023-08-29 Qualcomm Incorporated Correlation-based rendering with multiple distributed streams accounting for an occlusion for six degree of freedom applications
CN113791385A (en) * 2021-09-15 2021-12-14 张维翔 Three-dimensional positioning method and system

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS53114201U (en) * 1977-02-18 1978-09-11
US4731848A (en) * 1984-10-22 1988-03-15 Northwestern University Spatial reverberator
JP2569872B2 (en) * 1990-03-02 1997-01-08 ヤマハ株式会社 Sound field control device
JP3578783B2 (en) * 1993-09-24 2004-10-20 ヤマハ株式会社 Sound image localization device for electronic musical instruments
US5745584A (en) * 1993-12-14 1998-04-28 Taylor Group Of Companies, Inc. Sound bubble structures for sound reproducing arrays
GB9726338D0 (en) * 1997-12-13 1998-02-11 Central Research Lab Ltd A method of processing an audio signal
US7231054B1 (en) * 1999-09-24 2007-06-12 Creative Technology Ltd Method and apparatus for three-dimensional audio display
US7340062B2 (en) * 2000-03-14 2008-03-04 Revit Lawrence J Sound reproduction method and apparatus for assessing real-world performance of hearing and hearing aids
AU2000280030A1 (en) * 2000-04-19 2001-11-07 Sonic Solutions Multi-channel surround sound mastering and reproduction techniques that preservespatial harmonics in three dimensions

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10595148B2 (en) 2016-01-08 2020-03-17 Sony Corporation Sound processing apparatus and method, and program

Also Published As

Publication number Publication date
JP4343845B2 (en) 2009-10-14
EP1563485A1 (en) 2005-08-17
FR2847376A1 (en) 2004-05-21
WO2004049299A1 (en) 2004-06-10
KR100964353B1 (en) 2010-06-17
FR2847376B1 (en) 2005-02-04
JP2006506918A (en) 2006-02-23
ATE322065T1 (en) 2006-04-15
CN1735922A (en) 2006-02-15
ES2261994T3 (en) 2006-11-16
DE60304358T2 (en) 2006-12-07
DE60304358D1 (en) 2006-05-18
KR20050083928A (en) 2005-08-26
BR0316718A (en) 2005-10-18
US20060045275A1 (en) 2006-03-02
CN1735922B (en) 2010-05-12
US7706543B2 (en) 2010-04-27
AU2003290190A1 (en) 2004-06-18
ZA200503969B (en) 2006-09-27

Similar Documents

Publication Publication Date Title
EP1563485B1 (en) Method for processing audio data and sound acquisition device therefor
EP1836876B1 (en) Method and device for individualizing hrtfs by modeling
EP1992198B1 (en) Optimization of binaural sound spatialization based on multichannel encoding
EP1586220B1 (en) Method and device for controlling a reproduction unit using a multi-channel signal
EP3475943B1 (en) Method for conversion and stereophonic encoding of a three-dimensional audio signal
Ben-Hur et al. Binaural reproduction based on bilateral ambisonics and ear-aligned HRTFs
EP1479266B1 (en) Method and device for control of a unit for reproduction of an acoustic field
EP3400599B1 (en) Improved ambisonic encoder for a sound source having a plurality of reflections
EP3025514B1 (en) Sound spatialization with room effect
FR3065137A1 (en) SOUND SPATIALIZATION METHOD
EP3384688B1 (en) Successive decompositions of audio filters
EP4184505A1 (en) Complexity optimized sound spatialization with room effect
Paulo et al. Perceptual Comparative Tests Between the Multichannel 3D Capturing Systems Artificial Ears and the Ambisonic Concept
FR2866974A1 (en) Audio data processing method for e.g. documentary recording, involves encoding sound signals, and applying spatial component amplitude attenuation in frequency range defined by component order and distance between source and reference point
EP3449643B1 (en) Method and system of broadcasting a 360° audio signal
FR3040253B1 (en) METHOD FOR MEASURING PHRTF FILTERS OF AN AUDITOR, CABIN FOR IMPLEMENTING THE METHOD, AND METHODS FOR RESULTING IN RESTITUTION OF A PERSONALIZED MULTICANAL AUDIO BAND
US20200186952A1 (en) Method and system for processing an audio signal including ambisonic encoding
EP3484185A1 (en) Modelling of a set of acoustic transfer functions suitable for an individual, three-dimensional sound card and system for three-dimensional sound reproduction
CN116261086A (en) Sound signal processing method, device, equipment and storage medium
BRPI0316718B1 (en) SOUND DATA PROCESSING AND SOUND ACQUISITION DEVICE, APPLYING THIS PROCESS

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20050530

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

DAX Request for extension of the european patent (deleted)
AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.

Effective date: 20060329

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

Ref country code: IE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

Free format text: LANGUAGE OF EP DOCUMENT: FRENCH

REF Corresponds to:

Ref document number: 60304358

Country of ref document: DE

Date of ref document: 20060518

Kind code of ref document: P

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060629

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060629

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060629

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 20060712

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060829

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
REG Reference to a national code

Ref country code: IE

Ref legal event code: FD4D

REG Reference to a national code

Ref country code: ES

Ref legal event code: FG2A

Ref document number: 2261994

Country of ref document: ES

Kind code of ref document: T3

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20061130

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20061130

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20070102

BERE Be: lapsed

Owner name: FRANCE TELECOM

Effective date: 20061130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060630

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20071130

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20061113

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20071130

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060930

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20060329

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20101217

Year of fee payment: 8

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20120731

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20111130

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20221020

Year of fee payment: 20

Ref country code: GB

Payment date: 20221021

Year of fee payment: 20

Ref country code: ES

Payment date: 20221201

Year of fee payment: 20

Ref country code: DE

Payment date: 20221020

Year of fee payment: 20

REG Reference to a national code

Ref country code: DE

Ref legal event code: R071

Ref document number: 60304358

Country of ref document: DE

REG Reference to a national code

Ref country code: ES

Ref legal event code: FD2A

Effective date: 20231124

REG Reference to a national code

Ref country code: GB

Ref legal event code: PE20

Expiry date: 20231112

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20231112

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20231114

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20231112

Ref country code: ES

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20231114