EP3400599B1

EP3400599B1 - Improved ambisonic encoder for a sound source having a plurality of reflections

Info

Publication number: EP3400599B1
Application number: EP16808645.2A
Authority: EP
Inventors: Pierre Berthet
Original assignee: Mimi Hearing Technologies GmbH
Current assignee: Mimi Hearing Technologies GmbH
Priority date: 2016-01-05
Filing date: 2016-12-08
Publication date: 2021-06-16
Anticipated expiration: 2036-12-08
Also published as: CN108701461B; US11062714B2; WO2017118519A1; CN108701461A; US10475458B2; FR3046489A1; US20190019520A1; US20200058312A1; EP3400599A1; FR3046489B1

Description

FIELD OF THE INVENTION

La présente invention concerne l'encodage ambisonique de sources sonores. Il concerne plus spécifiquement l'amélioration de l'efficacité de ce codage, dans le cas où une source sonore est affectée de réflexions dans une scène sonore.The present invention relates to the ambisonic encoding of sound sources. It relates more specifically to improving the efficiency of this coding, in the case where a sound source is affected by reflections in a sound scene.

PREVIOUS STATE OF THE ART

Les représentations spatialisées du son regroupent des techniques de capture de synthèse et de reproduction d'environnement sonore permettant une immersion de l'auditeur beaucoup plus importante dans un environnement sonore. Elles permettent notamment à un utilisateur de discerner un nombre de sources sonores supérieures au nombre de haut-parleurs dont il dispose, et localiser précisément en 3D ces sources sonores, même lorsque leur direction n'est pas celle d'un haut-parleur. Les applications des représentations spatialisées du son sont nombreuses, et incluent la localisation précise de source sonores en 3 dimensions par un utilisateur à partir d'un son issu d'un casque stéréo, ou la localisation de sources sonores en 3 dimensions par des utilisateurs dans une pièce, le son étant émis par des enceintes, par exemple des enceintes 5.1. De plus, les représentations spatialisées du son permettent la réalisation d'effets sonores nouveaux. Par exemple, elles permettent la rotation d'une scène sonore ou l'application de réflexion d'une source sonore pour simuler le rendu d'un environnement sonore donné, par exemple une salle de cinéma ou une salle de concert.Spatialized representations of sound bring together techniques for synthesizing and reproducing a sound environment, allowing the listener to be much more immersed in a sound environment. They allow a user in particular to discern a number of sound sources greater than the number of loudspeakers at his disposal, and to locate these sound sources precisely in 3D, even when their direction is not that of a loudspeaker. The applications of spatialized representations of sound are numerous, and include the precise localization of sound sources in 3 dimensions by a user from sound from a stereo headset, or the localization of sound sources in 3 dimensions by users in a room, the sound being produced by speakers, for example 5.1 speakers. In addition, the spatialized representations of sound allow the creation of new sound effects. For example, they allow the rotation of a sound scene or the application of reflection of a sound source to simulate the rendering of a given sound environment, for example a cinema hall or a concert hall.

Les représentations spatialisées s'effectuent en deux étapes principales: un encodage ambisonique, et un décodage ambisonique. Pour bénéficier d'une représentation spatialisée du son, un décodage ambisonique en temps réel est toujours nécessaire. Une production ou traitement du son en temps-réel peut impliquer en plus un encodage ambisonique en temps réel de celui-ci. L'encodage ambisonique étant une tâche complexe, les capacités d'encodage ambisonique en temps réel peuvent être limitées. Par exemple, une capacité de calcul donnée ne pourra être capable d'encoder en temps réel qu'un nombre de sources sonores limitées.Spatialized representations are carried out in two main stages: ambisonic encoding, and ambisonic decoding. To benefit from a spatialized representation of sound, real-time ambisonic decoding is always necessary. Real-time sound production or processing may also involve real-time ambisonic encoding thereof. Ambisonic encoding being a complex task, real-time ambisonic encoding capabilities may be limited. For example, a given computation capacity will only be able to encode in real time a limited number of sound sources.

Les techniques de représentation spatialisées du son sont notamment décrites par J. Daniel, Représentations de champs acoustiques, application à la transmission et à la reproduction de scènes sonores dans un contexte multimédia, INIST-CNRS, Cote INIST : T 139957 . L'encodage ambisonique d'un champ sonore consiste en la décomposition du champ de pression sonore en un point, correspondant par exemple à la position d'un utilisateur, sous forme de coordonnées sphériques, exprimées sous la forme suivante : $p (\vec{r}, t) = \sum_{m = 0}^{\infty} j^{m} j_{m} (kr) \sum_{n = - m}^{+ m} B_{mn} (t) Y_{mn} (θ, φ)$

Dans laquelle p ( r , t ) représente la pression sonore, à un instant t, dans la direction r par rapport au point auquel le champ sonore est calculé. j^m représente la fonction sphérique de Bessel d'ordre m.The techniques of spatialized representation of sound are described in particular by J. Daniel, Representations of acoustic fields, application to the transmission and reproduction of sound scenes in a multimedia context, INIST-CNRS, INIST call number: T 139957 . Ambisonic encoding of a sound field consists of the decomposition of the sound pressure field at a point, corresponding for example to the position of a user, in the form of spherical coordinates, expressed in the following form:

p (\vec{r}, t) = \sum_{m = 0}^{\infty} j^{m} j_{m} (kr) \sum_{not = - m}^{+ m} B_{mn} (t) Y_{mn} (θ, φ)

In which p ( r , t ) represents the sound pressure, at a time t, in the direction r relative to the point at which the sound field is calculated. j ^m represents the spherical Bessel function of order m.

Y_mn ( θ,ϕ ) représente l'harmonique sphérique d'ordre mn dans les directions ( θ,ϕ ). définies par la direction r . Le symbole B_mn ( t ) définit les coefficients ambisoniques correspondant aux différentes harmoniques sphériques, à un instant t. Y _mn ( θ , ϕ ) represents the spherical harmonic of order mn in the directions ( θ , ϕ ) . defined by management r . The symbol B _mn ( t ) defines the ambisonic coefficients corresponding to the various spherical harmonics, at an instant t.

Les coefficients ambisoniques définissent donc, à chaque instant, l'ensemble du champ sonore entourant un point. Le traitement des champs sonores dans le domaine ambisonique possède des propriétés particulièrement intéressantes. En particulier, il est très aisé de procéder à des rotations de l'ensemble du champ sonore. Il est de plus possible de diffuser sur des haut-parleurs, à partir d'un ensemble de coefficients ambisoniques, du son comportant des informations de direction. Il est par exemple possible de diffuser du son sur des enceintes de types 5.1. Il est également possible de restituer, dans un casque ne disposant que d'un haut-parleur gauche et d'un haut-parleur droit, du son comportant des informations de directions, en utilisant des fonctions de transfert connues sous le nom de HRTF (Head-Related Transfer Functions, ou Fonctions de Transfert Relatives à la Tête). Ces fonctions permettent de restituer un signal directionnel sur deux haut-parleurs, en ajoutant à au moins un canal d'un signal stéréo un délai et/ou une atténuation, qui seront interprétés par le cerveau comme définissant la direction de la source sonore.The ambisonic coefficients therefore define, at each instant, the entire sound field surrounding a point. The processing of sound fields in the ambisonic domain has particularly interesting properties. In particular, it is very easy to rotate the entire sound field. It is also possible to broadcast on loudspeakers, from a set of ambisonic coefficients, sound comprising direction information. For example, it is possible to broadcast sound to 5.1-type speakers. It is also possible to play back, in a headset having only one left speaker and one right speaker, sound including direction information, using transfer functions known as HRTF ( Head-Related Transfer Functions). These functions allow a directional signal to be output to two speakers, adding to at least one channel of a stereo signal a delay and / or attenuation, which will be interpreted by the brain as defining the direction of the sound source.

La décomposition dite HOA (de l'acronyme anglais Higher Order Ambisonics, ou Ambisonie de Plus Haut Ordre) consiste à tronquer cette somme infinie à un ordre M, supérieur ou égal à 1 : $p (\vec{r}, t) = \sum_{m = 0}^{M} j^{m} j_{m} (kr) \sum_{n = - m}^{+ m} B_{mn} (t) Y_{mn} (θ, φ)$

The so-called HOA decomposition (from the English acronym Higher Order Ambisonics, or Higher Order Ambisonia) consists of truncating this infinite sum to an order M, greater than or equal to 1:

p (\vec{r}, t) = \sum_{m = 0}^{M} j^{m} j_{m} (kr) \sum_{not = - m}^{+ m} B_{mn} (t) Y_{mn} (θ, φ)

D'une manière générale, une source suffisamment distante est considérée comme propageant une onde sonore de manière sphérique. Il est alors possible de considérer que la valeur à un instant t d'un coefficient ambisonique B_mn ( t ) lié à cette source dépend, d'une part, de la pression sonore S ( t ) de la source à cet instant t, et d'autre part de l'harmonique sphérique liée à l'orientation ( θ_s , ϕ_s ) de cette source sonore. On peut donc écrire, pour une source sonore unique : $B_{mn} (t) = S (t) Y_{mn} (θ_{s}, φ_{s})$

In general, a sufficiently distant source is considered to propagate a sound wave in a spherical manner. It is then possible to consider that the value at an instant t of an ambisonic coefficient B _mn ( t ) linked to this source depends, on the one hand, on the sound pressure S ( t ) of the source at this instant t, and on the other hand of the spherical harmonic linked to the orientation ( θ _s , ϕ _s ) of this sound source. We can therefore write, for a single sound source:

B_{mn} (t) = S (t) Y_{mn} (θ_{s}, φ_{s})

Dans le cas d'un ensemble de N_s sources sonores lointaines, les coefficients ambisoniques décrivant la scène sonore sont calculés comme la somme des coefficients ambisoniques de chacune des sources, chaque source i ayant une orientation ( θ_si , ϕ_si ): $B_{mn} (t) = \sum_{i = 0}^{N_{s} - 1} S_{i} (t) Y_{mn} (θ_{s_{i}}, φ_{s_{i}})$

In the case of a set of N _s distant sound sources, the ambisonic coefficients describing the sound scene are calculated as the sum of the ambisonic coefficients of each of the sources, each source i having an orientation ( θ _si , ϕ _si ):

B_{mn} (t) = \sum_{i = 0}^{{NOT}_{s} - 1} S_{i} (t) Y_{mn} (θ_{s_{i}}, φ_{s_{i}})

On peut également représenter ce calcul sous forme de vecteur : $(\begin{matrix} B_{00} (t) \\ B_{1 - 1} (t) \\ B_{10} (t) \\ B_{11} (t) \\ ⋮ \\ B_{MM} (t) \end{matrix}) = \sum_{i = 0}^{N_{s} - 1} S_{i} (t) (\begin{matrix} Y_{00} (θ_{s_{i}}, φ_{s_{i}}) \\ Y_{1 - 1} (θ_{s_{i}}, φ_{s_{i}}) \\ Y_{10} (θ_{s_{i}}, φ_{s_{i}}) \\ Y_{11} (θ_{s_{i}}, φ_{s_{i}}) \\ ⋮ \\ Y_{MM} (θ_{s_{i}}, φ_{s_{i}}) \end{matrix})$

Les coefficients ambisoniques gardant la forme B_mn , avec, à l'ordre M, m allant de 0 à M, et n allant de -m à m.We can also represent this calculation in the form of a vector:

(\begin{matrix} B_{00} (t) \\ B_{1 - 1} (t) \\ B_{10} (t) \\ B_{11} (t) \\ ⋮ \\ B_{MM} (t) \end{matrix}) = \sum_{i = 0}^{{NOT}_{s} - 1} S_{i} (t) (\begin{matrix} Y_{00} (θ_{s_{i}}, φ_{s_{i}}) \\ Y_{1 - 1} (θ_{s_{i}}, φ_{s_{i}}) \\ Y_{10} (θ_{s_{i}}, φ_{s_{i}}) \\ Y_{11} (θ_{s_{i}}, φ_{s_{i}}) \\ ⋮ \\ Y_{MM} (θ_{s_{i}}, φ_{s_{i}}) \end{matrix})

The ambisonic coefficients keeping the form B _mn , with, at order M, m ranging from 0 to M, and n ranging from -m to m.

Un appareil comprenant un encodage ambisonique d'au moins une source peut donc définir un champ sonore complet, en calculant les coefficients ambisoniques à un ordre M. En fonction de l'ordre M, et du nombre de sources, ce calcul peut être long et gourmand en ressource. En effet, à un ordre M, (M + 1)² coefficients ambisoniques sont calculés à chaque instant t. Pour chaque coefficient, la contribution B_mn ( t ) = S ( t ) Y_mn ( θ_s ,ϕ_s ) de chacune des N_s sources doit être calculée. Si une source S est fixe, l'harmonique sphérique Y_mn ( θ_s , ϕ_s ) peut être pré-calculée. Dans le cas contraire, elle doit être recalculée à chaque instant.An apparatus comprising an ambisonic encoding of at least one source can therefore define a complete sound field, by calculating the ambisonic coefficients at an order M. Depending on the order M, and the number of sources, this calculation can be long and resource-hungry. Indeed, at an order M, (M + 1) ² ambisonic coefficients are calculated at each instant t. For each coefficient, the contribution B _mn ( t ) = S ( t ) Y _mn ( θ _s , ϕ _s ) of each of the N _s sources must be calculated. If a source S is fixed, the spherical harmonic Y _mn ( θ _s , ϕ _s ) can be pre-calculated. Otherwise, it must be recalculated at each moment.

Une augmentation de l'ordre du coefficient ambisonique permet une meilleure qualité du rendu auditif. Il peut donc être difficile d'obtenir une bonne qualité sonore, tout en préservant une charge, un temps de calcul raisonnable, une consommation électrique et un usage de batterie raisonnables. Ceci est d'autant plus vrai que les calculs de coefficients ambisoniques s'effectuent souvent en temps-réel sur des dispositifs mobiles. C'est par exemple le cas d'un smartphone, pour écouter de la musique en temps réel, avec des informations directionnelles calculées à l'aide de coefficients ambisoniques.An increase of the order of the ambisonic coefficient allows a better quality of the auditory rendering. Therefore, it can be difficult to achieve good sound quality, while preserving reasonable charge, computing time, power consumption, and battery usage. This is all the more true since the calculations of ambisonic coefficients are often carried out in real time on mobile devices. This is for example the case with a smartphone, for listening to music in real time, with directional information calculated using ambisonic coefficients.

Cette problématique est encore plus forte lorsque des réflexions sont calculées dans une scène sonore.This problem is even stronger when reflections are calculated in a sound scene.

Le calcul de réflexions permet de simuler une scène sonore dans une pièce, par exemple une salle de cinéma ou de concert. Dans ces conditions, le son se réfléchit sur les murs de la salle, donnant une « ambiance » caractéristique, les réflexions étant définies par les positions respectives des sources sonores, de l'auditeur, mais aussi par les matériaux sur lesquels les ondes sonores se diffusent, par exemple le matériau des murs. La création d'effets de salle à l'aide d'un codage audio ambisonique est notamment décrite par J. Daniel, Représentations de champs acoustiques, application à la transmission et à la reproduction de scènes sonores dans un contexte multimédia, INIST-CNRS, Cote INIST : T 139957, pp. 283-287 .The calculation of reflections makes it possible to simulate a sound scene in a room, for example a cinema or concert hall. Under these conditions, the sound is reflected on the walls of the room, giving a characteristic "atmosphere", the reflections being defined by the respective positions of the sound sources, of the listener, but also by the materials on which the sound waves are. diffuse, for example the material of the walls. The creation of room effects using ambisonic audio encoding is described in particular by J. Daniel, Representations of acoustic fields, application to the transmission and reproduction of sound scenes in a multimedia context, INIST-CNRS, INIST code: T 139957, pp. 283-287 .

Il est possible de simuler l'effet des réflexions et de donner une « ambiance » en ambisonie, en ajoutant, pour chaque source sonore, un ensemble de sources sonores secondaires, dont l'intensité et la direction sont calculées à partir des réflexions des sources sonores sur les murs et obstacles d'une scène sonore. Quelques sources sonores sont nécessaires, pour chaque source sonore initiale, afin de simuler de manière satisfaisante une scène sonore. Cependant, ceci rend le problème de capacité de calcul et de batterie précité encore plus critique, puisque la complexité de calcul des coefficients ambisoniques est encore multipliée par le nombre de sources sonores secondaires. La complexité du calcul des coefficients ambisoniques pour un rendu sonore satisfaisant peut alors rendre cette solution impraticable, par exemple parce qu'il devient impossible de calculer les coefficients ambisoniques en temps réel, parce que la charge de calcul des coefficients ambisoniques devient trop importante, ou parce la consommation électrique et/ou de batterie sur un appareil mobile devient rédhibitoire.It is possible to simulate the effect of the reflections and to give an “atmosphere” in ambisonics, by adding, for each sound source, a set of secondary sound sources, the intensity and direction of which are calculated from the reflections of the sources. sound on the walls and obstacles of a soundstage. A few sound sources are necessary, for each initial sound source, in order to satisfactorily simulate a sound scene. However, this makes the aforementioned calculation and battery capacity problem even more critical, since the complexity of calculating the ambisonic coefficients is further multiplied by the number of secondary sound sources. The complexity of the calculation of the ambisonic coefficients for a satisfactory sound rendering can then make this solution impractical, for example because it becomes impossible to calculate the ambisonic coefficients in real time, because the computation load of the ambisonic coefficients becomes too important, or because the power and / or battery consumption on a mobile device becomes prohibitive.

N. Tsingos et al. Perceptual Audio Rendering of Complex Virtual Environment, ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH 2004, Volume 23 Issue 3, August 200, pp. 249-258 divulgue une méthode de traitement binaural pour pallier ce problème. La solution proposée par Tsingos consiste à réduire le nombre de sources sonores en:

Evaluant la puissance de chaque source sonore ;
Classant les sources sonores, de la plus à la moins puissante ;
Supprimant les sources sonores les moins puissantes ;
Groupant les sources sonores restantes par grappes de sources sonores proches les unes des autres, et les fusionnant pour obtenir, pour chaque grappe, une unique source sonore virtuelle.

N. Tsingos et al. Perceptual Audio Rendering of Complex Virtual Environment, ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH 2004, Volume 23 Issue 3, August 200, pp. 249-258 discloses a binaural processing method to overcome this problem. The solution proposed by Tsingos consists in reducing the number of sound sources by:

Evaluating the power of each sound source;
Classifying the sound sources, from the most to the least powerful;
Eliminating the less powerful sound sources;
Grouping the remaining sound sources by clusters of sound sources close to each other, and merging them to obtain, for each cluster, a single virtual sound source.

La méthode divulguée par Tsingos permet de réduire le nombre de sources sonores, et donc la complexité du traitement global lorsque des réverbérations sont utilisées. Cependant, cette technique présente plusieurs inconvénients. Elle n'améliore pas la complexité du traitement des réverbérations elles-mêmes. Le problème rencontré se poserait donc à nouveau, si, avec un nombre réduit de sources, on souhaite augmenter le nombre de réverbérations. De plus, les traitements pour déterminer la puissance sonore de chaque source, et fusionner les sources par grappes présentent eux-mêmes une charge de calcul importante. Les expériences décrites se limitent à des cas ou les sources sonores sont connues à l'avance, et leurs puissances respectives pré-calculées. Dans des cas de scènes sonores pour lesquelles plusieurs sources d'intensités variables sont présentes, et dont les puissances doivent être recalculées, la charge de calcul associée viendrait, au moins partiellement, annuler le gain de calcul obtenu en limitant le nombre de sources.The method disclosed by Tsingos makes it possible to reduce the number of sound sources, and therefore the complexity of the overall processing when reverbs are used. However, this technique has several drawbacks. It does not improve the complexity of processing the reverbs themselves. The problem encountered would therefore arise again if, with a reduced number of sources, one wishes to increase the number of reverberations. In addition, the processing for determining the sound power of each source, and merging the sources by clusters themselves have a significant computational load. The experiments described are limited to cases where the sound sources are known in advance, and their respective powers are pre-calculated. In the case of sound scenes for which several sources of variable intensities are present, and whose powers must be recalculated, the associated calculation load would, at least partially, cancel out the calculation gain obtained by limiting the number of sources.

Enfin, les tests conduits par Tsingos donnent des résultats satisfaisants lorsque les sources sonores sont assimilables à du bruit, par exemple dans le cas d'une foule dans le métro. Sur d'autres types de sources sonores, une telle méthode pourrait s'avérer dommageable. Par exemple, lors de l'enregistrement d'un concert donné par un orchestre symphonique, il est fréquent que plusieurs instruments, bien qu'ayant une puissance sonore faible, contribuent de manière importante à l'harmonie d'ensemble. Supprimer purement et simplement les sources sonores associées, car elles sont relativement peu puissantes, nuirait alors gravement à la qualité de l'enregistrement.Finally, the tests conducted by Tsingos give satisfactory results when the sound sources can be assimilated to noise, for example in the case of a crowd in the metro. On other types of sound sources, such a method could prove to be harmful. For example, when recording a concert given by an orchestra symphonic, it is common for several instruments, although having low sound power, to contribute significantly to the overall harmony. Simply removing the associated sound sources, because they are relatively weak, would seriously damage the quality of the recording.

Le document US 6021206 divulgue un filtrage de sources sonores virtuelles correspondant aux réflexions comprenant un délai et une atténuation.The document US 6021206 discloses filtering of virtual sound sources corresponding to reflections including delay and attenuation.

Le document Markus Noistering et al. « A 3D AMBISONIC BASED BINAURAL SOUND REPRODUCTION SYSTEM » divulgue la création de sources sonores virtuelles correspondant aux réflexions et l'application d'un gain et d'un délai à chacune de ces sourcesThe document Markus Noistering et al. "A 3D AMBISONIC BASED BINAURAL SOUND REPRODUCTION SYSTEM" discloses the creation of virtual sound sources corresponding to the reflections and the application of gain and delay to each of these sources

Le document US 2007/160216 divulgue, de manière générale, le calcul de gains en fonction de la position d'une source sonore pour la binauralisation.The document US 2007/160216 generally discloses the calculation of gains as a function of the position of a sound source for binauralization.

Le document US 2005/069143 divulgue l'application de fonctions HRTF au son dans le domaine fréquentiel.The document US 2005/069143 discloses the application of HRTF functions to sound in the frequency domain.

Le document US 2011/305344 divulgue méthode de transformations des pistes sonores avant encodage binaural, afin de minimiser le besoin de « sweet spot », notamment en convertissant certaines pistes en mono.The document US 2011/305344 discloses a method of transforming sound tracks before binaural encoding, in order to minimize the need for a “sweet spot”, in particular by converting certain tracks into mono.

Il y a donc besoin d'un appareil et d'une méthode pour le calcul des coefficients ambisoniques, qui permette de calculer en temps réel un ensemble de coefficients ambisoniques représentatifs d'au moins une source sonore et une ou plusieurs réflexions de celle-ci dans une scène sonore, tout en limitant la complexité de calcul additionnelle liée à la ou aux réflexions de la source sonore, sans réduire à priori le nombre de sources sonores.There is therefore a need for an apparatus and a method for the calculation of the ambisonic coefficients, which makes it possible to calculate in real time a set of ambisonic coefficients representative of at least one sound source and one or more reflections thereof. in a sound scene, while limiting the additional complexity of calculation linked to the reflection (s) from the sound source, without reducing the number of sound sources a priori.

SUMMARY OF THE INVENTION

A cet effet, l'invention concerne un encodeur ambisonique d'onde sonore à pluralité de réflexions, comprenant : une logique de transformation fréquentielle de l'onde sonore ; une logique de calcul d'harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à partir d'une position d'une source de l'onde sonore et de positions d'obstacles à une propagation de l'onde sonore ; une pluralité de logiques de filtrage dans le domaine fréquentiel recevant en entrée des harmoniques sphériques de la pluralité de réflexions, chaque logique de filtrage consistant en une atténuation et un délai d'une réflexion, et étant paramétrée par un coefficient acoustique et un délai de ladite réflexion ; une logique d'addition d'harmoniques sphériques de l'onde sonore et des sorties des logiques de filtrage, en un ensemble d'harmoniques sphériques représentatives à la fois de l'onde sonore et de la pluralité de réflexions dans le domaine fréquentiel ; une logique de multiplication dudit ensemble d'harmoniques sphériques représentatives à la fois de l'onde sonore et de la pluralité de réflexions dans le domaine fréquentiel par des valeurs d'intensité sonores de l'onde en sortie de la transformation fréquentielle, afin d'obtenir un ensemble de coefficients ambisoniques représentatifs à la fois de l'onde sonore et de la pluralité de réflexions.To this end, the invention relates to an ambisonic encoder for a sound wave with a plurality of reflections, comprising: a logic for frequency transformation of the sound wave; logic for calculating spherical harmonics of the sound wave and the plurality of reflections from a position of a source of the sound wave and positions of obstacles to propagation of the sound wave; a plurality of filtering logics in the frequency domain receiving as input spherical harmonics of the plurality of reflections, each filtering logic consisting of an attenuation and a delay of a reflection, and being parameterized by an acoustic coefficient and a delay of said reflection; a logic of adding spherical harmonics of the sound wave and of the outputs of the filtering logics, into a set of spherical harmonics representative both of the sound wave and of the plurality of reflections in the frequency domain; a logic for multiplying said set of spherical harmonics representative both of the sound wave and of the plurality of reflections in the frequency domain by sound intensity values of the wave at the output of the frequency transformation, in order to obtaining a set of ambisonic coefficients representative of both the sound wave and the plurality of reflections.

Avantageusement, la logique de calcul d'harmoniques sphériques de l'onde sonore est configurée pour calculer les harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à partir d'une position fixe de la source de l'onde sonore.Advantageously, the logic for calculating spherical harmonics of the sound wave is configured to calculate the spherical harmonics of the sound wave and of the plurality of reflections from a fixed position of the source of the sound wave.

Avantageusement, la logique de calcul d'harmoniques sphériques de l'onde sonore est configurée pour calculer de manière itérative les harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à partir de positions successives de la source de l'onde sonore.Advantageously, the logic for calculating spherical harmonics of the sound wave is configured to iteratively calculate the spherical harmonics of the sound wave and of the plurality of reflections from successive positions of the source of the sound wave.

Avantageusement, chaque réflexion est caractérisée par un unique coefficient acoustique.Advantageously, each reflection is characterized by a single acoustic coefficient.

Avantageusement, chaque réflexion est caractérisée par un coefficient acoustique pour chaque fréquence dudit échantillonnage fréquentiel.Advantageously, each reflection is characterized by an acoustic coefficient for each frequency of said frequency sampling.

Avantageusement, les réflexions sont représentées par des sources sonores virtuelles.Advantageously, the reflections are represented by virtual sound sources.

Avantageusement, l'encodeur ambisonique comprend en outre une logique de calcul des coefficients acoustiques, des délais et de la position de des sources sonores virtuelles des réflexions, ladite logique de calcul étant configurée pour calculer les coefficients acoustiques et les délais des réflexions en fonction d'estimations d'une différence de distance parcourue par le son entre la position de la source de l'onde sonore et une position estimée d'un utilisateur d'une part, et d'une distance parcourue par le son entre les positions des sources sonores virtuelles des réflexions et la position estimée de l'utilisateur d'autre part.Advantageously, the ambisonic encoder further comprises a logic for calculating the acoustic coefficients, the delays and the position of the virtual sound sources of the reflections, said calculation logic being configured to calculate the acoustic coefficients and the delays of the reflections as a function of '' estimates of a difference in the distance traveled by sound between the position of the source of the sound wave and an estimated position of a user on the one hand, and of a distance traveled by the sound between the positions of the virtual sound sources of the reflections and the estimated position of the user on the other hand.

Avantageusement, la logique de calcul des coefficients acoustiques, des délais et des positions des sources sonores virtuelles des réflexions est en outre configurée pour calculer les coefficients acoustiques des réflexions en fonction d'au moins un coefficient acoustique d'au moins un obstacle à la propagation d'ondes sonores, sur lequel le son est réfléchi.Advantageously, the logic for calculating the acoustic coefficients, the delays and the positions of the virtual sound sources of the reflections is further configured to calculate the acoustic coefficients of the reflections as a function of at least one acoustic coefficient of at least one obstacle to propagation. of sound waves, on which the sound is reflected.

Avantageusement, la logique de calcul des coefficients acoustiques, des délais et des positions des sources sonores virtuelles des réflexions est en outre configurée pour calculer les coefficients acoustiques des réflexions en fonction d'un coefficient acoustique d'au moins un obstacle à la propagation d'ondes sonores, sur lequel le son est réfléchi.Advantageously, the logic for calculating the acoustic coefficients, the delays and the positions of the virtual sound sources of the reflections is further configured to calculate the acoustic coefficients of the reflections as a function of an acoustic coefficient of at least one obstacle to the propagation of sound waves, on which sound is reflected.

Avantageusement, la logique de calcul d'harmoniques sphériques de l'onde sonore et de la pluralité de réflexions est en outre configurée pour calculer des harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à chaque fréquence de sortie du circuit de transformation fréquentielle, ledit encodeur ambisonique comprenant en outre une logique de calcul de coefficients binauraux de l'onde sonore, configurée pour calculer des coefficient binauraux de l'onde sonore en multipliant à chaque fréquence de sortie du circuit de transformation fréquentielle de l'onde sonore le signal de l'onde sonore par les harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à cette fréquence.Advantageously, the logic for calculating spherical harmonics of the sound wave and of the plurality of reflections is further configured to calculate spherical harmonics of the sound wave and of the plurality of reflections at each output frequency of the transformation circuit. frequency, said ambisonic encoder further comprising a logic for calculating binaural coefficients of the sound wave, configured to calculate binaural coefficients of the sound wave by multiplying at each output frequency of the frequency transformation circuit of the sound wave the signal of the sound wave by the spherical harmonics of the sound wave and the plurality of reflections at this frequency.

Avantageusement, la logique de calcul des coefficients acoustiques, des délais et des positions des sources sonores virtuelles des réflexions est configurée pour calculer des coefficients acoustiques et des délais d'une pluralité de réflexions tardives.Advantageously, the logic for calculating the acoustic coefficients, the delays and the positions of the virtual sound sources of the reflections is configured to calculate the acoustic coefficients and the delays of a plurality of late reflections.

L'invention concerne également une méthode d'encodage ambisonique d'onde sonore à pluralité de réflexions, comme définie par la revendication 12.The invention also relates to a method of ambisonically encoding a sound wave with a plurality of reflections, as defined by claim 12.

L'invention concerne également un programme d'ordinateur pour l'encodage ambisonique d'onde sonore à pluralité de réflexions, comme définie par la revendication 13.The invention also relates to a computer program for ambisonic encoding of a plurality of reflections sound wave, as defined by claim 13.

L'encodeur ambisonique selon l'invention permet d'améliorer la sensation d'immersion dans une scène audio 3D.The ambisonic encoder according to the invention makes it possible to improve the feeling of immersion in a 3D audio scene.

La complexité d'encodage des réflexions de sources sonores d'un encodeur ambisonique selon l'invention est moindre que la complexité d'encodage des réflexions de sources sonores d'un encodeur ambisonique selon l'état de l'art.The complexity of encoding the reflections of sound sources from an ambisonic encoder according to the invention is less than the complexity of encoding the reflections of sound sources from an ambisonic encoder according to the state of the art.

L'encodeur ambisonique selon l'invention permet d'encoder un plus grand nombre de réflexions d'une source sonore en temps réel.The ambisonic encoder according to the invention makes it possible to encode a greater number of reflections from a sound source in real time.

L'encodeur ambisonique selon l'invention permet de diminuer la consommation électrique liée à l'encodage ambisonique, et d'augmenter la durée de vie d'une batterie d'un appareil mobile utilisé pour cette application.The ambisonic encoder according to the invention makes it possible to reduce the power consumption associated with ambisonic encoding, and to increase the life of a battery of a mobile device used for this application.

LIST OF FIGURES

D'autres caractéristiques apparaîtront à la lecture de la description détaillée donnée à titre d'exemple et non limitative qui suit faite au regard de dessins annexés qui représentent:

les figures 1a et 1b, deux exemples de systèmes d'écoute d'onde sonore, selon deux modes de réalisation de l'invention
la figure 2, un exemple d'un système de binauralisation comprenant un moteur de binauralisation par source sonore d'une scène audio selon l'état de l'art ;
les figures 3a et 3b, deux exemples de moteurs de binauralisation d'une scène 3D, respectivement dans le domaine temporel et le domaine fréquentiel selon l'état de l'art
la figure 4, un exemple d'encodeur ambisonique d'une onde sonore à une pluralité de réflexions, dans un ensemble de modes de mise en œuvre de l'invention ;
la figure 5, un exemple de calcul d'une source sonore secondaire, dans un mode de mise en œuvre de l'invention ;
la figure 6, un exemple de calcul de réflexions précoces et de réflexions tardives, dans un mode de réalisation de l'invention ;
la figure 7, une méthode d'encodage d'une onde sonore à une pluralité de réflexions dans un ensemble de modes de mise en œuvre de l'invention.

Other characteristics will appear on reading the detailed description given by way of example and not limiting which follows, given with reference to the appended drawings which represent:

the figures 1a and 1b , two examples of sound wave listening systems, according to two embodiments of the invention
the figure 2 , an example of a binauralization system comprising a binauralization engine by sound source of an audio scene according to the state of the art;
the figures 3a and 3b , two examples of binauralization engines of a 3D scene, respectively in the time domain and the frequency domain according to the state of the art
the figure 4 , an example of an ambisonic encoder of a sound wave with a plurality of reflections, in a set of embodiments of the invention;
the figure 5 , an example of calculation of a secondary sound source, in one embodiment of the invention;
the figure 6 , an example of calculation of early reflections and late reflections, in one embodiment of the invention;
the figure 7 , a method of encoding a sound wave with a plurality of reflections in a set of embodiments of the invention.

DETAILED DESCRIPTION

Les figures 1a et 1b représentent deux exemples de systèmes d'écoute d'onde sonore, selon deux modes de réalisation de l'invention.The figures 1a and 1b show two examples of sound wave listening systems, according to two embodiments of the invention.

La figure 1a représente un exemple de système d'écoute d'onde sonore, selon un mode de réalisation de l'invention.The figure 1a represents an example of a sound wave listening system, according to one embodiment of the invention.

Le système 100a comprend une tablette tactile 110a, un casque 120a pour permettre à un utilisateur 130a d'écouter une onde sonore. Le système 100a, comprend, à titre d'exemple uniquement, une tablette tactile. Cependant, cet exemple est également applicable à un smartphone, ou à tout autre appareil mobile possédant des capacités d'affichage et de diffusion sonore. L'onde sonore peut par exemple être issue de la lecture d'un film ou d'un jeu. Selon plusieurs modes de réalisation de l'invention, le système 100a peut être configuré pour écouter plusieurs ondes sonores. Par exemple, lorsque le système 100a est configuré pour la lecture d'un film comprenant une piste sonore multicanal 5.1, 6 ondes sonores sont écoutées simultanément. De la même manière, lorsque le système 100a est configuré pour jouer à un jeu, de nombreuses ondes sonores peuvent être écoutées simultanément. Par exemple, dans le cas d'un jeu faisant intervenir plusieurs personnages, une onde sonore peut être créée pour chaque personnage.The system 100a comprises a touch pad 110a, a headset 120a to allow a user 130a to listen to a sound wave. The system 100a comprises, by way of example only, a touchscreen tablet. However, this example is also applicable to a smartphone, or to any other mobile device having display and sound broadcasting capabilities. The sound wave can for example come from playing a movie or a game. According to several embodiments of the invention, the system 100a can be configured to listen to several sound waves. For example, when the system 100a is configured for playing a movie comprising a 5.1 multichannel sound track, 6 sound waves are listened to simultaneously. Likewise, when system 100a is configured to play a game, many sound waves can be heard simultaneously. For example, in the case of a game involving several characters, a sound wave can be created for each character.

Chacune des ondes sonores est associée à une source sonore, dont la position est connue.Each of the sound waves is associated with a sound source, the position of which is known.

La tablette tactile 110a comprend un encodeur ambisonique 111a selon l'invention, un circuit de transformation 112a, et un décodeur ambisonique 113a.The touchscreen tablet 110a comprises an ambisonic encoder 111a according to the invention, a transformation circuit 112a, and an ambisonic decoder 113a.

Selon un ensemble de modes de réalisation de l'invention, l'encodeur ambisonique 111a, le circuit de transformation 112a et le décodeur ambisonique 113a sont constitués d'instructions de code d'ordinateur exécutées sur un processeur de la tablette tactile. Ils peuvent par exemple avoir été obtenus en installant une application ou un logiciel spécifique sur la tablette. Dans d'autres modes de réalisation de l'invention, l'un au moins parmi l'encodeur ambisonique 111a, le circuit de transformation 112a et le décodeur ambisonique 113a est un circuit intégré spécialisé, par exemple un ASIC (acronyme de l'anglais « Application-Specific Integrated Circuit, littéralement « circuit intégré propre à une application »), un FPGA (acronyme de l'anglais Field-Programmable Gate Array, Réseau de portes programmable).According to a set of embodiments of the invention, the ambisonic encoder 111a, the transformation circuit 112a and the ambisonic decoder 113a consist of computer code instructions executed on a processor of the touch pad. They may for example have been obtained by installing a specific application or software on the tablet. In other embodiments of the invention, at least one of the ambisonic encoder 111a, the transformation circuit 112a and the ambisonic decoder 113a is a specialized integrated circuit, for example an ASIC (acronym for English “Application-Specific Integrated Circuit, literally“ application-specific integrated circuit ”), an FPGA (acronym for English Field-Programmable Gate Array).

L'encodeur ambisonique 111a est configuré pour calculer, dans le domaine fréquentiel, un ensemble de coefficients ambisoniques représentatifs de l'ensemble d'une scène sonore, à partir d'au moins une onde sonore. Il est de plus configuré pour appliquer des réflexions à au moins une onde sonore, afin de simuler un environnement d'écoute, par exemple une salle de cinéma d'une certaine taille, ou une salle de concert.The ambisonic encoder 111a is configured to calculate, in the frequency domain, a set of ambisonic coefficients representative of the whole of a sound scene, from at least one sound wave. It is further configured to apply reflections to at least one sound wave, in order to simulate a listening environment, for example a movie theater of a certain size, or a concert hall.

Le circuit de transformation 112a est configuré pour effectuer des rotations de la scène sonore en modifiant les coefficients ambisoniques, afin de simuler la rotation de la tête de l'utilisateur, de sorte que, quelle que soit l'orientation de son visage, les différentes ondes sonores lui paraissent parvenir d'une même position. Par exemple, si l'utilisateur tourne la tête vers la gauche d'un angle a, une rotation de la scène sonore vers la droite d'un même angle α permet de continuer à lui faire parvenir le son toujours de la même direction. Selon un ensemble de modes de réalisation de l'invention, le casque 120a est équipé d'au moins un capteur de mouvement 121a, par exemple un gyromètre, permettant d'obtenir un angle ou une dérivée d'un angle de rotation de la tête de l'utilisateur 130a. Un signal représentatif d'un angle de rotation, ou d'une dérivée d'un angle de rotation, est alors envoyé par le casque 121a à la tablette 120a, afin que le circuit de transformation 112a effectue la rotation de la scène sonore correspondante.The transformation circuit 112a is configured to perform rotations of the soundstage by modifying the ambisonic coefficients, in order to simulate the rotation of the user's head, so that, whatever the orientation of his face, the different sound waves seem to come from the same position. For example, if the user turns his head to the left by an angle α, a rotation of the sound stage to the right by the same angle α makes it possible to continue sending the sound to him always from the same direction. According to a set of embodiments of the invention, the helmet 120a is equipped with at least one movement sensor 121a, for example a gyrometer, making it possible to obtain an angle or a derivative of an angle of rotation of the head. user 130a. A signal representative of an angle of rotation, or of a derivative of an angle of rotation, is then sent by the headphones 121a to the tablet 120a, so that the transformation circuit 112a performs the rotation of the corresponding sound scene.

Le décodeur ambisonique 113a est configuré pour restituer la scène sonore sur les deux canaux stéréo du casque 120a, en convertissant les coefficients ambisoniques transformés en deux signaux stéréo, l'un pour le canal gauche et l'autre pour le canal droit. Dans un ensemble de modes de réalisation de l'invention, le décodage ambisonique s'effectue à l'aide de fonctions dites HRTF (acronyme de l'anglais « Head Related Transfer Functions », littéralement Fonctions de Transfer Liées à la Tête) permettant de restituer, sur deux canaux stéréo les directions des différentes sources sonores. La demande de brevet français n° 1558279 , déposée par le demandeur, décrit une méthode pour créer des fonctions HRTF optimisées pour un utilisateur en fonction d'une banque de fonctions HRTF, et des caractéristiques du visage dudit utilisateur.The ambisonic decoder 113a is configured to reproduce the sound scene on the two stereo channels of the headphones 120a, by converting the transformed ambisonic coefficients into two stereo signals, one for the left channel and the other for the right channel. In a set of embodiments of the invention, the ambisonic decoding is carried out using functions called HRTF (acronym for the English “Head Related Transfer Functions”, literally Head Related Transfer Functions) making it possible to reproduce the directions of the different sound sources on two stereo channels. French patent application no. 1558279 , filed by the applicant, describes a method for creating HRTF functions optimized for a user as a function of a bank of HRTF functions, and of the facial characteristics of said user.

Le système 100a permet ainsi à son utilisateur de bénéficier d'une expérience particulièrement immersive : lors d'un jeu ou d'une lecture d'un contenu multimédia, en plus de l'image, ce système lui permet de bénéficier d'une impression d'immersion dans une scène sonore. Cette impression est amplifiée à la fois par le suivi des orientations des différentes sources sonores lorsque l'utilisateur tourne la tête, et par l'application de réflexions donnant une impression d'immersion dans un environnement d'écoute particulier. Ce système permet par exemple de regarder un film ou un concert avec un casque audio, en ayant une impression d'immersion dans une salle de cinéma ou une salle de concert. L'ensemble de ces opérations est effectué en temps réel, ce qui permet d'adapter en permanence le son perçu par l'utilisateur à l'orientation de sa tête.The system 100a thus allows its user to benefit from a particularly immersive experience: during a game or a multimedia content reading, in addition to the image, this system allows him to benefit from an impression. immersion in a sound scene. This impression is amplified both by following the orientations of the different sound sources when the user turns his head, and by the application of reflections giving an impression of immersion in a particular listening environment. This system makes it possible, for example, to watch a film or a concert with an audio headset, while having an impression of immersion in a cinema hall or a concert hall. All of these operations are carried out in real time, which makes it possible to constantly adapt the sound perceived by the user to the orientation of his head.

L'encodeur ambisonique 111a selon l'invention permet d'encoder un plus grand nombre de réflexions des sources sonores, avec une complexité moindre par rapport à un encodeur ambisonique de l'art antérieur. Il permet donc d'effectuer tous les calculs ambisoniques en temps réel, tout en augmentant le nombre de réflexions des sources sonores. Cette augmentation du nombre de réflexions permet de modéliser de manière plus fine l'environnement d'écoute simulé (salle de concert, de cinéma...) et donc d'améliorer la sensation d'immersion dans la scène sonore. La réduction de la complexité de l'encodage ambisonique permet également, en considérant un nombre identique de source sonores, de réduire la consommation électrique de l'encodeur par rapport à un encodeur de l'état de l'art, et donc d'augmenter la durée de déchargement de la batterie de la tablette tactile 110a. Cela permet donc à l'utilisateur de profiter d'un contenu multimédia pendant une durée plus longue.The ambisonic encoder 111a according to the invention makes it possible to encode a greater number of reflections from sound sources, with less complexity compared to an ambisonic encoder of the prior art. It therefore makes it possible to perform all the ambisonic calculations in real time, while increasing the number of reflections from sound sources. This increase in the number of reflections makes it possible to model more precisely the simulated listening environment (concert hall, cinema, etc.) and therefore improve the feeling of immersion in the sound scene. The reduction in the complexity of the ambisonic encoding also makes it possible, by considering an identical number of sound sources, to reduce the electrical consumption of the encoder compared to an encoder of the state of the art, and therefore to increase the battery discharge time of the touch pad 110a. This therefore allows the user to enjoy multimedia content for a longer period of time.

La figure 1b représente un second exemple de système d'écoute d'onde sonore, selon un mode de réalisation de l'invention.The figure 1b represents a second example of a sound wave listening system, according to one embodiment of the invention.

Le système 100b comprend une unité centrale 110b connectée à un écran 114b, une souris 115b et un clavier 116b et un casque 120b et est utilisé par un utilisateur 130b. L'unité centrale comprend un encodeur ambisonique 111b selon l'invention, un circuit de transformation 112b, et un décodeur ambisonique 113b, respectivement semblables à l'encodeur ambisonique 111a, circuit de transformation 112a, et décodeur ambisonique 113a du système 100a. De manière similaire au système 100a, l'encodeur ambisonique 111b est configuré pour encoder au moins une onde représentative d'une scène sonore en y ajoutant des réflexions, le casque 120a comprend au moins un capteur de mouvement 120b, le circuit de transformation 120b est configuré pour effectuer des rotations de la scène sonore afin de suivre l'orientation de la tête de l'utilisateur, et le décodeur ambisonique 113b est configuré pour restituer le son sur les deux canaux stéréo du casque 120b, de manière à ce que l'utilisateur 130b ait une impression d'immersion dans une scène sonore.The system 100b includes a central unit 110b connected to a screen 114b, a mouse 115b and a keyboard 116b and a headset 120b and is used by a user 130b. The central unit comprises an ambisonic encoder 111b according to the invention, a transformation circuit 112b, and an ambisonic decoder 113b, respectively similar to the ambisonic encoder 111a, transformation circuit 112a, and ambisonic decoder 113a of the system 100a. Similarly to the system 100a, the ambisonic encoder 111b is configured to encode at least one wave representative of a sound scene by adding reflections thereto, the headphones 120a include at least one movement sensor 120b, the transformation circuit 120b is configured to perform soundstage rotations to follow the orientation of the user's head, and the ambisonic decoder 113b is configured to output sound on the two stereo channels of the headphones 120b, so that the user 130b has an impression of immersion in a sound scene.

Le système 100b est adapté pour la visualisation de contenu multimédia, mais également pour le jeu vidéo. En effet, dans un jeu vidéo, de très nombreuses ondes sonores, issues de différentes sources, peuvent survenir. C'est par exemple le cas dans un jeu de stratégie ou de guerre, dans lequel de nombreux personnages peuvent émettre des sons différents (bruits de pas, de course, tirs...) pour diverses sources sonores. Un encodeur ambisonique 111b permet d'encoder toutes ces sources, tout en leur ajoutant de nombreuses réflexions rendant la scène plus réaliste et immersive, en temps réel. Ainsi, le système 100b comprenant un encodeur ambisonique 111b selon l'invention permet une expérience immersive dans un jeu vidéo, avec un grand nombre de sources sonores et de réflexions.The 100b system is suitable for viewing multimedia content, but also for video games. Indeed, in a video game, very many sound waves, coming from different sources, can occur. This is for example the case in a strategy or war game, in which many characters can emit different sounds (noises of footsteps, running, shots, etc.) for various sound sources. A 111b ambisonic encoder can encode all these sources, while adding many reflections to them making the scene more realistic and immersive, in real time. Thus, the system 100b comprising an ambisonic encoder 111b according to the invention allows an immersive experience in a video game, with a large number of sound sources and reflections.

La figure 2 représente un exemple d'un système de binauralisation comprenant un moteur de binauralisation par source sonore d'une scène audio selon l'état de l'art.The figure 2 represents an example of a binauralization system comprising a binauralization engine by sound source of an audio scene according to the state of the art.

Le système de binauralisation 200 est configuré pour transformer un ensemble 210 de sources sonores d'une scène sonore en un canal gauche 240 et un canal droit 241 d'un système d'écoute stéréo, et comprend un ensemble de moteurs binauraux 220, comprenant un moteur binaural par source sonore.The binauralization system 200 is configured to transform a set 210 of sound sources of a soundstage into a left channel 240 and a right channel 241 of a stereo listening system, and includes a set of binaural motors 220, including a binaural motor by sound source.

Les sources peuvent être de tout type de sources sonores (mono, stéréo, 5.1, sources sonores multiples dans le cas d'un jeu vidéo par exemple). Chaque source sonore est associée à une orientation dans l'espace par exemple définie par des angles ( θ,ϕ ) dans un référentiel, et par une onde sonore, elle-même représentée par un ensemble d'échantillons temporels.The sources can be from any type of sound source (mono, stereo, 5.1, multiple sound sources in the case of a video game for example). Each sound source is associated with an orientation in space, for example defined by angles ( θ , ϕ ) in a frame of reference, and by a sound wave, itself represented by a set of temporal samples.

Chacun des moteurs de binauralisation de l'ensemble 220 est configuré pour, pour une source sonore et à chaque instant t correspondant à un échantillon de la source sonore :

effectuer un encodage HOA de la source sonore à un ordre M ;
effectuer une transformation sur les coefficients binauraux, par exemple une rotation ;
calculer une intensité sonore p( r ,t) à des instants t pour un ensemble de canaux de sortie, dans laquelle r représente l'orientation du canal de sortie.

Each of the binauralization engines of the set 220 is configured for, for a sound source and at each instant t corresponding to a sample of the sound source:

performing HOA encoding of the sound source at an M order;
performing a transformation on the binaural coefficients, for example a rotation;
calculate a sound intensity p ( r , t ) at times t for a set of output channels, in which r represents the orientation of the output channel.

Les canaux de sortie possibles correspondent aux différents canaux d'écoute, on peut par exemple avoir deux canaux de sortie dans un système d'écoute stéréo, 6 canaux de sortie dans un système d'écoute 5.1, etc...The possible output channels correspond to the different listening channels, for example we can have two output channels in a stereo listening system, 6 output channels in a 5.1 listening system, etc ...

Chaque moteur de binauralisation produit deux sorties (une sortie gauche et une sortie droite), et le système 200 comprend un circuit d'addition 230 de toutes les sorties gauches et un circuit d'addition 231 de toutes les sorties droites de l'ensemble 220 de moteurs de binauralisation. Les sorties des logiques d'addition 230 et 231 sont respectivement l'onde sonore du canal gauche 240 et l'onde sonore du canal droit 241 d'un système d'écoute stéréo.Each binauralization motor produces two outputs (one left and one right output), and the system 200 includes an addition circuit 230 of all the left outputs and an addition circuit 231 of all the right outputs of the set 220. binauralization engines. The outputs of the addition logic 230 and 231 are respectively the sound wave of the left channel 240 and the sound wave of the right channel 241 of a stereo listening system.

Le système 200 permet de transformer l'ensemble de sources sonores 210 en deux canaux stéréo, tout en pouvant appliquer toutes les transformations permises par l'ambisonie, telles que des rotations.The system 200 makes it possible to transform the set of sound sources 210 into two stereo channels, while being able to apply all the transformations allowed by the ambisonia, such as rotations.

Cependant, le système 200 présente un inconvénient majeur en termes de temps de calcul : il nécessite des calculs pour calculer les coefficients ambisoniques de chaque source sonore, des calculs pour les transformations de chaque source sonore, et des calculs pour les sorties associées à chaque source sonore. La charge de calcul pour le traitement d'une source sonore par le système 200 est donc proportionnelle au nombre de sources sonores, et peut, pour un grand nombre de sources sonores, devenir prohibitive.However, the system 200 has a major drawback in terms of calculation time: it requires calculations to calculate the ambisonic coefficients of each sound source, calculations for the transformations of each sound source, and calculations for the outputs associated with each source. sound. The computational load for the processing of a sound source by the system 200 is therefore proportional to the number of sound sources, and can, for a large number of sound sources, become prohibitive.

Les figures 3a et 3b représentent deux exemples de moteurs de binauralisation d'une scène 3D, respectivement dans le domaine temporel et le domaine fréquentiel selon l'état de l'art.The figures 3a and 3b represent two examples of engines for binauralization of a 3D scene, respectively in the time domain and the frequency domain according to the state of the art.

La figure 3a représente un exemple de moteur de binauralisation d'une scène 3D, dans le domaine temporel selon l'état de l'art.The figure 3a represents an example of a binauralization engine of a 3D scene, in the time domain according to the state of the art.

Afin de limiter la complexité du traitement binaural dans le cas d'un grand nombre de sources, le moteur de binauralisation 300a comprend un unique moteur d'encodage HOA 320a pour l'ensemble des sources 310 de la scène sonore. Ce moteur d'encodage 320a est configuré pour calculer, à chaque pas de temps, les coefficients binauraux de chaque source sonore en fonction de l'intensité et de la position de la source sonore audit pas de temps, puis à sommer les coefficients binauraux des différentes sources sonores. Ceci permet d'obtenir un unique ensemble 321a de coefficients binauraux représentatifs de l'ensemble de la scène sonore.In order to limit the complexity of the binaural processing in the case of a large number of sources, the binauralization engine 300a comprises a single HOA encoding engine 320a for all of the sources 310 of the sound scene. This encoding engine 320a is configured to calculate, at each time step, the binaural coefficients of each sound source as a function of the intensity and the position of the sound source at said time step, then to sum the binaural coefficients of the different sound sources. This makes it possible to obtain a single set 321a of binaural coefficients representative of the whole of the sound scene.

Le moteur de binauralisation 320a comprend ensuite un circuit de transformation 330a des coefficients, configuré pour transformer l'ensemble de coefficients 321a représentatifs de la scène sonore en un ensemble de coefficients transformés 331a représentatifs de l'ensemble de la scène sonore. Ceci permet par exemple d'effectuer une rotation de l'ensemble de la scène sonore.The binauralization engine 320a then comprises a coefficient transformation circuit 330a, configured to transform the set of coefficients 321a representative of the sound scene into a set of transformed coefficients 331a representative of the whole of the sound scene. This makes it possible for example to perform a rotation of the whole of the sound scene.

Le moteur de binauralisation 300a comprend enfin un décodeur binaural 340a, configuré pour restituer les coefficients transformés 331a en un ensemble de canaux de sortie, par exemple un canal gauche 341a et un canal droit 342a d'un système stéréo.The binauralization engine 300a finally comprises a binaural decoder 340a, configured to restore the transformed coefficients 331a into a set of output channels, for example a left channel 341a and a right channel 342a of a stereo system.

Le moteur de binauralisation 300a permet donc de réduire la complexité de calcul nécessaire au traitement binaural d'une scène sonore par rapport au système 200, en appliquant les étapes de transformation et décodage à l'ensemble de la scène sonore, plutôt qu'à chaque source sonore prise individuellement.The binauralization engine 300a therefore makes it possible to reduce the computational complexity necessary for the binaural processing of a sound scene compared to the system 200, by applying the transformation and decoding steps to the whole of the sound scene, rather than to each sound source taken individually.

figure 3b représente un exemple de moteur de binauralisation d'une scène 3D, dans le domaine fréquentiel selon l'état de l'art. figure 3b represents an example of a binauralization engine of a 3D scene, in the frequency domain according to the state of the art.

Le moteur de binauralisation 300b est assez semblable au moteur de binauralisation 300a. Il comprend un ensemble 311b de logiques de transformation fréquentielle, l'ensemble 311b comprenant une logique de transformation fréquentielle pour chaque source sonore. Les logiques de transformation fréquentielle peuvent par exemple être configurées pour appliquer une transformée de Fourier rapide (ou FFT, de l'acronyme anglais Fast Fourier Transform), afin d'obtenir un ensemble 312b de sources dans le domaine fréquentiel. L'application de transformées fréquentielles est bien connue de l'homme de l'art, et est par exemple décrite par A. Mertins, Signal Analysis : Wavelets, Filter banks, Time-Frequency Transforms and Applications, English (revised edition). ISBN : 9780470841839 . Elle consiste par exemple à transformer, par fenêtres temporelles, les échantillons sonores en intensité fréquentielles, selon un échantillonnage fréquentiel. L'opération inverse, ou transformation fréquentielle inverse (dite FFT^-1 ou transformation de Fourier rapide inverse dans le cas d'une transformée de Fourier rapide) permet de restituer, à partir d'un échantillonnage de fréquences, des intensités d'échantillons sonores.The 300b binauralization engine is quite similar to the 300a binauralization engine. It comprises a set 311b of frequency transformation logics, the set 311b comprising a frequency transformation logic for each sound source. The frequency transformation logics can for example be configured to apply a fast Fourier transform (or FFT, from the acronym Fast Fourier Transform), in order to obtain a set 312b of sources in the frequency domain. The application of frequency transforms is well known to those skilled in the art, and is for example described by A. Mertins, Signal Analysis: Wavelets, Filter banks, Time-Frequency Transforms and Applications, English (revised edition). ISBN: 9780470841839 . It consists for example in transforming, by time windows, the sound samples into frequency intensity, according to frequency sampling. The inverse operation, or inverse frequency transformation (known as FFT ^-1 or inverse fast Fourier transformation in the case of a fast Fourier transform) makes it possible to restore, from a sampling of frequencies, the intensities of sound samples .

Le moteur de binauralisation 300b comprend ensuite un encodeur HOA 320b dans le domaine fréquentiel. L'encodeur 320b est configuré pour calculer, pour chaque source et à chaque fréquence de l'échantillonnage fréquentiel, les coefficients ambisoniques correspondants, puis à additionner les coefficients ambisoniques des différentes sources, afin d'obtenir un ensemble 321b d'échantillons ambisoniques représentatifs de l'ensemble de la scène sonore, aux différentes fréquences. Un coefficient ambisonique à une fréquence f de l'échantillonnage en fréquence s'obtient, de manière similaire à un coefficient ambisonique à l'instant t, par la formule: B_mn ( f ) = S ( f ) Y_mn ( θ_s , ϕ_s ).The binauralization engine 300b then includes an HOA encoder 320b in the frequency domain. The encoder 320b is configured to calculate, for each source and at each frequency of the frequency sampling, the corresponding ambisonic coefficients, then to add the ambisonic coefficients of the different sources, in order to obtain a set 321b of ambisonic samples representative of the whole soundstage, at different frequencies. An ambisonic coefficient at a frequency f of the frequency sampling is obtained, so similar to an ambisonic coefficient at time t, by the formula: B _mn ( f ) = S ( f ) Y _mn ( θ _s , ϕ _s ).

Le moteur de binauralisation 300b comprend ensuite un circuit de transformation 330b, similaire au circuit de transformation 330a, permettant d'obtenir un ensemble 331b de coefficients ambisoniques transformés représentatifs de l'ensemble de la scène sonore, et un décodeur binaural 340b, configuré pour restituer deux canaux stéréo 341b et 342b. Le décodeur binaural 340b comprend un circuit de transformation fréquentielle inverse, afin de restituer les canaux stéréo dans le domaine temporel.The binauralization engine 300b then comprises a transformation circuit 330b, similar to the transformation circuit 330a, making it possible to obtain a set 331b of transformed ambisonic coefficients representative of the whole of the sound scene, and a binaural decoder 340b, configured to restore two stereo channels 341b and 342b. The binaural decoder 340b comprises an inverse frequency transformation circuit, in order to restore the stereo channels in the time domain.

Les propriétés du moteur de binauralisation 300b sont assez semblables à celles du moteur de binauralisation 300a. Il permet également d'effectuer un traitement binaural d'une scène sonore, avec une complexité réduite par rapport au système 200.The properties of the binauralization engine 300b are quite similar to those of the binauralization engine 300a. It also makes it possible to perform binaural processing of a sound scene, with reduced complexity compared to the system 200.

En cas d'augmentation importante du nombre de sources, la complexité du traitement binaural des moteurs de binauralisation 300a et 300b est principalement due au calcul des coefficients HOA par les encodeurs 320a et 320b. En effet, le nombre de coefficients à calculer est proportionnel au nombre de sources. Au contraire, les circuits de transformation 330a et 330b, ainsi que les décodeurs binauraux 340a et 340b traitent des ensembles de coefficients binauraux représentatifs de l'ensemble de la scène sonore, dont le nombre ne varie pas en fonction du nombre de sources.In the event of a significant increase in the number of sources, the complexity of the binaural processing of the binauralization engines 300a and 300b is mainly due to the calculation of the HOA coefficients by the encoders 320a and 320b. Indeed, the number of coefficients to be calculated is proportional to the number of sources. On the contrary, the transformation circuits 330a and 330b, as well as the binaural decoders 340a and 340b process sets of binaural coefficients representative of the whole of the sound scene, the number of which does not vary according to the number of sources.

Pour le traitement des réflexions, la complexité des encodeurs binauraux 320a et 320b peut augmenter de manière importante. En effet, la solution de l'état de l'art pour traiter les réflexions consiste à ajouter une source sonore virtuelle pour chaque réflexion. La complexité de l'encodage HOA de ces encodeurs selon l'état de l'art augmente donc proportionnellement en fonction du nombre de réflexions par source, et peut devenir problématique lorsque le nombre de réflexions devient trop important.For the processing of reflections, the complexity of binaural encoders 320a and 320b can increase significantly. Indeed, the state of the art solution for processing reflections consists of adding a virtual sound source for each reflection. The complexity of the HOA encoding of these encoders according to the state of the art therefore increases proportionally as a function of the number of reflections per source, and can become problematic when the number of reflections becomes too large.

La figure 4 représente un exemple d'encodeur ambisonique d'une onde sonore à une pluralité de réflexions, dans un ensemble de modes de mise en œuvre de l'invention.The figure 4 represents an example of an ambisonic encoder of a sound wave with a plurality of reflections, in a set of embodiments of the invention.

L'encodeur ambisonique 400 est configuré pour encoder une onde sonore 410 avec une pluralité de réflexions, en un ensemble de coefficients ambisoniques à un ordre M. Pour ce faire, l'encodeur ambisonique est configuré pour calculer une ensemble 460 d'harmoniques sphériques représentatives de l'onde sonore et de la pluralité de réflexions. L'encodeur ambisonique 400 sera décrit, à titre d'exemple, pour l'encodage d'une onde sonore unique. Cependant un encodeur ambisonique 400 selon l'invention peut également encoder une pluralité d'ondes sonores, les éléments de l'encodeur ambisonique étant utilisé de la même manière pour chaque onde sonore additionnelle. L'onde sonore 410 peut correspondre par exemple à un canal d'une piste audio, ou à une onde sonore créée dynamiquement, par exemple une onde sonore correspondant à un objet d'un jeu vidéo. Dans un ensemble de modes de réalisation de l'invention, les ondes sonores sont définies par des échantillons successifs d'intensité sonore. Selon différents modes de réalisation de l'invention, les ondes sonores peuvent par exemple être échantillonnées à une fréquence de 22500Hz, 12000Hz, 44100 Hz, 48000 Hz, 88200 Hz, ou 96000 Hz, et chacun des échantillons d'intensité codé sur 8, 12, 16, 24 ou 32 bits. En cas de pluralité d'onde sonores, celles-ci peuvent être échantillonnées à des fréquences différentes, et les échantillons peuvent être codés sur des nombres de bits différents.The ambisonic encoder 400 is configured to encode a sound wave 410 with a plurality of reflections, into a set of one-order ambisonic coefficients. To do this, the ambisonic encoder is configured to calculate a set 460 of representative spherical harmonics. the sound wave and the plurality of reflections. The ambisonic encoder 400 will be described, by way of example, for the encoding of a single sound wave. However, an ambisonic encoder 400 according to the invention can also encode a plurality of sound waves, the elements of the ambisonic encoder being used in the same way for each additional sound wave. The sound wave 410 can correspond for example to a channel of an audio track, or to a dynamically created sound wave, for example a sound wave corresponding to an object of a video game. In one set of embodiments of the invention, the sound waves are defined by successive sound intensity samples. According to different embodiments of the invention, the sound waves can for example be sampled at a frequency of 22500Hz, 12000Hz, 44100 Hz, 48000 Hz, 88200 Hz, or 96000 Hz, and each of the intensity samples coded to 8, 12, 16, 24 or 32 bits. In the event of a plurality of sound waves, these can be sampled at different frequencies, and the samples can be encoded on different numbers of bits.

L'encodeur ambisonique 400 comprend une logique 420 de transformation fréquentielle de l'onde sonore. Celle-ci est similaire aux logiques 311b de transformation fréquentielle des ondes sonores du système de binauralisation 300b selon l'état de l'art. Dans des modes de réalisation à une pluralité d'ondes sonores, l'encodeur 400 comprend une logique de transformation fréquentielle pour chaque onde sonore. En sortie de la logique de transformation fréquentielle, une onde sonore est définie 421, pour une fenêtre temporelle, par un ensemble d'intensités à différentes fréquences d'un échantillonnage en fréquence. Dans un ensemble de modes de réalisation de l'invention, la logique 420 de transformation fréquentielle est une logique d'application d'une FFT.The ambisonic encoder 400 includes logic 420 for frequency transformation of the sound wave. This is similar to the logic 311b of frequency transformation of the sound waves of the binauralization system 300b according to the state of the art. In embodiments with a plurality of sound waves, the encoder 400 includes frequency transformation logic for each sound wave. At the output of the frequency transformation logic, a sound wave is defined 421, for a time window, by a set of intensities at different frequencies of a frequency sampling. In a set of embodiments of the invention, the frequency transformation logic 420 is an application logic of an FFT.

L'encodeur 400a comprend également une logique 430 de calcul d'harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à partir d'une position d'une source de l'onde sonore et de positions d'obstacles à la propagation de l'onde sonore. Dans un ensemble de modes de réalisation de l'invention, la position de la source de l'onde sonore est définie par des angles ( θ_si ,ϕ_si ) et une distance par rapport à une position d'écoute de l'utilisateur. Le calcul des harmoniques sphériques Y ₀₀ ( θ_si,ϕ_si ), Y _1-1 ( θ_si , ϕ_si ) Y ₁₀ ( θ_si , ϕ_si ), Y ₁₁ ( θ_si , ϕ_si ), ..., Y_MM ( θ_si , ϕ_si ), de l'onde sonore à l'ordre M peut s'effectuer selon les méthodes connues de l'état de l'art, à partir des angles ( θ_si , ϕ_si ) définissant l'orientation de la source de l'onde sonore.The encoder 400a also includes logic 430 for calculating spherical harmonics of the sound wave and the plurality of reflections from a position of a source of the sound wave and from positions of obstacles to propagation. of the sound wave. In a set of modes embodiment of the invention, the position of the source of the sound wave is defined by angles ( θ _s _i , ϕ _s _i ) and a distance from a listening position of the user. The calculation of spherical harmonics Y ₀₀ ( θ _s _i , ϕ _s _i ), Y _1-1 ( θ _s _i , ϕ _s _i ) Y ₁₀ ( θ _s _i , ϕ _s _i ), Y ₁₁ ( θ _s _i , ϕ _s _i ), ..., Y _MM ( θ _s _i , ϕ _s _i ), the sound wave at the order M can be carried out according to the methods known in the state of the art, from the angles ( θ _s _i , ϕ _s _i ) defining the orientation of the source of the sound wave.

La logique 430 est également configurée pour calculer, à partir de la position de la source de l'onde sonore, un ensemble d'harmoniques sphériques de la pluralité de réflexions. Dans un ensemble de modes de réalisation de l'invention, la logique 430 est configurée pour calculer, à partir de la position de la source de l'onde sonore, et de positions d'obstacles à la propagation de l'onde sonore, une orientation d'une source virtuelle d'une réflexion, définie par des angles (θ_s,r , ϕ_s,r ) puis, à partir de ces angles, des harmoniques sphériques Y ₀₀ ( θ_s,r , ϕ_s,r ), Y _1-1 ( θ_s,r , ϕ_s,r ), Y ₁₀ ( θ_s,r , ϕ_s,r ), Y ₁₁ ( θ_s,r , ϕ_s,r ), ..., Y_MM ( θ_s,r , ϕ_s,r ) de la réflexion de l'onde sonore. Ceci permet d'obtenir, pour chaque réflexion, les harmoniques sphériques correspondant à la direction de l'onde réfléchie sur les obstacles à la propagation de l'onde sonore.Logic 430 is also configured to calculate, from the position of the source of the sound wave, a set of spherical harmonics of the plurality of reflections. In one set of embodiments of the invention, logic 430 is configured to calculate, from the position of the source of the sound wave, and positions of obstacles to the propagation of the sound wave, a orientation of a virtual source of a reflection, defined by angles ( θ _{s, r} , ϕ _{s, r} ) then, from these angles, spherical harmonics Y ₀₀ ( θ _{s, r} , ϕ _{s, r} ) , Y _1-1 ( θ _{s, r} , ϕ _{s, r} ), Y ₁₀ ( θ _{s, r} , ϕ _{s, r} ), Y ₁₁ ( θ _{s, r} , ϕ _{s, r} ), ..., Y _MM ( θ _{s, r} , ϕ _{s, r} ) of the sound wave reflection. This makes it possible to obtain, for each reflection, the spherical harmonics corresponding to the direction of the wave reflected on the obstacles to the propagation of the sound wave.

L'encodeur ambisonique 400 comprend également une pluralité 440 de logiques de filtrage dans le domaine fréquentiel recevant en entrée des harmoniques sphériques de la pluralité de réflexions, chaque logique de filtrage étant paramétrée par des coefficients acoustiques et de délai des réflexions. Dans la suite de la description, on appellera α_r un coefficient acoustique d'une réflexion et δ_r un délai d'une réflexion. Selon différents modes de réalisation de l'invention, le coefficient acoustique peut être un coefficient de α_r réverbération, représentatif d'un rapport des intensités d'une réflexion sur des intensités de la source sonore et défini entre 0 et 1. Selon d'autres modes de réalisation de l'invention, le coefficient acoustique est un coefficient α_a dit d'atténuation ou d'absorption, soit un coefficient défini entre 0 et 1 tel que α_a = α_r - 1. Ces logiques de filtrage permettent d'appliquer aux coefficients ambisoniques d'une réflexion un délai et une atténuation. Ainsi, la combinaison de l'orientation de la source virtuelle de la réflexion, du délai et de l'atténuation de la réflexion permet de modéliser chaque réflexion comme une réplique de la source sonore, venant d'une direction différente, affectée d'un délai et atténuée, suite au parcours et aux réflexions de l'onde sonore. Cette modélisation permet, avec plusieurs réflexions de simuler la propagation d'une onde sonore dans une scène de manière simple et efficace.The ambisonic encoder 400 also includes a plurality 440 of filter logic in the frequency domain receiving as input spherical harmonics of the plurality of reflections, each filter logic being parameterized by acoustic and delay coefficients of the reflections. In the remainder of the description, α _r will be called an acoustic coefficient of a reflection and δ _r a delay of a reflection. According to various embodiments of the invention, the acoustic coefficient can be a coefficient of α _r reverberation, representative of a ratio of the intensities of a reflection to the intensities of the sound source and defined between 0 and 1. According to d ' other embodiments of the invention, the acoustic coefficient is a coefficient α _a said of attenuation or absorption, that is to say a coefficient defined between 0 and 1 such that α _a = α _r - 1. These filtering logics allow d 'apply a delay and an attenuation to the ambisonic coefficients of a reflection. Thus, the combination of the orientation of the virtual source of the reflection, the delay and the attenuation of the reflection makes it possible to model each reflection as a replica of the sound source, coming from a different direction, assigned a delay and attenuated, following the path and reflections of the sound wave. This modeling allows, with several reflections, to simulate the propagation of a sound wave in a scene in a simple and efficient way.

De manière générale, le filtrage, à une fréquence f, d'une harmonique sphérique d'une réflexion peut s'écrire : H_r(f) Y_ij ( θ_s,r , ϕ_s,r ). Dans un mode de réalisation de l'invention une logique de filtrage 440 est configurée pour filtrer les harmoniques sphériques en appliquant: α_re ^-j2 ^πδrY_ij ( θ_s,r , ϕ_s,r ). Dans ce mode de réalisation, le coefficient α_r est traité comme un coefficient de réverbération. Dans d'autres modes de réalisation, un coefficient α_a peut être traité comme un coefficient d'atténuation, et le filtrage des harmoniques sphériques peut par exemple s'effectuer en appliquant: ( 1 - α_a ) e ^-j2 ^πfδrY_ij ( θ_s,r , ϕ_s,r ). Dans la suite de la description, on considèrera sauf mention contraire le coefficient α_r comme un coefficient de réverbération. Un homme de l'art pourra cependant aisément mettre en œuvre les différents modes de réalisation de l'invention avec un coefficient d'atténuation plutôt qu'un coefficient de réverbération.In general, the filtering, at a frequency f, of a spherical harmonic of a reflection can be written: H _r (f) Y _ij ( θ _{s, r} , ϕ _{s, r} ). In one embodiment of the invention, a filter logic 440 is configured to filter the spherical harmonics by applying: α _r e ^{- j 2} ^πδ ^r Y _ij ( θ _{s, r} , ϕ _{s, r} ). In this embodiment, the coefficient α _r is treated as a reverberation coefficient. In other embodiments, a coefficient α _a can be treated as an attenuation coefficient, and the filtering of spherical harmonics can for example be performed by applying: ( 1 - α _a ) e ^{-j 2} ^πfδ ^r Y _ij ( θ _{s, r} , ϕ _{s, r} ). In the remainder of the description, unless otherwise specified, the coefficient α _{r will be considered} as a reverberation coefficient. A person skilled in the art could however easily implement the various embodiments of the invention with an attenuation coefficient rather than a reverberation coefficient.

L'encodeur ambisonique 400 comprend également une logique 450 d'addition des harmoniques sphériques de l'onde sonore et des sorties des logiques de filtrage. Cette logique permet d'obtenir un ensemble Y'₀₀ , Y'_1-1 , Y'₁₀, Y'₁₁, ... Y'_MM d'harmoniques sphériques à l'ordre M, représentatives à la fois de l'onde sonore, et des réflexions de l'onde sonore, dans le domaine fréquentiel. Un harmonique sphérique Y'_ij (avec 0 ≤ i ≤ M, et -i ≤ j ≤ i) représentative à la fois de l'onde sonore, et des réflexions de l'onde sonore, est donc égale, en sortie de la logique d'addition 450, à la valeur $Y_{ij} =$

Y_{ij} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{N_{r}} H_{r} (f) Y_{ij} (θ_{s, r}, φ_{s, r}),

dans laquelle Y_ij ( θ_si , ϕ_si ) est une harmonique sphérique de la source de l'onde sonore, N_r est le nombre de réflexions de l'onde sonore, Y_ij ( θ_s,r , ϕ_s,r ) sont les harmoniques sphériques des positions des sources sonores virtuelles des réflexions, et les termes H_r(f) sont les logiques de filtrage des harmoniques sphériques pour la réflexion r à une fréquence f. Dans un ensemble de modes de réalisation de l'invention, les logiques de filtrage H_r(f) sont telles que H_r(f) = α_re ^-j2πfδ_r , et les harmoniques sphériques Y_ij à l'ordre M, représentatives à la fois de l'onde sonore, et des réflexions de l'onde sonore sont égales, en sortie de la logique 450 d'addition, à:

Y'_{ij} = Y_{ij} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{N_{r}} α_{r} e^{- j 2 π f δ_{r}} Y_{ij} (θ_{s, r}, φ_{s, r}) .

The ambisonic encoder 400 also includes logic 450 for adding the spherical harmonics of the sound wave and the outputs of the filtering logic. This logic makes it possible to obtain a set Y _'00 , Y' _1-1 , Y _'10 , Y' ₁₁ , ... Y ' _MM of spherical harmonics at order M, representative of both the wave sound, and reflections of the sound wave, in the frequency domain. A spherical harmonic Y ' _ij (with 0 ≤ i ≤ M, and -i ≤ j ≤ i) representative both of the sound wave, and of the reflections of the sound wave, is therefore equal, at the output of the logic addition 450, to the value

Y_{ij} =

Y_{ij} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{{NOT}_{r}} H_{r} (f) Y_{ij} (θ_{s, r}, φ_{s, r}),

where Y _ij ( θ _si , ϕ _si ) is a spherical harmonic of the source of the sound wave, N _r is the number of reflections of the sound wave, Y _ij ( θ _{s, r} , ϕ _{s, r} ) are the spherical harmonics of the positions of the virtual sound sources of the reflections, and the terms H _r (f) are the filtering logic of the spherical harmonics for the reflection r at a frequency f. In a set of embodiments of the invention, the filtering logics H _r (f) are such that H _r (f) = α _r e ^{-j 2 πfδ _r} , and the spherical harmonics Y _ij to the order M , representative of both the sound wave, and the reflections of the sound wave are equal, at the output of addition logic 450, to:

Y'_{ij} = Y_{ij} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{{NOT}_{r}} α_{r} e^{- j 2 π f δ_{r}} Y_{ij} (θ_{s, r}, φ_{s, r}) .

Selon différents modes de réalisation de l'invention, le nombre N_r de réflexions peut être prédéfini. Selon d'autres modes de réalisation de l'invention, les réflexions de l'onde sonore sont conservées selon leur coefficient acoustique, le nombre Nr de réflexions dépendant alors de la position de la source sonore, de la position de l'utilisateur, et des obstacles à la propagation du son. Dans l'exemple ci-dessus, le coefficient acoustique est défini comme un ratio de l'intensité de la réflexion sur l'intensité de la source sonore, soit un coefficient de réverbération. Dans un mode de réalisation de l'invention, les réflexions de l'onde sonore ayant un coefficient acoustique supérieur ou égal à un seuil prédéfini sont conservées. Dans d'autres modes de réalisation, le coefficient acoustique est défini comme un coefficient d'atténuation, soit un ratio entre l'intensité sonore absorbée par les obstacles à la propagation d'ondes sonores et le trajet dans l'air et l'intensité de la source sonore. Dans ce mode de réalisation, les réflexions de l'onde sonore ayant un coefficient acoustique inférieur ou égal à un seuil prédéfini sont conservéesAccording to different embodiments of the invention, the number N _r of reflections can be predefined. According to other embodiments of the invention, the reflections of the sound wave are preserved according to their acoustic coefficient, the number Nr of reflections then depending on the position of the sound source, on the position of the user, and obstacles to the propagation of sound. In the example above, the acoustic coefficient is defined as a ratio of the intensity of the reflection to the intensity of the sound source, or a reverberation coefficient. In one embodiment of the invention, the reflections of the sound wave having an acoustic coefficient greater than or equal to a predefined threshold are preserved. In other embodiments, the acoustic coefficient is defined as an attenuation coefficient, i.e. a ratio between the sound intensity absorbed by the obstacles to the propagation of sound waves and the air path and intensity. sound source. In this embodiment, the reflections of the sound wave having an acoustic coefficient less than or equal to a predefined threshold are preserved

Ainsi, l'encodeur ambisonique 400 permet de calculer un ensemble d'harmoniques sphériques Y'_ij représentatives à la fois de l'onde sonore et de ses réflexions. Une fois ces harmoniques sphériques calculées, l'encodeur peut comprendre une logique de multiplication des harmoniques sphériques par les valeurs d'intensités sonores de la source aux différentes fréquences, afin d'obtenir des coefficients ambisoniques représentatifs à la fois de l'onde sonore et des réflexions. Dans des modes de réalisation à plusieurs sources sonores, l'encodeur 400 comprend une logique d'addition des coefficients ambisoniques des différentes sources sonores et de leurs réflexions, permettant d'obtenir en sortie des coefficients ambisoniques représentatifs de l'ensemble de la scène sonore.Thus, the ambisonic encoder 400 makes it possible to calculate a set of spherical harmonics Y ' _ij representative both of the sound wave and of its reflections. Once these spherical harmonics have been calculated, the encoder can include a logic of multiplying the spherical harmonics by the sound intensity values of the source at the different frequencies, in order to obtain ambisonic coefficients representative of both the sound wave and reflections. In embodiments with several sound sources, the encoder 400 includes logic for adding the ambisonic coefficients of the different sound sources and their reflections, making it possible to obtain at the output ambisonic coefficients representative of the entire sound scene. .

Dans un ensemble de modes de réalisation de l'invention, les coefficients ambisoniques à l'ordre M représentatifs de la scène sonore sont alors égaux, en sortie de la logique d'addition des coefficients ambisoniques des différentes sources sonores et de leurs réflexions, pour Ns sources sonores et pour une fréquence f, à : $(\begin{matrix} B_{00} (f) \\ B_{1 - 1} (f) \\ B_{10} (f) \\ B_{11} (f) \\ ⋮ \\ B_{MM} (f) \end{matrix}) = \sum_{i = 0}^{N_{s} - 1} S_{i} (f) (\begin{matrix} Y_{00} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{N_{r}} H_{r} (f) Y_{00} (θ_{s, r}, φ_{s, r}) \\ Y_{1 - 1} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{N_{r}} H_{r} (f) Y_{1 - 1} (θ_{s, r}, φ_{s, r}) \\ Y_{10} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{N_{r}} H_{r} (f) Y_{10} (θ_{s, r}, φ_{s, r}) \\ Y_{11} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{N_{r}} H_{r} (f) Y_{11} (θ_{s, r}, φ_{s, r}) \\ ⋮ \\ Y_{MM} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{N_{r}} H_{r} (f) Y_{MM} (θ_{s, r}, φ_{s, r}) \end{matrix})$

In a set of embodiments of the invention, the M-order ambisonic coefficients representative of the sound scene are then equal, at the output of the logic for adding the ambisonic coefficients of the different sound sources and their reflections, for Ns sound sources and for a frequency f, at:

(\begin{matrix} B_{00} (f) \\ B_{1 - 1} (f) \\ B_{10} (f) \\ B_{11} (f) \\ ⋮ \\ B_{MM} (f) \end{matrix}) = \sum_{i = 0}^{{NOT}_{s} - 1} S_{i} (f) (\begin{matrix} Y_{00} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{{NOT}_{r}} H_{r} (f) Y_{00} (θ_{s, r}, φ_{s, r}) \\ Y_{1 - 1} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{{NOT}_{r}} H_{r} (f) Y_{1 - 1} (θ_{s, r}, φ_{s, r}) \\ Y_{10} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{{NOT}_{r}} H_{r} (f) Y_{10} (θ_{s, r}, φ_{s, r}) \\ Y_{11} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{{NOT}_{r}} H_{r} (f) Y_{11} (θ_{s, r}, φ_{s, r}) \\ ⋮ \\ Y_{MM} (θ_{s_{i}}, φ_{s_{i}}) + \sum_{r = 0}^{{NOT}_{r}} H_{r} (f) Y_{MM} (θ_{s, r}, φ_{s, r}) \end{matrix})

L'utilisation d'un unique coefficient ambisonique Y'_ij représentatif à la fois de l'onde sonore et de ses réflexions permet de réduire de manière importante les opérations de calcul permettant d'obtenir les coefficients ambisoniques, surtout lorsque le nombre de réflexions est élevé. En effet, ceci permet de réduire le nombre de multiplications, puisqu'il n'est plus nécessaire de multiplier chacune des intensités S_i ( f ) d'une source pour chaque fréquence par chacune des harmoniques sphériques Y_ij ( θ_s,r ,ϕ_s,r ), pour chaque valeur de i telle que 0 ≤ i ≤ M, chaque valeur de j telle que -i ≤ j ≤ i, et chaque réflexion. Cette réduction du nombre de multiplications permet une réduction importante de la complexité de calcul, particulièrement dans le cas d'un nombre de réflexions élevé.The use of a single ambisonic coefficient Y ' _ij representative of both the sound wave and its reflections makes it possible to significantly reduce the calculation operations making it possible to obtain the ambisonic coefficients, especially when the number of reflections is Student. Indeed, this makes it possible to reduce the number of multiplications, since it is no longer necessary to multiply each of the intensities S _i ( f ) of a source for each frequency by each of the spherical harmonics Y _ij ( θ _{s, r} , ϕ _{s, r} ), for each value of i such that 0 ≤ i ≤ M, each value of j such that -i ≤ j ≤ i, and each reflection. This reduction in the number of multiplications allows a significant reduction in the complexity of the computation, particularly in the case of a high number of reflections.

Dans un ensemble de modes de réalisation de l'invention, la logique 430 de calcul d'harmoniques sphériques de l'onde sonore est configurée pour calculer les harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à partir d'une position fixe de la source de l'onde sonore. Dans ce cas, les orientations ( θ_si ,ϕ_si ) de la source sonore, et les orientations ( θ_s,r ,ϕ_s,r ) de chacune des harmoniques sont constantes. Les harmoniques sphériques de l'onde sonore et de la pluralité de réflexions ont alors également une valeur constante, et peuvent être calculées une unique fois pour l'onde sonore.In one set of embodiments of the invention, the sound wave spherical harmonics calculation logic 430 is configured to calculate the spherical harmonics of the sound wave and the plurality of reflections from a position fixed source of the sound wave. In this case, the orientations ( θ _si , ϕ _si ) of the sound source, and the orientations ( θ _{s, r} , ϕ _{s, r} ) of each of the harmonics are constant. The spherical harmonics of the sound wave and of the plurality of reflections then also have a constant value, and can be calculated only once for the sound wave.

Dans d'autres modes de réalisation de l'invention, la logique 430 de calcul d'harmoniques sphériques de l'onde sonore est configurée pour calculer de manière itérative les harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à partir de positions successives de la source de l'onde sonore. Selon différents modes de réalisation de l'invention, différentes possibilités existent pour définir les itérations de calcul. Dans un mode de réalisation de l'invention, la logique 430 est configurée pour recalculer les valeurs des harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à chaque fois qu'un changement de la position de la source de l'onde sonore ou de la position de l'utilisateur est détecté. Dans un autre mode de réalisation de l'invention, la logique 430 est configurée pour recalculer les valeurs des harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à intervalles réguliers, par exemple toutes les 10 ms. Dans un autre mode de réalisation de l'invention, la logique 430 est configurée pour recalculer les valeurs des harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à chacune des fenêtres temporelles utilisées par la logique 420 de transformation fréquentielle de l'onde sonore pour convertir les échantillons temporels de l'onde sonore en échantillons fréquentiels.In other embodiments of the invention, the sound wave spherical harmonics calculation logic 430 is configured to iteratively calculate the spherical harmonics of the sound wave and the plurality of reflections from successive positions of the source of the sound wave. According to different embodiments of the invention, different possibilities exist for defining the iterations of calculation. In one embodiment of the invention, logic 430 is configured to recalculate the values of the spherical harmonics of the sound wave and of the plurality of reflections each time a change in the position of the source of the wave. sound or user's position is detected. In another embodiment of the invention, logic 430 is configured to recalculate the values of the spherical harmonics of the sound wave and of the plurality of reflections at regular intervals, for example every 10 ms. In another embodiment of the invention, logic 430 is configured to recalculate the values of the spherical harmonics of the sound wave and of the plurality of reflections at each of the time windows used by the frequency transformation logic 420 of the sound wave. sound wave to convert the temporal samples of the sound wave into frequency samples.

Dans un ensemble de modes de réalisation de l'invention, chaque réflexion est caractérisée par un unique coefficient acoustique α_r .In a set of embodiments of the invention, each reflection is characterized by a single acoustic coefficient α _r .

Dans d'autres modes de réalisation de l'invention, chaque réflexion est caractérisée par un coefficient acoustique pour chaque fréquence dudit échantillonnage fréquentiel. Ceci permet d'obtenir des coefficients acoustiques différents pour les différentes fréquences, et d'améliorer le rendu de certains effets. Par exemple, il est connu que les matériaux épais absorbent de manière plus importante les basses fréquences. De même certains types de matériaux absorbent et réfléchissent de manière différente les hautes fréquences. Ainsi, la définition de coefficients acoustiques différents pour une même réflexion et différentes fréquences permet de caractériser les matériaux rencontrés par les réflexions, permettant un meilleur rendu de différents types de salle, en fonction des matériaux des murs de celle-ci.In other embodiments of the invention, each reflection is characterized by an acoustic coefficient for each frequency of said frequency sampling. This makes it possible to obtain different acoustic coefficients for the different frequencies, and to improve the rendering of certain effects. For example, it is known that thick materials absorb low frequencies more significantly. Likewise, certain types of materials absorb and reflect high frequencies in different ways. Thus, the definition of different acoustic coefficients for the same reflection and different frequencies makes it possible to characterize the materials encountered by the reflections, allowing a better rendering of different types of room, according to the materials of the walls of this one.

Dans un ensemble de modes de réalisation de l'invention, une réflexion à une fréquence peut être considérée comme nulle, en fonction d'une comparaison entre le coefficient acoustique α_r pour cette fréquence et un seuil prédéfini. Par exemple, si le coefficient α_r représente un coefficient de réverbération, la fréquence est considérée comme nulle s'il est inférieur à un seuil prédéfini. Au contraire, s'il s'agit d'un coefficient d'atténuation, la fréquence est considérée comme nulle s'il est supérieur ou égal à un seuil prédéfini. Ceci permet de limiter encore le nombre de multiplications, et donc la complexité de l'encodage ambisonique, tout en ayant un impact minime sur le rendu binaural.In a set of embodiments of the invention, a reflection at a frequency can be considered as zero, as a function of a comparison between the acoustic coefficient α _r for this frequency and a predefined threshold. For example, if the coefficient α _r represents a reverberation coefficient, the frequency is considered to be zero if it is less than a predefined threshold. On the contrary, if it is an attenuation coefficient, the frequency is considered zero if it is greater than or equal to a predefined threshold. This makes it possible to further limit the number of multiplications, and therefore the complexity of the ambisonic encoding, while having a minimal impact on the binaural rendering.

Dans un ensemble de modes de réalisation de l'invention, l'encodeur ambisonique 400 comprend une logique de calcul des coefficients acoustiques et des délais, et de la position de la source sonore virtuelle des réflexions. Cette logique de calcul peut par exemple être configurée pour calculer les coefficients acoustiques et les délais des réflexions en fonction d'estimations d'une différence de distance parcourue par le son entre la position de la source de l'onde sonore et une position estimée d'un utilisateur d'une part, et la distance parcourue par le son entre les positions des sources sonores virtuelles des réflexions et la position estimée de l'utilisateur d'autre part. Il est en effet aisé, connaissant la différence de distance parcourue par le l'onde sonore par parvenir à l'utilisateur, en ligne droite depuis la source sonore d'une part, et par le biais d'une réflexion d'autre part, et connaissant la célérité du son, de déduire le délai ressenti par l'utilisateur entre le son issu de la source sonore en ligne droite d'une part, et le son ayant été affecté par la réflexion d'autre part.In a set of embodiments of the invention, the ambisonic encoder 400 includes logic for calculating the acoustic coefficients and the delays, and the position of the virtual sound source of the reflections. This calculation logic can for example be configured to calculate the acoustic coefficients and the delays of the reflections as a function of estimates of a difference in distance traveled by the sound between the position of the source of the sound wave and an estimated position d 'a user on the one hand, and the distance traveled by the sound between the positions of the virtual sound sources of the reflections and the estimated position of the user on the other hand. It is indeed easy, knowing the difference in distance traveled by the sound wave to reach the user, in a straight line from the sound source on the one hand, and through a reflection on the other hand, and knowing the speed of the sound, to deduce the delay felt by the user between the sound coming from the sound source in a straight line on the one hand, and the sound having been affected by the reflection on the other hand.

De la même manière, il est connu que l'intensité d'une onde sonore diminue au fur et à mesure de son parcours dans l'air. La logique de calcul des coefficients acoustiques et des délais, et de la position de la source sonore virtuelle des réflexions peut donc être configurée pour calculer un coefficient acoustique d'une réflexion de l'onde sonore en fonction de la différence de distance parcourue entre le son issu de la source sonore en ligne droite d'une part, et le son ayant été affecté par la réflexion d'autre part.Likewise, it is known that the intensity of a sound wave decreases as it travels through the air. The logic for calculating the acoustic coefficients and the delays, and the position of the virtual sound source of the reflections can therefore be configured to calculate an acoustic coefficient of a reflection of the sound wave as a function of the difference in distance traveled between the sound from the sound source in a straight line on the one hand, and sound having been affected by reflection on the other hand.

Dans d'autres modes de réalisation de l'invention, la logique de calcul des coefficients acoustiques et des délais, et de la position de la source sonore virtuelle des réflexions est également configurée pour calculer les coefficients acoustiques des réflexions en fonction d'un coefficient acoustique d'au moins un obstacle à la propagation d'ondes sonores, sur lequel le son est réfléchi. Ceci permet de mieux modéliser l'absorption par les matériaux d'une salle, et le coefficient acoustique de l'obstacle peut être variable selon les différentes fréquences. Le coefficient acoustique de l'obstacle peut être un coefficient de réverbération ou un coefficient d'atténuation.In other embodiments of the invention, the logic for calculating the acoustic coefficients and delays, and the position of the virtual sound source of the reflections is also configured to calculate the acoustic coefficients of the reflections as a function of a coefficient acoustics of at least one obstacle to the propagation of sound waves, on which the sound is reflected. This makes it possible to better model the absorption by the materials of a room, and the acoustic coefficient of the obstacle can be variable according to the different frequencies. The acoustic coefficient of the obstacle can be a reverberation coefficient or an attenuation coefficient.

La figure 5 représente un exemple de calcul d'une source sonore secondaire, dans un mode de mise en œuvre de l'invention.The figure 5 represents an example of calculation of a secondary sound source, in one embodiment of the invention.

Dans cet exemple une source de l'onde sonore a une position 520 dans une pièce 510, et l'utilisateur a une position 540. La pièce 510 est constituée de 4 murs 511, 512, 513 et 514.In this example a source of the sound wave has a position 520 in a room 510, and the user has a position 540. The room 510 consists of 4 walls 511, 512, 513 and 514.

Dans un ensemble de modes de mise en œuvre de l'invention, la logique de calcul des coefficients acoustiques et des délais, et de la position de la source sonore virtuelle des réflexions est configurée pour calculer les position, délai et atténuation des sources sonores virtuelles des réflexions de la manière suivante : pour chacun des murs 511, 512, 513, 514, la logique est configurée pour calculer une position d'une source sonore virtuelle d'une réflexion comme le symétrique de la position de la source sonore par rapport à un mur. La logique de calcul est ainsi configurée pour calculer les positions 521, 522, 523 et 524 de quatre sources sonores virtuelles des réflexions, respectivement par rapport aux murs 511, 512, 513 et 514.In a set of embodiments of the invention, the logic for calculating the acoustic coefficients and the delays, and the position of the virtual sound source of the reflections is configured to calculate the position, delay and attenuation of the virtual sound sources. reflections as follows: for each of the walls 511, 512, 513, 514, the logic is configured to calculate a position of a virtual sound source of a reflection as the symmetrical of the position of the sound source with respect to a wall. The calculation logic is thus configured to calculate the positions 521, 522, 523 and 524 of four virtual sound sources of the reflections, respectively with respect to the walls 511, 512, 513 and 514.

Pour chacune de ces sources sonores virtuelles, la logique de calcul est configurée pour calculer un chemin de parcours de l'onde sonore, et en déduire le coefficient acoustique et le délai correspondants. Par exemple, dans le cas de la source sonore virtuelle 511, l'onde sonore suit le trajet 530 jusqu'au point 531 du mur 512, puis le chemin 532 jusqu'à la position de l'utilisateur 540. La distance parcourue par le son selon le chemin 530, 532 permet de calculer un coefficient acoustique et un délai de la réflexion. Dans un ensemble de modes de réalisation de l'invention la logique de calcul est également configurée pour appliquer un coefficient acoustique correspondant à l'absorption du mur 512 au point 531. Dans un ensemble de modes de réalisation de l'invention, ce coefficient dépend des différentes fréquences, et peut par exemple être déterminé, pour chaque fréquence, en fonction du matériau et/ou de l'épaisseur du mur 512.For each of these virtual sound sources, the calculation logic is configured to calculate a path of travel of the sound wave, and to deduce therefrom the corresponding acoustic coefficient and the corresponding delay. For example, in the case of the virtual sound source 511, the sound wave follows the path 530 to point 531 of the wall 512, then the path 532 to the position of the user 540. The distance traveled by the user. sound along the path 530, 532 makes it possible to calculate an acoustic coefficient and a delay of the reflection. In a set of embodiments of the invention the calculation logic is also configured to apply an acoustic coefficient corresponding to the absorption of the wall 512 at point 531. In a set of embodiments of the invention, this coefficient depends different frequencies, and can for example be determined, for each frequency, as a function of the material and / or the thickness of the wall 512.

Dans un ensemble de modes de réalisation de l'invention, les sources sonores virtuelles 521, 522, 523, 524 sont utilisées pour calculer des sources sonores virtuelles secondaires, correspondant à des réflexions multiples. Par exemple, une source virtuelle secondaire 533 peut être calculée comme le symétrique de la source virtuelle 521 par rapport au mur 514. Le chemin de l'onde sonore correspondant comprend alors les segments 530 jusqu'au point 531 ; 534 entre les points 531 et 535 ; 536 entre le point 535 et la position 540 de l'utilisateur. Les coefficients acoustiques et les délais peuvent alors être calculés à partir de la distance parcourue par le son sur les segments 531, 535 et 536, et de l'absorption des murs aux points 531 et 535.In one set of embodiments of the invention, the virtual sound sources 521, 522, 523, 524 are used to calculate secondary virtual sound sources, corresponding to multiple reflections. For example, a secondary virtual source 533 can be calculated as the symmetrical of the virtual source 521 with respect to the wall 514. The corresponding sound wave path then comprises the segments 530 up to point 531; 534 between points 531 and 535; 536 between point 535 and position 540 of the user. The acoustic coefficients and the delays can then be calculated from the distance traveled by the sound on segments 531, 535 and 536, and the absorption of the walls at points 531 and 535.

Selon différents modes de réalisation de l'invention, des sources sonores virtuelles correspondant à des réflexions peuvent être calculées jusqu'à un ordre n prédéfini. Différents modes de réalisation sont possibles pour déterminer les réflexions à conserver. Dans un mode de réalisation de l'invention, la logique de calcul est configurée pour calculer, pour chaque source sonore virtuelle, une source sonore virtuelle d'ordre supérieur pour chacun des murs, jusqu'à un ordre prédéfini n. Dans un mode de réalisation, l'encodeur ambisonique est configuré pour traiter un nombre Nr prédéfini de réflexions par source sonore, et conserve les Nr réflexions ayant l'atténuation la plus faible. Dans un autre mode de réalisation de l'invention, les sources sonores virtuelles sont conservées sur la base d'une comparaison d'un coefficient acoustique avec un seuil prédéfini.According to different embodiments of the invention, virtual sound sources corresponding to reflections can be calculated up to a predefined order n. Different embodiments are possible to determine which reflections to keep. In one embodiment of the invention, the calculation logic is configured to calculate, for each virtual sound source, a higher order virtual sound source for each of the walls, up to a predefined order n. In one embodiment, the ambisonic encoder is configured to process a predefined number Nr of reflections per sound source, and keeps the Nr reflections having the lowest attenuation. In another embodiment of the invention, the virtual sound sources are kept on the basis of a comparison of an acoustic coefficient with a predefined threshold.

La figure 6 représente un exemple de calcul de réflexions précoces et de réflexions tardives, dans un mode de réalisation de l'invention.The figure 6 represents an example of calculation of early reflections and late reflections, in one embodiment of the invention.

Le diagramme 600 représente l'intensité de plusieurs réflexions de l'onde sonore, par rapport au temps. L'axe 601 représente l'intensité d'une réflexion, et l'axe 602 le délai entre l'émission de l'onde sonore par la source de l'onde sonore, et la perception d'une réflexion par l'utilisateur. Dans cet exemple, les réflexions survenant avant un délai prédéfini 603 sont considérées comme des réflexions précoces 610, et les réflexions survenant après le délai 603 comme des réflexions tardives 620. Dans un mode de réalisation de l'invention, les réflexions précoces sont calculées à l'aide d'une source sonore virtuelle, par exemple selon le principe décrit en référence à la figure 5.Diagram 600 represents the intensity of several reflections of the sound wave, versus time. The axis 601 represents the intensity of a reflection, and the axis 602 the delay between the emission of the sound wave by the source of the sound wave, and the perception of a reflection by the user. In this example, reflections occurring before a predefined delay 603 are considered early reflections 610, and reflections occurring after delay 603 as late reflections 620. In one embodiment of the invention, early reflections are calculated at using a virtual sound source, for example according to the principle described with reference to the figure 5 .

Selon différents modes de réalisation de l'invention, les réflexions tardives sont calculées de la manière suivante: un ensemble de Nt sources sonores secondaires est calculée, par exemple selon le principe décrit en figure 5. La logique de calcul des coefficients acoustiques et des délais, et de la position de la source sonore virtuelle des réflexions est configurée pour conserver un nombre Nr de réflexions inférieur à Nt, selon différents modes de réalisation décrits ci-dessus. Dans un ensemble de modes de réalisation de l'invention, elle est de plus configurée pour construire une liste de (Nt - Nr) réflexions tardives, comprenant toutes les réflexions non conservées. Cette liste comprend uniquement, pour chaque réflexion tardive, un coefficient acoustique et un délai de la réflexion tardive, mais pas de position d'une source virtuelle.According to different embodiments of the invention, the late reflections are calculated as follows: a set of Nt secondary sound sources is calculated, for example according to the principle described in figure 5 . The logic for calculating acoustic coefficients and delays, and the position of the virtual sound source of the reflections is configured to keep a number Nr of reflections less than Nt, according to various embodiments described above. In a set of embodiments of the invention, it is further configured to build a list of (Nt - Nr) late reflections, including all non-conserved reflections. This list includes only, for each late reflection, an acoustic coefficient and a delay of the late reflection, but no position of a virtual source.

Selon un mode de réalisation de l'invention, cette liste est transmise par l'encodeur ambisonique à un décodeur ambisonique. Le décodeur ambisonique est alors configuré pour filtrer ses sorties, par exemple ses canaux stéréo de sortie, avec les coefficients acoustiques et les délais des réflexions tardives, puis à ajouter ces signaux filtrés aux signaux de sortie. Ceci permet d'améliorer la sensation d'immersion dans une salle ou un environnement d'écoute, tout en limitant encore la complexité de calcul de l'encodeur.According to one embodiment of the invention, this list is transmitted by the ambisonic encoder to an ambisonic decoder. The ambisonic decoder is then configured to filter its outputs, for example its output stereo channels, with the acoustic coefficients and the delays of the late reflections, then to add these filtered signals to the output signals. This makes it possible to improve the feeling of immersion in a room or a listening environment, while further limiting the computational complexity of the encoder.

Selon un autre mode de réalisation de l'invention, l'encodeur ambisonique est configuré pour filtrer l'onde sonore avec les coefficients acoustiques et les délais des réflexions tardives, et ajouter les signaux obtenus de manière uniforme à l'ensemble des coefficients ambisoniques. Ceci permet d'obtenir, avec une complexité de calcul limitée, un effet représentatif de multiples réflexions dans un environnement sonore. Dans ce mode de réalisation de l'invention, comme dans le précédent, les réflexions tardives ont une intensité faible et n'ont pas d'information de direction d'une source sonore. Elles seront donc perçues par un utilisateur comme un « écho » de l'onde sonore, réparti de manière homogène dans la scène sonore, et représentatif d'un environnement d'écoute.According to another embodiment of the invention, the ambisonic encoder is configured to filter the sound wave with the acoustic coefficients and the delays of the late reflections, and to add the signals obtained uniformly to the set of ambisonic coefficients. This makes it possible to obtain, with a limited computational complexity, an effect representative of multiple reflections in a sound environment. In this embodiment of the invention, as in the previous one, the late reflections have a low intensity and have no direction information from a sound source. They will therefore be perceived by a user as an “echo” of the sound wave, distributed homogeneously in the sound scene, and representative of a listening environment.

Le calcul des coefficients acoustiques et délais des réflexions tardives induit le calcul de nombreuses réflexions. Il s'agit donc d'une opération relativement coûteuse en termes de complexité de calcul. Selon un mode de réalisation de l'invention, ce calcul est effectué une seule fois, par exemple à l'initialisation de la scène sonore, et les coefficients acoustiques et les délais des réflexions tardives sont réutilisés sans modification par l'encodeur ambisonique. Ceci permet d'obtenir des réflexions tardives représentatives de l'environnement d'écoute à moindre coût. Selon d'autres modes de réalisation de l'invention, ce calcul est effectué de manière itérative. Par exemple, ces coefficients acoustiques et délais des réflexions tardives peuvent être calculés à des intervalles de temps prédéfinis, par exemple toutes les 5 secondes. Ceci permet de conserver en permanence des coefficients acoustiques et délais des réflexions tardives représentatifs de la scène sonore, et des positions relatives d'une source de l'onde sonore et de l'utilisateur, tout en limitant la complexité de calcul liée à la détermination des réflexions tardives.The calculation of the acoustic coefficients and delays of the late reflections induces the calculation of many reflections. It is therefore a relatively expensive operation in terms of computational complexity. According to one embodiment of the invention, this calculation is carried out only once, for example on initialization of the sound scene, and the acoustic coefficients and the delays of the late reflections are reused without modification by the ambisonic encoder. This makes it possible to obtain late reflections representative of the listening environment at a lower cost. According to others embodiments of the invention, this calculation is performed iteratively. For example, these acoustic coefficients and delays of the late reflections can be calculated at predefined time intervals, for example every 5 seconds. This makes it possible to permanently conserve acoustic coefficients and delays of late reflections representative of the sound scene, and the relative positions of a source of the sound wave and of the user, while limiting the complexity of calculation linked to the determination. late reflections.

Dans d'autres modes de réalisation de l'invention, les coefficients acoustiques et délais des réflexions tardives sont calculés lorsque la position d'une source de l'onde sonore ou de l'utilisateur varie de manière significative, par exemple lorsque la différence entre la position de l'utilisateur et une position précédente de l'utilisateur lors d'un calcul des coefficients acoustiques et délais des réflexions tardives représentatifs de la scène sonore est supérieure à un seuil prédéfini. Ceci permet de ne calculer les coefficients acoustiques et délais des réflexions tardives représentatifs de la scène sonore que lorsque la position d'une source de l'onde sonore ou de l'utilisateur a suffisamment varié pour modifier de manière perceptible les réflexions tardives.In other embodiments of the invention, the acoustic coefficients and delays of the late reflections are calculated when the position of a source of the sound wave or of the user varies significantly, for example when the difference between the position of the user and a previous position of the user during a calculation of the acoustic coefficients and delays of the late reflections representative of the sound scene is greater than a predefined threshold. This makes it possible to calculate the acoustic coefficients and delays of the late reflections representative of the sound scene only when the position of a source of the sound wave or of the user has varied sufficiently to perceptibly modify the late reflections.

La figure 7 représente une méthode d'encodage d'une onde sonore à une pluralité de réflexions dans un ensemble de modes de mise en œuvre de l'invention.The figure 7 represents a method of encoding a sound wave at a plurality of reflections in a set of embodiments of the invention.

La méthode 700 comprend une étape 710 de transformation fréquentielle de l'onde sonore.The method 700 comprises a step 710 of frequency transformation of the sound wave.

Elle comprend ensuite une étape 720 de calcul d'harmoniques sphériques de l'onde sonore et de la pluralité de réflexions à partir d'une position d'une source de l'onde sonore et de positions d'obstacles à la propagation d'ondes sonores.It then comprises a step 720 of calculating spherical harmonics of the sound wave and of the plurality of reflections from a position of a source of the sound wave and from positions of obstacles to the propagation of waves. sound.

Elle comprend ensuite une étape 730 de filtrage, par une pluralité de logiques de filtrage dans le domaine fréquentiel, des harmoniques sphériques de la pluralité de réflexions, chaque logique de filtrage étant paramétrée par des coefficients acoustiques et des délais des réflexions.It then comprises a step 730 of filtering, by a plurality of filtering logics in the frequency domain, the spherical harmonics of the plurality of reflections, each filtering logic being parameterized by acoustic coefficients and reflection delays.

Elle comprend ensuite une étape 740 d'addition d'harmoniques sphériques de l'onde sonore et des sorties des logiques de filtrage.It then comprises a step 740 of adding spherical harmonics of the sound wave and of the outputs of the filtering logic.

Les exemples ci-dessus démontrent la capacité d'un encodeur ambisonique selon l'invention à calculer des coefficients ambisoniques d'une onde sonore à une pluralité de réflexions. Ils ne sont cependant donnés qu'à titre d'exemple et ne limitent en aucun cas la portée de l'invention, définie dans les revendications ci-dessous.The above examples demonstrate the ability of an ambisonic encoder according to the invention to calculate ambisonic coefficients of a sound wave at a plurality of reflections. However, they are given only by way of example and in no way limit the scope of the invention, defined in the claims below.

Claims

An ambisonic encoder (400) for a sound wave (410) having a plurality of reflections, comprising:
- a logic (420) for frequency transformation of the sound wave;

- a logic (430) for calculating spherical harmonics of the sound wave and of the plurality of reflections on the basis of a position of a source of the sound wave and of positions of obstacles to propagation of the sound wave;

- a plurality (440) of filtering logics in the frequency domain receiving, as input, spherical harmonics of the plurality of reflections, each filtering logic consisting of an attenuation and a delay of a reflection, and being parameterized by an acoustic coefficient and a delay of said reflection;

- a logic (450) of adding spherical harmonics of the sound wave and outputs from the filtering logics into a set of spherical harmonics representative of both the sound wave and the plurality of reflections in the frequency domain;

- a multiplication logic of said set of spherical harmonics representative of both the sound wave and the plurality of reflections in the frequency domain by sound intensity values of the wave at the output of the frequency transformation to obtain a set of ambisonic coefficients (B ₀₀, B _1-1, B ₁₀, B ₁₁, ..., B _MM) representative of both the sound wave and the plurality of reflections.
The ambisonic encoder according to claim 1, wherein the logic for calculating spherical harmonics of the sound wave is configured to calculate the spherical harmonics of the sound wave and of the plurality of reflections on the basis of a fixed position of the source of the sound wave.
The ambisonic encoder according to claim 1, wherein the logic for calculating spherical harmonics of the sound wave is configured to iteratively calculate the spherical harmonics of the sound wave and of the plurality of reflections on the basis of successive positions of the source of the sound wave.
The ambisonic encoder according to one of claims 1 to 3, wherein each reflection is characterized by a unique acoustic coefficient.
The ambisonic encoder according to one of claims 1 to 3, wherein each reflection is characterized by an acoustic coefficient for each frequency of the frequency sampling.
The ambisonic encoder according to one of claims 1 to 5, wherein the reflections are represented by virtual sound sources.
The ambisonic encoder according to one of claims 1 to 5, further comprising a logic for calculating the acoustic coefficients, the delays and the position of the virtual sound sources of the reflections, the calculating logic being configured to calculate the acoustic coefficients and the delays of the reflections according to estimates of a difference in the distance traveled by the sound between the position of the source of the sound wave and an estimated position both of a user and of a distance traveled by the sound between the positions of the virtual sound sources of the reflections and the estimated position of the user.
The ambisonic encoder according to claim 7, wherein the logic for calculating the acoustic coefficients, the delays and the positions of the virtual sound sources of the reflections is further configured to calculate the acoustic coefficients of the reflections according to at least one acoustic coefficient of at least one obstacle to the propagation of sound waves, off which the sound is reflected.
The ambisonic encoder according to one of claims 7 to 8, wherein the logic for calculating the acoustic coefficients, the delays and the positions of the virtual sound sources of the reflections is configured to calculate positions of virtual sound sources of the reflections as inverses of the position of the source of the sound wave with respect to a plane that is tangential to an obstacle to the propagation of sound waves.
The ambisonic encoder according to one of claims 1 to 9, wherein the logic for calculating spherical harmonics of the sound wave and of the plurality of reflections is further configured to calculate spherical harmonics of the sound wave and of the plurality of reflections at each output frequency of the frequency transformation circuit, the ambisonic encoder further comprising a logic for calculating binaural coefficients of the sound wave, which logic is configured to calculate binaural coefficients of the sound wave by multiplying, at each output frequency of the circuit for transforming the frequency of the sound wave, the signal of the sound wave by the spherical harmonics of the sound wave and of the plurality of reflections at this frequency.
The ambisonic encoder according to one of claims 7 to 9, wherein the logic for calculating the acoustic coefficients, the delays and the positions of the virtual sound sources of the reflections is configured to calculate acoustic coefficients and delays of a plurality of late reflections.
A method for ambisonically encoding a sound wave having a plurality of reflections, comprising:
- performing a frequency transformation (710) of the sound wave;

- calculating (720) spherical harmonics of the sound wave and of the plurality of reflections on the basis of a position of a source of the sound wave and of positions of obstacles to a propagation of sound waves;

- filtering (730), by a plurality of logics for filtering in the frequency domain, spherical harmonics of the plurality reflections, each filtering logic consisting of an attenuation and a delay of a reflection, and being parameterized by an acoustic coefficient and a delay of a reflection;

- adding (740) spherical harmonics of the sound wave and the outputs from the filtering logics, into a set of spherical harmonics representative of both the sound wave and of the plurality of reflections in the frequency domain;

- multiplying said set of spherical harmonics representative of both the sound wave and the plurality of reflections in the frequency domain by sound intensity values of the wave at the output of the frequency transformation to obtain a set of ambisonic coefficients (B ₀₀, B _1-1, B ₁₀, B ₁₁, ..., B _MM) representative of both the sound wave and the plurality of reflections.
A computer program product, comprising computer code instructions stored on a computer readable medium for ambisonic encoding of sound waves with a plurality of reflections, said computer code instructions being configured to:
- perform a frequency transformation of the sound wave;

- calculate spherical harmonics of the sound wave and of the plurality of reflections on a basis of a position of a source of the sound wave and of positions of obstacles to a propagation of sound waves;

- parameterize a plurality of logics for filtering in the frequency domain receiving, as input, spherical harmonics of the plurality of reflections, each filtering logic consisting of an attenuation and a delay of a reflection, and being parameterized by an acoustic coefficient and a delay of said reflection;

- add spherical harmonics of the sound wave and outputs from the filtering logics, into a set of spherical harmonics representative of both the sound wave and the plurality of reflections in the frequency domain;

- multiply said set of spherical harmonics representative of both the sound wave and the plurality of reflections in the frequency domain by sound intensity values of the wave at the output of the frequency transformation to obtain a set of ambisonic coefficients (B ₀₀, B _1-1, B ₁₀, B ₁₁, ..., B _MM) representative of both the sound wave and the plurality of reflections;
when said program runs on the computer.