EP1946612A1

EP1946612A1 - Hrtfs individualisation by a finite element modelling coupled with a revise model

Info

Publication number: EP1946612A1
Application number: EP06820237A
Authority: EP
Inventors: Rozenn Nicol; Sylvain Busson; Vincent Lemaire
Original assignee: France Telecom SA
Current assignee: Orange SA
Priority date: 2005-10-27
Filing date: 2006-10-18
Publication date: 2008-07-23
Anticipated expiration: 2026-10-18
Also published as: WO2007048900A1; US20080306720A1; EP1946612B1

Abstract

The invention relates to modelling individual head related transfer functions (HRTFs) with respect to an individual audition in a three-dimensional space. The inventive method consists in picking up morphological parameters of several individuals for roughly estimating the HRTFs by finite elements and building the database thereof roughly estimated by comparing/ training on said database and on another database containing several HRTFs measured in all directions of space and in building, for the same individuals, a model based on an artificial neurone network which is capable to calculate the HRTFs for all directions of space from a series of measurements of morphological parameters of any individuals.

Description

Individualization of HRTFs using finite element modeling coupled with a corrective model

The present invention relates to the modeling of individual transfer functions called HRTFs (for "Head Related Transfer Functions"), relating to the hearing of an individual in the three-dimensional space.

The invention is particularly in the context of telecommunication services offering spatialized sound broadcasting (for example an audio conference between several speakers, a movie trailer). On telecommunications terminals, including mobile terminals, it is envisaged a sound reproduction with a stereo headset. The most effective technique for positioning sound sources in space is then binaural synthesis.

Binaural synthesis is based on the use of so-called "binaural" filters, which reproduce the acoustic transfer functions between the sound source and the listener's ears. These filters are used to simulate auditory location indices, indices that allow a listener to locate sound sources in real listening situations. These filters take into account all the acoustic phenomena (in particular the diffraction by the head, the reflections on the roof of the ear and the top of the torso) which modify the acoustic wave in its path between the source and the ears of the listener. These phenomena vary greatly with the position of the sound source (mainly with its direction) and these variations allow the listener to locate the source in the space. Indeed, these variations determine a kind of acoustic coding of the position of the source. The auditory system of an individual knows, by learning, to interpret this coding to locate the sound sources. Nevertheless, acoustic phenomena of diffraction / reverberation depend equally strongly on the morphology of the individual. Binaural quality synthesis is based on binaural filters that reproduce better the acoustic coding that naturally produces the body of the listener, taking into account the individual specificities of its morphology. When these conditions are not respected, there is a deterioration in performance binaural rendering, which results in particular intracranial perception of sources and confusions forward / backward. The sources at the front are seen at the back and vice versa.

Among the technologies of 3D sound or sound spatialization, in the processing of the audio signal applied in particular to the simulation of acoustic and psychoacoustic phenomena, some aim at the generation of signals to be broadcast on loudspeakers or headphones, in order to give the listens to the auditory illusion of sound sources placed at particular respective positions around it. We are talking about creating sources and virtual sound images.

The binaural techniques described above are applied to the treatment of a 3D sound intended for headset broadcasting to two left and right atria. These techniques aim at reconstructing the sound field at the level of a listener's ears, so that their eardrums perceive a sound field that is virtually identical to that which would have been induced by real sources in 3D space. The binaural techniques are based on a pair of binaural signals that respectively feed the two headphones of the headphones. These binaural signals can be obtained in two ways:

- by direct sound recording, by means of two microphones inserted at the entrance of the auditory canal of an individual or manikin with a standard morphology ("artificial head"), or

by signal processing, by filtering a monophonic signal by two binaural filters, these filters reproducing the properties of acoustic propagation between the source placed at a given position and the two ears of a listener. Binaural techniques employing binaural filters define the field of binaural synthesis in an advantageous context of the present invention. Binaural synthesis is based on binaural filters that model the propagation of the acoustic wave between the source and the two ears of the listener. These filters represent acoustic transfer functions called HRTFs that model the transformations generated by the torso, head and horn of the listener on the signal coming from a sound source. At each sound source position is associated a pair of HRTFs (one HRTF for the right ear, one HRTF for the left ear). In addition, HRTFs carry the acoustic fingerprint of the morphology of the individual on which they were measured.

HRTFs therefore depend not only on the direction of the sound, but also on the individual. They are thus a function of the frequency f, the position (θ, φ) of the sound source (where the angle θ represents the azimuth and the angle φ the elevation), of the ear (left or right) and the individual.

Typically, HRTFs are obtained by measurement. We initially set a selection of directions that cover more or less finely all the space surrounding the listener. For each direction, left and right HRTFs are measured by means of microphones inserted at the entrance of a subject's ear canal. The measurement must be performed in an anechoic chamber (or "deaf room"). Finally, if M directions are measured, we obtain, for a given subject, a database of 2M acoustic transfer functions representing each position of the space for each ear.

In the advantageous context of binaural synthesis, the spatialization effect is based on the use of HRTFs which, for optimal performance, must take into account acoustic propagation phenomena between the source and the ears, but also the individual specificities of the morphology of the listener. The experimental measurement of HRTFs directly on an individual is, at the moment, the most reliable solution to obtain binaural filters of quality and really individualized (taking into account the individual specificities of the morphology of the individual). It is recalled that it is a matter of measuring the transfer function between a source located at a given position (Θ1, φ1) and the two ears of the subject by means of microphones placed at the entrance of the auditory ducts of this person.

However, the measurement of these HRTFs transfer functions presents some difficulties. It requires specific and expensive equipment (typically a deaf chamber, a microphone, a mechanical source positioning device). This operation is long because it is necessary in particular to measure the transfer functions for a large number of directions in order to cover homogeneously the whole of a 3D sphere surrounding the listener.

This measurement of HRTFs becomes very difficult, if not impossible, in the context of binaural synthesis applications for the general public. The measurement of HRTFs actually poses at least three main problems: • The measurement of HRTF itself is difficult to implement because it requires specific equipment. The measurement must be performed in an anechoic chamber. It also requires a mechanical device to move and control the measurement speaker to perform measurements for a large number of directions evenly distributed in azimuth and elevation around the listener. In addition, the measurement procedure as a whole is painful for the subject, because of the constraints imposed on the subject by the measuring system and because of the duration of the measurement.

• A second problem is the need to measure HRTFs in a large number of directions to provide sufficient and homogeneous spatial sampling of the 3D sphere surrounding the listener. Plus the number measured directions, the longer the duration of the measurement, which increases the discomfort of the subject.

• A third problem is the measurement of a particular individual. To offer a binaural performance to any individual implies using its own HRTFs, which must have been measured beforehand, which is generally impossible.

We have therefore sought solutions requiring a minimum of HRTFs measurements and further implementing modeling techniques. In particular, mathematical models of HRTFs consisting of a function F making it possible to express an HRTF (Y) from a set of parameters (X) given a priori, such as Y = F (X), were searched. Often, two essential elements intervene: - the development of the mathematical model (function F), and - the specification of the set of parameters to be applied at the input of the model.

The state of the art is presented below to the inventors' knowledge of HRTF modeling implemented to date, paying attention to the choice of input parameters of the models.

In US-2003/138107, a statistical model of HRTFs was presented from morphological data. This approach starts from a statistical analysis applied to a database including HRTFs and morphological data. Principal component analysis is first applied to HRTFs on the one hand and to morphological data on the other hand, which allows the description of all the data with a reduced number of components. Then, a linear regression is carried out between the components resulting from the principal component analysis of the HRTFs and the components resulting from that of the morphological data. This establishes a statistical model linking morphological data to HRTFs. It is then sufficient to measure the morphological parameters of a any individual to predict his HRTFs from the statistical model obtained.

However, this document also provides for enriching the morphological data of an individual, at the input of the model, by some HRTFs measured on this individual and in respective respective directions.

Thus, even if the number of measurements is restricted in this document, it is still necessary to respect the measurement protocol of HRTFs, in particular to provide an anechoic chamber for the measurements and to strictly position the sources at specific distances from the microphones which are contiguous to the ears of the individual.

The implementation of the present invention overcomes such constraints.

To this end, the present invention aims at a method for modeling HRTFs transfer functions specific to an individual, in which there is provided: an initial stage of model constitution in which: a) a first database including a plurality of HRTFs measured in a multiplicity of directions of space and for a plurality of individuals, b) a second database is constituted including own and respective morphological parameters of said plurality of individuals, c) from said morphological parameters of the second database, a finite element modeling is applied to obtain a third database comprising modeled, clean and respective HRTFs of said plurality of individuals, for at least a part of said multiplicity of directions, d) by comparison and learning about the data of the first and third databases, we build a correct model if clean to providing modeled and adjusted HRTFs for said multiplicity of directions, ^* and a current step of determining HRTFs in said plurality of directions, for any individual, wherein: e) measuring morphological parameters of any individual, and ) Modeled and corrected HRTFs of the individual are obtained by applying the finite element modeling and the model corrective to the morphological parameters of the individual.

Thus, the present invention intends to take advantage of the advantages of the technique described in document FR-2 851 877, according to which it is possible to model, at least roughly, the HRTFs of an individual for which an appropriate set of parameters has been measured. morphological. It is typically a finite element modeling, which amounts to estimating, as a function of their original direction, the disturbances that the acoustic waves undergo when they encounter an obstacle corresponding to the bust of the individual. In particular in this document FR-2 851 877, it has been proposed to measure general dimensions of the head and torso of an individual, and to model at least the head and the torso of the individual by simple geometrical shapes ( for example ellipsoids for the head and the torso and a cylinder for the neck), the dimensions of these simple shapes corresponding to the dimensions measured on the individual. Finite element modeling is then applied to these simple forms. Modeled HRTFs are obtained which are satisfactory in that the HRTFs obtained are at least distinguishable from one individual to another, in particular in the low and medium acoustic frequencies. For higher frequencies, this document FR-2 851 877 proposes to also locate at least the position of an ear on the head of the individual and preferably the shape of the flag of the ear too. However, the quality of the HRTFs thus modeled remained to perfect and the present invention proposes for this purpose to apply a corrective model, setting advantageously, a network of artificial neurons is used, in particular in step d) model constitution of the above method.

As soon as a comparison and learning phase is implemented for the constitution of the model, in particular if an artificial neural network is used, it is preferable that the measurement conditions of the morphological parameters are substantially reproducible at least between the stage of constitution of the model and the current stage carried out on any individual. It is also preferable that the simplified geometric model, as well as the finite element model, be reproducible.

For this purpose, the procedure for measuring morphological parameters which is described in FR-2 851 877 can be repeated here. Typically, an installation can be provided for estimating HRTFs transfer functions specific to an individual, comprising:

a cabin for measuring morphological parameters of an individual, and

a processing unit capable of evaluating the HRTFs of the individual in a multiplicity of spatial directions by applying to the morphological parameters of the individual a finite element modeling and a corrective model based on a learning and advantageously putting in a network of artificial neurons.

The present invention also aims at such an installation.

Advantageously, the installation can be equipped with means of shooting, according to at least two different angles (for example of face and profile), the bust at least of an individual to deduce general dimensions of his head, of his torso, or others. For this purpose, the cabin may comprise, in a preferred embodiment, a measurement standard so that the shots show, with the bust of the individual, the measurement standard. Means shape recognition, for example, can then be used to measure the morphological parameters involved in the modeling.

Thus, this installation makes it possible to implement at least the current step of the method within the meaning of the invention.

It then suffices, in the current step, to provide:

a set of morphological parameters of any individual, measured for example with the installation described above, and

at least one direction chosen from a multiplicity of directions in space and in which an estimate of HRTFs is desired, and modeled and adjusted HRTFs are obtained for this selected direction.

In a general embodiment, it is possible to model the HRTFs in all the multiplicity of directions of the space by finite elements, and then to perfect the model by comparison / learning between all these modeled HRTFs and all the HRTFs of the first base.

Alternatively, one can proceed as follows.

It has been found that finite element HRTF modeling is more efficient in some particular directions, in that for these directions finite element HRTFs are closer to the measured HRTFs than in the other directions. regardless of the individual. Thus, at the end of finite element modeling, only those best-modeled HRTFs corresponding to preferred directions can be retained and the comparison made only on these privileged directions. On the other hand, we will conduct learning on all the multiplicity of directions of space. Thus, in more generic terms, for the implementation of the model constitution step, from said morphological parameters of the second database and by comparison with the measured HRTFs of the first database, one selects directions privileged space models in which finite element modeling provides modeled HRTFs close to the measured HRTFs in these preferred directions, and

in step c), from said morphological parameters of the second database, finite element modeling is applied to obtain a third database comprising modeled, clean and respective HRTFs of said plurality of individuals, following said privileged directions,

in step d), by comparison and learning on the data of the first and third databases, a corrective model is constructed that is capable of yielding modeled and adjusted HRTFs for the multiplicity of directions.

In addition or alternatively, we can assume that all directions are not equivalent in terms of individualization and that there are privileged directions that are more "individual" than the others, in the sense that the HRTFs in these directions are more rich in individual information than others. For example, the directions where the contribution of the flag is more marked, or even predominant, are potentially strongly individual directions. It then seems relevant to focus the finite element modeling on these directions, which may then give another, for example complementary, criterion of selection of preferred directions of the modeled HRTFs.

The present invention also relates to a computer program product, intended to be stored in a memory of a processing unit or on a removable support adapted to cooperate with a reader of said processing unit, or intended to be transmitted from a server to said processing unit. The program includes instructions in the form of computer code for constructing a learning-based model advantageously implementing an artificial neural network, capable of providing HRTFs transfer functions of an individual for a multiplicity of directions, from 'a set of measurements, made on this individual, of morphological parameters of this individual. The program then implements, from a first database including a plurality of HRTFs along a plurality of spatial directions and for a plurality of individuals, and a second database containing morphological parameters of these individuals, at least one finite element modeling, followed by a comparison / learning phase.

The present invention also relates to a second computer program product, intended to be stored in a memory of a processing unit or on a removable support adapted to cooperate with a reader of said processing unit, or intended to be transmitted from a server to said processing unit. The program includes instructions in computer code form for implementing a learning-based model advantageously implementing an artificial neural network, which model is capable of providing HRTFs transfer functions of an individual for a multiplicity of directions, from a set of measurements made on this individual, of morphological parameters of this individual.

Thus, the first program described above allows to build the model, while the second program consists of computer instructions representing the model itself.

Other features and advantages of the invention will appear on examining the detailed description below, and the attached drawings in which: FIG. 1 diagrammatically illustrates the main steps of the process within the meaning of the invention,

FIG. 2 diagrammatically illustrates the operating steps of a model implementing an artificial neural network, which can then correspond to a flowchart schematically showing the progress of the second computer program described above,

FIG. 3 schematically illustrates the steps of construction of the model, which may then correspond to a flowchart schematically showing the progress of the first computer program described above,

FIG. 4a schematically illustrates the first model constitution step in a method according to the invention,

FIG. 4b schematically illustrates the current step using the model constituted in a process in the sense of the invention; FIG. 4c schematically illustrates an advantageous embodiment for constituting the aforementioned model, and

- Figure 5 schematically shows an installation for the implementation of the invention.

First of all, the principle of the constitution of a model using a comparison / learning phase is recalled below.

In particular, it involves calculating the HRTFs transfer functions by means of a mathematical model based on a function F that makes it possible to express a transfer function from several input parameters. More specifically, if the desired transfer function is represented as a vector Y (Ys 5R ", ne K) and if the input parameters are described as a vector X (Xe SR ¹ ", me K), the function F defines the following relation: Y = F (X). In other words, the function F makes it possible to deduce a transfer function from a given set of known parameters a priori. The interest of the mathematical model lies in the use of input parameters that it is easy to acquire for any individual, bearing in mind, however, that their relationship to the transfer function is not necessarily direct or obvious. The mathematical model must in particular be able to extract more or less hidden information in the input parameters in order to deduce the desired transfer function. The method of the invention is essentially based on two points:

- the definition of the function F,

the determination of the input parameters X.

The mathematical model of HRTFs is based on a function F for expressing an HRTF from a given number of input parameters. The input parameters are grouped into a vector X (Xe ïT me K) which therefore constitutes the input vector of the function F. The output vector of the function is an HRTF which is represented by a vector Y (Ye ^^ ne K).

For example, this vector Y may consist of frequency coefficients describing the spectrum modulus of the transfer function defined by the HRTF.

Equivalently, Y may consist of:

time coefficients describing the impulse response associated with the transfer function defined by the HRTF,

or frequency coefficients describing the complex spectrum of the transfer function defined by the HRTF.

The function F is therefore a function of 9T in SR ".

The problem of modeling consists in determining the function F, in association with a relevant set of parameters (X), such that any HRTF (Y) is solution of: Y = F (X).

Specifically for estimating the HRTFs of an individual, the input vector X of the model contains mainly information relating to:

the direction in which it is desired to calculate an HRTF, preferably in the form of an azimuth angle (θ) and an elevation angle (φ), - and "individual" parameters (such as HRTFs estimated from the morphological parameters of the individual and by finite element modeling in all or only a few directions of space, as will be seen later), these parameters Individuals (corresponding indirectly to the morphological parameters) are intended to provide the model with information relating to the specificities of the individual whose HRTFs are to be calculated.

The output vector Y of the model consists of coefficients associated with a given representation of an HRTF. As indicated above, the vector Y may correspond to the frequency coefficients describing the spectrum modulus of an HRTF, but other representations may be considered (principal component analysis, HR filter, or others).

As shown in Figure 1, the model is applied here for correction and optionally interpolation purposes. Morphological parameters such as the dimensions of the head Dim ^H and / or the torso Dim ^τ of an individual are measured on this individual (step E10). From finite element modeling (step E11), HRTFs estimated HRTF _g (0j, 0j) are deduced for all or part of the directions of space (step E12). The corrective model based on an artificial neural network is then used (step E13) to calculate HRTF corrected HRTF _c (0i, 0j) of this individual in all directions (over 360 °) covering the entire 3D sphere (step E 14 ), and this, by comparison with a first database of real HRTFs measurements of the same individual (denoted HRTF _m (0i, 0j)) throughout the 3D sphere (step E15 of Figure 1). The previously estimated HRTFs are thus used as input parameters of the correction model of step E13, and the previously measured HRTFs E15 are used as input comparison parameters also of the correction model of step E13. In general, modeling based on an artificial neural network consists essentially of:

to determine the function F which best approaches the relationship between X and Y,

- Determine the set of X input parameters best suited in connection with the function F, especially in terms of quality and quantity of information provided by the parameters and that can be exploited by the model used.

The determination of F and the vector X are obviously not independent.

There is a wide variety of mathematical methods for determining these two F and X entities. The method of the invention is preferably based on statistical learning algorithms and, in a preferred embodiment, on network type algorithms. artificial neurons. These algorithms are briefly presented below.

Statistical learning algorithms are tools for predicting statistical processes. They have been used successfully for the prediction of processes for which several explanatory variables can be identified. Artificial neural networks define a particular category of these algorithms. The interest of neural networks lies in their ability to capture high-level dependencies, that is, dependencies that involve multiple variables at once. The process prediction takes advantage of the knowledge and exploitation of high-level dependencies. There is a wide variety of application domains of neural networks, especially in financial techniques to predict market fluctuations, in pharmaceuticals, in the banking field for the detection of credit card fraud, in marketing to predict behavior. consumers, or others. Neural networks are often considered as universal predictors, in the sense that they are capable of predicting any data from variables explanatory, since the number of hidden units is sufficient. In other words, they make it possible to model any mathematical function of SR '"in SR", if the number of hidden units is sufficient.

With reference to Figure 2; a neural network consists of three layers: an input layer 10, a hidden layer 11 and an output layer 12. The input layer 11 corresponds to the explanatory variables, that is to say the variables of input (the aforementioned vector X), from which the prediction is made, and which will be described in detail later. The output layer 12 defines the predicted values (the above-mentioned vector Y).

In the hidden layer, a first step 111 consists in calculating linear combinations of the explanatory variables so as to combine the information coming potentially from several variables. A second step 112 may consist in applying a non-linear transformation (for example a function of the "hyperbolic tangent" type) to each of the linear combinations in order to obtain the values of the hidden units or neurons that constitute the hidden layer. This nonlinear transformation defines the activation function of the neurons. Finally, the hidden units are recombined linearly, in step 113, to calculate the value predicted by the neural network.

Initially, the development of a neural network goes through three operations:

- Learning to optimize, for a given architecture of the neural network, the parameters of the network from a series of training examples (forming the training set), from which the network of neurons seek to minimize its prediction error; the validation procedure, conducted in parallel with the learning and intended to optimize the architecture of the network, so that the neural network does not over-learn the learning set. The network models only the basic dependency relationships and does not attempt to reproduce relationships that are due only to statistical fluctuations in the learning set. In addition to the learning error, a prediction error is thus evaluated on examples from a validation set, which is distinct from the training set. This error defines the validation error. For example, it begins to decrease when increasing the number of hidden layers, reaches a minimum, and then increases when the number of hidden layers becomes too large. The minimum therefore defines an optimal number of hidden layers of the network;

the calculation of the final prediction error, on a third test set, distinct from the two previous sets.

There are different categories of neural network distinguished by their architecture (type of interconnection between neurons, choice of activation functions, or other) and the learning mode used.

Neural networks are not used for prediction purposes only. They are also used for classification and / or clustering of data in a perspective of information reduction. Indeed, a network of neurons is able, in a set of data, to identify common characteristics between the elements of this set, to group them according to their resemblance. Each group thus formed is then associated with an element representative of the information contained in the group, called "representative". This representative can then be substituted for the entire group. The set of data can thus be described by means of a reduced number of elements, which constitutes a reduction of data. Kohonen cards or self-check cards organizers (in English SOM for "Self Organizing Map") may be neural networks dedicated to this grouping task.

One question arose with respect to choosing all the HRTFs, roughly estimated by finite element modeling, as input to the artificial neural network model 11 or if only a few HRTFs estimated in privileged directions could be useful, such as indicated above.

It is also recalled that the grossly estimated HRTFs can be determined from finite element modeling by considering for example simple geometrical shapes for the head, torso, neck, or other of an individual, as described in FIG. document FR-2 851 877, without repeating this description in detail here.

The method that seemed the most immediate was a uniform selection in which a subset of roughly estimated HRTFs directions was chosen by trying to cover the entire 3D sphere as homogeneously and evenly as possible. This method was based on a regular sampling of the 3D sphere. However, it turned out that the HRTFs did not vary in a uniform way depending on the direction. From this point of view, a uniform selection of HRTFs was not really optimal.

A more promising method was to apply the clustering technique to identify the most "relevant" HRTFs directions, that is, the most representative of the HRTF characteristics observed on the whole. of the 3D sphere. When applied to the determination of an individual's HRTFs, this grouping technique may consist of: in a first step, to identify the redundancies between the HRTFs of neighboring directions,

in a second step, grouping the HRTFs according to a similarity criterion; in a third step, the whole of the 3D sphere surrounding the listener is thus subdivided into a reduced number of zones corresponding to the different groups of HRTFs previously identified, and

in a fourth step, each group is associated with an HRTF which is considered to be the representative of the group. This "representative" HRTF is one of the HRTFs of the cluster and is selected as the HRTF minimizing a distance criterion with all the other HRTFs in the group. The representative HRTF contains most of the HRTFs information of the group. In the end, all the representative HRTFs thus obtained constitute a compact description of the properties of the HRTFs for the entire 3D sphere.

This technique had good results for the model. The first result is a reduction of data. The clustering procedure also provides additional information as to the directions associated with the representative HRTFs, this information making it possible to define a selection of HRTFs intended to feed the input of the HRTFs calculation model. This selection is a priori non-uniform, but more efficient, and guarantees a better "representativeness" of the entire 3D sphere.

Nevertheless, it has emerged to the inventors that the greater selectivity providing efficient clustering is observed between distinct morphotypes of individuals, rather than between distinct directions of HRTFs. The inventors then favored the completeness of the database of morphological parameters, in particular by choosing a large variety of morphotypes. It was then preferred to derive from this basis a new base containing the HRTFs modeled by finite elements for all these individuals and in all directions of space. It is these HRTFs which are then provided as input to the corrective model illustrated by step 11 of FIG.

Preferably, the invention uses "artificial neural network" type statistical learning algorithms as a modeling tool for the corrective calculation of HRTFs (for example with a "Multi Layer Perception" neuron network or MLP). The input parameters of the neuron network are at least the azimuth angle (Θ1) and elevation angle (φ1) specifying the direction of an HRTF to be calculated, and the HRTFs roughly estimated using the finite element model .

The output parameters of the model are then the coefficients of the vector describing the HRTF for the direction (Θ1, φ1) and for the individual whose HRTFs were estimated by finite element modeling.

Referring again to FIG. 2, the principle of calculating HRTFs by implementing an artificial neural network (for example of the MLP type) consists of:

input layer 10 consisting of input parameters then including: o HRTFs roughly estimated and denoted HRTF ₉ (^, θj), with i lying between 1 and n, o the directions for which the HRTFs are to be calculated , preferably specified in the form of an elevation angle (φj ^cal ) and an azimuth angle (θ _j ^cal ), with j being between 1 and N, N possibly being different and in particular greater than n,

- the output layer 12 giving the corrected HRTFs of the individual in the directions (φ _j ^cal , θ _j ^cal ) specified input, and - One or more hidden layers 11 which seek, by adjusting the weight and activation functions of neurons, to better model the relationship between the input layer and the output layer.

Referring now to FIG. 3, the implementation of a neural network goes through three stages:

- the learning phase 21,

- the validation phase 22, and

The test phase 23. To carry out these three phases, a database of HRTFs roughly estimated on one or more individuals is initially available. Thus, it will be understood that a prior step of collecting measurements of morphological parameters of several individuals and hence of their roughly estimated HRTFs in all directions of space is implemented. This is how we build the database 20.

This database 20 is broken down into three distinct sets:

- a learning package (APPR),

- a validation set (VALID), - a test set (TEST).

For the learning phase 21, there are pairs combining:

an input vector X (describing the direction of the HRTF to be calculated and the individual parameters such as the rough estimation of the HRTFs in all or some directions), and an output vector Y (corresponding to the HRTF that must be estimated at best the neural network).

Learning consists, for each pair thus formed from the learning set:

to optimize the neural network (in terms of the weights and activation functions of the neurons), and comparing the result obtained by the neural network and the expected result (corresponding to a HRTF actually measured on the individual and stored in the aforementioned first database, as illustrated by the reference E15 of FIG. to minimize a given error criterion.

One risk of the learning phase is over-learning, which translates as follows: the neural network learns "by heart" the learning set and tries to reproduce variations specific to the learning set, then they do not exist at the global level. To avoid over-learning, the validation phase 22 is conducted in conjunction with the learning phase 21. It consists in evaluating the prediction error of the neural network on a validation set (distinct from the training set ), which defines the validation error. During learning, the validation error begins to decrease and then starts to grow again when over-learning occurs. The minimum of the validation error therefore determines the end of the learning.

In fact, this observation has a direct impact on the number of estimated HRTFs to provide at the input of the model, after the learning phase. It will be remembered that an advantageous optional feature provides for determining an optimum number of HRTFs roughly estimated to provide input model.

The test phase is conducted once the training is complete and consists in evaluating the prediction error on the test set. This error, called "test error", finally describes the final performance of the neural network.

At the end of these three phases, there is an operational neural network, to which it suffices to submit input parameters to obtain the HRTFs of any individual in any direction. Thus, with reference to FIG. 4a, the method illustrated by way of example therefore comprises a step a) during which a database 20 is constituted by measuring a plurality of HRTFs in a multiplicity of directions of the space and for a plurality of individuals. This measurement step referenced 40 in FIG. 4a consists in collecting the HRTFs measurements in N spatial directions, for M individuals, preferably of different morphology (or "morphotype"), in order to obtain an exhaustive database according to the specificities individuals. More generally, the number of individuals taken into account when learning is high and the better the performance of the neural network, especially in terms of "universality".

The following step b) consists of learning the model using this database 20 and another database 41 comprising grossly estimated HRTFs from finite element modeling 49 (or "BEM") applied to morphological parameters 48 specific to the same individuals. In step 41, arbitrary directions i representative of HRTFs in a restricted number n (with n <N) are arbitrarily selected. This step 41 will be described in detail below, with reference to FIG. 4c. The three learning phases 21, validation 22 and test 23 are then conducted to build the model in step 44. It will be noted that it is possible to adjust the number of roughly estimated HRTFs to avoid the phenomenon of over-learning described above. Thus, it is possible to determine an optimum number Nopt of roughly estimated HRTFs which are necessary for the proper functioning of the model (step 42) and to adopt this optimum number (step 43) for the definition of the model. The neural network 44 is finally obtained for calculating the HRTFs. The neural network 44 is then able to calculate the HRTFs of any individual, in any direction, provided that there are some morphological parameters of the individual. Referring to Figure 4c, an optional aspect of the invention is now specified for a preferred embodiment of model learning. In fact, the database 20 must be constituted under the most conventional and standard conditions to offer, at the output of the model, quality HRTFs that can be applied to rendering devices by providing satisfactory listening comfort. .

On the other hand, a second type of measurement 48 is carried out on the same individuals on which the measurements constituting the measured HRTFs database have been carried out, and consisting in recording the morphological parameters of these M individuals (dimensions head, torso, neck, position and shape of ears, etc.). For each set of morphological morphological parameters of an individual j, finite element modeling 49 is applied to obtain HRTFs estimated in at least a portion of the directions of space.

Moreover, during a step 50, it is specified, at the input of the model, in which directions (0 _j ^cal , θj ^cal ) the HRTFs will have to be calculated. Preferably, this will of course be the largest possible number of 3D space directions. A version of the model 44b, in the learning state, calculates the HRTFs corrected in these directions (0j ^cal , θ _j ^cal ) from the roughly estimated HRTFs, in a following step 46b. The model compares these HRTFs computed and corrected with the HRTFs of the database 20 in the same directions (0j ^cal , θj ^cal ). If the deviation is judged to be too large (arrow N), the learning model 44b is perfected until this difference is reduced to an acceptable error (arrow O): the model then becomes definitive (end step 44).

With reference to FIG. 5, an example of an installation for carrying out measurements of morphological parameters that will be used will now be described. to determine modeled and corrected HRTFs. The individual IND is placed in a cabin CAB. It has its bust preferentially with respect to a top mark REP1 and a front mark REP2 provided in the cab CAB. This embodiment makes it possible to maintain the individual IND by being positioned correctly with respect to two means of shooting Si and S ₂ according to two distinct angles O1 and φi and, consequently, to obtain a 3D topography of its bust, with in particular the dimensions of the individual's head, torso, neck, etc.

Advantageously, the cabin comprises an ETA measurement standard which will serve as a scale for measuring these dimensions. In particular, means of shooting Si and S ₂ incorporate, in their field, ETA yardstick with the bust of the individual IND.

Referring again to FIG. 5, the images can be analyzed by shape recognition means to measure the morphological parameters of the individual. In practice, image signals are collected by an interface 51 of a CPU UC, which converts them into digital data. These data are then processed to determine the morphological parameters 48 and hence the coarse HRTFs by applying the BEM model (step 49). Finally, these grossly estimated HRTFs are processed by model 44 based on artificial neural network. The model 44 may be stored as a computer program product in a memory of the CPU. The HRTFs calculated for all the directions of the space that the model gives can then be stored in memory 52 or recorded on a removable medium (on diskette or engraved on CD-ROM) or communicated via a network such as the Internet or equivalent .

However, it is indicated that the protocol for measuring the morphological parameters, on the one hand, and the HRTFs measured in the base 20, on the other hand on the other hand, should preferably be defined in advance and be followed in substantially the same way, for all individuals. The network of neurons thus obtained is capable of calculating the HRTFs of any individual, in any direction, provided that measures of its morphological parameters are available.

Of course, the present invention is not limited to the embodiment described above by way of example; it extends to other variants.

For example, instead of providing two shots for measuring the morphological parameters, it is possible to provide a 3D laser reading of an individual's bust.

Claims

claims

A method of modeling HRTFs transfer functions specific to an individual, in which there is provided: an initial stage of model building in which: a) a first database is constituted including a plurality of HRTFs measured according to a multiplicity of directions of space and for a plurality of individuals, b) a second database is constituted including own and respective morphological parameters of said plurality of individuals, c) from said morphological parameters of the second database, finite element modeling is applied to obtain a third database of modeled, clean and respective HRTFs of said plurality of individuals for at least a portion of said multiplicity of directions, d) by comparison and learning on data of First and Third Databases, a Corrective Model for Modeled HRTFs is Constructed s and adjusted for said multiplicity of directions, 0 * and a current step of determining HRTFs in said plurality of directions, for any individual, wherein: e) measuring morphological parameters of any individual, and f) one obtains modeled and corrected HRTFs of the individual by applying finite element modeling and said model corrective to the morphological parameters of the individual.

The method of claim 1, wherein:

the conditions for measuring the morphological parameters are substantially reproducible at least between the step of constituting the model and the current step carried out on any individual.

3. Method according to one of claims 1 and 2, wherein:

- at least the general dimensions of the head and torso of an individual are measured, and

at least the head and the torso of the individual are modeled by simple geometrical shapes of dimensions corresponding to the measured general dimensions, for applying said finite element modeling.

4. The method of claim 3, wherein is further located at least the position of an ear on the head of the individual.

5. Method according to one of claims 3 and 4, wherein one obtains a front view and profile of the bust at least of the individual to deduce said general dimensions.

6. Method according to one of the preceding claims, wherein the corrective model of step d) is constructed by implementing an artificial neural network.

7. Method according to one of the preceding claims, wherein, for the implementation of the step of constituting the model, from said morphological parameters of the second database and by comparison with measured HRTFs of the first base. of data, privileged directions of space are selected according to which the finite element modeling provides modeled HRTFs close to the HRTFs measured according to these privileged directions, and

in step c), from said morphological parameters of the second database, finite element modeling is applied to obtain a third database comprising modeled, clean and respective HRTFs of said plurality of individuals, following said privileged directions, in step d), by comparison and learning on the data of the first and third databases, a corrective model is constructed that is capable of producing modeled and adjusted HRTFs for the multiplicity of directions.

8. Method according to one of the preceding claims, wherein is provided in the current step:

a set of morphological parameters of any individual, and

at least one selected direction (φ _j ^cal , θj ^cal ) among said multiplicity of 0 directions in which an estimate of HRTFs is desired, and HRTFs modeled and adjusted for this selected direction are obtained.

9. Installation for implementing the method according to one of the preceding claims, for estimating HRTFs transfer functions specific to an individual, comprising:

a cabin for measuring morphological parameters of an individual, and

a processing unit (UC) capable of evaluating the HRTFs of the individual in a multiplicity of spatial directions by applying to the morphological parameters of the individual an o finite element modeling and a learning-based corrective model .

10. Installation according to claim 9, wherein the cabin comprises a measurement standard, the installation further comprising means for shooting the bust of the individual according to at least two angles of view, showing, with the bust of the individual, said measurement standard.

11. Computer program product, intended to be stored in a memory of a processing unit or on a removable support adapted to cooperate with a reader of said processing unit, or intended to be transmitted from a server 0 to said processing unit, comprising instructions in the form of a computer code for implementing the initial step of the method according to one of the Claims 1 to 8, for constructing a learning-based model capable of providing HRTFs transfer functions of an individual for a multiplicity of directions, from a set of measurements, made on that individual, of morphological parameters of this individual, the program implementing, from a first database including a plurality of HRTFs according to a plurality of directions of space and for a plurality of individuals, and a second database having parameters of these individuals, at least one finite element modeling, followed by a comparison / learning phase.

12. Computer program product, intended to be stored in a memory of a processing unit or on a removable support adapted to cooperate with a reader of said processing unit, or intended to be transmitted from a server to said unit method, comprising instructions in the form of a computer code for implementing the current step of the method according to one of claims 1 to 8, for implementing a model based on a learning and capable of giving functions of HRTFs transfer of an individual for a multiplicity of directions, from a set of measurements made on this individual any morphological parameters of any individual.