WO2006048537A1

WO2006048537A1 - Dynamic sound system configuration

Info

Publication number: WO2006048537A1
Application number: PCT/FR2005/002699
Authority: WO
Inventors: Olivier Bernier; Olivier Perrault
Original assignee: France Telecom
Priority date: 2004-11-03
Filing date: 2005-10-27
Publication date: 2006-05-11
Also published as: FR2877534A1

Abstract

A method for configuring a sound system consisting of a plurality of loud speakers (E1, En) for reproducing a sound signal (50) in a listening space occupied by at least one person, characterised in that it includes a step of real-time location of the position of said person(s) in said listening space and a step of dynamic matching (30) of sound signal reproduction on at least one previously located speaker depending on said person's position.

Description

DYNAMIC CONFIGURATION OF A SOUND SYSTEM

The present invention relates generally to the field of electro-acoustic and deals with the rendering of sound signals through a sound system consisting of a plurality of loudspeakers.

The invention relates more particularly to a method of configuring a sound system consisting of a plurality of loudspeakers intended to broadcast a sound signal within a listening room, in which there is at least one person likely to move. In the state of the art, some home theater systems, more commonly known as "home theater", include means of configuring their sound system to automatically configure the speakers according to certain specificities of the room in which the home theater system is installed. Pioneer has developed such a system, called MCACC system ("Multi-Channel Acoustic Calibration System"). A description of the main features of this system is available on the Internet at the following Web address: http://www.pioneerelectronics.com/pna/article/0,, 2076_4 151_20157532,00.html.

The principle of the MCACC system is first to evaluate the number and type of speakers connected to the home theater system, as well as the acoustic characteristics of the room in which they are installed, then calibrate the speakers accordingly to obtain the best possible sound quality for the user. Primarily, the calibration consists in determining the power of the sound signal to be sent to each of the speakers and at what precise moment to send it.

To further optimize the speaker calibration, the user, equipped with a microphone connected to the system, is placed at its listening position in the room. Sound signals are then broadcast from the different speakers and the calibration process then consists in determining the number and type of speakers connected, their distance from the listening position of the user, as well as their pressure levels. sound. In possession of this information, the system is provided to make all the necessary adjustments to optimize listening of the user at its listening position in the room and in particular, to ensure that all the sounds broadcast by the speakers reach it with the same volume.

However, this method of the prior art for automatically adjusting the parameters of the different speakers of a home theater system according to the listening position of the user in the room where the system is installed, is intended to optimize the sound reproduction only in this unique place of the room, in other words the configuration of the speakers is fixed at the initial listening position of the user. Also, this method does not take into account the possible mobility of the user. Thus, the preset speaker setting parameters during the configuration phase of the home theater system, are not intended to be adapted to the possible movements of the listener in the room, changing the listening position. However, compared to sound reproduction techniques, for example that reproduce a spatialized sound, the disposition of the listener with respect to the speakers is a strong constraint. The mobility of the listener thus causes an alteration of his listening quality.

The present invention aims to remedy these drawbacks by proposing a method of configuring a sound system for dynamically optimizing the sound reproduction in several listening locations according to the actual position of at least one person , taking into account to configure the system, moving people. With this objective in view, the invention, moreover in conformity with the generic definition given in the preamble above, is essentially characterized in that it comprises the real-time location of the position of at least said person. in said listening location, and the dynamic adaptation of the reproduction of said sound signal on at least one previously localized enclosure, according to said position of the person.

According to a preferred embodiment, the location of the position of a person in the listening place includes the permanent acquisition of images of the listening location by at least one camera and real-time analysis of said images in order to recognize the presence of said person to permanently locate his position in the listening place according to his position in the image .

According to this embodiment, the analysis of the images acquired consists in the application of image processing algorithms, making it possible to detect in real time a face in an image and to provide its position in the image.

According to one variant, the analysis of the images acquired comprises the detection of the orientation of the face with respect to the axis of the camera, the dynamic adaptation of the sound signal reproduction being furthermore performed according to said orientation.

Preferably, the location of the position of the speakers in the listening location comprises the prior determination of the position and orientation of each speaker with respect to the position and the axis of the camera.

In one embodiment, the dynamic adaptation of the reproduction of the sound signal on at least one loudspeaker consists in adapting the sound level reproduced by said loudspeaker and / or in adapting the frequency spectrum of the sound signal reproduced by said loudspeaker and / or correct the generation of a sound reproduction effect located in the space.

According to a particular embodiment, the method comprises the identification of the localized person, the dynamic adaptation of the restitution of the sound signal being made taking into account further preferences associated with the identified person.

According to another particular embodiment, the dynamic adaptation of the sound signal reproduction is performed by learning.

The invention also relates to a sound system consisting of a plurality of loudspeakers provided to restore a sound signal within a listening place for at least one person, characterized in that it comprises means process for carrying out the process according to the invention.

Other features and advantages of the present invention will appear more clearly on reading the following description given by way of illustrative and nonlimiting example and with reference to the single appended figure, schematically showing a functional architecture of a sound system. for the implementation of the method according to the invention.

Figure 1 thus describes a sound system according to the present invention. It comprises a plurality of loudspeakers El to En, intended to restore a sound signal 50 in a listening room where the speakers are installed, for at least one person likely to move in the place of residence. 'listening.

The sound system also comprises processing means 60, for dynamically configuring the system and whose various functions will be described in more detail below. As has already been explained, an object of the invention is to allow an optimized reproduction of the sound signal by the various speakers in several places of the listening place according to the actual position of one or more people in the place of listening. The optimized reproduction of the sound by the speakers according to the position of a person advantageously takes into account the successive positions of the person considered. Thus, the processing means comprise means for locating in real time the position of the persons present in the listening place. These means of location have the role of determining and then follow in real time the successive positions of people in the listening place.

To do this, according to an exemplary embodiment, the locating means comprise a camera 10, whose field covers at best the listening place and which is intended to acquire in real time and permanently images of the listening place. The camera is preferably fixed. It could also be mobile. The use of several cameras could also be considered. the locating means then implement a real-time analysis function of the images acquired by the camera with a view to recognizing the presence of at least one person, thereby permanently locating its position in the location of listens according to its position in the image. According to a preferred embodiment, image processing algorithms, known in themselves, to detect and follow in real time a face in an image, can be used for this purpose. In this way, through the implementation of such algorithms, the image analysis function will return the number of faces on the image and their position in the image. However, in order to be able to completely determine the precise position of the person in the listening place, the notion of depth in the image must be taken into account. The notion of depth with respect to the camera is for example treated by measuring a distance usually fairly constant in a face, such as the gap between the eyes.

Taking into account the location of the cameras by the processing means may be necessary during this step of image analysis, in the case for example where there are two cameras to be calibrated.

In a simplified mode, the position information determined by the image analysis function for each person recognized by the camera may consist of location indication information of the listening location in which respectively the people recognized by the camera. In this mode, the scene corresponding to the listening location filmed by the camera is pre-cut into several areas.

According to a more complex embodiment, the function of analysis of the images acquired by the camera 10 further comprises the detection of the orientation of the faces with respect to the axis of the camera. The dynamic adaptation of the reproduction of the sound signal that will be view in more detail below, may then be carried out taking into account, on the one hand, the actual position of the person considered in the listening place and, on the other hand, the orientation of the face of the nobody.

The processing means 60 also implement a function 40 for locating the position of the speakers in the listening area, comprising the prior determination of the position and orientation of each speaker with respect to the position and the axis. from the camera. The location of the position of the speakers in the listening location is preferably performed during a configuration phase of the processing means. From the position data of the persons provided by the image analysis function 20, a dynamic adaptation function 30 of the reproduction of the sound signal on one or more speakers can then be implemented. The adaptation of the sound signal reproduction on at least one speaker can then consist in dynamically adapting the sound level restored by the speaker or speakers depending on the position of the person considered in the listening place. The adaptation of the sound signal reproduction on at least one speaker may also consist in dynamically adapting the frequency spectrum of the sound signal reproduced by the speaker or speakers depending on the position of the person considered in the listening place. Depending on the type of application, the dynamic adaptation of the reproduction of the sound signal on at least one speaker may still consist in correcting the generation of a sound reproduction effect located in the space.

When several people are present in the listening place, the function 30 also has the role of distributing in real time the sound signal on the different speakers according to the actual position of people, so as to avoid interference between them.

An example of embodiment could be the following. After the locating means have located two people within the listening place, the dynamic adaptation function of the sound signal restitution selects for each of these people according to their respective position, the pair of speakers the most close and, for each pair of speakers, equalizes the sound levels of each of the speakers according to the position of the person, so that each person always has a stereophonic listening sensation regardless of its position relative to the pair of speakers. selected speaker. The dynamic adaptation function 30 of the processing means is therefore provided to control individually the different speakers El, ..., En of the sound system. Speaker direction can also be controlled by the adaptation function to avoid interference. According to a simplified embodiment based on learning, the function of dynamic adaptation of the reproduction of the sound signal is previously learned, either statistically, or by sampling and testing methods.

The embodiment by learning could for example consist in that a person, having a means of remote communication with the sound system processing means, addresses to the processing means, while moving in the place of listening, a ranking to indicate whether he is satisfied or not the quality of listening in different places of the place. A specific neural network type algorithm, for example, will then provide a model for optimizing the sound signals reproduced by the speakers as a function of the position of a person, so that in any place whatsoever, the restored signal is satisfactory. for this person.

An embodiment by sampling methods could typically be made as follows: a person is placed in different parts of the listening area equipped with a microphone and sound signals are then sent by the various speakers to allow to approach the more accurately a target signal.

However, the system keeps the possibility of being able to filter certain people to ensure that they can not benefit from the dynamic adaptation of the return of the sound signal. In particular, such filtering could be used depending on the position of the person, for example to avoid any adaptation when the person moves within the listening room in an area poorly covered by the speakers.

According to a particular embodiment of the invention, the processing means implement a function of identifying the localized person, the dynamic adaptation of the reproduction of the sound signal can be carried out taking into account also preferences associated with the identified person. It is then possible for example to amplify or reduce, respectively, a sound phenomenon, for example the sound volume of the sound signal restored for some identified persons, according to their preferences. The identification of the persons may be performed by image processing algorithms based on face tracking, implemented by the image analysis function of the processing means.

Claims

1. A method of configuring a sound system consisting of a plurality of loudspeakers (El, ... En) provided to restore a sound signal (50) in a listening location to least one person, including the location (20) in real time of the position of at least said person in said listening place, and the dynamic adaptation (30) of the reproduction of said sound signal on at least one previously located enclosure , according to said position of the person, said method being characterized in that the dynamic adaptation of the restitution of the sound signal is carried out taking into account further preferences associated with said person, determined after identification of said person.

2. Method according to claim 1, characterized in that the location of the position of a person in the listening place includes the permanent acquisition of images of the listening location by at least one camera (10) and the analyzing (20) in real time said images to recognize the presence of said person to permanently locate its position in the listening location according to its position in the image.

3. Method according to claim 2, characterized in that the analysis of the acquired images consists of the application of image processing algorithms, to detect in real time a face in an image and to provide its position in the image.

4. Method according to claim 3, characterized in that the analysis of the acquired images comprises the detection of the orientation of the face with respect to the axis of the camera, the dynamic adaptation of the restitution of the sound signal being furthermore performed according to said orientation.

5. Method according to claim 2, 3 or 4, characterized in that the location (40) of the position of the speakers in the listening location comprises the prior determination of the position and orientation of each speaker with respect to the position and axis of the camera.

6. Method according to any one of the preceding claims, characterized in that the dynamic adaptation of the sound signal reproduction on at least one chamber is to adapt the sound level restored by said enclosure.

7. Method according to any one of the preceding claims, characterized in that the dynamic adaptation of the reproduction of the sound signal on at least one chamber is to adapt the frequency spectrum of the sound signal restored by said enclosure.

8. Method according to any one of the preceding claims, characterized in that the dynamic adaptation of the reproduction of the sound signal on at least one speaker is to correct the generation of a localized sound reproduction effect in space.

9. Method according to any one of the preceding claims, characterized in that the dynamic adaptation of the reproduction of the sound signal is performed by learning.

10. Sound system consisting of a plurality of loudspeakers provided to restore a sound signal within a listening area for at least one person, characterized in that it comprises processing means (60 ) for carrying out the method according to any one of claims 1 to 9.