WO2009133324A1 - Device and method for vocal reproduction with controlled multi-sensorial perception - Google Patents

Device and method for vocal reproduction with controlled multi-sensorial perception Download PDF

Info

Publication number
WO2009133324A1
WO2009133324A1 PCT/FR2009/000488 FR2009000488W WO2009133324A1 WO 2009133324 A1 WO2009133324 A1 WO 2009133324A1 FR 2009000488 W FR2009000488 W FR 2009000488W WO 2009133324 A1 WO2009133324 A1 WO 2009133324A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
sound
user
signals
output
Prior art date
Application number
PCT/FR2009/000488
Other languages
French (fr)
Inventor
Jacques Feldmar
Maryvonne Zimmermann
Original Assignee
Jacques Feldmar
Maryvonne Zimmermann
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jacques Feldmar, Maryvonne Zimmermann filed Critical Jacques Feldmar
Priority to EP09738358A priority Critical patent/EP2269183A1/en
Publication of WO2009133324A1 publication Critical patent/WO2009133324A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied

Definitions

  • the present invention relates to a device and a method of voice reproduction with controlled multi-sensory perception.
  • the present invention relates to the field of voice training for the reproduction of a reference sound.
  • This reference sound can be a note, a rhythm, a melody, a range or sound sequence to reproduce.
  • It relates more particularly to a method of voice reproduction of a reference sound by at least one user, said method comprising an input signal acquisition step, a step of processing said input signals acquired for the purpose of providing output signals comprising at least one comparison information between an acquired input sound signal and the signal of said reference sound, and a step of multi-sensory perception by the user of at least one of said output signals by to enable said user to reach said reference sound.
  • Such a method is intended to be used in particular for applications of learning singing, imitation and musical play, as well as for speech therapy purposes.
  • it can be implemented as well by an amateur as a professional singer, a speaker or an actor, without this being limiting. It allows the user to train in order to harmonize his vocal functions in order to achieve the reproduction of a reference sound.
  • the vocal functions implemented include for example the control of the diaphragm and the muscles, the vibration of the vocal cords, the control of the articulators.
  • the articulators can be for example the lips, the jaw, the tongue, the veil from the palace, the uvula.
  • voice reproduction methods and devices use display means to provide a visual feedback to the user regarding the difference between the sound he has produced and the reference sound he wishes to produce.
  • voice reproduction methods and devices use display means to provide a visual feedback to the user regarding the difference between the sound he has produced and the reference sound he wishes to produce.
  • software In the field of karaoke, software and can sing in a microphone to reproduce notes whose height and duration are displayed on a screen. At the end of each reproduction session, a similarity score is calculated and displayed on the screen.
  • US Patent 5,889,224 which relates to the real-time evaluation of the vocal performance of a singer from a karaoke-free melody. For this, the singer's voice and the melody are detected separately. The signal of the singer's voice thus detected is sampled. The sampled data thus obtained is compared with the data of the reference sound to be produced to obtain differential data. These data are used to calculate a similarity score representing the degree of deviation of the singer's voice.
  • the disadvantage of such a method lies in the inability to exploit the solicitation of the different senses of the user to make it reach the reference sound gradually and intuitively.
  • the different senses are not exploited at the same time, whereas the brain is perfectly capable of integrating information coming from several senses at the same time.
  • a training method consists in causing the user to generate a vocal sound and to adjust the vocal sound of said user in order to achieve a targeted score by implementing sensory feedback means - or effectors.
  • sensory feedback means are selected from visual, auditory, tactile feedback means, or a combination thereof.
  • return means indicate the difference between the its vocal product and the targeted note, which allows the user to decrease in real time this difference by adjusting its sound output.
  • the multiplicity of sensory feedback means, as well as their difference of nature makes it possible to exploit several senses of the user at the same time, which, by integrating multiple information at the same time by the brain, offers the possibility to this one. This is to benefit from more information and thus to improve the intuitive adjustment of its vocal production.
  • this solution has the disadvantage of not making the best use of the different senses of the user, the requested directions not being excited by the effectors optimally.
  • the auditory and visual feedbacks do not sufficiently indicate the corrections to the voice and facial movements so that this information is integrated by the brain and that the adjustment is made more intuitive.
  • the aim of the present invention is to remedy this technical problem, by making it possible to optimally exploit the user's perception capacity as well as his state of performance.
  • the solution of the invention lies in the implementation of a system for controlling the level of perception by the user of the information provided by the output signals calculated from acquired input signals.
  • the solution's approach has been to look for ways to implement multi-sensory perception means to provide the brain with more relevant information that can be better integrated. It has then become apparent that the use of a control system can make it possible to regulate the level of perception of the feedback information of the sound produced, by more particularly adjusting the comparison information between an input signal acquired in relation to the User's sound and reference sound.
  • the subject of the invention is a voice reproduction method such as of the type mentioned above in which, in addition to the characteristics already mentioned, the level of multi-sensory perception of at least one output signal comprising at least a comparison information between an acquired input sound signal and the reference sound signal is controlled by adjusting at least one parameter of a signal from the output signal, said acquired input signal and said signal of the reference sound, depending on the performance status of the user.
  • this process consisting of the combination of a conventional method of vocal training and multi-sensory perception whose level is controlled, allows the user to optimize the adjustment of its production according to its perception.
  • the adjustment of at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal is performed automatically and dynamically.
  • the user can thus itself adjust the level of perception of the comparison between the sound that it has produced and the sound of reference as a function, on the one hand, of its state of performance and, on the other hand, of its sensitivity of perception.
  • the adjustment of at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal is carried out by the user.
  • the user can thus have a real-time adjustment of the level of perception and the comparison between the sound he has produced and the reference sound according to his state of performance as well as, for example, the evolution time of this state.
  • the use of this asymmetrical - or binaural - auditory perception is known in the field of speech therapy. It consists of exploiting the non-symmetrical role of the ears in order to spatially separate or locate the sounds. This type of hearing is suitable for blind people to transcribe the position of a cursor on a screen.
  • the left-right axis can be encoded by the relative loudness of the sound given by a pair of earphones in each ear.
  • the high-low axis can be encoded by the pitch-note, frequency-of this sound.
  • Such a method combining the control of multi-sensory perception and asymmetrical auditory perception, provides the brain with information enabling the user to intuitively correct its production according to its binaural sound perception. According to a first mode of implementation of this auditory perception, it is expected that the output signals provided and audibly perceived come from the same sound source.
  • the output signals supplied and audibly perceived come from two sound sources spatially separated and arranged to provide each of the user's ear with one of the two different output signals in connection with the signal of said reference sound and an acquired input sound signal.
  • the output sound signals supplied to each ear of the user and perceived in an auditory manner consist of a combination of signals among at least one acquired input sound signal, the signal of the reference sound and an indicator of the difference between at least one acquired input sound signal and the signal of said reference sound, said indicator relating to at least one characteristic of said signals. It is thus possible to inject any kind of combination or information between at least one acquired input sound signal and the signal of the reference sound in order to provide the brain with relevant information that it can integrate in order to improve its production according to its perception.
  • the output sound signals supplied to each ear of the user and audibly perceived comprise respectively at least partly the signal of the reference sound and an acquired input sound signal.
  • the distribution of these signals between the two ears provided by the binaural sound perception then ensures that the user decreases this gap intuitively.
  • the output sound signals supplied to each ear of the user and audibly perceived are respectively the reference signal and an acquired input sound signal.
  • the output sound signals supplied to each ear of the user and perceived in an auditory manner are respectively the reference signal and the difference between an acquired input sound signal and said input signal. reference.
  • a second embodiment of the invention aimed at integrating into the brain the algebraic difference (signed) between the acquired input sound signal and the sound of the reference signal, it is expected that the amplitude of the sound signals output signals supplied to each ear of the user and audibly perceived as a function of the sign of the difference between an acquired input sound signal and said reference signal.
  • the output signals supplied and perceived in a visual manner being related to at least one acquired input sound signal and the signal of said reference sound.
  • This visual perception acts in addition to the binaural auditory perception, so that it is integrated by the brain in a complementary way to the sound signals provided.
  • the brain integrates the visual and auditory information simultaneously. This allows the user to adjust more optimally its production according to its perception.
  • the output signals supplied and perceived visually consist of a combination of signals from at least one acquired input sound signal, the sound signal.
  • the output signals provided and perceived in a visual manner are perceived by the display of a three-dimensional virtual face of correction indicating the movements of the face of the face. necessary to reproduce the reference sound.
  • This display provides the user with the algebraic (signed) difference between what has been produced and what should be produced. Since our senses expect coherent signals, the movements of a speaker's mouth should correspond to the sounds made. If a person sees lip movements that are incompatible with what they hear, they are disturbed. This incompatibility is thus used as visual information integrated by the brain so as to adjust the voice reproduction of the user.
  • the voice reproduction method operates in a closed loop.
  • This closed loop between production and sound perception dynamically adjusts the link between production and perception to arrive at the vocal reproduction result of the reference sound.
  • provision is made for a delay to be introduced at the level of the acquired input signals so as to synchronize said acquired input signals with the output signals supplied.
  • This allows combinations to be made between the input and output signals. Synchronized output, so that the user can integrate in real time the gap between voice production and perception to adjust in real time.
  • the invention also relates to a voice reproduction device for a reference sound by at least one user, comprising an input signal acquisition system, said input signals comprising at least one input sound signal, a processing system of said acquired input signals adapted to provide output signals comprising at least a comparison information of an acquired input sound signal with the signal of said reference sound, and a multi-sensory perception system of said signals output provided, arranged to allow the user to reach said reference sound.
  • This device comprises a multisensory perception level control system of at least one output signal comprising at least one comparison information between an acquired input sound signal and the signal of said reference sound, said control system comprising means for adjusting at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal, depending on the performance state of the user.
  • This device consisting of the combination between a conventional device for vocal training and a means of multi-sensory perception and binaural sound, allows the user to optimize the adjustment of its production according to its perception.
  • it comprises means for recording and storing the acquired input signals and the output signals provided with a view to establishing a reproduction progress indicator. voice of the reference sound by the user. This allows the user to know the evolution of the vocal reproduction of the reference sound, so as to determine itself the progress made.
  • the input signal acquisition system to comprise means of perception at least among auditory, visual and tactile perception means, said perception means being arranged so as to provide the user with at least one output signal related to at least one acquired input sound signal and the signal of said reference sound.
  • FIG. 1 a block diagram of a device and a method of single voice reproduction.
  • FIG. 2 is a diagram of a single-user voice reproduction device according to one embodiment of the invention;
  • FIG. 3 is a device for auditory perception of a single-user speech reproduction device; according to one embodiment of the invention,
  • FIG. 4 means for visual perception of a single-user speech reproduction device according to one embodiment of the invention,
  • FIG. 5, a block diagram of FIG. a multi-user voice reproduction device according to the present invention.
  • DETAILED PRESENTATION OF PARTICULAR EMBODIMENTS DETAILED PRESENTATION OF PARTICULAR EMBODIMENTS
  • the performance status of the user will be understood to be the level of reproduction of a reference sound reached by the user, that is to say the difference between the sound that the user has produced and the reference sound.
  • This reference state can be determined according to one or more parameters of the signals respectively of the sound that the user has produced and of the reference sound.
  • Fig. 1 shows a block diagram of a voice reproduction device and method according to the present invention.
  • This device comprises an acquisition system 2, a treatment system 3, a mufti-sensory perception system 4 and a control system 5.
  • the acquisition system 2 allows the capture of a plurality of signals from the behavior of the user 1. It realizes the acquisition of input signals, said input signals comprising at least one input sound signal. It comprises acquisition means, including sound acquisition means 21, movement means 22, breathing means 23, touch keys 24, and blast means 25. These means consist respectively of a microphone, an accelerometer, a electrocardiograph, keyboard and spirometer.
  • these acquisition means comprise a plurality of microphones, a joystick, a steering wheel, a camera, a stereo-vision device, a carpet, a vibration sensor, a pressure sensor, an electroencephalograph , a propeller, an induction tape or a telephone.
  • the multi-sensory perception system 4 makes it possible for user 1 to feel the difference between the sound he has produced and the reference sound 6 that he wishes to reproduce, in order to help him reproduce the said reference sound 6. It receives for this purpose the output signals supplied and transmits them to the user 1. It comprises means of sound perception 41 and 42, visual 43, tactile 44 and vibrational 45. These perception means are constituted respectively headphones, a screen, a force feedback wheel and a muscle stimulation electrode.
  • these perception means comprise a data display, a plurality of loudspeakers, a sound headset, a braille reading device, a robot or a winder.
  • the processing system 3 comprises processing means 31 of the input signals acquired so as to provide output signals.
  • the processes are operated so that these output signals comprise at least one piece of information for comparing an input sound signal acquired with the signal of the reference sound 6.
  • These processing means 31 may consist for example of a computer, a PDA, a DVD, a telephone.
  • the output signals calculated by the processing means 31 may in particular be acoustic indices, such as the vocal cord voltage, the speech register, the loudness, the prosody or the suprasegmental, the segmental, the vocal tone, the coordination supraglottic or glottic, turbulent air movement, stochastic disturbances of vocal fold vibration, unsolicited vibrations of ventricular folds or ary-epiglottic folds, uncontrolled transitions or non-modal vibrations.
  • acoustic indices such as the vocal cord voltage, the speech register, the loudness, the prosody or the suprasegmental, the segmental, the vocal tone, the coordination supraglottic or glottic, turbulent air movement, stochastic disturbances of vocal fold vibration, unsolicited vibrations of ventricular folds or ary-epiglottic folds, uncontrolled transitions or non-modal vibrations.
  • the processing system 3 also includes means for recording and storing the acquired input signals and output signals. These means 32 make it possible to establish a progress indicator for the speech reproduction of the reference sound 6 by the user 1. This indicator can be used, for example, to provide the user with progress information as a function of time in the form of of graphics, or to control the level of perception by automatic and dynamic adjustment of at least one parameter of a signal among said output signal, said acquired input signal and said sound signal of reference 6, depending on the state of performance of the user 1.
  • the control system 5 controls the multisensory perception level of at least one output signal comprising at least one piece of comparison information between an acquired input sound signal and the signal of said reference sound. it means for adjusting at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal 6, depending on the state of performance of the user 1 .
  • these means are means for manually adjusting the level of perception, made up of elements that can be manipulated by the user 1, the latter being able to thus adjust the level of multi-sensory perception as a function of his state of perception. performance.
  • these means are automatic and dynamic adjustment means, consisting of calculation elements able to determine the state of performance of the user and to deduce the corresponding level of perception. They can do this by integrating gap information between their product and their reproduction over a wide time interval, which enables the user's performance state to be determined more precisely.
  • This control system 5 thus makes it possible to adjust the level of multi-sensory perception as a function of the state of performance of the user. For example, for a novice user, the difference between the sound produced and the sound to be reproduced will be very large, and then the control system 5 will ensure a low dynamic level of perception so that the output signals including the information differences between the two sounds are not perceived too harmful. On the contrary, in the case of an expert user, the difference between the sound produced and the sound to be reproduced will be very small, and then the control system 5 will provide a high dynamic so that the user can reach more precisely the its to reproduce.
  • the manual control means may consist for example of a keyboard, a mouse, a mixer, a steering wheel or a joystick.
  • the automatic and dynamic control means may consist for example of a processor.
  • the transmission of signals between the acquisition systems 2, treatment 3, perception 4 and control is provided by wire. According to other embodiments, this transmission is wireless or performed via a local or external network, for example of the Internet type.
  • the reference sound signal 6 is placed on a data storage medium in order to supply it to the processing system 3.
  • This support may be for example a standard CD, a midi format file, or any other type of medium allowing recording, of the signal.
  • the voice reproduction method operates in a closed loop.
  • the acquisition (I) of input signals is first performed by the acquisition means 21, 22, 23, 24 and 25 of the acquisition system 2.
  • the input signals comprise at least one sound signal corresponding to the sound produced by the user 1.
  • the acquired input signals are then processed (II) to provide (III) output signals comprising at least one comparison information between an acquired input sound signal and the signal of said reference sound 6.
  • the treatments performed can be calculations or effects. These calculations are for example, without being limiting, the calculation of the fundamental frequency (pitch, pitch, note), the volume, the intensity, the rhythm, the dynamics (attack, support, release), the timbre, the nasality, vibrato, breath (veiled effect), articulation, averaging, history or progress indication of the user, discrimination of sound, classification of sounds, measurement of signal similarities, pose and motion analysis in images.
  • These effects are for example the change of pitch, the change of tempo, the separation music speech, the reverb, the calibration on a fair note, the shift of octave.
  • At least one of the output signals is then perceived (IV) by the user 1 in a multi-sensory manner so as to enable said user 1 to reach the reference sound 6.
  • the signal of the reference sound 6 is supplied (V) to the processing system 3 before the acquisition of the input signals, so as to account for the processing (II) of both the input signals and the signal of the input signal. reference sound 6.
  • This voice reproduction method also comprises steps (VII), (VIII) and (IX) for controlling the multisensory perception level of at least one output signal comprising at least one piece of information comparing a sound signal of acquired input and the signal of the reference sound 6.
  • This control is achieved by adjusting at least one parameter of a signal among the output signal (VII), the acquired input signal (VIII) and the signal reference sound 6 (IX).
  • the signal perceived by the user 1 is an output signal comprising a comparison information between an acquired input sound signal and the signal of the reference sound 6, it is possible to adjust the output signal, the acquired input signal or the reference sound signal, or a combination of all three, so as to modify the dynamics of the difference between the sound produced and the sound to be reproduced.
  • This control is carried out manually by the user 1 according to the performance state that he determines himself and the level of perception that he wants, or automatically and dynamically depending on the state of the user's performance determined during the processing step (II).
  • the multi-sensory perception (IV) of output signals provided is performed using a combination of auditory, visual and tactile perception modes. According to other embodiments, it may be provided to use only two modes of perception among the three above.
  • output signals are connected to the signal of the reference sound 6 and an acquired input sound signal, and thus comprise a comparison information between reference sound 6 and the sound emitted by the user 1.
  • These signals are further constituted so as to provide the user 1 with two different output signals. This makes it possible to provide, by comparison of the two signals, perceptible information related to the difference between the sound produced and the sound to be reproduced.
  • these two output signals come from two spatially separated sound sources.
  • the two sources are arranged to provide each ear of the user 1 a different signal among the two signals. This can be achieved for example by using two earphones, each earphone being arranged against an ear and emitting a different signal.
  • these two output signals come from the same sound source.
  • the source then emits a single signal containing the two different signals.
  • the binaural hearing capability of the two user's ears is implemented so as to separate the two signals.
  • the output sound signals supplied to each ear of the user 1 are respectively the reference signal (6) and an audible signal. acquired entrance. In this case, the user can directly perceive the difference between the sound produced and the sound to be reproduced.
  • the output sound signals provided to each ear of the user 1 are respectively the reference signal 6 and the difference between an acquired input sound signal and said reference signal 6.
  • one of the two output signals comprises an indicator relating to at least one characteristic of the input signals and the reference sound.
  • the auditory perception step comprises a sub-step of assigning the two signals according to the sign of the difference between the sound signal of acquired input and the signal of the reference sound 6.
  • the acquired input sound will be emitted into the left ear and the reference sound into the right ear, this assignment being inverted when the input sound is lower than the reference sound 6.
  • the perception of the sign of the difference is effected by adjusting the amplitude of the two signals according to the sign of this difference.
  • output signals provided and perceived during the multi-sensory perception stage (IV) at least one is perceived visually and another tactile. These output signals are also in connection with the signal of the reference sound 6 and an acquired input sound signal.
  • Figures 2 to 4 show diagrams of a voice reproduction device according to one embodiment of the invention.
  • the device comprises in this embodiment a microphone 50, a central unit 51, a display screen 52, a pair of earphones 53 and a steering wheel 54.
  • the microphone 50 realizes the acquisition of the sound emitted by the user 1 and the conversion of this emitted sound into an input signal.
  • the microphone is connected to the CPU 51 which performs the processing of the input signals to obtain output signals, as well as the local recording of the data (input and output signals).
  • the central unit 51 is connected to the steering wheel 54.
  • This steering wheel allows the user to provide manual control of the level of perception of the difference between the sound produced (the sound acquired by the microphone 50) and the sound to be reproduced (the reference sound).
  • the user 1 turns the steering wheel 54 in one direction or the other so as to decrease or increase the dynamics of said level of multi-sensory perception. He can thus adjust his perception level himself according to his state of performance.
  • the steering wheel 54 is replaced by a control keyboard adapted to perform the same operations of adjusting the difference between the sound produced and the sound to be reproduced.
  • the central unit 51 is also connected to the multisensory perception means including the pair of headphones 53 and the display 52.
  • different output signals are transmitted to the earphones 53 'and 53 "of the earphone pair 53.
  • the signal transmitted to the earpiece 53 ' is the input signal picked up by the microphone and the signal transmitted to the earphone 53 "is the signal of the reference sound.
  • the signal transmitted to the earpiece 53 ' is the signal of the reference sound and the signal transmitted to the earphone 53 "is the input signal picked up by the microphone. It is thus possible for the user 1 to perceive the sign of the difference between the two signals.
  • a three-dimensional virtual face of correction is displayed on the display screen 52.
  • This face indicates the movements of the face of the user 1 necessary for the reproduction of the reference sound.
  • the three-dimensional virtual face of correction indicates the type of correction to be made (labial, articulatory, etc.) and the curve indicates the difference between the sound produced and the sound to be reproduced, possibly with an indication. progress over time.
  • the display 52 ' relates to a labial correction, the display 52 "an articulatory correction and the display 52'" a vibratory correction by the breath.
  • the position of the body and breathing are indeed essential.
  • the position of the body can be acquired by a camera - or two cameras in the case of stereoscopy - associated with image processing.
  • a software specifically dedicated to the analysis of the face also allows to have information on the lips - in particular their opening, stretching and protrusion - of the user, on the opening of the jaw and on the height of the head relatively to the shoulders.
  • induction bands may be used around the chest and abdominals, and a propeller sensor placed in front of the mouth.
  • the device also comprises tactile perception means in addition to the pair of headphones 53 and the display screen 52.
  • tactile perception means may for example consist of a muscle stimulation electrode.
  • the device and the method can be applied in multi-user applications.
  • FIG. 5 shows an example of embodiment with two users (1, 1 ')
  • each user notably has means of acquisition and perception.
  • the processing means are shared via a connection to a local or external network such as the Internet.
  • the processing means may be specific to each user.
  • the reference sounds (6,6 ') to be reproduced may be identical or different.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)

Abstract

The present invention relates to a method of vocal reproduction of a reference sound (6) by at least one user (1), said method comprising a step (I) of acquiring input signals, a step (II) of processing said input signals acquired with a view to providing (III) output signals comprising at least one cue relating to a comparison between an input sound signal acquired and the signal of said reference sound (6), and a step of multi-sensorial perception (IV) by the user (I) of at least one of said output signals with a view to allowing said user (1) to achieve said reference sound (6), characterized in that the level of multi-sensorial perception of at least one output signal comprising at least one cue relating to a comparison between an input sound signal acquired and the signal of said reference sound (6) is controlled by adjusting at least one parameter of a signal from among said output signal, said input signal acquired and said signal of the reference sound (6), as a function of the state of performance of the user (1). The present invention also relates to a device for vocal reproduction implementing such a method.

Description

DISPOSITIF ET PROCÉDÉ DE REPRODUCTION VOCALE À PERCEPTION MULTI-SENSORIELLE CONTRÔLÉE VOICE REPRODUCTION DEVICE AND METHOD WITH CONTROLLED MULTI-SENSORY PERCEPTION
La présente invention concerne un dispositif et un procédé de reproduction vocale à perception multi-sensorielle contrôlée.The present invention relates to a device and a method of voice reproduction with controlled multi-sensory perception.
DOMAINE TECHNIQUETECHNICAL AREA
La présente invention se rapporte au domaine de l'entraînement vocal en vue de la reproduction d'un son de référence. Ce son de référence peut être une note, un rythme, une mélodie, une gamme ou séquence sonore à reproduire.The present invention relates to the field of voice training for the reproduction of a reference sound. This reference sound can be a note, a rhythm, a melody, a range or sound sequence to reproduce.
Elle se rapporte plus particulièrement à un procédé de reproduction vocale d'un son de référence par au moins un utilisateur, ledit procédé comprenant une étape d'acquisition de signaux d'entrée, une étape de traitement desdits signaux d'entrée acquis en vue de fournir des signaux de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal dudit son de référence, et une étape de perception multi-sensorielle par l'utilisateur d'au moins un desdits signaux de sortie en vue de permettre audit utilisateur d'atteindre ledit son de référence.It relates more particularly to a method of voice reproduction of a reference sound by at least one user, said method comprising an input signal acquisition step, a step of processing said input signals acquired for the purpose of providing output signals comprising at least one comparison information between an acquired input sound signal and the signal of said reference sound, and a step of multi-sensory perception by the user of at least one of said output signals by to enable said user to reach said reference sound.
ÉTAT DE LA TECHNIQUE ANTERIEURESTATE OF THE PRIOR ART
Un tel procédé vise à être utilisé notamment pour des applications d'apprentissage du chant, d'imitation et de jeu musicaux, ainsi qu'à des fins orthophoniques. Dans ce contexte, il peut être mis en œuvre aussi bien par un amateur qu'un chanteur professionnel, un orateur ou un acteur, sans que cela soit limitatif. Il permet à l'utilisateur de s'entraîner afin d'harmoniser ses fonctions vocales en vue d'atteindre la reproduction d'un son de référence. Les fonctions vocales mises en œuvres comprennent par exemple le contrôle du diaphragme et des muscles, la vibration des cordes vocales, le contrôle des articulateurs. Les articulateurs peuvent être par exemple les lèvres, la mâchoire, la langue, le voile du palais, la luette.Such a method is intended to be used in particular for applications of learning singing, imitation and musical play, as well as for speech therapy purposes. In this context, it can be implemented as well by an amateur as a professional singer, a speaker or an actor, without this being limiting. It allows the user to train in order to harmonize his vocal functions in order to achieve the reproduction of a reference sound. The vocal functions implemented include for example the control of the diaphragm and the muscles, the vibration of the vocal cords, the control of the articulators. The articulators can be for example the lips, the jaw, the tongue, the veil from the palace, the uvula.
L'état de la technique dans ce domaine comporte des procédés et dispositifs utilisant les capacités physiques et mentales de l'utilisateur. Des procédés et dispositifs de reproduction vocale utilisent à ce titre des moyens d'affichage pour fournir un retour visuel à l'utilisateur concernant la différence entre le son qu'il a produit et le son de référence qu'il souhaite produire. Dans le domaine du karaoké, des logiciels permettent ainsi de chanter dans un microphone pour reproduire des notes dont la hauteur et la durée sont affichées sur un écran. À la fin de chaque session de reproduction, un score de similarité est calculé puis affiché sur l'écran. Un tel procédé est décrit dans le document de brevet US 5,889,224, qui concerne l'évaluation en temps réel de la performance vocale d'un chanteur à partir d'une mélodie sans paroles de type karaoké. Pour cela, la voix du chanteur et la mélodie sont détectées séparément. Le signal de la voix du chanteur ainsi détecté est échantillonné. Les données échantillonnées ainsi obtenues sont comparées aux données du son de référence à produire pour obtenir des données différentielles. Ces données sont utilisées pour calculer un score de similarité représentant le degré de déviation de la voix du chanteur.The state of the art in this field includes methods and devices utilizing the physical and mental capabilities of the user. As such, voice reproduction methods and devices use display means to provide a visual feedback to the user regarding the difference between the sound he has produced and the reference sound he wishes to produce. In the field of karaoke, software and can sing in a microphone to reproduce notes whose height and duration are displayed on a screen. At the end of each reproduction session, a similarity score is calculated and displayed on the screen. Such a method is described in US Patent 5,889,224, which relates to the real-time evaluation of the vocal performance of a singer from a karaoke-free melody. For this, the singer's voice and the melody are detected separately. The signal of the singer's voice thus detected is sampled. The sampled data thus obtained is compared with the data of the reference sound to be produced to obtain differential data. These data are used to calculate a similarity score representing the degree of deviation of the singer's voice.
L'inconvénient d'un tel procédé réside dans l'incapacité à exploiter la sollicitation des différents sens de l'utilisateur en vue de le faire atteindre le son de référence de manière progressive et intuitive. D'une part, les différents sens ne sont pas exploités en même temps, alors que le cerveau est parfaitement capable d'intégrer des informations provenant de plusieurs sens en même temps.The disadvantage of such a method lies in the inability to exploit the solicitation of the different senses of the user to make it reach the reference sound gradually and intuitively. On the one hand, the different senses are not exploited at the same time, whereas the brain is perfectly capable of integrating information coming from several senses at the same time.
Une solution permettant d'exploiter plusieurs sens de l'utilisateur est décrite dans le document de brevet US 2004/0194610. Dans ce document, un procédé d'entraînement consiste à faire générer à l'utilisateur un son vocal et à faire ajuster le son vocal dudit utilisateur afin d'atteindre une note ciblée en mettant en œuvre des moyens de retour sensoriel - ou effecteurs. Ces moyens de retour sensoriel sont choisis parmi des moyens de retour visuel, auditif, tactile, ou une combinaison de ceux-ci. Ces moyens de retour indiquent la différence entre le son vocal produit et la note ciblée, ce qui permet à l'utilisateur de diminuer en temps réel cette différence en ajustant sa production sonore. La multiplicité des moyens de retour sensoriel, ainsi que leur différence de nature, permet d'exploiter plusieurs sens de l'utilisateur à la fois, ce qui, par intégration de multiples informations en même temps par le cerveau, offre la possibilité à celui- ci de bénéficier de plus d'informations et donc d'améliorer l'ajustement intuitif de sa production vocale.A solution for exploiting multiple senses of the user is described in US patent document 2004/0194610. In this document, a training method consists in causing the user to generate a vocal sound and to adjust the vocal sound of said user in order to achieve a targeted score by implementing sensory feedback means - or effectors. These sensory feedback means are selected from visual, auditory, tactile feedback means, or a combination thereof. These return means indicate the difference between the its vocal product and the targeted note, which allows the user to decrease in real time this difference by adjusting its sound output. The multiplicity of sensory feedback means, as well as their difference of nature, makes it possible to exploit several senses of the user at the same time, which, by integrating multiple information at the same time by the brain, offers the possibility to this one. This is to benefit from more information and thus to improve the intuitive adjustment of its vocal production.
Néanmoins, cette solution présente l'inconvénient de ne pas exploiter au mieux les différents sens de l'utilisateur, les sens sollicités n'étant pas excités par les effecteurs de manière optimale. En particulier, il n'est pas possible de tenir compte des capacités de perception des différents sens de l'utilisateur de sorte à fournir à l'utilisateur des informations en retour de sa production qui soient dépendantes de son état de performance. Dans une moindre mesure, les retours auditifs et visuels ne permettent pas d'indiquer de façon suffisamment claire les corrections à apporter à la voix et aux mouvements du visage de sorte que ces informations soient intégrées par le cerveau et que l'ajustement soit rendu plus intuitif.However, this solution has the disadvantage of not making the best use of the different senses of the user, the requested directions not being excited by the effectors optimally. In particular, it is not possible to take into account the perception capabilities of the various senses of the user so as to provide the user with feedback of his production that is dependent on his state of performance. To a lesser extent, the auditory and visual feedbacks do not sufficiently indicate the corrections to the voice and facial movements so that this information is integrated by the brain and that the adjustment is made more intuitive.
Ainsi, aucune solution de l'état de la technique ne permet de disposer d'un procédé ou d'un dispositif de reproduction vocale d'un son de référence par un utilisateur qui exploite de manière optimale les capacités physiques et mentales de l'utilisateur, de sorte à lui permettre d'ajuster le son qu'il produit vis-à-vis du son de référence à produire de manière intuitive.Thus, no solution of the state of the art makes it possible to have a method or a device for the vocal reproduction of a reference sound by a user who optimally exploits the physical and mental capacities of the user. , so as to allow it to adjust the sound it produces vis-à-vis the reference sound to produce intuitively.
OBJET DE L'INVENTIONOBJECT OF THE INVENTION
Le but de la présente invention est de remédier à ce problème technique, en permettant d'exploiter de manière optimale la capacité de perception de l'utilisateur ainsi que son état de performance. La solution de l'invention réside dans la mise en œuvre d'un système de contrôle du niveau de perception par l'utilisateur des informations fournies par les signaux de sortie calculés à partir des signaux d'entrée acquis.The aim of the present invention is to remedy this technical problem, by making it possible to optimally exploit the user's perception capacity as well as his state of performance. The solution of the invention lies in the implementation of a system for controlling the level of perception by the user of the information provided by the output signals calculated from acquired input signals.
L'approche de la solution a consisté à rechercher des façons de mettre en œuvre les moyens de perception multi-sensorielle de sorte à fournir au cerveau des informations plus pertinentes et susceptibles d'être mieux intégrées. Il est alors apparu que l'utilisation d'un système de contrôle peut permettre de régler le niveau de perception des informations en retour du son produit, en ajustant plus particulièrement l'information de comparaison entre un signal d'entée acquis en rapport avec le son émis par l'utilisateur et le son de référence.The solution's approach has been to look for ways to implement multi-sensory perception means to provide the brain with more relevant information that can be better integrated. It has then become apparent that the use of a control system can make it possible to regulate the level of perception of the feedback information of the sound produced, by more particularly adjusting the comparison information between an input signal acquired in relation to the User's sound and reference sound.
Dans ce but, l'invention a pour objet un procédé de reproduction vocale tel que du type mentionné ci-dessus dans lequel, outre les caractéristiques déjà mentionnées, le niveau de perception multi-sensorielle d'au moins un signal de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal du son de référence est contrôlé par l'ajustement d'au moins un paramètre d'un signal parmi le signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence, en fonction de l'état de performance de l'utilisateur.For this purpose, the subject of the invention is a voice reproduction method such as of the type mentioned above in which, in addition to the characteristics already mentioned, the level of multi-sensory perception of at least one output signal comprising at least a comparison information between an acquired input sound signal and the reference sound signal is controlled by adjusting at least one parameter of a signal from the output signal, said acquired input signal and said signal of the reference sound, depending on the performance status of the user.
Cela permet à l'utilisateur de contrôler lui-même la perception de la reproduction vocale, en particulier le niveau de difficulté de la reproduction vocale, l'intensité des signaux de sortie qu'il perçoit via les moyens de perception et les options de traitement des signaux d'entrée acquis.This allows the user to control the perception of voice reproduction himself, in particular the level of difficulty of speech reproduction, the intensity of the output signals he perceives via the means of perception and the processing options. acquired input signals.
Ainsi ce procédé, constitué de la combinaison entre un procédé classique de d'entraînement vocal et une perception multi-sensorielle dont le niveau est contrôlé, permet à l'utilisateur d'optimiser l'ajustement de sa production en fonction de sa perception.Thus this process, consisting of the combination of a conventional method of vocal training and multi-sensory perception whose level is controlled, allows the user to optimize the adjustment of its production according to its perception.
Avantageusement, il est prévu que l'ajustement d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence, soit réalisé de manière automatique et dynamique. L'utilisateur peut ainsi régler lui-même le niveau de perception de la comparaison entre le son qu'il a produit et le son de référence en fonction, d'une part, de son état de performance et, d'autre part, de sa sensibilité de perception.Advantageously, it is provided that the adjustment of at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal is performed automatically and dynamically. The user can thus itself adjust the level of perception of the comparison between the sound that it has produced and the sound of reference as a function, on the one hand, of its state of performance and, on the other hand, of its sensitivity of perception.
Avantageusement, il est prévu que l'ajustement d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence, soit réalisé par l'utilisateur. L'utilisateur peut ainsi disposer d'un ajustement en temps réel du niveau de perception et la comparaison entre le son qu'il a produit et le son de référence en fonction de son état de performance ainsi que, par exemple, de l'évolution temporelle de cet état.Advantageously, it is provided that the adjustment of at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal, is carried out by the user. The user can thus have a real-time adjustment of the level of perception and the comparison between the sound he has produced and the reference sound according to his state of performance as well as, for example, the evolution time of this state.
Dans un mode de réalisation visant à utiliser pour cela le rôle non symétrique des deux oreilles ainsi que l'intégration par le cerveau des informations provenant des différents sens, il est prévu que, lors de l'étape de perception multi- sensorielle, une partie au moins desdits signaux de sortie fournis est perçue de manière auditive, lesdits signaux de sortie fournis et perçus de manière auditive étant constitués de sorte à fournir à l'utilisateur deux signaux de sortie différents en liaison avec le signal dudit son de référence et un signal sonore d'entrée acquis.In one embodiment for using the non-symmetrical role of the two ears and the integration by the brain of the information from the different senses, it is expected that, during the multisensory perception stage, a part at least one of said output signals provided is audibly perceived, said output signals supplied and audibly perceived being constituted so as to provide the user with two different output signals in connection with the signal of said reference sound and a signal input sound acquired.
L'utilisation de cette perception auditive asymétrique - ou binaurale - est connue dans le domaine de l'orthophonie. Elle consiste à exploiter le rôle non symétrique des oreilles afin de séparer ou de localiser spatialement les sons. Ce type d'audition convient aux personnes aveugles pour transcrire la position d'un curseur sur un écran. L'axe gauche-droite peut être codé par l'intensité relative du son donné par une paire d'écouteurs dans chaque oreille. L'axe haut-bas peut être codé par la hauteur - note, fréquence - de ce son.The use of this asymmetrical - or binaural - auditory perception is known in the field of speech therapy. It consists of exploiting the non-symmetrical role of the ears in order to spatially separate or locate the sounds. This type of hearing is suitable for blind people to transcribe the position of a cursor on a screen. The left-right axis can be encoded by the relative loudness of the sound given by a pair of earphones in each ear. The high-low axis can be encoded by the pitch-note, frequency-of this sound.
Un tel procédé, combinant le contrôle de la perception multi-sensorielle et la perception auditive asymétrique, permet de fournir au cerveau des informations permettant à l'utilisateur de corriger de façon intuitive sa production en fonction de sa perception sonore binaurale. Selon un premier mode de mise en œuvre de cette perception auditive, il est prévu que les signaux de sortie fournis et perçus de manière auditive proviennent d'une même source sonore.Such a method, combining the control of multi-sensory perception and asymmetrical auditory perception, provides the brain with information enabling the user to intuitively correct its production according to its binaural sound perception. According to a first mode of implementation of this auditory perception, it is expected that the output signals provided and audibly perceived come from the same sound source.
Selon un deuxième mode de mise en œuvre de cette perception auditive, il est prévu que les signaux de sortie fournis et perçus de manière auditive proviennent de deux sources sonores spatialement séparées et agencées pour fournir à chaque oreille de l'utilisateur l'un parmi les deux signaux de sortie différents en liaison avec le signal dudit son de référence et un signal sonore d'entrée acquis.According to a second embodiment of this auditory perception, it is provided that the output signals supplied and audibly perceived come from two sound sources spatially separated and arranged to provide each of the user's ear with one of the two different output signals in connection with the signal of said reference sound and an acquired input sound signal.
Préférentiellement, il est prévu que les signaux sonores de sortie fournis à chaque oreille de l'utilisateur et perçus de manière auditive soient constitués d'une combinaison de signaux parmi au moins un signal sonore d'entrée acquis, le signal du son de référence et un indicateur de la différence entre au moins un signal sonore d'entrée acquis et le signal dudit son de référence, ledit indicateur portant sur au moins une caractéristique desdits signaux. Il est ainsi possible d'injecter toute sorte de combinaison ou d'information entre au moins un signal sonore d'entrée acquis et le signal du son de référence en vue de fournir au cerveau des informations pertinentes qu'il pourra intégrer afin d'améliorer sa production en fonction de sa perception.Preferably, it is provided that the output sound signals supplied to each ear of the user and perceived in an auditory manner consist of a combination of signals among at least one acquired input sound signal, the signal of the reference sound and an indicator of the difference between at least one acquired input sound signal and the signal of said reference sound, said indicator relating to at least one characteristic of said signals. It is thus possible to inject any kind of combination or information between at least one acquired input sound signal and the signal of the reference sound in order to provide the brain with relevant information that it can integrate in order to improve its production according to its perception.
Selon un mode particulier de réalisation, il est prévu que les signaux sonores de sortie fournis à chaque oreille de l'utilisateur et perçus de manière auditive comportent respectivement au moins en partie le signal du son de référence et un signal sonore d'entrée acquis. La répartition de ces signaux entre les deux oreilles fournie par la perception sonore binaurale assure alors que l'utilisateur diminue cet écart de façon intuitive.According to a particular embodiment, it is provided that the output sound signals supplied to each ear of the user and audibly perceived comprise respectively at least partly the signal of the reference sound and an acquired input sound signal. The distribution of these signals between the two ears provided by the binaural sound perception then ensures that the user decreases this gap intuitively.
Selon un mode particulier de réalisation, il est prévu que les signaux sonores de sortie fournis à chaque oreille de l'utilisateur et perçus de manière auditive soient respectivement le signal de référence et un signal sonore d'entrée acquis. Selon un mode particulier de réalisation, il est prévu que les signaux sonores de sortie fournis à chaque oreille de l'utilisateur et perçus de manière auditive sont respectivement le signal de référence et la différence entre un signal sonore d'entrée acquis et ledit signal de référence.According to a particular embodiment, it is expected that the output sound signals supplied to each ear of the user and audibly perceived are respectively the reference signal and an acquired input sound signal. According to a particular embodiment, it is provided that the output sound signals supplied to each ear of the user and perceived in an auditory manner are respectively the reference signal and the difference between an acquired input sound signal and said input signal. reference.
Dans un premier mode de mise en œuvre de l'invention visant à faire intégrer au cerveau la différence algébrique (signée) entre le signal sonore d'entrée acquis et le son du signal de référence, il est prévu que l'affectation des signaux sonores de sortie fournis à chaque oreille de l'utilisateur et perçus de manière auditive soit fonction du signe de la différence entre un signal sonore d'entrée acquis et ledit signal de référence.In a first embodiment of the invention for integrating into the brain the algebraic difference (signed) between the acquired input sound signal and the sound of the reference signal, it is expected that the assignment of sound signals output signals supplied to each ear of the user and audibly perceived as a function of the sign of the difference between an acquired input sound signal and said reference signal.
Dans un deuxième mode de mise en œuvre de l'invention visant à faire intégrer au cerveau la différence algébrique (signée) entre le signal sonore d'entrée acquis et le son du signal de référence, il est prévu que l'amplitude des signaux sonores de sortie fournis à chaque oreille de l'utilisateur et perçus de manière auditive soit fonction du signe de la différence entre un signal sonore d'entrée acquis et ledit signal de référence.In a second embodiment of the invention aimed at integrating into the brain the algebraic difference (signed) between the acquired input sound signal and the sound of the reference signal, it is expected that the amplitude of the sound signals output signals supplied to each ear of the user and audibly perceived as a function of the sign of the difference between an acquired input sound signal and said reference signal.
Dans un mode de réalisation visant à améliorer la perception multi-sensorielle de la production sonore de l'utilisateur, il est prévu que, lors de l'étape de perception multi-sensorielle, une partie au moins desdits signaux de sortie fournis soit perçue de manière visuelle, lesdits signaux de sortie fournis et perçus de manière visuelle étant en rapport avec au moins un signal sonore d'entrée acquis et le signal dudit son de référence. Cette perception visuelle agit en complément de la perception auditive binaurale, de sorte à être intégrée par le cerveau de manière complémentaire aux signaux sonores fournis. Le cerveau intègre en effet les informations visuelles et auditives de manière simultanée. Cela permet donc à l'utilisateur d'ajuster de manière plus optimale sa production en fonction de sa perception. Dans un mode de réalisation préférentiel mettant en œuvre la perception visuelle, il est prévu que les signaux de sortie fournis et perçus de manière visuelle soient constitués d'une combinaison de signaux parmi au moins un signal sonore d'entrée acquis, le signal du son de référence et un indicateur de la différence entre au moins un signal sonore d'entrée acquis et le signal dudit son de référence, ledit indicateur portant sur au moins une caractéristique desdits signaux. Il est ainsi possible de donner à voir à l'utilisateur toute sorte de combinaison ou d'information entre au moins un signal sonore d'entrée acquis et le signal du son de référence en vue de fournir au cerveau des informations pertinentes.In one embodiment for improving the multi-sensory perception of the user's sound production, it is expected that, during the multi-sensory perception step, at least a portion of said output signals provided will be perceived from in a visual manner, said output signals supplied and perceived in a visual manner being related to at least one acquired input sound signal and the signal of said reference sound. This visual perception acts in addition to the binaural auditory perception, so that it is integrated by the brain in a complementary way to the sound signals provided. The brain integrates the visual and auditory information simultaneously. This allows the user to adjust more optimally its production according to its perception. In a preferred embodiment implementing visual perception, it is provided that the output signals supplied and perceived visually consist of a combination of signals from at least one acquired input sound signal, the sound signal. reference and an indicator of the difference between at least one acquired input sound signal and the signal of said reference sound, said indicator relating to at least one characteristic of said signals. It is thus possible to show the user any kind of combination or information between at least one acquired input sound signal and the signal of the reference sound in order to provide the brain with relevant information.
Dans un autre mode de réalisation préférentiel mettant en œuvre la perception visuelle, il est prévu que les signaux de sortie fournis et perçus de manière visuelle soient perçus par l'affichage d'un visage virtuel tridimensionnel de correction indiquant les mouvements du visage de l'utilisateur nécessaires à la reproduction du son de référence. Cet affichage permet de fournir à l'utilisateur la différence algébrique (signée) entre ce qui a été produit et ce qui devrait être produit. Dans la mesure où nos sens s'attendent à des signaux cohérents, les mouvements de la bouche d'un locuteur devraient correspondre aux sons émis. Si une personne voit des mouvements de lèvres incompatibles avec ce qu'elle entend, elle est perturbée. Cette incompatibilité est ainsi utilisée comme information visuelle intégrée par le cerveau de sorte à ajuster la reproduction vocale de l'utilisateur.In another preferred embodiment implementing visual perception, it is provided that the output signals provided and perceived in a visual manner are perceived by the display of a three-dimensional virtual face of correction indicating the movements of the face of the face. necessary to reproduce the reference sound. This display provides the user with the algebraic (signed) difference between what has been produced and what should be produced. Since our senses expect coherent signals, the movements of a speaker's mouth should correspond to the sounds made. If a person sees lip movements that are incompatible with what they hear, they are disturbed. This incompatibility is thus used as visual information integrated by the brain so as to adjust the voice reproduction of the user.
Dans un mode de réalisation visant à améliorer la perception multi-sensorielle de la production sonore de l'utilisateur, il est prévu que, lors de l'étape de perception multi-sensorielle, une partie au moins desdits signaux de sortie fournis soit perçue de manière tactile, lesdits signaux de sortie fournis et perçus de manière tactile étant en rapport avec au moins un signal sonore d'entrée acquis et le signal dudit son de référence. Cette perception tactile agit en complément de la perception sonore binaurale, et éventuellement également visuelle, de sorte à être intégrée par le cerveau de manière complémentaire aux signaux sonores fournis.In one embodiment for improving the multi-sensory perception of the user's sound production, it is expected that, during the multi-sensory perception step, at least a portion of said output signals provided will be perceived from in a tactile manner, said output signals supplied and perceived in a tactile manner being related to at least one acquired input sound signal and the signal of said reference sound. This tactile perception acts in addition to the binaural sound perception, and possibly also visual perception, so as to be integrated by the brain in a complementary manner to the sound signals. provided.
Préférentiellement, le procédé de reproduction vocale fonctionne en boucle fermée. Cette boucle fermée entre la production et la perception sonore permet d'ajuster dynamiquement le lien entre la production et la perception pour arriver au résultat de reproduction vocale du son de référence.Preferably, the voice reproduction method operates in a closed loop. This closed loop between production and sound perception dynamically adjusts the link between production and perception to arrive at the vocal reproduction result of the reference sound.
Avantageusement, il est prévu qu'un retard soit introduit au niveau des signaux d'entrée acquis de sorte à synchroniser lesdits signaux d'entrée acquis sur les signaux de sortie fournis.Cela permet d'effectuer des combinaisons entre les signaux d'entrée et de sortie synchronisés, de sorte à permettre à l'utilisateur d'intégrer en temps réel l'écart entre sa production vocale et sa perception afin de l'ajuster en temps réel.Advantageously, provision is made for a delay to be introduced at the level of the acquired input signals so as to synchronize said acquired input signals with the output signals supplied. This allows combinations to be made between the input and output signals. Synchronized output, so that the user can integrate in real time the gap between voice production and perception to adjust in real time.
L'invention concerne également un dispositif de reproduction vocale d'un son de référence par au moins un utilisateur, comprenant un système d'acquisition de signaux d'entrée, lesdits signaux d'entrée comprenant au moins un signal sonore d'entrée, un système de traitement desdits signaux d'entrée acquis apte à fournir des signaux de sortie comprenant au moins une information de comparaison d'un signal sonore d'entrée acquis avec le signal dudit son de référence, et un système de perception multi-sensorielle desdits signaux de sortie fournis, agencé de manière à permettre à l'utilisateur d'atteindre ledit son de référence. Ce dispositif comprend un système de contrôle du niveau de perception multi- sensorielle d'au moins un signal de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal dudit son de référence, ledit système de contrôle comportant des moyens d'ajustement d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence, en fonction de l'état de performance de l'utilisateur. Ce dispositif, constitué de la combinaison entre un dispositif classique de d'entraînement vocal et un moyen de perception multi-sensorielle et sonore binaurale, permet à l'utilisateur d'optimiser l'ajustement de sa production en fonction de sa perception. Selon un autre mode de réalisation préférentiel de ce dispositif, il est prévu qu'il comprenne des moyens d'enregistrement et de stockage des signaux d'entrée acquis et des signaux de sortie fournis en vue d'établir un indicateur de progrès de la reproduction vocale du son de référence par l'utilisateur. Cela permet à l'utilisateur de connaître l'évolution de la reproduction vocale du son de référence, de sorte à déterminer lui-même le progrès effectué.The invention also relates to a voice reproduction device for a reference sound by at least one user, comprising an input signal acquisition system, said input signals comprising at least one input sound signal, a processing system of said acquired input signals adapted to provide output signals comprising at least a comparison information of an acquired input sound signal with the signal of said reference sound, and a multi-sensory perception system of said signals output provided, arranged to allow the user to reach said reference sound. This device comprises a multisensory perception level control system of at least one output signal comprising at least one comparison information between an acquired input sound signal and the signal of said reference sound, said control system comprising means for adjusting at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal, depending on the performance state of the user. This device, consisting of the combination between a conventional device for vocal training and a means of multi-sensory perception and binaural sound, allows the user to optimize the adjustment of its production according to its perception. According to another preferred embodiment of this device, it is provided that it comprises means for recording and storing the acquired input signals and the output signals provided with a view to establishing a reproduction progress indicator. voice of the reference sound by the user. This allows the user to know the evolution of the vocal reproduction of the reference sound, so as to determine itself the progress made.
Selon un autre mode de réalisation préférentiel de ce dispositif, il est prévu que le système d'acquisition des signaux d'entrée comprenne des moyens de perception au moins parmi des moyens de perception auditive, visuelle et tactile, lesdits des moyens de perception étant disposés de sorte à fournir à l'utilisateur au moins un signal de sortie en rapport avec au moins un signal sonore d'entrée acquis et le signal dudit son de référence.According to another preferred embodiment of this device, provision is made for the input signal acquisition system to comprise means of perception at least among auditory, visual and tactile perception means, said perception means being arranged so as to provide the user with at least one output signal related to at least one acquired input sound signal and the signal of said reference sound.
BRÈVE DESCRIPTION DES DESSINSBRIEF DESCRIPTION OF THE DRAWINGS
L'invention sera mieux comprise à la lecture de la description détaillée d'un exemple non limitatif de réalisation, accompagnée de figures représentant respectivement : la figure 1 , un schéma de principe d'un dispositif et d'un procédé de reproduction vocale mono-utilisateur selon la présente invention, la figure 2, un schéma d'un dispositif de reproduction vocale monoutilisateur selon un mode de réalisation de l'invention, - la figure 3, des moyens de perception auditive d'un dispositif de reproduction vocale mono-utilisateur selon un mode de réalisation de l'invention, la figure 4, des moyens de perception visuelle d'un dispositif de reproduction vocale mono-utilisateur selon un mode de réalisation de l'invention, et - la figure 5, un schéma de principe d'un dispositif de reproduction vocale multi-utilisateurs selon la présente invention. EXPOSÉ DÉTAILLÉ DE MODES DE RÉALISATION PARTICULIERSThe invention will be better understood on reading the detailed description of a nonlimiting example of embodiment, accompanied by figures respectively representing: FIG. 1, a block diagram of a device and a method of single voice reproduction. FIG. 2 is a diagram of a single-user voice reproduction device according to one embodiment of the invention; FIG. 3 is a device for auditory perception of a single-user speech reproduction device; according to one embodiment of the invention, FIG. 4, means for visual perception of a single-user speech reproduction device according to one embodiment of the invention, and FIG. 5, a block diagram of FIG. a multi-user voice reproduction device according to the present invention. DETAILED PRESENTATION OF PARTICULAR EMBODIMENTS
On entendra dans le présent brevet par état de performance de l'utilisateur le niveau de reproduction d'un son de référence atteint par l'utilisateur, c'est-à-dire l'écart entre le son que l'utilisateur a produit et le son de référence. Cet état de référence peut être déterminé selon un ou plusieurs paramètres des signaux respectivement du son que l'utilisateur a produit et du son de référence.In this patent, the performance status of the user will be understood to be the level of reproduction of a reference sound reached by the user, that is to say the difference between the sound that the user has produced and the reference sound. This reference state can be determined according to one or more parameters of the signals respectively of the sound that the user has produced and of the reference sound.
La figure 1 représente un schéma de principe d'un dispositif et d'un procédé de reproduction vocale selon la présente invention. Ce dispositif comprend un système d'acquisition 2, un système de traitement 3, un système de perception mufti-sensorielle 4, ainsi qu'un système de contrôle 5.Fig. 1 shows a block diagram of a voice reproduction device and method according to the present invention. This device comprises an acquisition system 2, a treatment system 3, a mufti-sensory perception system 4 and a control system 5.
Le système d'acquisition 2 permet la captation d'une pluralité de signaux provenant du comportement de l'utilisateur 1. Il réalise l'acquisition de signaux d'entrée, lesdits signaux d'entrée comprenant au moins un signal sonore d'entrée. Il comporte des moyens d'acquisition parmi lesquels des moyens d'acquisition sonore 21 , de mouvement 22, de respiration 23, tactile 24 et de souffle 25. Ces moyens sont respectivement constitués d'un microphone, d'un accéléromètre, d'un électrocardiographe, d'un clavier et d'un spiromètre.The acquisition system 2 allows the capture of a plurality of signals from the behavior of the user 1. It realizes the acquisition of input signals, said input signals comprising at least one input sound signal. It comprises acquisition means, including sound acquisition means 21, movement means 22, breathing means 23, touch keys 24, and blast means 25. These means consist respectively of a microphone, an accelerometer, a electrocardiograph, keyboard and spirometer.
Dans d'autres modes de réalisation, ces moyens d'acquisitions comprennent une pluralité de microphones, un joystick, un volant, une caméra, un dispositif de stéréo-vision, un tapis, un capteur de vibration, un capteur de pression, un électroencéphalographe, une hélice, une bande à induction ou un téléphone.In other embodiments, these acquisition means comprise a plurality of microphones, a joystick, a steering wheel, a camera, a stereo-vision device, a carpet, a vibration sensor, a pressure sensor, an electroencephalograph , a propeller, an induction tape or a telephone.
Le système de perception multi-sensorielle 4 permet de faire ressentir à l'utilisateur 1 l'écart entre le son qu'il a produit et le son de référence 6 qu'il souhaite reproduire, dans le but de l'aider à reproduire ledit son de référence 6. Il reçoit pour cela les signaux de sortie fournis et les transmet à l'utilisateur 1. Il comprend des moyens de perception sonore 41 et 42, visuelle 43, tactile 44 et vibrationnelle 45. Ces moyens de perception sont respectivement constitués d'écouteurs, d'un écran, d'un volant à retour d'effort et d'une électrode de stimulation musculaire.The multi-sensory perception system 4 makes it possible for user 1 to feel the difference between the sound he has produced and the reference sound 6 that he wishes to reproduce, in order to help him reproduce the said reference sound 6. It receives for this purpose the output signals supplied and transmits them to the user 1. It comprises means of sound perception 41 and 42, visual 43, tactile 44 and vibrational 45. These perception means are constituted respectively headphones, a screen, a force feedback wheel and a muscle stimulation electrode.
Dans d'autres modes de réalisation, ces moyens de perception comprennent un afficheur de données, une pluralité de haut-parleurs, un casque sonore, un dispositif de lecture en braille, un robot ou un enrouleur.In other embodiments, these perception means comprise a data display, a plurality of loudspeakers, a sound headset, a braille reading device, a robot or a winder.
Le système de traitement 3 comprend des moyens de traitement 31 des signaux d'entrée acquis de sorte à fournir des signaux de sortie. Les traitements sont opérés de sorte à ce que ces signaux de sortie comprenant au moins une information de comparaison d'un signal sonore d'entrée acquis avec le signal du son de référence 6.The processing system 3 comprises processing means 31 of the input signals acquired so as to provide output signals. The processes are operated so that these output signals comprise at least one piece of information for comparing an input sound signal acquired with the signal of the reference sound 6.
Ces moyens de traitement 31 peuvent être par exemple constitués d'un ordinateur, d'un PDA, d'un DVD, d'un téléphone.These processing means 31 may consist for example of a computer, a PDA, a DVD, a telephone.
Les signaux de sortie calculés par les moyens de traitement 31 peuvent être en particulier des indices acoustiques, comme la tension des cordes vocales, le registre de la parole, la sonie, la prosodie ou le suprasegmental, le segmentai, le timbre vocal, la coordination supraglottique ou glottique, le mouvement turbulent de l'air, les perturbations stochastiques de la vibration des plis vocaux, les vibrations non sollicitées des plis ventriculaires ou des plis ary-epiglottiques, les transitions incontrôlées ou les vibrations non-modales.The output signals calculated by the processing means 31 may in particular be acoustic indices, such as the vocal cord voltage, the speech register, the loudness, the prosody or the suprasegmental, the segmental, the vocal tone, the coordination supraglottic or glottic, turbulent air movement, stochastic disturbances of vocal fold vibration, unsolicited vibrations of ventricular folds or ary-epiglottic folds, uncontrolled transitions or non-modal vibrations.
Le système de traitement 3 comprend également des moyens d'enregistrement et de stockage 32 des signaux d'entrée acquis et des signaux de sortie fournis. Ces moyens 32 permettent d'établir un indicateur de progrès de la reproduction vocale du son de référence 6 par l'utilisateur 1. Cet indicateur peut être utilisé par exemple pour fournir à l'utilisateur des informations de progrès en fonction du temps sous la forme de graphiques, ou pour contrôler le niveau de perception par l'ajustement automatique et dynamique d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence 6, en fonction de l'état de performance de l'utilisateur 1.The processing system 3 also includes means for recording and storing the acquired input signals and output signals. These means 32 make it possible to establish a progress indicator for the speech reproduction of the reference sound 6 by the user 1. This indicator can be used, for example, to provide the user with progress information as a function of time in the form of of graphics, or to control the level of perception by automatic and dynamic adjustment of at least one parameter of a signal among said output signal, said acquired input signal and said sound signal of reference 6, depending on the state of performance of the user 1.
Le système de contrôle 5 assure le contrôle du niveau de perception multi- sensorielle d'au moins un signal de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal dudit son de référence 6. Il comprend pour cela des moyens d'ajustement d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence 6, en fonction de l'état de performance de l'utilisateur 1.The control system 5 controls the multisensory perception level of at least one output signal comprising at least one piece of comparison information between an acquired input sound signal and the signal of said reference sound. it means for adjusting at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal 6, depending on the state of performance of the user 1 .
Selon un premier mode de réalisation, ces moyens sont des moyens d'ajustement manuel du niveau de perception, constitués d'éléments manipulables par l'utilisateur 1 , ce dernier pouvant ainsi régler le niveau de perception multi-sensorielle en fonction de son état de performance. Selon un deuxième mode de réalisation, ces moyens sont des moyens d'ajustement automatique et dynamique, constitués d'éléments de calculs aptes à déterminer l'état de performance de l'utilisateur et à en déduire le niveau de perception correspondant. Ils peuvent pour cela intégrer des informations d'écart entre son produit et son à reproduire sur un large intervalle de temps, ce qui permet de déterminer l'état de performance de l'utilisateur de façon plus précise.According to a first embodiment, these means are means for manually adjusting the level of perception, made up of elements that can be manipulated by the user 1, the latter being able to thus adjust the level of multi-sensory perception as a function of his state of perception. performance. According to a second embodiment, these means are automatic and dynamic adjustment means, consisting of calculation elements able to determine the state of performance of the user and to deduce the corresponding level of perception. They can do this by integrating gap information between their product and their reproduction over a wide time interval, which enables the user's performance state to be determined more precisely.
Ce système de contrôle 5 rend ainsi possible le réglage du niveau de perception multi-sensorielle en fonction de l'état de performance de l'utilisateur. Par exemple, pour un utilisateur débutant, l'écart entre le son produit et le son à reproduire sera très grand, et alors le système de contrôle 5 assurera une faible dynamique du niveau de perception de sorte que les signaux de sortie comprenant l'information de différence entre les deux sons ne soient pas perçus de façon trop néfaste. Au contraire, dans le cas d'un utilisateur expert, l'écart entre le son produit et le son à reproduire sera très faible, et alors le système de contrôle 5 assurera une dynamique élevée de sorte que l'utilisateur puisse atteindre plus finement du son à reproduire. Les moyens de contrôle manuel peut être par exemple constitués d'un clavier, d'une souris, d'une table de mixage, d'un volant ou d'un joystick. Les moyens de contrôle automatique et dynamique peuvent être par exemple constitués d'un processeur.This control system 5 thus makes it possible to adjust the level of multi-sensory perception as a function of the state of performance of the user. For example, for a novice user, the difference between the sound produced and the sound to be reproduced will be very large, and then the control system 5 will ensure a low dynamic level of perception so that the output signals including the information differences between the two sounds are not perceived too harmful. On the contrary, in the case of an expert user, the difference between the sound produced and the sound to be reproduced will be very small, and then the control system 5 will provide a high dynamic so that the user can reach more precisely the its to reproduce. The manual control means may consist for example of a keyboard, a mouse, a mixer, a steering wheel or a joystick. The automatic and dynamic control means may consist for example of a processor.
La transmission des signaux entre les systèmes d'acquisition 2, de traitement 3, de perception 4 et de contrôle est assurée par voie filaire. Selon d'autres modes de réalisation, cette transmission est sans fil ou réalisée via un réseau local ou externe, par exemple de type Internet.The transmission of signals between the acquisition systems 2, treatment 3, perception 4 and control is provided by wire. According to other embodiments, this transmission is wireless or performed via a local or external network, for example of the Internet type.
Le signal du son de référence 6 est disposé sur un support de stockage de données en vue de le fournir au système de traitement 3. Ce support peut être par exemple un CD standard, un fichier au format midi, ou tout autre type de support permettant l'enregistrement, du signal.The reference sound signal 6 is placed on a data storage medium in order to supply it to the processing system 3. This support may be for example a standard CD, a midi format file, or any other type of medium allowing recording, of the signal.
En référence toujours à la figure 1 , le procédé de reproduction vocale fonctionne en boucle fermée. Selon ce procédé, on procède tout d'abord à l'acquisition (I) de signaux d'entrée par les moyens d'acquisition 21 , 22, 23, 24 et 25 du système d'acquisition 2. Les signaux d'entrée comportent au moins un signal sonore correspondant au son produit par l'utilisateur 1.Referring still to Figure 1, the voice reproduction method operates in a closed loop. According to this method, the acquisition (I) of input signals is first performed by the acquisition means 21, 22, 23, 24 and 25 of the acquisition system 2. The input signals comprise at least one sound signal corresponding to the sound produced by the user 1.
Les signaux d'entrée acquis sont ensuite traités (II) en vue de fournir (III) des signaux de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal dudit son de référence 6.The acquired input signals are then processed (II) to provide (III) output signals comprising at least one comparison information between an acquired input sound signal and the signal of said reference sound 6.
Les traitements réalisés peuvent être des calculs ou des effets. Ces calculs sont par exemple, sans que ce soit limitatif, le calcul de la fréquence fondamentale (pitch, hauteur, note), le volume, l'intensité, le rythme, la dynamique (attaque, soutien, relâche), le timbre, la nasalité, le vibrato, le souffle (effet voilé), l'articulation, le calcul de moyenne, de l'historique ou de l'indication de progrès de l'utilisateur, la discrimination de son, la classification de sons, la mesure de similarités de signaux, l'analyse de pose et de mouvement dans les images. Ces effets sont par exemple le changement de pitch, le changement de tempo, la séparation musique parole, la réverbération, le calage sur une note juste, le décalage d'octave.The treatments performed can be calculations or effects. These calculations are for example, without being limiting, the calculation of the fundamental frequency (pitch, pitch, note), the volume, the intensity, the rhythm, the dynamics (attack, support, release), the timbre, the nasality, vibrato, breath (veiled effect), articulation, averaging, history or progress indication of the user, discrimination of sound, classification of sounds, measurement of signal similarities, pose and motion analysis in images. These effects are for example the change of pitch, the change of tempo, the separation music speech, the reverb, the calibration on a fair note, the shift of octave.
Au moins un des signaux de sortie est ensuite perçu (IV) par l'utilisateur 1 de manière multi-sensorielle en vue de permettre audit utilisateur 1 d'atteindre le son de référence 6.At least one of the output signals is then perceived (IV) by the user 1 in a multi-sensory manner so as to enable said user 1 to reach the reference sound 6.
Le signal du son de référence 6 est fourni (V) au système de traitement 3 avant l'acquisition des signaux d'entrée, de sorte à faire tenir compte les traitements (II) à la fois des signaux d'entrée et du signal du son de référence 6.The signal of the reference sound 6 is supplied (V) to the processing system 3 before the acquisition of the input signals, so as to account for the processing (II) of both the input signals and the signal of the input signal. reference sound 6.
Ce procédé de reproduction vocale comprend également des étapes (VII), (VIII) et (IX) de contrôle du niveau de perception multi-sensorielle d'au moins un signal de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal du son de référence 6. Ce contrôle est réalisé par l'ajustement d'au moins un paramètre d'un signal parmi le signal de sortie (VII), le signal d'entrée acquis (VIII) et le signal du son de référence 6 (IX).This voice reproduction method also comprises steps (VII), (VIII) and (IX) for controlling the multisensory perception level of at least one output signal comprising at least one piece of information comparing a sound signal of acquired input and the signal of the reference sound 6. This control is achieved by adjusting at least one parameter of a signal among the output signal (VII), the acquired input signal (VIII) and the signal reference sound 6 (IX).
Dans la mesure où le signal perçu par l'utilisateur 1 est un signal de sortie comprenant une information de comparaison entre un signal sonore d'entrée acquis et le signal du son de référence 6, il est possible d'ajuster le signal de sortie, le signal d'entrée acquis ou le signal du son de référence, ou une combinaison des trois, de manière à modifier la dynamique de l'écart entre le son produit et le son à reproduire.Since the signal perceived by the user 1 is an output signal comprising a comparison information between an acquired input sound signal and the signal of the reference sound 6, it is possible to adjust the output signal, the acquired input signal or the reference sound signal, or a combination of all three, so as to modify the dynamics of the difference between the sound produced and the sound to be reproduced.
Ce contrôle s'effectue soit manuellement par l'utilisateur 1 en fonction de l'état de performance qu'il détermine lui-même et du niveau de perception dont il souhaite disposé, soit de manière automatique et dynamique en fonction de l'état de performance de l'utilisateur déterminé lors de l'étape de traitement (II).This control is carried out manually by the user 1 according to the performance state that he determines himself and the level of perception that he wants, or automatically and dynamically depending on the state of the user's performance determined during the processing step (II).
Dans ce mode de réalisation préférentiel, la perception multi-sensorielle (IV) des signaux de sortie fournis s'opère selon une combinaison des modes de perception auditive, visuelle et tactile. Selon d'autres modes de réalisation, il peut être prévu de n'utiliser que deux modes de perception parmi les trois ci-dessus.In this preferred embodiment, the multi-sensory perception (IV) of output signals provided is performed using a combination of auditory, visual and tactile perception modes. According to other embodiments, it may be provided to use only two modes of perception among the three above.
Parmi les signaux de sortie fournis et perçus, deux sont perçus de manière auditive. Ces signaux de sortie sont en liaison avec le signal du son de référence 6 et un signal sonore d'entrée acquis, et comprennent ainsi une information de comparaison entre son de référence 6 le son émis par l'utilisateur 1.Of the output signals provided and perceived, two are heard in an auditory manner. These output signals are connected to the signal of the reference sound 6 and an acquired input sound signal, and thus comprise a comparison information between reference sound 6 and the sound emitted by the user 1.
Ces signaux sont de plus constitués de sorte à fournir à l'utilisateur 1 deux signaux de sortie différents. Cela permet de fournir par comparaison des deux signaux une information perceptible en rapport avec la différence entre le son produit et le son à reproduire.These signals are further constituted so as to provide the user 1 with two different output signals. This makes it possible to provide, by comparison of the two signals, perceptible information related to the difference between the sound produced and the sound to be reproduced.
Dans le mode de réalisation retenu, ces deux signaux de sortie proviennent de deux sources sonores spatialement séparées. Les deux sources sont agencées pour fournir à chaque oreille de l'utilisateur 1 un signal différent parmi les deux signaux. Cela peut être réalisé par exemple en utilisant deux écouteurs, chaque écouteur étant disposé contre une oreille et émettant un signal différent.In the embodiment chosen, these two output signals come from two spatially separated sound sources. The two sources are arranged to provide each ear of the user 1 a different signal among the two signals. This can be achieved for example by using two earphones, each earphone being arranged against an ear and emitting a different signal.
Selon un autre mode de réalisation, ces deux signaux de sortie proviennent d'une même source sonore. La source émet alors un seul signal contenant les deux signaux différents. Dans ce cas, la capacité d'audition binaurale des deux oreilles de l'utilisateur est mise en œuvre de sorte à séparer les deux signaux.According to another embodiment, these two output signals come from the same sound source. The source then emits a single signal containing the two different signals. In this case, the binaural hearing capability of the two user's ears is implemented so as to separate the two signals.
De sorte à fournir à l'utilisateur l'information de comparaison entre le son produit et le son à reproduire, les signaux sonores de sortie fournis à chaque oreille de l'utilisateur 1 sont respectivement le signal de référence (6) et un signal sonore d'entrée acquis. Dans ce cas, l'utilisateur peut directement percevoir la différence entre le son produit et le son à reproduire.In order to provide the user with the comparison information between the sound produced and the sound to be reproduced, the output sound signals supplied to each ear of the user 1 are respectively the reference signal (6) and an audible signal. acquired entrance. In this case, the user can directly perceive the difference between the sound produced and the sound to be reproduced.
Dans un autre mode de réalisation, les signaux sonores de sortie fournis à chaque oreille de l'utilisateur 1 sont respectivement le signal de référence 6 et la différence entre un signal sonore d'entrée acquis et ledit signal de référence 6.In another embodiment, the output sound signals provided to each ear of the user 1 are respectively the reference signal 6 and the difference between an acquired input sound signal and said reference signal 6.
Dans un autre mode de réalisation, l'un des deux signaux de sortie comprend un indicateur portant sur au moins une caractéristique des signaux d'entrée et du son de référence.In another embodiment, one of the two output signals comprises an indicator relating to at least one characteristic of the input signals and the reference sound.
De sorte à faire percevoir le signe de la différence entre le son produit et le son à reproduire, l'étape de perception auditive comprend une sous-étape d'affectation des deux signaux en fonction du signe de la différence entre le signal sonore d'entrée acquis et le signal du son de référence 6. Ainsi, dans le cas où le son d'entrée est supérieur au son de référence, le son d'entrée acquis sera émis dans l'oreille gauche et le son de référence dans l'oreille droite, cette affectation étant inversée lorsque le son d'entrée est inférieur au son de référence 6.In order to perceive the sign of the difference between the sound produced and the sound to be reproduced, the auditory perception step comprises a sub-step of assigning the two signals according to the sign of the difference between the sound signal of acquired input and the signal of the reference sound 6. Thus, in the case where the input sound is greater than the reference sound, the acquired input sound will be emitted into the left ear and the reference sound into the right ear, this assignment being inverted when the input sound is lower than the reference sound 6.
Dans un autre mode de réalisation, la perception du signe de la différence est opérée par l'ajustement de l'amplitude des deux signaux en fonction du signe de cette différence.In another embodiment, the perception of the sign of the difference is effected by adjusting the amplitude of the two signals according to the sign of this difference.
Parmi les signaux de sortie fournis et perçus lors de l'étape de perception multi- sensorielle (IV), au moins un est perçu de manière visuelle et un autre de manière tactile. Ces signaux de sortie sont également en liaison avec le signal du son de référence 6 et un signal sonore d'entrée acquis.Among the output signals provided and perceived during the multi-sensory perception stage (IV), at least one is perceived visually and another tactile. These output signals are also in connection with the signal of the reference sound 6 and an acquired input sound signal.
Dans tout ce procédé de reproduction vocale, un retard est introduit au niveau des signaux d'entrée acquis de sorte à les synchroniser sur les signaux de sortie fournis. Ce retard permet ainsi de réaliser une mise en correspondance exacte entre les signaux d'entrée et de sortie au niveau de l'étape de perception multi- sensorielle (IV).In all this speech reproduction method, a delay is introduced at the acquired input signals so as to synchronize them with the output signals provided. This delay thus makes it possible to perform an exact matching between the input and output signals at the level of the multisensory perception step (IV).
Les figures 2 à 4 représentent des schémas d'un dispositif de reproduction vocale selon un mode de réalisation de l'invention. En référence à la figure 2, le dispositif comprend dans ce mode de réalisation un microphone 50, une unité centrale 51 , un écran de visualisation 52, une paire d'écouteurs 53 et un volant 54.Figures 2 to 4 show diagrams of a voice reproduction device according to one embodiment of the invention. With reference to FIG. 2, the device comprises in this embodiment a microphone 50, a central unit 51, a display screen 52, a pair of earphones 53 and a steering wheel 54.
Le microphone 50 réalise l'acquisition du son émis par l'utilisateur 1 et la conversion de ce son émis en signal d'entrée. Le microphone est relié à l'unité centrale 51 qui réalise le traitement des signaux d'entrée en vue d'obtenir des signaux de sortie, ainsi que l'enregistrement local des données (signaux d'entrée et de sortie).The microphone 50 realizes the acquisition of the sound emitted by the user 1 and the conversion of this emitted sound into an input signal. The microphone is connected to the CPU 51 which performs the processing of the input signals to obtain output signals, as well as the local recording of the data (input and output signals).
L'unité centrale 51 est reliée au volant 54. Ce volant permet à l'utilisateur d'assurer le contrôle manuel du niveau de perception de la différence entre le son produit (le son acquis par le microphone 50) et le son à reproduire (le son de référence). L'utilisateur 1 tourne le volant 54 dans un sens ou dans l'autre de sorte à diminuer ou augmenter la dynamique dudit niveau de perception multi- sensorielle. Il peut ainsi régler lui-même son niveau de perception en fonction de son état de performance. Dans un autre mode de réalisation, le volant 54 est remplacé par un clavier de contrôle apte à réaliser les mêmes opérations d'ajustement de la différence entre le son produit et le son à reproduire.The central unit 51 is connected to the steering wheel 54. This steering wheel allows the user to provide manual control of the level of perception of the difference between the sound produced (the sound acquired by the microphone 50) and the sound to be reproduced ( the reference sound). The user 1 turns the steering wheel 54 in one direction or the other so as to decrease or increase the dynamics of said level of multi-sensory perception. He can thus adjust his perception level himself according to his state of performance. In another embodiment, the steering wheel 54 is replaced by a control keyboard adapted to perform the same operations of adjusting the difference between the sound produced and the sound to be reproduced.
L'unité centrale 51 est également reliée aux moyens de perception multi- sensorielle parmi lesquels la paire 53 d'écouteurs et l'écran 52 de visualisation.The central unit 51 is also connected to the multisensory perception means including the pair of headphones 53 and the display 52.
En référence à la figure 3, des signaux de sortie différents sont transmis aux écouteurs 53' et 53" de la paire 53 d'écouteurs. Par exemple, dans le cas où l'amplitude du signal d'entrée est supérieure à celle du signal du son de référence, le signal transmis à l'écouteur 53' est le signal d'entrée capté par le microphone et le signal transmis à l'écouteur 53" est le signal du son de référence. Dans le cas où l'amplitude du signal d'entrée est inférieure à celle du signal du son de référence, le signal transmis à l'écouteur 53' est le signal du son de référence et le signal transmis à l'écouteur 53" est le signal d'entrée capté par le microphone. Il est ainsi possible pour l'utilisateur 1 de percevoir le signe de la différence entre les deux signaux.Referring to Fig. 3, different output signals are transmitted to the earphones 53 'and 53 "of the earphone pair 53. For example, in the case where the amplitude of the input signal is greater than that of the signal of the reference sound, the signal transmitted to the earpiece 53 'is the input signal picked up by the microphone and the signal transmitted to the earphone 53 "is the signal of the reference sound. In the case where the amplitude of the input signal is lower than that of the signal of the reference sound, the signal transmitted to the earpiece 53 'is the signal of the reference sound and the signal transmitted to the earphone 53 "is the input signal picked up by the microphone. It is thus possible for the user 1 to perceive the sign of the difference between the two signals.
En référence à la figure 4, il est affiché sur l'écran 52 de visualisation un visage virtuel tridimensionnel de correction. Ce visage indique les mouvements du visage de l'utilisateur 1 nécessaires à la reproduction du son de référence. Pour chaque affichage sur l'écran 52, le visage virtuel tridimensionnel de correction indique le type de correction à apporter (labial, articulatoire, etc.) et la courbe indique la différence entre le son produit et le son à reproduire, éventuellement avec une indication de progrès en fonction du temps. Parmi les affichages possibles sur l'écran 52, l'affichage 52' concerne une correction labiale, l'affichage 52" une correction articulatoire et l'affichage 52'" une correction vibratoire par le souffle.With reference to FIG. 4, a three-dimensional virtual face of correction is displayed on the display screen 52. This face indicates the movements of the face of the user 1 necessary for the reproduction of the reference sound. For each display on the screen 52, the three-dimensional virtual face of correction indicates the type of correction to be made (labial, articulatory, etc.) and the curve indicates the difference between the sound produced and the sound to be reproduced, possibly with an indication. progress over time. Among the possible displays on the screen 52, the display 52 'relates to a labial correction, the display 52 "an articulatory correction and the display 52'" a vibratory correction by the breath.
Pour le chant, la position du corps et la respiration sont en effet indispensables. La position du corps peut être acquise par une caméra - ou deux caméras dans le cas de la stéréoscopie - associée à du traitement d'images. Un logiciel plus spécifiquement dédié à l'analyse du visage permet aussi d'avoir des informations sur les lèvres - en particulier leur ouverture, étirement et protrusion - de l'utilisateur, sur l'ouverture de la mâchoire et sur la hauteur de la tête relativement aux épaules. Pour la respiration et le souffle, il peut être utilisé des bandes à inductions autour de la poitrine et des abdominaux, et un capteur à hélice disposé devant la bouche.For singing, the position of the body and breathing are indeed essential. The position of the body can be acquired by a camera - or two cameras in the case of stereoscopy - associated with image processing. A software specifically dedicated to the analysis of the face also allows to have information on the lips - in particular their opening, stretching and protrusion - of the user, on the opening of the jaw and on the height of the head relatively to the shoulders. For breathing and breathing, induction bands may be used around the chest and abdominals, and a propeller sensor placed in front of the mouth.
Dans un autre mode de réalisation, le dispositif comprend également des moyens de perception tactile en complément de la paire 53 d'écouteurs et de l'écran 52 de visualisation. Ces moyens de perception tactile peuvent être par exemple constitués d'une électrode de stimulation musculaire.In another embodiment, the device also comprises tactile perception means in addition to the pair of headphones 53 and the display screen 52. These tactile perception means may for example consist of a muscle stimulation electrode.
Les modes de réalisation précédemment décrits de la présente invention sont donnés à titre d'exemples et ne sont nullement limitatifs. Il est entendu que l'homme du métier est à même de réaliser différentes variantes de l'invention sans pour autant sortir du cadre du brevet.The previously described embodiments of the present invention are given by way of examples and are in no way limiting. It is understood that the skilled person is able to realize different variants of the invention without departing from the scope of the patent.
En particulier, le dispositif et le procédé peut être appliqué dans des applications multi-utilisateurs. En référence à la figure 5, qui montre un exemple de réalisation à deux utilisateurs (1 ,1 '), chaque utilisateur dispose notamment de moyens d'acquisition et de perception. Les moyens de traitement sont mutualisés via une connexion à un réseau local ou externe comme Internet. Selon un autre mode de réalisation, les moyens de traitement peuvent être propres à chaque utilisateur. Les sons de référence (6,6') à reproduire peuvent être identiques ou différents. In particular, the device and the method can be applied in multi-user applications. With reference to FIG. 5, which shows an example of embodiment with two users (1, 1 '), each user notably has means of acquisition and perception. The processing means are shared via a connection to a local or external network such as the Internet. According to another embodiment, the processing means may be specific to each user. The reference sounds (6,6 ') to be reproduced may be identical or different.

Claims

REVENDICATIONS
1 - Procédé de reproduction vocale d'un son de référence (6) par au moins un utilisateur (1 ), ledit procédé comprenant une étape d'acquisition (I) de signaux d'entrée, une étape de traitement (II) desdits signaux d'entrée acquis en vue de fournir (III) des signaux de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal dudit son de référence (6), et une étape de perception multi-sensorielle (IV) par l'utilisateur (1 ) d'au moins un desdits signaux de sortie en vue de permettre audit utilisateur (1 ) d'atteindre ledit son de référence (6), caractérisé en ce que le niveau de perception multi-sensorielle d'au moins un signal de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal dudit son de référence (6) est contrôlé par l'ajustement d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence (6), en fonction de l'état de performance de l'utilisateur (1 ).1 - Process for the vocal reproduction of a reference sound (6) by at least one user (1), said method comprising a step of acquisition (I) of input signals, a step of processing (II) said signals input signal acquired to provide (III) output signals comprising at least one comparison information between an acquired input sound signal and the signal of said reference sound (6), and a multi-sensory perception step ( IV) by the user (1) of at least one of said output signals to enable said user (1) to reach said reference sound (6), characterized in that the level of multi-sensory perception of at least one output signal comprising at least one comparison information between an acquired input sound signal and the signal of said reference sound (6) is controlled by adjusting at least one parameter of a signal from said output signal, said acquired input signal and said sound signal reference number (6), depending on the user's performance status (1).
2 - Procédé de reproduction vocale selon la revendication 1 , dans lequel l'ajustement d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence (6), est réalisé de manière automatique et dynamique.The method of voice reproduction according to claim 1, wherein the adjustment of at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal (6) is realized automatically and dynamically.
3 - Procédé de reproduction vocale selon l'une quelconque des revendications 1 ou 2, dans lequel l'ajustement d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence (6), est réalisé par l'utilisateur (1).A voice reproduction method according to any of claims 1 or 2, wherein adjusting at least one parameter of a signal of said output signal, said acquired input signal and said sound signal of reference (6), is performed by the user (1).
4 - Procédé de reproduction vocale selon l'une quelconque des revendications précédentes, dans lequel, lors de l'étape de perception multi- sensorielle (IV), une partie au moins desdits signaux de sortie fournis est perçue de manière auditive, lesdits signaux de sortie fournis et perçus de manière auditive étant constitués de sorte à fournir à l'utilisateur (1 ) deux signaux de sortie différents en liaison avec le signal dudit son de référence (6) et un signal sonore d'entrée acquis.4 - speech reproduction method according to any one of the preceding claims, wherein, during the multisensory perception step (IV), at least a portion of said output signals provided is audibly perceived, said signals of provided and audibly perceived outputs being constituted so as to provide the user (1) with two signals of different output in connection with the signal of said reference sound (6) and an acquired input sound signal.
5 - Procédé de reproduction vocale selon la revendication 4, dans lequel lesdits signaux de sortie fournis et perçus de manière auditive proviennent d'une même source sonore.5 - speech reproduction method according to claim 4, wherein said output signals provided and audibly received from the same sound source.
6 - Procédé de reproduction vocale selon la revendication 4, dans lequel lesdits signaux de sortie fournis et perçus de manière auditive proviennent de deux sources sonores spatialement séparées et agencées pour fournir à chaque oreille de l'utilisateur (1) l'un parmi les deux signaux de sortie différents en liaison avec le signal dudit son de référence (6) et un signal sonore d'entrée acquis.The voice reproduction method according to claim 4, wherein said audibly supplied and audibly output signals are from two spatially separated sound sources arranged to provide each of the user's ear (1) with one of two different output signals in connection with the signal of said reference sound (6) and an acquired input sound signal.
7 - Procédé de reproduction vocale selon l'une quelconque des revendications 4 à 6, dans lequel les signaux sonores de sortie fournis à chaque oreille de l'utilisateur (1) et perçus de manière auditive sont constitués d'une combinaison de signaux parmi au moins un signal sonore d'entrée acquis, le signal du son de référence (6) et un indicateur de la différence entre au moins un signal sonore d'entrée acquis et le signal dudit son de référence (6), ledit indicateur portant sur au moins une caractéristique desdits signaux.7 - speech reproduction method according to any one of claims 4 to 6, wherein the output sound signals provided to each ear of the user (1) and audibly perceived consist of a combination of signals from the minus an acquired input sound signal, the reference sound signal (6) and an indicator of the difference between at least one acquired input sound signal and the signal of said reference sound (6), said indicator relating to least one characteristic of said signals.
8 - Procédé de reproduction vocale selon la revendication 7, dans lequel les signaux sonores de sortie fournis à chaque oreille de l'utilisateur (1) et perçus de manière auditive comportent respectivement au moins en partie le signal du son de référence (6) et un signal sonore d'entrée acquis.The voice reproduction method according to claim 7, wherein the output sound signals supplied to each ear of the user (1) and audibly perceived respectively comprise at least in part the signal of the reference sound (6) and an input sound signal acquired.
9 - Procédé de reproduction vocale selon la revendication 8, dans lequel les signaux sonores de sortie fournis à chaque oreille de l'utilisateur (1) et perçus de manière auditive sont respectivement le signal de référence (6) et un signal sonore d'entrée acquis.9 - speech reproduction method according to claim 8, wherein the output sound signals supplied to each ear of the user (1) and audibly perceived are respectively the reference signal (6) and an input sound signal acquired.
10 - Procédé de reproduction vocale selon la revendication 8, dans lequel les signaux sonores de sortie fournis à chaque oreille de l'utilisateur (1) et perçus de manière auditive sont respectivement le signal de référence (6) et la différence entre un signal sonore d'entrée acquis et ledit signal de référence (6).The voice reproduction method according to claim 8, wherein the output audible signals supplied to each ear of the user (1) and audibly perceived are respectively the reference signal (6) and the difference between an acquired input sound signal and said reference signal (6).
11 - Procédé de reproduction vocale selon l'une quelconque des revendications 8 à 10, dans lequel l'affectation des signaux sonores de sortie fournis à chaque oreille de l'utilisateur (1) et perçus de manière auditive est fonction du signe de la différence entre un signal sonore d'entrée acquis et ledit signal de référence (6).11 - speech reproduction method according to any one of claims 8 to 10, wherein the assignment of the output sound signals supplied to each ear of the user (1) and perceived auditory is a function of the sign of the difference between an acquired input sound signal and said reference signal (6).
12 - Procédé de reproduction vocale selon l'une quelconque des revendications 8 à 10, dans lequel l'amplitude des signaux sonores de sortie fournis à chaque oreille de l'utilisateur (1) et perçus de manière auditive est fonction du signe de la différence entre un signal sonore d'entrée acquis et ledit signal de référence (6).12 - speech reproduction method according to any one of claims 8 to 10, wherein the amplitude of the output sound signals provided to each ear of the user (1) and heard auditory is a function of the sign of the difference between an acquired input sound signal and said reference signal (6).
13 - Procédé de reproduction vocale selon l'une quelconque des revendications précédentes, dans lequel, lors de l'étape de perception multi- sensorielle (IV), une partie au moins desdits signaux de sortie fournis est perçue de manière visuelle, lesdits signaux de sortie fournis et perçus de manière visuelle étant en rapport avec au moins un signal sonore d'entrée acquis et le signal dudit son de référence (6).13 - A speech reproduction method according to any one of the preceding claims, wherein, during the multi-sensory perception step (IV), at least a portion of said output signals provided is perceived visually, said signals of output provided and perceived visually related to at least one acquired input sound signal and the signal of said reference sound (6).
14 - Procédé de reproduction vocale selon la revendication 13, dans lequel les signaux de sortie fournis et perçus de manière visuelle sont constitués d'une combinaison de signaux parmi au moins un signal sonore d'entrée acquis, le signal du son de référence (6) et un indicateur de la différence entre au moins un signal sonore d'entrée acquis et le signal dudit son de référence (6), ledit indicateur portant sur au moins une caractéristique desdits signaux.14. The voice reproduction method as claimed in claim 13, in which the output signals supplied and perceived visually consist of a combination of signals among at least one acquired input sound signal, the signal of the reference sound. ) and an indicator of the difference between at least one acquired input sound signal and the signal of said reference sound (6), said indicator relating to at least one characteristic of said signals.
15 - Procédé de reproduction vocale selon l'une quelconque des revendications 13 ou 14, dans lequel les signaux de sortie fournis et perçus de manière visuelle sont perçus par l'affichage d'un visage virtuel tridimensionnel de correction indiquant les mouvements du visage de l'utilisateur (1) nécessaires à la reproduction du son de référence (6).A voice reproduction method according to any one of claims 13 or 14, wherein the output signals provided and received from visual way are perceived by the display of a three-dimensional virtual face of correction indicating the movements of the face of the user (1) necessary for the reproduction of the reference sound (6).
16 - Procédé de reproduction vocale selon l'une quelconque des revendications précédentes, dans lequel, lors de l'étape de perception multi- sensorielle (IV), une partie au moins desdits signaux de sortie fournis est perçue de manière tactile, lesdits signaux de sortie fournis et perçus de manière tactile étant en rapport avec au moins un signal sonore d'entrée acquis et le signal dudit son de référence (6).16 - Voice reproduction method according to any one of the preceding claims, wherein, during the multisensory perception step (IV), at least a portion of said output signals provided is perceived in a tactile manner, said signals of output provided and touch-sensed being related to at least one acquired input sound signal and the signal of said reference sound (6).
17 - Procédé de reproduction vocale selon l'une quelconque des revendications précédentes, fonctionnant en boucle fermée.17 - speech reproduction method according to any one of the preceding claims, operating in a closed loop.
18 - Procédé de reproduction vocale selon l'une quelconque des revendications précédentes, dans lequel un retard est introduit au niveau des signaux d'entrée acquis de sorte à synchroniser lesdits signaux d'entrée acquis sur les signaux de sortie fournis.18. The voice reproduction method as claimed in claim 1, wherein a delay is introduced at the acquired input signals so as to synchronize said acquired input signals with the output signals provided.
19 - Dispositif de reproduction vocale d'un son de référence (6) par au moins un utilisateur (1), comprenant un système d'acquisition (2) de signaux d'entrée, lesdits signaux d'entrée comprenant au moins un signal sonore d'entrée, un système de traitement (3) desdits signaux d'entrée acquis apte à fournir des signaux de sortie comprenant au moins une information de comparaison d'un signal sonore d'entrée acquis avec le signal dudit son de référence (6), et un système de perception multi-sensorielle (4) desdits signaux de sortie fournis, agencé de manière à permettre à l'utilisateur (1) d'atteindre ledit son de référence (6), caractérisé en ce qu'il comprend un système de contrôle (5) du niveau de perception multi-sensorielle d'au moins un signal de sortie comprenant au moins une information de comparaison entre un signal sonore d'entrée acquis et le signal dudit son de référence (6), ledit système de contrôle comportant des moyens d'ajustement d'au moins un paramètre d'un signal parmi ledit signal de sortie, ledit signal d'entrée acquis et ledit signal du son de référence (6), en fonction de l'état de performance de l'utilisateur (1 ).19 - Device for the vocal reproduction of a reference sound (6) by at least one user (1), comprising an acquisition system (2) of input signals, said input signals comprising at least one sound signal input, a processing system (3) of said acquired input signals adapted to provide output signals comprising at least one comparison information of an acquired input sound signal with the signal of said reference sound (6) , and a multi-sensory perception system (4) of said output signals provided, arranged to allow the user (1) to reach said reference sound (6), characterized in that it comprises a system for controlling (5) the multisensory perception level of at least one output signal comprising at least one piece of comparison information between an acquired input sound signal and the signal of said reference sound (6), said control system having adjustment means of at least one parameter of a signal among said output signal, said acquired input signal and said reference sound signal (6), depending on the performance state of the user (1).
20 - Dispositif de reproduction vocale selon la revendication 19, comprenant des moyens d'enregistrement et de stockage (32) des signaux d'entrée acquis et des signaux de sortie fournis en vue d'établir un indicateur de progrès de la reproduction vocale du son de référence (6) par l'utilisateur (1 ).The voice reproducing apparatus according to claim 19, comprising means for recording and storing (32) the acquired input signals and output signals provided for establishing a progress indicator of the speech reproduction of the sound. reference (6) by the user (1).
21 - Dispositif de reproduction vocale selon l'une quelconque des revendications 19 ou 20, dans lequel le système d'acquisition (2) des signaux d'entrée comprend des moyens de perception au moins parmi des moyens de perception auditive (41 ,42), visuelle (43) et tactile (44,45), lesdits des moyens de perception étant disposés de sorte à fournir à l'utilisateur (1 ) au moins un signal de sortie en rapport avec au moins un signal sonore d'entrée acquis et le signal dudit son de référence (6). 21 - voice reproduction device according to any one of claims 19 or 20, wherein the acquisition system (2) of the input signals comprises means for perception at least among auditory perception means (41, 42) , visual (43) and tactile (44,45), said perception means being arranged to provide the user (1) with at least one output signal related to at least one acquired input sound signal and the signal of said reference sound (6).
PCT/FR2009/000488 2008-04-28 2009-04-24 Device and method for vocal reproduction with controlled multi-sensorial perception WO2009133324A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP09738358A EP2269183A1 (en) 2008-04-28 2009-04-24 Device and method for vocal reproduction with controlled multi-sensorial perception

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0852830 2008-04-28
FR0852830A FR2930671B1 (en) 2008-04-28 2008-04-28 DEVICE AND METHOD FOR VOICE REPRODUCTION WITH CONTROLLED MULTI-SENSORY PERCEPTION

Publications (1)

Publication Number Publication Date
WO2009133324A1 true WO2009133324A1 (en) 2009-11-05

Family

ID=40042799

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FR2009/000488 WO2009133324A1 (en) 2008-04-28 2009-04-24 Device and method for vocal reproduction with controlled multi-sensorial perception

Country Status (3)

Country Link
EP (1) EP2269183A1 (en)
FR (1) FR2930671B1 (en)
WO (1) WO2009133324A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109035968B (en) * 2018-07-12 2020-10-30 杜蘅轩 Piano learning auxiliary system and piano

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1083536A2 (en) * 1999-09-09 2001-03-14 Lucent Technologies Inc. A method and apparatus for interactive language instruction
US20040194610A1 (en) * 2003-03-21 2004-10-07 Monte Davis Vocal pitch-training device
WO2005004084A1 (en) * 2003-07-08 2005-01-13 I.P. Equities Pty Ltd Knowledge acquisition system, apparatus and processes
US20070052799A1 (en) * 2005-09-06 2007-03-08 International Business Machines Corporation System and method for assisting speech development for the hearing-challenged

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1083536A2 (en) * 1999-09-09 2001-03-14 Lucent Technologies Inc. A method and apparatus for interactive language instruction
US20040194610A1 (en) * 2003-03-21 2004-10-07 Monte Davis Vocal pitch-training device
WO2005004084A1 (en) * 2003-07-08 2005-01-13 I.P. Equities Pty Ltd Knowledge acquisition system, apparatus and processes
US20070052799A1 (en) * 2005-09-06 2007-03-08 International Business Machines Corporation System and method for assisting speech development for the hearing-challenged

Also Published As

Publication number Publication date
FR2930671B1 (en) 2010-05-07
EP2269183A1 (en) 2011-01-05
FR2930671A1 (en) 2009-10-30

Similar Documents

Publication Publication Date Title
JP6744940B2 (en) System and method for generating haptic effects related to audio signal transitions
US11625994B2 (en) Vibrotactile control systems and methods
US8638966B2 (en) Haptic chair sound enhancing system with audiovisual display
JP6734623B2 (en) System and method for generating haptic effects related to audio signals
US7732694B2 (en) Portable music player with synchronized transmissive visual overlays
FR3059191B1 (en) PERFECTLY AUDIO HELMET DEVICE
US8987571B2 (en) Method and apparatus for providing sensory information related to music
EP3281666B1 (en) Virtual reality system
Turchet Smart Mandolin: autobiographical design, implementation, use cases, and lessons learned
EP2396711A2 (en) Device and process interpreting musical gestures
WO2010092140A2 (en) Device and method for controlling the playback of a file of signals to be reproduced
EP1388832B1 (en) Method for calibrating audio intonation
Hunt et al. Multiple media interfaces for music therapy
CN109119057A (en) Musical composition method, apparatus and storage medium and wearable device
WO2006011342A1 (en) Music sound generation device and music sound generation system
JP2023175013A (en) Taste determination system, taste determination method, and program
WO2009133324A1 (en) Device and method for vocal reproduction with controlled multi-sensorial perception
JP7175120B2 (en) Singing aid for music therapy
Turchet et al. Smart Musical Instruments: Key Concepts and Do-It-Yourself Tutorial
US20230351868A1 (en) Vibrotactile control systems and methods
WO2021084095A1 (en) Sensorineural sound stimulation
WO2023173285A1 (en) Audio processing method and apparatus, electronic device, and computer-readable storage medium
US20230281244A1 (en) Audio Content Serving and Creation Based on Modulation Characteristics and Closed Loop Monitoring
Jap Mapping detected periodic dance movements to control tempo in the music playback of electronic dance music
CN117836854A (en) Generating audiovisual content based on video clips

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09738358

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2009738358

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE