EP1869949A1 - Verfahren und system zur räumlichen zuordnung eines audiosignals auf der basis seiner intrinsischen qualitäten - Google Patents

Verfahren und system zur räumlichen zuordnung eines audiosignals auf der basis seiner intrinsischen qualitäten

Info

Publication number
EP1869949A1
EP1869949A1 EP06743580A EP06743580A EP1869949A1 EP 1869949 A1 EP1869949 A1 EP 1869949A1 EP 06743580 A EP06743580 A EP 06743580A EP 06743580 A EP06743580 A EP 06743580A EP 1869949 A1 EP1869949 A1 EP 1869949A1
Authority
EP
European Patent Office
Prior art keywords
frame
sound signal
frames
index
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP06743580A
Other languages
English (en)
French (fr)
Inventor
Jean-Philippe Thomas
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Charlet Delphine
Collet Mikael
Orange SA
Original Assignee
Charlet Delphine
Collet Mikael
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Charlet Delphine, Collet Mikael, France Telecom SA filed Critical Charlet Delphine
Publication of EP1869949A1 publication Critical patent/EP1869949A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • H04S1/002Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/002Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/11Application of ambisonics in stereophonic audio systems

Definitions

  • the sound exchanges essentially concern the emission and reception of sound signals, directly perceptible to the human ear. These include sound signals provided by radio, telephone, messaging, television and others.
  • the current techniques consist of either making a spatialized sound recording in the studio, or performing a non-real-time and manual studio treatment, in particular for the spatialization of movie soundtracks.
  • ambisonic technique Another example of physical reconstruction of the acoustic field is known as ambisonic technique, which uses a decomposition of the sound field on the basis of eigenfunctions called "spherical harmonics".
  • the known technique of stereophony exploits the differences in propagation time or loudness to position the sound sources between two or more speakers, from interaural differences in time and of intensity that define the perceptual criteria of auditory localization in a substantially horizontal plane.
  • the binaural techniques aim to reconstruct the acoustic field only in the vicinity of the listener's ears, so that the eardrums of the latter perceive an acoustic field substantially identical to that which would have been generated by real sources.
  • the object of the present invention is to overcome the drawbacks of the techniques of the prior art, in order to allow the widest application of spatialization techniques to any monophonic and / or stereophonic sound signal, or even to a more complex sound signal.
  • the subject of the present invention is a method for spatialising a sound signal that is remarkable in that it consists at least in subdividing this sound signal into successive frames, analyzing the frames of the sound signal in blocks of frames to determine at least a spectral and / or physiological parameter of this sound signal in each frame, assigning each frame an index representative of at least this spectral and / or physiological parameter, to generate a series of classified frames and to submit each frame or group of classified frames , assigned the same index, to a spatialization processing, depending on the value of this index assigned to each frame.
  • the subject of the present invention is furthermore a system for spatializing a remarkable sound signal in that it comprises at least, in combination, a module for analyzing this sound signal in successive frames, to determine and assign to each frame successively an index representative of at least one spectral and / or physiological parameter of said sound signal in each frame, to generate a sequence of indexes of classified frames and a processing module of each frame or group of frames of the sound signal, classified and assigned the same index, according to a spatialization processing, depending on the value of the assigned index.
  • the invention is applicable to the electronics industry for processing sound signals, in particular stereo or mono-phonic signals, recording phonograms and / or videograms, to the fixed telephony industry. or mobile, to voice communication and / or transmission of speech signals over the IP network.
  • FIG. 1a represents, purely by way of illustration, a flowchart of the essential steps for implementing the method that is the subject of the present invention
  • FIG. 1b represents, for purely illustrative purposes, a method of classifying acoustic frames according to acoustic classes, according to a remarkable aspect of the method, object of the present invention
  • FIGS. 2a and 2b represent, by way of illustration, a nonlimiting variant of implementation of the method which is the subject of the present invention applied in the deferred mode, either to the recording and restitution, or to the partial offline diffusion, or still the real-time online communication of phonograms and / or the sound portion of videograms, according to the method object of the present invention;
  • FIG. 3 a represents, by way of illustration, a spatialization system of a sound signal, in accordance with the subject of the present invention, operating in partial offline diffusion mode;
  • FIG. 3b represents, by way of illustration, a system for spatializing a sound signal in online communication mode, more particularly intended for devices of the telephone or other type.
  • the method which is the subject of the invention is implemented on a sound signal SS, this sound signal corresponding for example to a digital signal, digital audio, speech or telephony, for example.
  • the method which is the subject of the invention consists, in the first place, in a step A in subdividing the sound signal SS into successive frames. The subdivision operation is noted
  • the aforementioned subdivision step can be executed in a conventional manner by subdividing the signal into successive frames with a duration of between 10 and 20 milliseconds, for example, typically 16 milliseconds. A sampling of the sound signal is thus performed to obtain a succession of contiguous frames for example, i denoting the rank of the frame. Typical values are frames with a duration of 32 milliseconds, calculated every 16 milliseconds, thus with an overlap of 16 milliseconds.
  • Step A is followed by a step B of analyzing a block of frames of the sound signal TSS, to determine at least one spectral and / or physiological parameter of the sound signal in each frame.
  • step B of analyzing each frame is represented by the relation
  • [PspjJo denotes a set of spectral and / or physiological parameters resulting from the analysis.
  • spectral and / or physiological parameter of the sound signal it is indicated that this or these parameters may correspond for example to a frequency parameter for a musical signal, a formant for a speech signal or the like, as will be described later in the description. It is understood in particular that several significant parameters can be associated with each frame in particular when the sound signal
  • SS is a signal that includes both music and song lyrics, for example, or in other situations.
  • the number of spectral and / or physiological parameters assigned to each frame is not limiting, as will be described later in the description.
  • Step B is then followed by a step C of assigning to each frame TSSj an index representative of at least one spectral or physiological parameter to generate a sequence of classified frames.
  • step C of assigning an index is represented by the symbolic relation
  • index assigned to each frame corresponds for example to an index j assigned a sub-index k, the subindex k allowing, for example, to specify classification variants for the main classification j assigned to each frame.
  • Step C thus makes it possible to obtain a fine classification of each frame as a function of the spectral and / or physiological parameter or parameters of the sound signal SS in the frame considered TSSi.
  • Step C is then followed by a spatialization step D itself, this step consisting in submitting each classified frame or group of frames of the same index and noted to the same spatialization processing function of the value of the aforementioned index assigned to the frame in question.
  • S j , k (.) Indicates the specific spatialization treatment applied to any rank i-frame to which the index j, k has been assigned or to any corresponding frame group. It is understood in particular that for a frame duration typically equal to 16 milliseconds, for example, a succession of frames can of course be assigned the same index j, k.
  • step D of FIG. 1a is followed by a step E consisting of comparing the value of rank i of the frame considered TSSj with a maximum value I of the decomposition or subdivision into frames.
  • the rank of the frame considered i is incremented to the value i + 1 in step F to return to step A and continue the process as long as there is a frame not subject to the subdivision process of step A to ensure the complete processing of the SS sound signal. More specifically, it is indicated that the method that is the subject of the invention with reference to FIG. 1b consists in classifying each frame or group of frames according to a plurality of acoustic classes denoted by C j .
  • each acoustic class C j such as Music, Speech / Talker Type, Brouhaha, or Silence, for example, can be associated with an index value j taking the corresponding value 0, 1, 2, 3. for the aforementioned acoustic classes in a non-limiting manner.
  • the value of the index j, 0, 1, 2 or 3 is a value obtained by analysis from the spectral and / or physiological parameters of the sound signal SS in each frame and that the value subindex k associated with each index value may be an arbitrary value or corresponding to a particular quality of the sound signal.
  • the technique used to process documents that are a priori unknown is the following: - a change of speaker is detected (break in the signal of a particular index);
  • the speaker after change is compared to all the speakers already identified in the document and he is either recognized as one of them, or considered as a new speaker and therefore increases the size of the "reference dictionary" of speakers for this document or sound signal.
  • the spatialization process applied to each classified frame may be chosen from among a plurality of spatialization processes such as reverberation, attenuation, the fundamental frequency change, the harmonic filtering coloration, the delay for example, or the holophonie, stereophonic, binaural or other techniques. It will be understood that for any index value j and subindex k assigned to each frame TSSj or group of frames can thus be chosen, as a function of the aforementioned index and subindex value, a specific spatialization treatment. and in particular the most appropriate treatment according to the desired effect.
  • the processing may consist in applying a so-called "fun" effect by changing the speech signal.
  • voice tone for example, for one or more speakers of the SS sound signal.
  • the method which is the subject of the invention thus makes it possible to automatically apply sound renderings or differentiated sound positions to any sound documents which result in SS sound signals for which no additional information is available or for which there is no control. not the sound.
  • the latter also makes it possible to reduce the load or the bit rate of the networks or the spectrum of the radio waves, since while the transmission of the sound signal SS can be carried out on a signal monophonic for example, he. then it is quite possible to execute the spatialization process upon reception of this signal and therefore after transmission at least from the point of view of the load or the speed of the transmission network.
  • the method which is the subject of the invention can advantageously be executed in off-line mode in deferred time.
  • steps A, B represent the same steps as those bearing the same reference in FIG.
  • the index assignment step can then be subdivided into a step Co corresponding to choosing the corresponding index j and subindex k, assigned to each frame TSSj, the step Co being followed by a step C 1 consisting of compare the rank of the frame i to the final value I representative of the number of frames.
  • step C 1 consisting of compare the rank of the frame i to the final value I representative of the number of frames.
  • step Ci On the contrary, on a positive response to the test of step Ci, a record C 3 of all the frames noted [TSSiJ, and a series of indexes noted
  • step C 3 refers to a recording on any storage medium such as, for example, a non-volatile memory, a permanent memory, a CD or DVD type optical recording disk. example.
  • the offline mode implementation in deferred time of the method which is the subject of the invention then consists, as represented in FIG. 2b, from the record available, obtained in step C 3 , of reading the support of corresponding record, comprising at least the index sequence assigned to the. sound signal subdivided into frames and the sound signal or at least the sequence of frames representative of the latter in a step C shown in Figure 2b.
  • step D 0 is then followed by a step D 1 of applying to the sound signal and to each current frame of this sound signal, a spatialization processing function of the index assigned to the current frame of the sound signal according to the symbolic relationship of step D of figure la.
  • the spatialized sound signal is thus restored in accordance with the method that is the subject of the present invention.
  • the method that is the subject of the present invention can also be implemented in offline broadcast mode with a limited time offset, not exceeding the duration of a few frames, by analysis and classification of each successive frame and successive spatialization processing. each frame according to the assigned index.
  • step C3 the recording operation as described in connection with Figure 2a in step C3 can be started.
  • the system adapts to the recording.
  • the method that is the subject of the present invention can be executed in line communication mode with minimum time shift, not exceeding the duration of a frame, by analyzing and classifying and spatially processing each frame of the delayed sound signal to a maximum of one frame duration.
  • FIGS. 2a and 2b can be performed with storage in a memory such as a random access memory for example, the time offset being able to be reduced to the calculation time of the analysis operation of frame and index assignment for a current frame, that is to say the B and Co steps of Figure 2a, this calculation time can of course be made much smaller than the duration of a frame.
  • the abovementioned procedure for execution in online communication mode of the method that is the subject of the present invention can advantageously be used for a spatialization processing of a telephone communication, for example, on a speech signal transmitted over a fixed or mobile telephone network. for example. It can also be implemented in online communication mode for the transmission of an IP network speech signal - for example.
  • FIGS. 3a and 3b A more detailed description of a spatialization system of a sound signal according to the subject of the present invention will now be given in connection with FIGS. 3a and 3b.
  • the system which is the subject of the invention comprises at least in combination a module 1 for analyzing the sound signal in successive frames in order to determine and assign to each successive frame an index representative of minus one spectral and / or physiological parameter of the sound signal in each frame to thus generate a sequence of indexes of classified frames.
  • the index sequence of classified frames is represented by the relation
  • the system according to the invention furthermore comprises a module 2 for processing each frame or group of frames of the sound signal SS and in particular frames classified and assigned the same index according to a same spatialization processing, a function of the value of the assigned index and, as previously described in the description, of the sub-index k associated with any index value j representative of an acoustic class.
  • the spatialization processing module 2 is able, from the sequence of indexes of the aforementioned classified frames and, of course, from the sequence of frames which it has at its disposal. apply the spatialization processing and reproduce the spatialized sound signal on a set of loudspeakers denoted HP in FIGS.
  • the analysis module 1 delivers to the spatialization processing module 2 a signal representative of the index sequence as indicated above, as well as either the sound signal SS or the sequence of frames [TSSi] J shifted (e) temporeinlement of a duration substantially equal to a plurality of frame duration to the spatialization processing module 2.
  • the analysis module 1 and the processing module 2 are connected substantially in parallel as shown in the above-mentioned figure, to execute each subdivided into frames in parallel with the sound signal.
  • the analysis module 1 and the spatialization processing model 2 may advantageously each comprise a frame subdivision module, bearing the reference Io respectively 20 , which are synchronized by a signal synchronization S y exchanged for example between the module 1 and module 2 analysis processing by spatialization.
  • the analysis module 1 furthermore comprises a module 1 1 executing the operations B and C of FIG. 1a, that is to say allowing the analysis of frames and the assignment of indexes to generate the sequence of classified frames and, in particular, the index sequence previously described.
  • the spatialization processing module 2 comprises a processing module ensuring the actual spatialization operation on the series of classified frames.
  • FIG. 3b it is indicated that the processing of analysis module 1 and, in particular, the module I 1 of the latter can be much lower than that of a frame duration, which makes it possible to ensure a mode of operation of the system shown in Figure 3b with a processing time much lower than a frame time.
  • the corresponding system then appears particularly well suited to processing the sound signal SS in online communication mode under the conditions indicated above in the description.
  • the method which is the subject of the present invention and the corresponding system as described in FIGS. 3a and 3b can be implemented from corresponding software modules and in particular from a program recorded on a storage medium and executed by a computer.
  • this module makes it possible to execute the subdivision into frames, as well as software modules allowing the analysis of each successive frame of the signal. sound to determine at least one spectral and / or physiological parameter of the sound signal in each frame, and then assign to each frame the representative index of the spectral and / or physiological parameter to generate a sequence of classified frames, that is to say say the signal corresponding to the index sequence previously described in the description.
  • These operations are executed by a software module I 1 implanted on the analysis module 1 represented in FIG. 3a or 3b.
  • the spatialization module 2 ⁇ which makes it possible to execute and submit each frame to the group of classified frames affected. from the same index to a spatialization process according to the value of the index assigned to each frame.
  • the operating mode is that shown in FIG. 1a.
  • the sound signal SS is then successively subjected to the steps A, B, C, D and E of the above-mentioned figure on a terminal from a transmitted SS sound signal. via IP packets without any prior processing performed at the server transmitting the corresponding html pages.
  • the operating mode corresponds to an on-line processing with a processing delay corresponding to at most one frame duration.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)
EP06743580A 2005-03-15 2006-03-15 Verfahren und system zur räumlichen zuordnung eines audiosignals auf der basis seiner intrinsischen qualitäten Withdrawn EP1869949A1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0502556 2005-03-15
PCT/FR2006/000580 WO2006097633A1 (fr) 2005-03-15 2006-03-15 Procede et systeme de spatialisation d'un signal sonore en fonction des qualites intrinseques de ce dernier

Publications (1)

Publication Number Publication Date
EP1869949A1 true EP1869949A1 (de) 2007-12-26

Family

ID=35229612

Family Applications (1)

Application Number Title Priority Date Filing Date
EP06743580A Withdrawn EP1869949A1 (de) 2005-03-15 2006-03-15 Verfahren und system zur räumlichen zuordnung eines audiosignals auf der basis seiner intrinsischen qualitäten

Country Status (2)

Country Link
EP (1) EP1869949A1 (de)
WO (1) WO2006097633A1 (de)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2548491A1 (fr) * 1983-06-29 1985-01-04 Tournier Sa Editions Gerard Procede et dispositif de production d'effet spatial pour l'ecoute d'emissions ou d'enregistrements monophoniques
US5596159A (en) * 1995-11-22 1997-01-21 Invision Interactive, Inc. Software sound synthesis system
US6570991B1 (en) 1996-12-18 2003-05-27 Interval Research Corporation Multi-feature speech/music discrimination system
US7711123B2 (en) * 2001-04-13 2010-05-04 Dolby Laboratories Licensing Corporation Segmenting audio signals into auditory events

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006097633A1 *

Also Published As

Publication number Publication date
WO2006097633A1 (fr) 2006-09-21

Similar Documents

Publication Publication Date Title
EP1999998B1 (de) Verfahren zur binauralen synthese unter berücksichtigung eines raumeffekts
EP1992198B1 (de) Optimierung des binauralen raumklangeffektes durch mehrkanalkodierung
EP1836876B1 (de) Verfahren und vorrichtung zur individualisierung von hrtfs durch modellierung
EP2000002B1 (de) Verfahren und einrichtung zur effizienten binauralen raumklangerzeugung im transformierten bereich
EP1600042B1 (de) Verfahren zum bearbeiten komprimierter audiodaten zur räumlichen wiedergabe
EP1563485B1 (de) Verfahren zur verarbeitung von audiodateien und erfassungsvorrichtung zur anwendung davon
EP2898707B1 (de) Optimierte kalibrierung eines klangwiedergabesystems mit mehreren lautsprechern
WO2007048900A1 (fr) Individualisation de hrtfs utilisant une modelisation par elements finis couplee a un modele correctif
WO2011104463A1 (fr) Compression de flux audio multicanal
EP1886535B1 (de) Verfahren zum herstellen mehrerer zeitsignale
EP2005420A1 (de) Einrichtung und verfahren zur codierung durch hauptkomponentenanalyse eines mehrkanaligen audiosignals
EP3079074A1 (de) Datenverarbeitungsverfahren zur einschätzung der parameter für die audiosignalmischung, entsprechendes mischverfahren, entsprechende vorrichtungen und computerprogramme
EP1479266A2 (de) Verfahren und vorrichtung zur steuerung einer anordnung zur wiedergabe eines schallfeldes
EP3400599B1 (de) Verbesserter ambisonic-codierer für eine tonquelle mit mehreren reflexionen
FR2776461A1 (fr) Procede de perfectionnement de reproduction sonore tridimensionnelle
EP3025514B1 (de) Klangverräumlichung mit raumwirkung
US12014710B2 (en) Device, method and computer program for blind source separation and remixing
EP1994526B1 (de) Gemeinsame schallsynthese und -spatialisierung
EP1869949A1 (de) Verfahren und system zur räumlichen zuordnung eines audiosignals auf der basis seiner intrinsischen qualitäten
WO2023156578A1 (fr) Procédé de traitement d'un signal sonore numérique pour émulation de disques vinyle
EP3108670B1 (de) Verfahren und vorrichtung zur wiedergabe eines mehrkanalaudiosignals in einer hörzone
CN118038888A (zh) 对白清晰度的确定方法、装置、电子设备和存储介质
EP3934282A1 (de) Verfahren zur umwandlung eines ersten satzes repräsentativer signale eines schallfelds in einen zweiten satz von signalen und entsprechende elektronische vorrichtung

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070914

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

17Q First examination report despatched

Effective date: 20080207

DAX Request for extension of the european patent (deleted)
GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20101201