WO2003028005A2 - Procede de caracterisation du timbre d'un signal sonore selon au moins un descripteur - Google Patents
Procede de caracterisation du timbre d'un signal sonore selon au moins un descripteur Download PDFInfo
- Publication number
- WO2003028005A2 WO2003028005A2 PCT/FR2002/003291 FR0203291W WO03028005A2 WO 2003028005 A2 WO2003028005 A2 WO 2003028005A2 FR 0203291 W FR0203291 W FR 0203291W WO 03028005 A2 WO03028005 A2 WO 03028005A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- harmonic
- harm
- hss
- hsd
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 title claims abstract description 19
- 230000003595 spectral effect Effects 0.000 claims abstract description 60
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000001228 spectrum Methods 0.000 claims description 11
- 238000012512 characterization method Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 5
- 238000005259 measurement Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000005311 autocorrelation function Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000003936 working memory Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H7/00—Instruments in which the tones are synthesised from a data store, e.g. computer organs
- G10H7/08—Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H3/00—Instruments in which the tones are generated by electromechanical means
- G10H3/12—Instruments in which the tones are generated by electromechanical means using mechanical resonant generators, e.g. strings or percussive instruments, the tones of which are picked up by electromechanical transducers, the electrical signals being further manipulated or amplified and subsequently converted to sound by a loudspeaker or equivalent instrument
- G10H3/125—Extracting or recognising the pitch or fundamental frequency of the picked up signal
Definitions
- the invention relates to a method for characterizing the timbre of a sound signal, according to at least one descriptor.
- the field of the invention is that of characterizing the timbre of a sound signal varying as a function of time.
- the timbre of a sound signal is intuitively characterized by all the perceptual properties excluding the pitch, the perceived intensity and the subjective duration of the sound signal.
- harmonic sound signals such as those produced by a violin, a flute, etc.
- percussive sound signals such as those produced by a drum, etc.
- harmonic sound signals such as those produced by a violin, a flute, etc.
- percussive sound signals such as those produced by a drum, etc.
- timbre measurements were carried out: each of these sets of measurements constitute a timbre space, respectively harmonic or percussive.
- timbre of a sound signal s (t) we seek to model the timbre of a sound signal s (t), more precisely its characteristics also called descriptors, in order to be able for example to recognize or locate the timbre of an unknown signal among those known to a sound database.
- the models of these characteristics are generally expressed as a function of the spectral and temporal envelopes of the sound signal s (t) and their variation.
- the sound signal s (t) and the time envelope AND (t) are illustrated in FIG. 1; the spectral envelope ES (f) is illustrated in FIG. 3: it is generally obtained following a first step consisting in analyzing the signal according to a sliding time window of which an example is represented in FIG. 2 then in a second step consisting in calculate the fast Fourier transform of the signal resulting from the previous step.
- the logarithmic attack time (lat or LT) defined as the logarithm of the difference between the instant t0 at which the signal starts and the instant tl at which the signal stabilizes as in the case of harmonic sound signals or reaches its maximum as in the case of percussive sound signals: lat - log ⁇ o (tl-t ⁇ ); these instants t0 and tl are shown in FIG.
- tO is the instant when the signal amplitude reaches 2% of the maximum amplitude
- - the harmonic spectral centroid or hsc or SC defined as the average over the duration of the signal, of the instantaneous spectral centroid, that is to say considered in a sliding analysis window
- the instantaneous spectral centroid is itself defined by the weighted average of the harmonic peaks of the spectrum of the signal represented in FIG. 3 and corresponds in a way to the equilibrium point of all the harmonic peaks.
- a simple method consists firstly in extracting the fundamental frequency fO from the sound signal s (t), then in a second step in detecting the peaks of harmonic, located around multiples of the fundamental frequency fO as illustrated in FIG. 3.
- the local fundamental frequency is for example obtained by calculating the normalized auto-correlation function of the local signal s (t); the local fundamental frequency fO then corresponds to the inverse of the time T0 of the first maximum of this function; - the harmonic spectral deviation or hsd representative of the spectral irregularity, defined as the average over the duration of the signal, of the instantaneous harmonic spectral deviation considered in a sliding analysis window; the instantaneous harmonic spectral deviation is itself defined as the spectral deviation of the amplitude peaks (in logarithmic scale) of the spectrum with respect to the spectral envelope.
- An example of instantaneous harmonic spectral deviation “ihsd” corresponding to the sound signal of a clarinet is illustrated in FIG.
- the harmonic spectral variation or hsv representative of the spectral flux defined as the average over the duration of the signal of the instantaneous harmonic spectral variation considered in an analysis window
- the instantaneous harmonic spectral variation is itself defined as the complement to 1 of the normalized correlation between the amplitude of the harmonics of two adjacent windows.
- the aim of the present invention is therefore to define new characteristics or descriptors so that, when combined with already known descriptors, they best apply to different timbre spaces and make it possible to best calculate the distance between two sound signals of the same stamp space.
- the subject of the invention is a method for characterizing the timbre of a sound signal s (t) varying as a function of time, for a duration D, according to at least one descriptor, mainly characterized in that it consists in defining said descriptor by the "hss" harmonic spectral range of the signal.
- the calculation of the harmonic spectral range of the signal comprises the following steps: a) memorizing the signal s (t), b) extracting its fundamental frequency fO, c) calculate and store the harmonics of the signal s (t) truncated according to a time window h (t) of duration less than or equal to D, as a function of the frequency by means of a device for transforming Fast Fourier, and by sliding said time window h (t) over the duration D of the signal s (t), d) for each time window h (t), calculate the harmonic spectral range of the truncated signal hss (s (t ) .h (t)) according to the following formula:
- .A (s. H, harm) being the amplitude of the peak of the harmonic number harm of the spectrum of the truncated signal s.h,
- .nbh being the number of harmonics of the spectrum of the truncated signal sh
- .hsc (sh) being the harmonic spectral centroid of the truncated signal sh memorize each hss (sh) e) calculate the harmonic spectral range of the signal hss (s) according to the following formula:
- nbf being the number of windows obtained by sliding the window h (t) over the duration D of the signal s (t).
- step d) also consists in calculating the harmonic spectral deviation of the truncated signal hsd (s (t) .h (t)) according to the following formula :
- step e then consists in also calculate the harmonic spectral deviation of the signal hsd (s):
- ⁇ hsd (s.h) hsd (s) - ⁇ nbf
- the duration of the window h (t) is equal to or almost equal to D and the number of windows nbf is equal to 1.
- the sound signal is a harmonic signal.
- the invention also relates to a method for measuring the "dist" distance between two harmonic sound signals, characterized in that it consists in using the characterization of the signals as described above.
- the characterization of the sound signals being based on the following descriptors, the logarithmic attack time (lat), the harmonic spectral centroid (hsc), the harmonic spectral deviation (hsd) and the harmonic spectral variation (hsv), the distance "dist "is of the form
- xi, x 2 , x 3 / x 4 , x 5 being predetermined coefficients.
- the logarithmic attack time (lat) is calculated on a logarithmic decimal scale and 5 ⁇ X ⁇ ⁇ ll, 10 "5 ⁇ x 2 ⁇ 5.10 -5 , 10 ⁇ ⁇ x 3 ⁇ 5.10 ⁇ 4 , 5 ⁇ x 4 ⁇ 15 and -30 ⁇ x 5 ⁇ -90.
- FIG. 1 schematically represents an audible signal s (t) and its time envelope AND (t) as a function of time t
- FIG. 2 schematically represents a time window for sliding analysis h (t)
- FIG. 3 schematically represents harmonic peaks and a spectral envelope ES (f) as a function of the frequency f
- FIG. 4 schematically illustrates the instantaneous harmonic spectral deviation of a clarinet.
- the sound signal s (t) varying as a function of time t and of a duration D, represented in FIG. 1 is analyzed according to a sliding time window h (t) represented in FIG. 2, which can for example be a Hamming window.
- the duration D of the signal is generally of the order of a few seconds, in the case for example of sound samples to be located among those of a database; but it can be much longer.
- a new descriptor, representative of the harmonic spectral range is used to contribute to the description of the timbre of a preferably harmonic sound signal and to make it possible to calculate more precisely the distance between two sound signals of a same harmonic timbre space.
- the harmonic spectral range corresponds to a frequency spreading coefficient of the energy of the harmonic part of the signal, around the spectral centroid.
- f (s. H, harm) being the frequency of the harmonic number harm of the spectrum of the truncated signal sh
- nbh being the number of harmonics of the spectrum of the truncated signal sh
- hsc (sh) being the harmonic spectral centroid of the truncated signal sh calculated according to a method of the prior art of which an example is given below, the hss ( s (t) .h (t)) thus obtained are memorized
- the harmonic spectral range of the signal s (t) is calculated as follows:
- nbf being the number of windows obtained by sliding the window h (t) over the duration D of the signal s (t).
- the harmonic spectral range of the signal s (t) is directly calculated over the duration D of the signal. This amounts to saying that the duration of the analysis window h (t) is equal to or almost equal to the duration D of the signal and that the number of windows is then equal to 1.
- this new descriptor it can advantageously be combined with the other descriptors lat, hsc, hsd and hsv of the prior art and calculate for example the distance "dist" between two sound signals of a same harmonic timbre space according to the following formula:
- ⁇ is the difference between the values of the same descriptor for the two sound signals considered and i, x 2 , X3 r X and X5 are predetermined coefficients.
- step d) of the calculation of hss will advantageously be completed by the following calculation:
- SE s. H, harm
- SE being the local spectral envelope of the truncated signal (with an amplitude on a logarithmic scale), around the peak of the harmonic number harm obtained according to a method known to those skilled in the art.
- ⁇ hsd (s.h) hsd (s) ⁇ nbf
- step d) of the calculation of hss by the following calculation known to those skilled in the art:
- the distance was notably measured by calculating the descriptors according to the aforementioned formulas, the logarithmic attack time, lat, being calculated on a logarithmic decimal scale, and by taking the coefficients in the following ranges:
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Electrophonic Musical Instruments (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2003531457A JP4242281B2 (ja) | 2001-09-26 | 2002-09-26 | 少なくとも1つの記述子に基づいて音響信号の音色を特徴付けるための方法 |
US10/490,607 US7406356B2 (en) | 2001-09-26 | 2002-09-26 | Method for characterizing the timbre of a sound signal in accordance with at least a descriptor |
EP02799430A EP1438707A2 (fr) | 2001-09-26 | 2002-09-26 | Procede de caracterisation du timbre d'un signal sonore selon au moins un descripteur |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR01/12384 | 2001-09-26 | ||
FR0112384A FR2830118B1 (fr) | 2001-09-26 | 2001-09-26 | Procede de caracterisation du timbre d'un signal sonore selon au moins un descripteur |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2003028005A2 true WO2003028005A2 (fr) | 2003-04-03 |
WO2003028005A3 WO2003028005A3 (fr) | 2003-09-25 |
Family
ID=8867628
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FR2002/003291 WO2003028005A2 (fr) | 2001-09-26 | 2002-09-26 | Procede de caracterisation du timbre d'un signal sonore selon au moins un descripteur |
Country Status (5)
Country | Link |
---|---|
US (1) | US7406356B2 (fr) |
EP (1) | EP1438707A2 (fr) |
JP (1) | JP4242281B2 (fr) |
FR (1) | FR2830118B1 (fr) |
WO (1) | WO2003028005A2 (fr) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090048828A1 (en) * | 2007-08-15 | 2009-02-19 | University Of Washington | Gap interpolation in acoustic signals using coherent demodulation |
US8126578B2 (en) * | 2007-09-26 | 2012-02-28 | University Of Washington | Clipped-waveform repair in acoustic signals using generalized linear prediction |
US8247677B2 (en) * | 2010-06-17 | 2012-08-21 | Ludwig Lester F | Multi-channel data sonification system with partitioned timbre spaces and modulation techniques |
US10186247B1 (en) | 2018-03-13 | 2019-01-22 | The Nielsen Company (Us), Llc | Methods and apparatus to extract a pitch-independent timbre attribute from a media signal |
US11158297B2 (en) | 2020-01-13 | 2021-10-26 | International Business Machines Corporation | Timbre creation system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4384335A (en) * | 1978-12-14 | 1983-05-17 | U.S. Philips Corporation | Method of and system for determining the pitch in human speech |
FR2639459A1 (fr) * | 1988-11-19 | 1990-05-25 | Sony Corp | Procede de traitement du signal et appareil de formation de donnees issues d'une source sonore |
US5327518A (en) * | 1991-08-22 | 1994-07-05 | Georgia Tech Research Corporation | Audio analysis/synthesis system |
US5479564A (en) * | 1991-08-09 | 1995-12-26 | U.S. Philips Corporation | Method and apparatus for manipulating pitch and/or duration of a signal |
US6182042B1 (en) * | 1998-07-07 | 2001-01-30 | Creative Technology Ltd. | Sound modification employing spectral warping techniques |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE19505435C1 (de) * | 1995-02-17 | 1995-12-07 | Fraunhofer Ges Forschung | Verfahren und Vorrichtung zum Bestimmen der Tonalität eines Audiosignals |
-
2001
- 2001-09-26 FR FR0112384A patent/FR2830118B1/fr not_active Expired - Fee Related
-
2002
- 2002-09-26 EP EP02799430A patent/EP1438707A2/fr not_active Withdrawn
- 2002-09-26 US US10/490,607 patent/US7406356B2/en not_active Expired - Fee Related
- 2002-09-26 JP JP2003531457A patent/JP4242281B2/ja not_active Expired - Fee Related
- 2002-09-26 WO PCT/FR2002/003291 patent/WO2003028005A2/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4384335A (en) * | 1978-12-14 | 1983-05-17 | U.S. Philips Corporation | Method of and system for determining the pitch in human speech |
FR2639459A1 (fr) * | 1988-11-19 | 1990-05-25 | Sony Corp | Procede de traitement du signal et appareil de formation de donnees issues d'une source sonore |
US5479564A (en) * | 1991-08-09 | 1995-12-26 | U.S. Philips Corporation | Method and apparatus for manipulating pitch and/or duration of a signal |
US5327518A (en) * | 1991-08-22 | 1994-07-05 | Georgia Tech Research Corporation | Audio analysis/synthesis system |
US6182042B1 (en) * | 1998-07-07 | 2001-01-30 | Creative Technology Ltd. | Sound modification employing spectral warping techniques |
Also Published As
Publication number | Publication date |
---|---|
FR2830118B1 (fr) | 2004-07-30 |
FR2830118A1 (fr) | 2003-03-28 |
WO2003028005A3 (fr) | 2003-09-25 |
JP2005504347A (ja) | 2005-02-10 |
EP1438707A2 (fr) | 2004-07-21 |
US7406356B2 (en) | 2008-07-29 |
US20040220799A1 (en) | 2004-11-04 |
JP4242281B2 (ja) | 2009-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8140331B2 (en) | Feature extraction for identification and classification of audio signals | |
EP2659481B1 (fr) | Détection d'un changement de scène autour d'un ensemble de points de départ dans des données multimédia | |
US8440900B2 (en) | Intervalgram representation of audio for melody recognition | |
US20140330556A1 (en) | Low complexity repetition detection in media data | |
WO2004095315A1 (fr) | Analyse de caracteristiques temporelles parametrees | |
US20110067555A1 (en) | Tempo detecting device and tempo detecting program | |
CN112394224B (zh) | 音频文件产生时间溯源动态匹配方法及系统 | |
AU2019335404B2 (en) | Methods and apparatus to fingerprint an audio signal via normalization | |
WO2003028005A2 (fr) | Procede de caracterisation du timbre d'un signal sonore selon au moins un descripteur | |
US20230350943A1 (en) | Methods and apparatus to identify media that has been pitch shifted, time shifted, and/or resampled | |
WO2006032751A1 (fr) | Procede et dispositif d'evaluation de l'efficacite d'une fonction de reduction de bruit destinee a etre appliquee a des signaux audio | |
Rauhala et al. | F0 estimation of inharmonic piano tones using partial frequencies deviation method | |
CN117714960A (zh) | 麦克风模组的检测方法、检测装置、车辆及存储介质 | |
EP3155609A1 (fr) | Analyse frequentielle par demolation de phase d'un signal acoustique | |
WO2011128582A2 (fr) | Procédé et système d'analyse et de codage de signaux audiophoniques |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): JP |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FR GB GR IE IT LU MC NL PT SE SK TR |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 10490607 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2003531457 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2002799430 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2002799430 Country of ref document: EP |