WO2001004870A1 - Method of automatic recognition of musical compositions and sound signals - Google Patents
Method of automatic recognition of musical compositions and sound signals Download PDFInfo
- Publication number
- WO2001004870A1 WO2001004870A1 PCT/GR2000/000024 GR0000024W WO0104870A1 WO 2001004870 A1 WO2001004870 A1 WO 2001004870A1 GR 0000024 W GR0000024 W GR 0000024W WO 0104870 A1 WO0104870 A1 WO 0104870A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- vectors
- model
- unknown
- group
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/02—Means for controlling the tone frequencies, e.g. attack or decay; Means for producing special musical effects, e.g. vibratos or glissandos
- G10H1/06—Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour
- G10H1/12—Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour by filtering complex waveforms
- G10H1/125—Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour by filtering complex waveforms using a digital filter
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/121—Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
- G10H2240/131—Library retrieval, i.e. searching a database or selecting a specific musical piece, segment, pattern, rule or parameter set
- G10H2240/141—Library retrieval matching, i.e. any of the steps of matching an inputted segment or phrase with musical database contents, e.g. query by humming, singing or playing; the steps may include, e.g. musical analysis of the input, musical feature extraction, query formulation, or details of the retrieval process
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
- G10H2250/215—Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
- G10H2250/235—Fourier transform; Discrete Fourier Transform [DFT]; Fast Fourier Transform [FFT]
Definitions
- This invention refers to a method of automatic recognition of musical compositions and sound signals and it is used in order to identify musical compositions and sound signals transmitted by radio, TV and/or performed in public places.
- the unknown musical composition or sound signal is received, in which the same procedure of extracting a corresponding set of characteristics is applied. These characteristics are compared with the corresponding sets of characteristics of the model signals and, by means of a number original criteria, it is decided if one (and which one exactly) of the model signals corresponds to the unknown signal under consideration. This procedure is described in figure 1.
- the whole frequency band from 0 to 11025 Hz is divided to sub-bands that are almost exponentially distributed.
- Hz is divided in 60 sub-bands.
- each model signal is digitised with a random sampling frequency F s preferably greater than or equal to 11025 Hz and a window of 8192 or 16384 or 32768 sample length, slides on the obtained digitised signal.
- F s random sampling frequency
- a window of 8192 or 16384 or 32768 sample length slides on the obtained digitised signal.
- an adaptive Fast Fourier Transform is applied and the Discrete Fourier Transform absolute value is obtained.
- the frequency domain window is divided in sections according to the aforementioned frequency sub-bands choice (see Table 1) and then, in every such section, all the peaks of the absolute value of the Fourier transform are spotted and the greater one is obtained. The value of this peak is called "section representative".
- Wf_ 32768 samples is obtained; notice that in any case this window will be of the same length with the sliding window which was used for the model signals.
- the L greater value representatives are spotted, where the value of L is the same with the one used for the model signals.
- the window slides for l samples where the value of ⁇ i may vary from 0,55 * F s to 1,9 * F s samples, with most frequently used value the
- STEP is a parameter expressing the shift step, that usually belongs to the interval [0.005, 0.01], the more frequently used value being 0.0075.
- the identification procedure described so far is depicted in figure 3.
- each group of unknown signal representatives is being compared with elements of the set of representatives of each model signal separately.
- each of the S+l groups of M unknown signal representatives is compared with groups of M model signal representatives by means of the method consisting of the following steps:
- V ! [60555249474339343330292220171411952 l]
- step E 2 If, indeed, it is greater than or equal to 0.5 ⁇ *L, we proceed to step E 2 below. If it is smaller than 0.51* , then we consider that the set of the tests performed so far did not result to a successful recognition, so, after considering U j as the next representative- vector of the model signal, we start the comparison procedure again, beginning from the comparison of the vector V j with the new U j .
- step E 3 If it is greater or equal, we proceed to step E 3 below. If it is smaller, then we consider that the set of tests performed so far did not result to a successful recognition, so, after considering U as the next representative- vector of the model signal, the comparison procedure starts again beginning from the comparison of the vector V j with the new U j .
- step E M If it is greater or equal, we proceed to step E M below. If it is smaller, then we consider that the set of tests performed so far did not result to a successful recognition, so, after considering U j as the next representative- vector of the model signal, the comparison procedure starts again beginning from the comparison of the vector V j with the new U j .
- V M the M representative vector of the unknown signal corresponding to the same with V j shift coefficient fj .
- the comparison procedure starts again beginning from the comparison of the vector V with the new U . If all possible vectors of the model signal are unsuccessfully compared with one group of representatives of the unknown signal corresponding to the specific shift coefficient / , then we repeat the comparison procedure, using the group of representatives of the unknown signal corresponding to the next shift coefficient f i+l . If the comparison of a specific set of model vectors with all (S+l) groups of representatives of the unknown signal is unsuccessful, then we proceed to the comparison of the unknown signal with another set of model vectors.
- the L greater value representatives are spotted, where the value of L is the same with the one used in the first criterion.
- the irrevocable group of representatives of the unknown signal is compared to elements of the set of the representatives of the model signal, by means of a method similar to the first criterion consisting of the steps briefly described below:
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Character Discrimination (AREA)
- Application Of Or Painting With Fluid Materials (AREA)
- Paints Or Removers (AREA)
Abstract
The invention refers to a method of automatic recognition of musical compositions and sound signals, which is used for the identification of musical compositions and sound signals played by radio or TV, or performed in public places. According to this invention, there is a selection of a desirably large number of musical compositions and sound signals, which we want to identify. In every one of these signals an original procedure is applied leading to the extraction of a set of characteristics which will finally represent a model signal. Subsequently, for the implementation of the recognition, the unknown musical composition or sound signal is received and digitised. To its digitised version the same procedure of extracting a set of characteristics is applied. These are compared with the corresponding sets of the model signals and with original criteria it is decided if there is a model signal that corresponds to the unknown signal under consideration. Moreover, it is decided which model signal exactly corresponds to the unknown one.
Description
Method of Automatic Recognition of Musical Compositions and
Sound Signals
This invention refers to a method of automatic recognition of musical compositions and sound signals and it is used in order to identify musical compositions and sound signals transmitted by radio, TV and/or performed in public places.
During the past, efforts for the development of methods for the automatic recognition of musical compositions and sound signals have been made, that led to the creation of systems performing this task. However, these methods and the related systems manifest low percentage of successful recognition both for the musical compositions and for the sound signals of interest. The introduced method offers much better percentage of fully automatic recognition, higher or equal than ninety eight percent (98%). According to this invention, there is a selection of a desirably high number of musical compositions and sound signals, which we want to identify. For easy reference we will refer to these compositions and signals with the term model signals. In every one of these signals an original procedure is applied leading to the extraction of a set of characteristics which will finally represent each model signal. Subsequently, for the implementation of the recognition, the unknown musical composition or sound signal is received, in which the same procedure of extracting a corresponding set of characteristics is applied. These characteristics are compared with the corresponding sets of characteristics of the model signals and, by means of a number original criteria, it is decided if one (and which one exactly) of the model signals corresponds to the unknown signal under consideration. This procedure is described in figure 1.
It is stressed that, officially, there is no reference in international bibliography for a similar method or a relative system. In the world market there are very few similar systems which offer a percentage of successful recognition less than sixty percent (60%).
The invention is described more thoroughly below:
First, the whole frequency band from 0 to 11025 Hz is divided to sub-bands that are almost exponentially distributed. An implementation of such a division presented in Table 1. According to this implementation, the whole frequency band from 0 to 11025
Hz is divided in 60 sub-bands.
Subsequently, each model signal is digitised with a random sampling frequency Fs preferably greater than or equal to 11025 Hz and a window of 8192 or 16384 or 32768 sample length, slides on the obtained digitised signal. In every such window, an adaptive Fast Fourier Transform is applied and the Discrete Fourier Transform absolute value is obtained. Next, the frequency domain window is divided in sections according to the aforementioned frequency sub-bands choice (see Table 1) and then, in every such section, all the peaks of the absolute value of the Fourier transform are spotted and the greater one is obtained. The value of this peak is called "section representative". Then the L "representatives" with the greater values are spotted, where the value of L may vary from 13 to 30, while the most frequently used L value is 20. The indicators of the sections corresponding to these representatives, sorted in increasing order, form a vector, which constitutes the
"representative-vector" of the window. The above procedure is repeated while the window slides on the whole digitised model signal thus creating all the representative vectors for the specific model signal. Notice that, while the window slides on the model signal, the generated representative vectors often remain unchanged in two successive windows, successive in the sense that they start in two positions differing one sample the one from the other. For this reason, in every representative vector we assign a number indicating the number of subsequent windows in which the specific vector remained unchanged. For that number we will use the name "number of repetitions" of the representative vector. For the set of the generated representative vectors of each model signal we will use the name "the model signal set of representatives". The aforementioned procedure is described in figure 2.
For the identification of the unknown sound signal, which from now on will be called the "unknown signal", the following procedure is used: A part of the unknown signal of length varying from eight (8) to sixteen (16) seconds is received, digitised and registered, at least temporarily. At the beginning of that unknown signal part a window of length W^ = 8192 or W ^ = 16834 or
Wf_ = 32768 samples is obtained; notice that in any case this window will be of the same length with the sliding window which was used for the model signals. In this window a Fast Fourier Transform is applied and the absolute value of it is obtained. Afterwards, all the peaks of the absolute value of the Fourier transform are spotted and S copies of these peaks are created. For the creation of every copy of the peaks, the positions of the peaks are multiplied with a different coefficient /, , i = 0, 1, ...,
S, which is called "window shift coefficient". Thus, S+l different groups of peaks are created. For every one of these groups the following procedure is realised: the section to which each peak corresponds according to the aforementioned frequency sub-bands division, is spotted (see Table 1). For every section to which at least one peak corresponds, the greater peak is kept. The value of this peak is called "representative of the section of the unknown signal corresponding to the shift coefficient ft ".
Next, the L greater value representatives are spotted, where the value of L is the same with the one used for the model signals. The indicators of the sections corresponding to these representatives, sorted in increasing order, form a vector, which constitutes the "first representative vector of the unknown signal corresponding to the shift coefficient // ".
Afterwards, the window slides for l samples, where the value of ^i may vary from 0,55 * Fs to 1,9 * Fs samples, with most frequently used value the
^1 = 1,4 * Fs . For the new window position and for every shift coefficient fj , i =
0, 1,...,S, (S+l) vectors are computed with the way that described above; each such vector will be called "second representative vector of the unknown signal corresponding to the shift coefficient _/)• ". The above procedure is repeated for M-2 windows, where each window starts at a sample having a distance of lt samples from the start of the previous one, i = 2, 3, ..., M-l, where the value of M may fluctuate between 7 and 13 windows, the most usual value being M = 9. In this way S+l groups of M representative vectors are obtained; for each such group we will
employ the name "group of unknown signal representative vectors corresponding to the shift coefficient ".
It must be stressed that for a specific application the i. t values, i = 1, 2,..., M-
1, are not necessary equal, but must be kept fixed throughout the whole procedure. The exact number (S+l) of the shift coefficients _ )• varies from 1 to 15, while their values are given by the formula: l + (— ) * STEP, if i odd
2
// = 1, if i = 0 i = \,2,...,S ,
\ - (ill) * STEP, if i even
where STEP is a parameter expressing the shift step, that usually belongs to the interval [0.005, 0.01], the more frequently used value being 0.0075. The identification procedure described so far is depicted in figure 3.
For the realisation of the unknown signal recognition, each group of unknown signal representatives is being compared with elements of the set of representatives of each model signal separately. To set ideas, each of the S+l groups of M unknown signal representatives is compared with groups of M model signal representatives by means of the method consisting of the following steps:
Ei) If the first representative vector of one group of the unknown signal is called Vj and the first representative vector of the of the model signal is called U| , then initially, the number of the common elements between these two vectors is calculated. For example, if L = 20 and
V! =[60555249474339343330292220171411952 l]
U!=[605855494741393733302825201714119642] then the number of the common elements is thirteen (13). Subsequently, it is checked if the number of the common elements between the vectors Vj and Uj is greater than or equal to the number 0.51* , which is called
"requisite similarity threshold". If, indeed, it is greater than or equal to 0.5 \*L, we proceed to step E2 below. If it is smaller than 0.51* , then we consider that the set of the tests performed so far did not result to a successful recognition, so, after considering Uj as the next representative- vector of the model signal, we start the comparison procedure again, beginning from the comparison of the vector Vj with the new Uj .
Ej) If the second representative vector of the unknown signal, corresponding to the same shift coefficient with Vj / , is called V2 and the representative vector of the model signal corresponding to the sample (£\ *fj ), is called U2 , then we calculate the number of the common elements between these two vectors.
Afterwards, we check if the number of the common elements between the vectors V2 and U2 is greater than or equal to the "requisite similarity threshold".
If it is greater or equal, we proceed to step E3 below. If it is smaller, then we consider that the set of tests performed so far did not result to a successful recognition, so, after considering U as the next representative- vector of the model signal, the comparison procedure starts again beginning from the comparison of the vector Vj with the new Uj .
•
E(M-D) If the (M-l) representative vector of the unknown signal corresponding to the same with Vj shift coefficient fj , is called V(M_j) and the representative
M-l vector of the model signal corresponding to the sample 2_J μ * f ) is called
U(M-i) , then we calculate the number of the common elements between these two vectors.
Next, we check if the number of the common elements between the vectors M-l) a <^ U M_!) is greater than or equal to the "requisite similarity threshold".
If it is greater or equal, we proceed to step EM below. If it is smaller, then we consider that the set of tests performed so far did not result to a successful recognition, so, after considering Uj as the next representative- vector of the model signal, the comparison procedure starts again beginning from the comparison of the vector Vj with the new U j .
EM) If the M representative vector of the unknown signal corresponding to the same with Vj shift coefficient fj , is called VM and the representative vector of the
M-l model signal corresponding to the sample ∑ μ * f ) is called UM , then we μ=ϊ calculate the number of the common elements between these two vectors VM and UM and we check if it is greater than or equal to the "requisite similarity threshold". If it is greater or equal, we proceed to step EM+ι below. If it is smaller, then we consider that the set of tests performed so far did not result to a successful recognition, so, after considering as Uj the next representative- vector of the model signal, the checking procedure starts again beginning from the comparison of the vector V with the new Uj .
EM+I) First we check how many of the pairs (Vj ,Uj ), (V2,U2 ),..., (VM ,UM ) have, according to the previous comparisons, a number of common elements in the interval [0.51 *L, 0.71 *L . If the number of these pairs is greater than 0.34*M, then we consider that the set of tests performed so far did not result to a
successful recognition, so, after considering U as the next representative-vector of the model signal, the comparison procedure starts again beginning from the comparison of the vector Vj with the new Uj . If the number of these pairs is smaller or equal than 0.34*M then the following check is realised: For the pairs of the vectors pairs (V , U ), (V2,U2), ..., (VM ,UM ), having already being compared, we calculate the mean value of the number of the common elements. If this mean value is greater than or equal to 0.71*/, then we consider that the comparison between the group of the M representatives of the model signal corresponding to the shift coefficient f that we checked and the group of the representatives of the unknown signal, is successful. If the mean value is smaller than 0.71* then we consider that the set of tests performed so far did not result to a successful recognition, so, after considering as Uj the next representative-vector of the model signal, the comparison procedure starts again beginning from the comparison of the vector V with the new U . If all possible vectors of the model signal are unsuccessfully compared with one group of representatives of the unknown signal corresponding to the specific shift coefficient / , then we repeat the comparison procedure, using the group of representatives of the unknown signal corresponding to the next shift coefficient fi+l . If the comparison of a specific set of model vectors with all (S+l) groups of representatives of the unknown signal is unsuccessful, then we proceed to the comparison of the unknown signal with another set of model vectors.
If the result of the above comparison is successful for a group of the unknown signal corresponding to a specific shift coefficient, let's say fε, we proceed to the application of the irrevocable comparison criterion, which will be described below. As it is already mentioned, the successful application of the first aforementioned criterion results to the determination of a group of M representatives of the model signal Uj , U2 ,..--UM which "fit" to the group of the representatives of the unknown signal Vj , V2, ...,VM corresponding to the specific shift coefficient fε . Since the positions of these vectors in their corresponding signals are now known, it is possible to realise a sequence of comparisons between vectors of the unknown signal, corresponding to the specific shift coefficient fε, with the vectors of the model signal formed at the specific positions where the first criterion was satisfied.
In this way, in the digitised unknown signal of duration from eight (8) to sixteen (16) seconds, a window of length We is obtained beginning at the unknown signal starting point. In this window a fast Fourier transform is applied again and its absolute value is obtained. Subsequently, the peaks of the Fourier transform are spotted and their positions are multiplied with the shift coefficient fε, which has been previously verified that satisfies the first criterion. Then, in each section, the peaks are sorted according to their value. In each section to which at least one peak has been previously ascribed, the greater peak is kept to form the "representative vector of the unknown signal". Next, the L greater value representatives are spotted, where the value of L is the same with the one used in the first criterion. The
indicators of the sections corresponding to these representatives, sorted in increasing order, form a vector that constitutes the "first irrevocable representative-vector of the unknown signal".
Then, the window slides for k samples, where the value of kγ is equal to
H * (M — X) ne vaιue 0f β fluctuates between 30 and 50. For the new window
D -\ position a new vector is calculated, the same way as it was described before, called "second irrevocable representative-vector of the unknown signal". The above procedure is repeated for over D-2 windows, each one starting at a distance of kt
. * (M -\) samples from the start of its previous window, where kt = , / = 2, 3, ..., D-\.
In this way, finally, a group consisting of D representative- vectors is created.
We will refer to this group with the name "irrevocable group of representatives of the unknown signal".
In order to obtain the final decision if the unknown signal corresponds to the model signal in hand, the irrevocable group of representatives of the unknown signal is compared to elements of the set of the representatives of the model signal, by means of a method similar to the first criterion consisting of the steps briefly described below:
Ti) If the first irrevocable representative- vector of the unknown signal is called * * VJ and U j is called the representative- vector of the model signal corresponding to the position, let's say Λi, where the first criterion has been satisfied, then initially we calculate the number of the common elements between these two vectors.
T2) If the second irrevocable representative-vector of the unknown signal is called V2 , then this vector is compared with vector U2 , which is the representative vector of the model signal corresponding to the position
+ k * fε, where fε is the shift coefficient that has been calculated from the first criterion.
T(D-D) If the (D-l)* irrevocable representative- vector of the unknown signal is called V jj_ ) and the representative- vector of the model signal corresponding to
D-2 the sample ( j * fε) is called U(D_JJ , then we calculate the number of the y=ι common elements between these two vectors.
Finally, having calculated the number of the common elements between these D pairs of vectors, in order to decide for the identification, we check if the two conditions stated below are satisfied:
[Condition 1] At least 0.825 * D from the pairs of the vectors, have common number of elements greater than 0.71 * L.
[Condition 2] The total number of the common elements of the vectors, namely the sum of the common elements of the pairs
(Vi ,Uj),(V2 jU^- V^-i^U .j))), is greater than 0.6875 * D * L.
If these two conditions are satisfied, then we have successfully recognised that the specific musical composition corresponds to the model signal in hand.
The whole procedure of the identification is described in the Figures 3, 4 and 5.
Table 1
Claims
The method for the automatic recognition of musical compositions and sound signals, which is used for the identification of musical compositions and sound signals played by radio or TV or performed in public places, is based on the existence of a procedure, which is applied to the model signals and results to the extraction of a set of characteristics, which will finally represent each model signal. Besides, it is based on a similar procedure, which applies to the unknown musical composition or sound signal for the extraction of similar characteristics and, finally, it is based on a procedure of comparison performed between the representative sets of characteristics of the model and the unknown signal. This method is characterised by the model sets of characteristics corresponding to division of the frequency domain in bands. It is also characterised by two original criteria for the decision of the identification, according to which, two musical compositions or sound signals are identified only when: a) A group of M representative vectors of the model signal U , U2 , ...,UM where two successive vectors are calculated at samples having distance ,■ , /' = 1, 2, ..., M-l, "match" with a group of representatives of the unknown signal Vj , V2, ..., VM , which corresponds to a specific shift coefficient fε . Notice that, the values of £j , i = 1, 2, ..., M-l, are not necessarily equal, but , in any case, are kept fixed throughout the application. The matching betweenU , U2, ...,UM and Vj , V2, ..., VM is realised by means of the following criterion:
All comparisons between the vectors of the pairs (Vj ,U ), (V2,U2), ..., ( VM , UM ) are made and the number of pairs with common elements in the interval [0.51 * L, 0.71 * L] is computed. If it is greater than 0.34 * M then we consider that the set of comparisons performed so far did not result to a successful recognition. If this number is smaller or equal than 0.34 * M then it is checked if the mean value of the number of common elements of vectors of the above pairs (Vj ,U ),
(V2,U2), ..., (VM ,UM ) is greater than or equal to 0.71 * L. If it is, then we consider that the comparison between the group of the M representatives, corresponding to the shift coefficient fε, of the model signal in hand and the group of representatives of the unknown signal is successful. b) A second group of D irrevocable representative- vectors of the model signal
U , U2, ...,Uj) being calculated at a distance k} the one from its previous, where
. e * (M -\) kj = , / = 1, 2, ...-Z , which are not necessarily equal, but are, in any case, kept fixed throughout the application, "match" with a group of representatives of the unknown signal Vj , V2 , ... , VD which corresponds to a specific shift coefficient fε , according to the following criterion:
• At least 0.825*Z) from the pairs of the vectors (Vj ,U ),(V2 ,U2),...,(V(D_j) ,U _j)) , have common number of elements greater than 0.71 * L.
• The total number of the common elements of the vectors (namely the one that results from the summation of the common elements of the pairs
is greater than 0.6875 * D * L.
If both these conditions (a) and (b) are satisfied, then we have successfully recognised the specific musical composition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP00940675A EP1147511A1 (en) | 1999-07-08 | 2000-07-07 | Method of automatic recognition of musical compositions and sound signals |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GR99100235 | 1999-07-08 | ||
GR990100235 | 1999-07-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2001004870A1 true WO2001004870A1 (en) | 2001-01-18 |
Family
ID=10943871
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GR2000/000024 WO2001004870A1 (en) | 1999-07-08 | 2000-07-07 | Method of automatic recognition of musical compositions and sound signals |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1147511A1 (en) |
GR (1) | GR1003625B (en) |
WO (1) | WO2001004870A1 (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002011123A2 (en) * | 2000-07-31 | 2002-02-07 | Shazam Entertainment Limited | Method for search in an audio database |
WO2002073593A1 (en) * | 2001-03-14 | 2002-09-19 | International Business Machines Corporation | A method and system for the automatic detection of similar or identical segments in audio recordings |
DE10117870A1 (en) * | 2001-04-10 | 2002-10-31 | Fraunhofer Ges Forschung | Method and device for converting a music signal into a note-based description and method and device for referencing a music signal in a database |
WO2003009273A1 (en) * | 2001-07-16 | 2003-01-30 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. | Method and device for characterising a signal and for producing an indexed signal |
WO2003054852A2 (en) * | 2001-12-06 | 2003-07-03 | Hewlett-Packard Company | System and method for music inditification |
EP1387514A2 (en) * | 2002-07-31 | 2004-02-04 | British Broadcasting Corporation | Signal comparison method and apparatus |
EP1504445A1 (en) * | 2002-04-25 | 2005-02-09 | Shazam Entertainment Limited | Robust and invariant audio pattern matching |
DE102004023436A1 (en) * | 2004-05-10 | 2005-12-08 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for analyzing an information signal |
DE102004028694B3 (en) * | 2004-06-14 | 2005-12-22 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for converting an information signal into a variable resolution spectral representation |
US7214870B2 (en) | 2001-11-23 | 2007-05-08 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Method and device for generating an identifier for an audio signal, method and device for building an instrument database and method and device for determining the type of an instrument |
DE10232916B4 (en) * | 2002-07-19 | 2008-08-07 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for characterizing an information signal |
US7653534B2 (en) | 2004-06-14 | 2010-01-26 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for determining a type of chord underlying a test signal |
US7739062B2 (en) | 2004-06-24 | 2010-06-15 | Landmark Digital Services Llc | Method of characterizing the overlap of two media segments |
US7881931B2 (en) | 2001-07-20 | 2011-02-01 | Gracenote, Inc. | Automatic identification of sound recordings |
US7986913B2 (en) | 2004-02-19 | 2011-07-26 | Landmark Digital Services, Llc | Method and apparatus for identificaton of broadcast source |
US8090579B2 (en) | 2005-02-08 | 2012-01-03 | Landmark Digital Services | Automatic identification of repeated material in audio signals |
US8453170B2 (en) | 2007-02-27 | 2013-05-28 | Landmark Digital Services Llc | System and method for monitoring and recognizing broadcast data |
US8725829B2 (en) | 2000-07-31 | 2014-05-13 | Shazam Investments Limited | Method and system for identifying sound signals |
JP2016512610A (en) * | 2013-02-04 | 2016-04-28 | テンセント・テクノロジー・(シェンジェン)・カンパニー・リミテッド | Method and device for audio recognition |
US10354307B2 (en) | 2014-05-29 | 2019-07-16 | Tencent Technology (Shenzhen) Company Limited | Method, device, and system for obtaining information based on audio input |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5210820A (en) * | 1990-05-02 | 1993-05-11 | Broadcast Data Systems Limited Partnership | Signal recognition system and method |
US5778335A (en) * | 1996-02-26 | 1998-07-07 | The Regents Of The University Of California | Method and apparatus for efficient multiband celp wideband speech and music coding and decoding |
US5874686A (en) * | 1995-10-31 | 1999-02-23 | Ghias; Asif U. | Apparatus and method for searching a melody |
-
1999
- 1999-07-08 GR GR990100235A patent/GR1003625B/en not_active IP Right Cessation
-
2000
- 2000-07-07 EP EP00940675A patent/EP1147511A1/en not_active Withdrawn
- 2000-07-07 WO PCT/GR2000/000024 patent/WO2001004870A1/en not_active Application Discontinuation
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5210820A (en) * | 1990-05-02 | 1993-05-11 | Broadcast Data Systems Limited Partnership | Signal recognition system and method |
US5874686A (en) * | 1995-10-31 | 1999-02-23 | Ghias; Asif U. | Apparatus and method for searching a melody |
US5778335A (en) * | 1996-02-26 | 1998-07-07 | The Regents Of The University Of California | Method and apparatus for efficient multiband celp wideband speech and music coding and decoding |
Cited By (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8190435B2 (en) | 2000-07-31 | 2012-05-29 | Shazam Investments Limited | System and methods for recognizing sound and music signals in high noise and distortion |
US6990453B2 (en) | 2000-07-31 | 2006-01-24 | Landmark Digital Services Llc | System and methods for recognizing sound and music signals in high noise and distortion |
US8725829B2 (en) | 2000-07-31 | 2014-05-13 | Shazam Investments Limited | Method and system for identifying sound signals |
US7346512B2 (en) | 2000-07-31 | 2008-03-18 | Landmark Digital Services, Llc | Methods for recognizing unknown media samples using characteristics of known media samples |
US7865368B2 (en) | 2000-07-31 | 2011-01-04 | Landmark Digital Services, Llc | System and methods for recognizing sound and music signals in high noise and distortion |
US8700407B2 (en) | 2000-07-31 | 2014-04-15 | Shazam Investments Limited | Systems and methods for recognizing sound and music signals in high noise and distortion |
WO2002011123A3 (en) * | 2000-07-31 | 2002-05-30 | Shazam Entertainment Ltd | Method for search in an audio database |
US10497378B2 (en) | 2000-07-31 | 2019-12-03 | Apple Inc. | Systems and methods for recognizing sound and music signals in high noise and distortion |
US8386258B2 (en) | 2000-07-31 | 2013-02-26 | Shazam Investments Limited | Systems and methods for recognizing sound and music signals in high noise and distortion |
US9899030B2 (en) | 2000-07-31 | 2018-02-20 | Shazam Investments Limited | Systems and methods for recognizing sound and music signals in high noise and distortion |
JP2004505328A (en) * | 2000-07-31 | 2004-02-19 | シャザム エンターテインメント リミテッド | System and method for recognizing sound / musical signal under high noise / distortion environment |
WO2002011123A2 (en) * | 2000-07-31 | 2002-02-07 | Shazam Entertainment Limited | Method for search in an audio database |
US9401154B2 (en) | 2000-07-31 | 2016-07-26 | Shazam Investments Limited | Systems and methods for recognizing sound and music signals in high noise and distortion |
WO2002073593A1 (en) * | 2001-03-14 | 2002-09-19 | International Business Machines Corporation | A method and system for the automatic detection of similar or identical segments in audio recordings |
DE10117870B4 (en) * | 2001-04-10 | 2005-06-09 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and apparatus for transferring a music signal into a score-based description and method and apparatus for referencing a music signal in a database |
US7064262B2 (en) | 2001-04-10 | 2006-06-20 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method for converting a music signal into a note-based description and for referencing a music signal in a data bank |
DE10117870A1 (en) * | 2001-04-10 | 2002-10-31 | Fraunhofer Ges Forschung | Method and device for converting a music signal into a note-based description and method and device for referencing a music signal in a database |
US7478045B2 (en) | 2001-07-16 | 2009-01-13 | M2Any Gmbh | Method and device for characterizing a signal and method and device for producing an indexed signal |
WO2003009273A1 (en) * | 2001-07-16 | 2003-01-30 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. | Method and device for characterising a signal and for producing an indexed signal |
US7881931B2 (en) | 2001-07-20 | 2011-02-01 | Gracenote, Inc. | Automatic identification of sound recordings |
US7214870B2 (en) | 2001-11-23 | 2007-05-08 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Method and device for generating an identifier for an audio signal, method and device for building an instrument database and method and device for determining the type of an instrument |
US6995309B2 (en) | 2001-12-06 | 2006-02-07 | Hewlett-Packard Development Company, L.P. | System and method for music identification |
WO2003054852A3 (en) * | 2001-12-06 | 2003-12-04 | Hewlett Packard Co | System and method for music inditification |
WO2003054852A2 (en) * | 2001-12-06 | 2003-07-03 | Hewlett-Packard Company | System and method for music inditification |
EP1504445A4 (en) * | 2002-04-25 | 2005-08-17 | Shazam Entertainment Ltd | Robust and invariant audio pattern matching |
EP1504445A1 (en) * | 2002-04-25 | 2005-02-09 | Shazam Entertainment Limited | Robust and invariant audio pattern matching |
US7627477B2 (en) | 2002-04-25 | 2009-12-01 | Landmark Digital Services, Llc | Robust and invariant audio pattern matching |
DE10232916B4 (en) * | 2002-07-19 | 2008-08-07 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for characterizing an information signal |
EP1387514A2 (en) * | 2002-07-31 | 2004-02-04 | British Broadcasting Corporation | Signal comparison method and apparatus |
EP1387514A3 (en) * | 2002-07-31 | 2008-12-10 | British Broadcasting Corporation | Signal comparison method and apparatus |
US8811885B2 (en) | 2004-02-19 | 2014-08-19 | Shazam Investments Limited | Method and apparatus for identification of broadcast source |
US8290423B2 (en) | 2004-02-19 | 2012-10-16 | Shazam Investments Limited | Method and apparatus for identification of broadcast source |
US7986913B2 (en) | 2004-02-19 | 2011-07-26 | Landmark Digital Services, Llc | Method and apparatus for identificaton of broadcast source |
US9225444B2 (en) | 2004-02-19 | 2015-12-29 | Shazam Investments Limited | Method and apparatus for identification of broadcast source |
US9071371B2 (en) | 2004-02-19 | 2015-06-30 | Shazam Investments Limited | Method and apparatus for identification of broadcast source |
DE102004023436A1 (en) * | 2004-05-10 | 2005-12-08 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for analyzing an information signal |
US8065260B2 (en) | 2004-05-10 | 2011-11-22 | Juergen Herre | Device and method for analyzing an information signal |
DE102004023436B4 (en) * | 2004-05-10 | 2006-06-14 | M2Any Gmbh | Apparatus and method for analyzing an information signal |
US8017855B2 (en) | 2004-06-14 | 2011-09-13 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for converting an information signal to a spectral representation with variable resolution |
DE102004028694B3 (en) * | 2004-06-14 | 2005-12-22 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for converting an information signal into a variable resolution spectral representation |
US7653534B2 (en) | 2004-06-14 | 2010-01-26 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for determining a type of chord underlying a test signal |
US7739062B2 (en) | 2004-06-24 | 2010-06-15 | Landmark Digital Services Llc | Method of characterizing the overlap of two media segments |
US9092518B2 (en) | 2005-02-08 | 2015-07-28 | Shazam Investments Limited | Automatic identification of repeated material in audio signals |
US8090579B2 (en) | 2005-02-08 | 2012-01-03 | Landmark Digital Services | Automatic identification of repeated material in audio signals |
US8453170B2 (en) | 2007-02-27 | 2013-05-28 | Landmark Digital Services Llc | System and method for monitoring and recognizing broadcast data |
JP2016512610A (en) * | 2013-02-04 | 2016-04-28 | テンセント・テクノロジー・(シェンジェン)・カンパニー・リミテッド | Method and device for audio recognition |
US10354307B2 (en) | 2014-05-29 | 2019-07-16 | Tencent Technology (Shenzhen) Company Limited | Method, device, and system for obtaining information based on audio input |
Also Published As
Publication number | Publication date |
---|---|
EP1147511A1 (en) | 2001-10-24 |
GR990100235A (en) | 2001-03-30 |
GR1003625B (en) | 2001-08-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1147511A1 (en) | Method of automatic recognition of musical compositions and sound signals | |
US8086445B2 (en) | Method and apparatus for creating a unique audio signature | |
Delforouzi et al. | Adaptive digital audio steganography based on integer wavelet transform | |
US6453252B1 (en) | Process for identifying audio content | |
JP4418748B2 (en) | System and method for identifying and segmenting media objects repeatedly embedded in a stream | |
JP5728888B2 (en) | Signal processing apparatus and method, and program | |
US10089994B1 (en) | Acoustic fingerprint extraction and matching | |
JP2006505821A (en) | Multimedia content with fingerprint information | |
EP1515310A1 (en) | A system and method for providing high-quality stretching and compression of a digital audio signal | |
CA2537328A1 (en) | Method of processing and storing mass spectrometry data | |
EP1550297A1 (en) | Fingerprint extraction | |
EP1451803A2 (en) | System and method for music identification | |
CN110277087B (en) | Pre-judging preprocessing method for broadcast signals | |
Gajic et al. | Robust speech recognition using features based on zero crossings with peak amplitudes | |
CN106716529A (en) | Discrimination and attenuation of pre-echoes in a digital audio signal | |
KR100527002B1 (en) | Apparatus and method of that consider energy distribution characteristic of speech signal | |
Seo et al. | Linear speed-change resilient audio fingerprinting | |
Mousset et al. | A comparison of several recent methods of fundamental frequency and voicing decision estimation | |
Yamashita et al. | Spectral subtraction iterated with weighting factors | |
GB2294619A (en) | Inaudible insertion of information into an audio signal | |
CN1154173A (en) | A pitch post-filter | |
Ming et al. | Speech recognition with unknown partial feature corruption–a review of the union model | |
WO1998022935A2 (en) | Formant extraction using peak-picking and smoothing techniques | |
WO1998022935A9 (en) | Formant extraction using peak-picking and smoothing techniques | |
Tang | Evaluation of double sided periodic substitution (DSPS) method for recovering missing speech in packet voice communications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2000940675 Country of ref document: EP |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWP | Wipo information: published in national office |
Ref document number: 2000940675 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2000940675 Country of ref document: EP |