CN102841932A - Content-based voice frequency semantic feature similarity comparative method - Google Patents

Content-based voice frequency semantic feature similarity comparative method Download PDF

Info

Publication number
CN102841932A
CN102841932A CN2012102772962A CN201210277296A CN102841932A CN 102841932 A CN102841932 A CN 102841932A CN 2012102772962 A CN2012102772962 A CN 2012102772962A CN 201210277296 A CN201210277296 A CN 201210277296A CN 102841932 A CN102841932 A CN 102841932A
Authority
CN
China
Prior art keywords
probability
music
semantic
song
audio frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012102772962A
Other languages
Chinese (zh)
Inventor
严勤
张二芬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN2012102772962A priority Critical patent/CN102841932A/en
Publication of CN102841932A publication Critical patent/CN102841932A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to a content-based voice frequency semantic feature similarity comparative method. The method includes extracting music with a frame size of 5 seconds and a frame shift of 0.5 seconds and then extracting characteristic parameters in the music; composing the characteristic parameters into to characteristic vectors; constructing a word bank including description of 174 keywords and then constructing a Hidden Markov Model by using each keyword as a model and the characteristic vectors as a training sample; outputting a probability polynomial for the Hidden Markov Model to obtain probability distribution based on the keywords; and comparing similarity based on the given keywords according to a KL formula. According to the method, 174 categories are provided for each song so that high level semantic description of each piece of music is elaborately conducted. Moreover, by adoption of the Hidden Markov Model to build a model of semantic keywords, substantive characteristics capable of representing the music and the high level semantic description are connected to remedy a semantic vacancy from a low level to a high level.

Description

A kind of content-based audio frequency semantic feature similarity comparative approach
Technical field
The present invention relates to a kind of Audio Processing and mode identification technology, relate in particular to a kind of similarity comparative approach based on hidden Markov model.
Background technology
The research of content-based audio frequency semantic feature similarity comparative approach; Be based on the music retrieval of content and an important branch in music recommend field; Specifically be meant through the audio frequency characteristics analysis; Different voice datas are composed with different semantics, made audio frequency keep similar acoustically with identical semanteme.Because music and people's sense of hearing perception is closely related; It has more passed on a kind of emotion; A kind of mood of very difficult quantification, this specific character of music have determined that extrinsic informations such as used title of the song, singer are to music analysis and inapplicable in the systematic searching technology of audio frequency.Therefore find some characteristic that can characterize music and how the high-layer semantic information of music is described all and be badly in need of very much.
How to extract the lower level characteristic (tone, melody and rhythm etc.) in the music, make unordered audio frequency become in proper order in order, the audio retrieval technology that is based on content realizes key in application.Present research all is based on a certain audio frequency characteristics; Such as the cepstrum coefficient MFCC that has extracted based on the Mei Er frequency; Or earlier with in human auditory's perception etc. characteristics such as loudness pre-emphasis, intensity, loudness carry out a series of engineering simulation; Thereby adopt all-pole modeling to carry out linear prediction analysis afterwards and obtain corresponding LPC coefficient, the research that also has uses the behavioral characteristics of MFCC or LPC to portray the time-varying characteristics of sound signal, i.e. the single order of primitive character and second order difference.For music content, it is not enough that the low layer acoustic feature is only arranged, and high-level semantic notion how to describe music also is a key issue.Along with the raising of living condition, People more and more is paid attention to the cultivation of spirit taste, in the different occasions people music that demand is different, the purposes of music has been proposed more and clearer and more definite and careful requirement, and these requirements are that traditional research can't realize.
Summary of the invention
Goal of the invention: the objective of the invention is to: a kind of content-based audio frequency semantic feature similarity comparative approach is provided; It can extract the characteristic parameter of music signal; And utilize the characteristic parameter that is extracted to set up HMM based on semantic keyword, can compare the similarity of the semantic feature of music then according to probability model.
Technical scheme: embodiments of the invention are realized through following technical scheme:
A kind of content-based audio frequency semantic feature similarity comparative approach comprises the steps:
1) extract frame length 5s, frame moves the music of 0.5s, then extracts the characteristic parameter in the music;
2) with above-mentioned characteristic parameter composition characteristic vector;
3) make up the lexicon that 174 keywords are described, then with each keyword as model, be training sample structure hidden Markov model with the eigenvector;
4), obtain probability distribution based on keyword to the hmm output probability polynomial expression;
5) relatively compare according to the KL formula based on the similarity of given keyword.
Described hidden Markov model building method comprises the steps:
1) according to formula b j ( o t ) = Π s = 1 S [ Σ m - 1 Ms c Jsm N ( o St , μ Jsm , Σ Jsm ) ] r s Obtain the probability b of state output observed reading;
Wherein N is a Gaussian probability-density function, and O is that the characteristic coefficient of music is an observation sequence, and μ, ∑, c are respectively average, variance and weight coefficient, and M is that the Gauss that each state comprises mixes first number;
2) iterations is set,, and adds up and obtain ∑ with the probability P (O/ λ) of all training audio frequency observation sequences of Viterbi algorithm computation HMM output 1In, with the Baum-Welch algorithm model parameter is carried out revaluation again, obtain λ 1, again with the probability P (O/ λ 1) of all training audio frequency observation sequences of Viterbi algorithm computation HMM output, and add up and obtain ∑ 2In;
3) with ∑ 1And ∑ 2The result compares, and whether judged result less than predetermined threshold value be, then need not to carry out revaluation and calculates, and λ 1 as result of calculation output, is then done new round computing with λ 1.
Said initial probability is taken as [1.0 1.0 1.0], and state transition probability is taken as 0.0 1.0 0.0 0.0 0.6 0.4 0.0 0.0 0.0
The said method that obtains probability distribution is following:
According to formula P ( i | χ ) = ( Π t = 1 T p ( x t | i ) ) 1 T Σ v = 1 | v | ( Π t = 1 T p ( x t | v ) ) 1 T Calculate the probability that each word occurs in a first song, obtain the probability vector of all keywords in this first song then, i=1 wherein ..., | v |, what p (i) represented is that certain keyword will appear at the prior probability in certain first song, p (i)=1/|v|, x={x 1..., x T, T is the frame number by frame extraction characteristic of every first song.
Described similarity comparison step is following:
1) particular keywords of the given inquiry song of selection obtains inquiring about the semantic polynomial expression q of song;
2) through the KL distance between each semantic polynomial expression p of
Figure BDA00001978451200023
computing semantic polynomial expression q and lane database, wherein v is the dictionary of selecting for use.
The characteristic parameter of said sound signal is a frequency spectrum parameter, and said frequency spectrum parameter comprises: rhythm, melody sharpness, homophony, tone, tone sharpness, tone center, accent intensity.
In extracting the characteristic parameter process, music file converted to the audio frequency of monophony wav form, the position speed of every section music is 256kbps, and sample size is 16, and SF is 16kHz.
Beneficial effect: the present invention has provided 174 classifications for each first song, and this has just carried out detailed high-level semantic to each song and has described.And adopt hidden Markov model to set up the model of semantic keyword, can represent the essential characteristic of music to dock then, remedy low layer to high-rise semantic vacancy with the high-level semantic description.
Description of drawings
Fig. 1 is the structural drawing of embodiment provided by the invention;
Fig. 2 is the characteristic parameter extraction process flow diagram of embodiment provided by the invention;
Fig. 3 is the HMM model training process flow diagram of embodiment provided by the invention.
Embodiment
Below in conjunction with Figure of description the present invention is further detailed:
Song that the present invention uses and semantic keyword be from Computer Audio Lab 500 databases, and the 500 first pop musics that this musical database is concentrated come from the artist of 500 nearest 50 years country variants.The country that these music are different in the whole world is all popular each other, and the people of different countries is familiar with these songs, and CAL500 has just had global representativeness like this, has solved the difference that the region of music is brought.The description that CAL500 carries out the semantic feature of music comprises many-side, for example comprises audio frequency understanding, performance emotion, the form of expression etc.This database has obtained widely using, and comprises the semantic tagger and the retrieval work of music for The Characteristics in the future, can be as a test set commonly used.
Embodiment provided by the invention is based on the audio frequency semantic feature similarity comparative approach of content; Its structure is as shown in Figure 1, comprising: characteristic parameter extraction module, high-layer semantic information describing module, HMM set up module, labeling module and similarity comparison module.
Signal transitive relation between each module is following:
Input signal frame gets into said characteristic parameter extraction module, in this module, the digital audio signal sequence of input is extracted characteristic.
In the high-layer semantic information describing module, the first song of each in the storehouse is at least by 3 audience's auditions and carry out the marking of keyword label.174 specific word relevant with music have been formed semantic label (semantic lables) in the final data storehouse, and promptly the first song of each in the storehouse all has this 174 labels, and each label all has the marking value, and score value is distributed between 0 to 1.
Setting up in the module of HMM, at first set up training sample and test sample book based on the characteristic parameter that extracts, the present invention is directed to 500 first songs, select 425 head as training sample set at random, 75 remaining first songs are as the test sample book collection.Each keyword root is according to separately mark value, selected value greater than 0 song as training sample.
In labeling module; The keyword model that HMM is trained out carries out probability calculation, so for the first new song of a first song, particularly one outside the storehouse; Probability vector with all training patterns that obtain this first song is called semantic polynomial expression with this probability vector.By this semantic polynomial expression that obtains, we just can obtain the degree of correlation of a first semantic keyword of singing.
At the similarity comparison module, we use the KL distance that the semantic polynomial expression of one first song of appointment and the semantic polynomial expression of the song in the storehouse are carried out
Describe in the face of the concrete processing procedure of each module down, as follows:
One, characteristic parameter extraction module
The principle of work of characteristic parameter extraction module is as shown in Figure 2, and its major function is to extract the characteristic parameter of input audio signal, mainly is frequency spectrum parameter; Said frequency spectrum parameter mainly comprises: rhythm (tempo), melody sharpness (pulseclarity), homophony (mode); Tone (key); Tone sharpness (keyclarity), tone center (tonalcentroid) transfers intensity (keystrength).
In extracting the characteristic parameter process, we convert music file the audio frequency of monophony wav form to, and the position speed of every section music is 256kbps, and sample size is 16, and SF is 16kHz, and audio format is PCM.With reference to the MIRtoolbox kit, that extraction time adopts is frame length 5s, and frame moves 0.5s; Extract the above characteristic parameter, obtain the rhythm (tempo) of 1 dimension, the melody sharpness (pulseclarity) of 1 dimension; The homophony (mode) of 1 dimension, the tone (key) of 1 dimension, the tone sharpness (keyclarity) of 1 dimension; The tone center (tonalcentroid) of 6 dimensions; The accent intensity (keystrength) of 24 dimensions, eigenvector when finally forming one 35 tie up long, this step is carried out under the matlab environment.Per song with this song of txt document storing by frame extract eigenvector.
Two, high-layer semantic information describing module
The high-layer semantic information purpose of description is from domestic consumer, to collect training data, and specific practice is the method for music being carried out the keyword mark through user's audition on one side music on one side, and semantic label is provided a definitions set basis clearly.These semantic speech comprise the mark of 18 kinds of expression emotions, like emotion-happy, and not-emotion-happy etc.; The mark of 36 kinds of expression schools, like genre-Pop, genre-Rock etc.; The mark of 29 kinds of music utensils, like instrument-bass, instrument-piano etc., or the like; This data set will reflect the degree of getting in touch between semantic speech and the song, and therefore for each first song, we have also provided the corresponding score value of label when providing these a series of keyword label.Each first song all representes that by a digital vectors score value is distributed between 0 to 1 like this, and this first song of 0 expression is uncorrelated with this keyword, and 1 expression is extremely relevant.
Three, HMM's sets up module
The principle of work of setting up module of HMM is as shown in Figure 3:
HMM (Hidden Markov Model; Hidden Markov model) is a kind of of Markov chain; Its state can not observe directly; But can observe each observation vector through the observation vector sequence all is to show as various states through some probability density distribution, and each observation vector is to be produced by a status switch with response probability Density Distribution.It is a kind of probability model that is used to describe the statistics of random processes characteristic with parametric representation, is a dual random process, is made up of two parts: Markov chain and general random process.Wherein Markov chain is used for the transfer of the state of describing, and describes with transition probability.The general random process is used for the relation between description state and observation sequence, describes with the observed value probability.
For the HMM model, its state conversion process can not be observed, thereby is referred to as " concealing " Markov model.
HMM is defined as follows:
(1) X represents the set of one group of state, wherein X={S 1, S 2... S N, status number is N, and uses q τRepresent t state constantly.Though state is hidden,, there is the meaning of some physics all relevant with state or state set for plurality of applications.The inner contact of state is exactly can be to other state from a state;
(2) O represents one group of set O={V that can observe symbol 1, V 2... V M, M is the number of the different observed value that possibly export from each state;
(3) state transition probability distribution A={a Ij, a here Ij=P{q I-1=S j| q τ=S i, 1<i, j≤N.In particular cases, each state can a step arrive other any state, at this moment to (i j) has a arbitrarily Ij>O.HMM for other has a Ij=0 (for one or more pairs of i, j);
(4) the observation probability distribution B={b of state j j(k) }, expression state j exports the probability of corresponding observed value, wherein b j(k)=P{O τ=V k| q τ=S j, 1≤j≤N, 1≤k≤M;
(5) init state distribution π=π i, π i=P{q 1=S i, 1≤i≤N;
By last, HMM can be defined as a five-tuple λ:
λ=(X,O,π,A,B)
Or be abbreviated as
λ=(π,A,B)
Above three key elements of said HMM are actual can separated into two parts, one of which is the Markov chain, is described by, A, another part is a stochastic process, is described by B.
HMM training, i.e. parameter estimation problem, a given observation sequence O will confirm a λ model through certain method, makes P (O/ λ) maximum.
The embodiment of the invention is taken as [1.0 1.0 1.0] with initial probability, and state transition probability is taken as 0.0 1.0 0.0 0.0 0.6 0.4 0.0 0.0 0.0 ;
According to the mixed Gaussian function:
b j ( o t ) = Π s = 1 S [ Σ m - 1 Ms c jsm N ( o st , μ jsm , Σ jsm ) ] r s ;
Obtain parameter b, b is the probability of state output observed reading, and wherein N is a Gaussian probability-density function, and O is that the characteristic coefficient of music is an observation sequence, and μ, ∑, c are respectively average, variance and weight coefficient, and M is that the Gauss that each state comprises mixes first number;
After the initial model parameter, iterations is set, with the probability P (O/ λ) of all training audio frequency observation sequences of Viterbi algorithm computation HMM output; And add up and obtain with the Baum-Welch algorithm model parameter being carried out revaluation again in the ∑ 1, obtain λ 1; Again with the probability P (O/ λ 1) of all training audio frequency observation sequences of Viterbi algorithm computation HMM output, and add up and obtain in the ∑ 2, ∑ 1 is compared with ∑ 2 results; Whether judged result is then to need not to carry out revaluation and calculate less than predetermined threshold value; λ 1 as result of calculation output, is then done new round computing with λ 1.
Four, labeling module
The function of labeling module is the first song for appointment, and the best keyword that can provide this first song is described, and our method is to use bayes rule to remove to calculate the prior probability of each keyword in the lexicon.
According to Bayesian formula:
P ( i | χ ) = p ( χ | i ) p ( i ) p ( χ )
I=1 arbitrarily wherein ..., | v |, what p (i) represented is that certain keyword will appear at the prior probability in certain first song, we can define a unified standard, p (i)=1/|v |, x={x 1..., x T, T is the frame number by frame extraction characteristic of every first song, can be independent in short-term with regarding as between every first frame of singing and the frame, so, According to total probability formula, we can know, p ( χ ) = Σ v = 1 | v | p ( χ | v ) p ( v ) , Like this,
P ( i | χ ) = ( Π t = 1 T p ( x t | i ) ) 1 T Σ v = 1 | v | ( Π t = 1 T p ( x t | v ) ) 1 T - - - ( 1 )
We use formula (1) can calculate the probability that each word occurs in a first song.For a first song, the probability vector with all keyword models that obtain this first song is called semantic polynomial expression with this probability vector, can arrange out the degree of correlation of these relevant semantic keywords, and the best keyword that can provide this first song is described.
Five, similarity comparison module
Similarity comparison module function is, the given song that will inquire about, and the particular keywords that selection will be inquired about, we at first obtain inquiring about the semantic polynomial expression of song, are called for short the inquiry polynomial expression.We calculate the KL distance between each the semantic polynomial expression p that inquires about polynomial expression q and our lane database then, and through the calculating of distance, we can compare based on the similarity degree between the song of some keyword.
KL distance calculation formula is following:
KL ( q | | p ) = Σ i = 1 | v | q i log q i p i
V is the dictionary that we select for use, comprises 174 independently semantic keywords.Use this method, we can also accomplish the recommendation based on particular keywords, can satisfy user's different demands.

Claims (7)

1. a content-based audio frequency semantic feature similarity comparative approach is characterized in that: comprise the steps:
1) extract frame length 5s, frame moves the music of 0.5s, then extracts the characteristic parameter in the music;
2) with above-mentioned characteristic parameter composition characteristic vector;
3) make up the lexicon that 174 keywords are described, then with each keyword as model, be training sample structure hidden Markov model with the eigenvector;
4), obtain probability distribution based on keyword to the hmm output probability polynomial expression;
5) relatively compare according to the KL formula based on the similarity of given keyword.
2. a kind of content-based audio frequency semantic feature similarity comparative approach according to claim 1, it is characterized in that: described hidden Markov model building method comprises the steps:
1) obtains the probability b that state is exported observed reading according to formula ;
Wherein N is a Gaussian probability-density function, and O is that the characteristic coefficient of music is an observation sequence, and μ, ∑, c are respectively average, variance and weight coefficient, and M is that the Gauss that each state comprises mixes first number;
2) iterations is set,, and adds up and obtain ∑ with the probability P (O/ λ) of all training audio frequency observation sequences of Viterbi algorithm computation HMM output 1In, with the Baum-Welch algorithm model parameter is carried out revaluation again, obtain λ 1, again with the probability P (O/ λ 1) of all training audio frequency observation sequences of Viterbi algorithm computation HMM output, and add up and obtain ∑ 2In;
3) with ∑ 1And ∑ 2The result compares, and whether judged result less than predetermined threshold value be, then need not to carry out revaluation and calculates, and λ 1 as result of calculation output, is then done new round computing with λ 1.
3. a kind of content-based audio frequency semantic feature similarity comparative approach according to claim 2; It is characterized in that: said initial probability is taken as [1.0 1.0 1.0], and state transition probability is taken as
Figure FDA00001978451100012
.
4. a kind of content-based audio frequency semantic feature similarity comparative approach according to claim 1 is characterized in that: the said method that obtains probability distribution is following:
According to formula
Figure FDA00001978451100013
Calculate the probability that each word occurs in a first song, obtain the probability vector of all keywords in this first song then, i=1 wherein ..., | v |, what p (i) represented is that certain keyword will appear at the prior probability in certain first song, p (i)=1/|v|, x={x 1..., x T, T is the frame number by frame extraction characteristic of every first song.
5. a kind of content-based audio frequency semantic feature similarity comparative approach according to claim 1, it is characterized in that: described similarity comparison step is following:
1) particular keywords of the given inquiry song of selection obtains inquiring about the semantic polynomial expression q of song;
2) through the KL distance between each semantic polynomial expression p of computing semantic polynomial expression q and lane database, wherein v is the dictionary of selecting for use.
6. a kind of content-based audio frequency semantic feature similarity comparative approach according to claim 1; It is characterized in that: the characteristic parameter of said sound signal is a frequency spectrum parameter, and said frequency spectrum parameter comprises: rhythm, melody sharpness, homophony, tone, tone sharpness, tone center, accent intensity.
7. a kind of content-based audio frequency semantic feature similarity comparative approach according to claim 1; It is characterized in that: the audio frequency that in extracting the characteristic parameter process, music file is converted to monophony wav form; The position speed of every section music is 256kbps; Sample size is 16, and SF is 16kHz.
CN2012102772962A 2012-08-06 2012-08-06 Content-based voice frequency semantic feature similarity comparative method Pending CN102841932A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012102772962A CN102841932A (en) 2012-08-06 2012-08-06 Content-based voice frequency semantic feature similarity comparative method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012102772962A CN102841932A (en) 2012-08-06 2012-08-06 Content-based voice frequency semantic feature similarity comparative method

Publications (1)

Publication Number Publication Date
CN102841932A true CN102841932A (en) 2012-12-26

Family

ID=47369295

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012102772962A Pending CN102841932A (en) 2012-08-06 2012-08-06 Content-based voice frequency semantic feature similarity comparative method

Country Status (1)

Country Link
CN (1) CN102841932A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324691A (en) * 2013-06-03 2013-09-25 河海大学 Voice frequency searching method based on M-tree
CN104978962A (en) * 2014-04-14 2015-10-14 安徽科大讯飞信息科技股份有限公司 Query by humming method and system
CN105095279A (en) * 2014-05-13 2015-11-25 深圳市腾讯计算机系统有限公司 File recommendation method and apparatus
CN107704631A (en) * 2017-10-30 2018-02-16 西华大学 A kind of construction method of the music mark atom based on mass-rent
CN111506762A (en) * 2020-04-09 2020-08-07 南通理工学院 Music recommendation method, device, equipment and storage medium
CN115910042A (en) * 2023-01-09 2023-04-04 百融至信(北京)科技有限公司 Method and apparatus for identifying information type of formatted audio file

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101021854A (en) * 2006-10-11 2007-08-22 鲍东山 Audio analysis system based on content
CN101154379A (en) * 2006-09-27 2008-04-02 夏普株式会社 Method and device for locating keywords in voice and voice recognition system
CN101415259A (en) * 2007-10-18 2009-04-22 三星电子株式会社 System and method for searching information of embedded equipment based on double-language voice enquiry
CN101553799A (en) * 2006-07-03 2009-10-07 英特尔公司 Method and apparatus for fast audio search
CN101593519A (en) * 2008-05-29 2009-12-02 夏普株式会社 Detect method and apparatus and the search method and the system of voice keyword
US20110307253A1 (en) * 2010-06-14 2011-12-15 Google Inc. Speech and Noise Models for Speech Recognition
CN102456077A (en) * 2006-07-03 2012-05-16 英特尔公司 Method and device for rapidly searching audio frequency

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101553799A (en) * 2006-07-03 2009-10-07 英特尔公司 Method and apparatus for fast audio search
CN102456077A (en) * 2006-07-03 2012-05-16 英特尔公司 Method and device for rapidly searching audio frequency
CN101154379A (en) * 2006-09-27 2008-04-02 夏普株式会社 Method and device for locating keywords in voice and voice recognition system
CN101021854A (en) * 2006-10-11 2007-08-22 鲍东山 Audio analysis system based on content
CN101415259A (en) * 2007-10-18 2009-04-22 三星电子株式会社 System and method for searching information of embedded equipment based on double-language voice enquiry
CN101593519A (en) * 2008-05-29 2009-12-02 夏普株式会社 Detect method and apparatus and the search method and the system of voice keyword
US20110307253A1 (en) * 2010-06-14 2011-12-15 Google Inc. Speech and Noise Models for Speech Recognition

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324691A (en) * 2013-06-03 2013-09-25 河海大学 Voice frequency searching method based on M-tree
CN104978962A (en) * 2014-04-14 2015-10-14 安徽科大讯飞信息科技股份有限公司 Query by humming method and system
CN104978962B (en) * 2014-04-14 2019-01-18 科大讯飞股份有限公司 Singing search method and system
CN105095279A (en) * 2014-05-13 2015-11-25 深圳市腾讯计算机系统有限公司 File recommendation method and apparatus
CN107704631A (en) * 2017-10-30 2018-02-16 西华大学 A kind of construction method of the music mark atom based on mass-rent
CN107704631B (en) * 2017-10-30 2020-12-01 西华大学 Crowdsourcing-based music annotation atom library construction method
CN111506762A (en) * 2020-04-09 2020-08-07 南通理工学院 Music recommendation method, device, equipment and storage medium
CN111506762B (en) * 2020-04-09 2023-07-11 南通理工学院 Music recommendation method, device, equipment and storage medium
CN115910042A (en) * 2023-01-09 2023-04-04 百融至信(北京)科技有限公司 Method and apparatus for identifying information type of formatted audio file
CN115910042B (en) * 2023-01-09 2023-05-05 百融至信(北京)科技有限公司 Method and device for identifying information type of formatted audio file

Similar Documents

Publication Publication Date Title
CN102521281B (en) Humming computer music searching method based on longest matching subsequence algorithm
CN103823867B (en) Humming type music retrieval method and system based on note modeling
CN101599271B (en) Recognition method of digital music emotion
Shao et al. Unsupervised classification of music genre using hidden markov model
Cheng et al. Automatic chord recognition for music classification and retrieval
US7488886B2 (en) Music information retrieval using a 3D search algorithm
CN103177722B (en) A kind of song retrieval method based on tone color similarity
Gulati et al. Automatic tonic identification in Indian art music: approaches and evaluation
US20090306797A1 (en) Music analysis
CN102841932A (en) Content-based voice frequency semantic feature similarity comparative method
CN105575393A (en) Personalized song recommendation method based on voice timbre
CN112185321A (en) Song generation
CN110134823B (en) MIDI music genre classification method based on normalized note display Markov model
CN101488128B (en) Music search method and system based on rhythm mark
Li et al. Construction and analysis of hidden Markov model for piano notes recognition algorithm
CN115359785A (en) Audio recognition method and device, computer equipment and computer-readable storage medium
Wang Mandarin spoken document retrieval based on syllable lattice matching
Waghmare et al. Raga identification techniques for classifying indian classical music: A survey
Kroher The flamenco cante: Automatic characterization of flamenco singing by analyzing audio recordings
Ching et al. Instrument role classification: Auto-tagging for loop based music
Wang et al. Music information retrieval system using lyrics and melody information
Wang et al. Research on CRFs in music chord recognition algorithm
Cazau et al. An automatic music transcription system dedicated to the repertoires of the marovany zither
Schuller et al. Applications in intelligent music analysis
Feng et al. Vocal Segment Classification in Popular Music.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C05 Deemed withdrawal (patent law before 1993)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20121226