CN102521281A - Humming computer music searching method based on longest matching subsequence algorithm - Google Patents
Humming computer music searching method based on longest matching subsequence algorithm Download PDFInfo
- Publication number
- CN102521281A CN102521281A CN2011103821590A CN201110382159A CN102521281A CN 102521281 A CN102521281 A CN 102521281A CN 2011103821590 A CN2011103821590 A CN 2011103821590A CN 201110382159 A CN201110382159 A CN 201110382159A CN 102521281 A CN102521281 A CN 102521281A
- Authority
- CN
- China
- Prior art keywords
- subsequence
- melody
- sequence
- delta
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Auxiliary Devices For Music (AREA)
Abstract
The invention discloses a humming computer music searching method based on longest matching subsequence algorithm, which comprises the following steps of (1) fundamental tone frequency extraction, (2) music characteristic database construction, (3) characteristic expression achieving and (4) searching matching. The method has the advantages of improving integral speed of similarity calculation, improving searching efficiency of a searching engine, constructing an accurate music searching platform for karaoke, a network engine based on content search and a multifunctional intelligent mobile terminal platform and being capable of being widely used in fields of relative plugs of the network searching engine and the like. The method for music characteristic extraction, music characteristic expression and similarity accurate calculation can conduct accurate calculation on a humming searching system, enables music search to be accurate, slight and happy, and has strong practical value and reality meaning.
Description
Technical field
The present invention relates to a kind of humming Computer Music search method, belong to Computer Applied Technology field based on the music information content retrieval based on the longest coupling subsequence algorithm.
Background technology
In recent years along with the development of Internet, voice data increases by geometric progression.Traditional search method based on label character can not the satisfying magnanimity multi-medium data retrieval need; Therefore (Music Information Retrieval, MIR) technology has become one of the hot spot technology in fields such as signal Processing, pattern-recognition and data mining in content-based music information retrieval.Content-based multimedia information retrieval Study on Technology mainly concentrates on image and video aspect, and at present, the technology of domestic and international application on audio retrieval is also rare.Along with the user promotes the interest of network class and retrieval, make that to set up audio frequency web data retrieval mechanism most important.The key technical problem that restricts content-based music retrieval technical development is how to extract audio frequency characteristics to realize that music content characterizes and describes musical features and which kind of method to carry out characteristic matching with.The extraction of melody characteristics and expression are based on the basic link in the music retrieval of content; The semantic information that can express music objective, accurately of the melody characteristics that from snatch of music, extracts; Determining the correct transmission of musical features, be directly connected to follow-up coupling and whether effectively retrieve; Whether accurately can the similarity computational algorithm and the corresponding matched mechanism of snatch of music meet the general sense of hearing, psychological feelings, be decision result for retrieval key factor.Therefore the assessment of the extraction expression of melody characteristics and calculation of similarity degree is the most important link that influences the music retrieval system performance of a singing search or content.
For acoustic signal, its pitch acoustically is to be determined by its fundamental frequency sequence (Fundamental Frequency).The order that pitch extracts is to change into the fundamental frequency sequence to the acoustic signal of user's input.At present; Common algorithm aspect feature extraction like: autocorrelation function algorithm (Autocorrelation), cepstral analysis method (Cepstral Analysis), cross correlation function algorithm (CCF), average magnitude difference function algorithm (AMDF), standardization cross correlation function algorithm (NCCF), integrate pitch extraction algorithm (Integrated Pitch Tracker); But along with development of technologies in the plurality of applications scene; The treatment effect of these algorithms has not reached the requirement of application, and the deviation that very easily causes feature representation and real music semantic content is with fuzzy.
Common methods and shortcoming aspect feature representation is as follows at present:
1, pitch profile representation can't quantize change in pitch, is prone to cause feature representation and real music semantic content fuzzy, along with the song sample expansion, very easily occur the pitch profile phase with but situation that actual melody differs greatly.
2, MIDI note approximate expression method can produce the round values of the approximate normalizing of the natural pitch of user's humming to discrete MIDI note melody and express inaccurate problem.As shown in Figure 1, displaying be the expression of same section melody in c major and the big accent of A, the MIDI pitch value of two sections all diaphones of melody fragment is different fully, but impression is almost completely consistent in the sense of hearing of giving the people and the music cognition.In rational feature representation method, should look these two sections melody and have identical melody characteristics; Just embody MIDI note approximate expression method based on this point and seem appropriate inadequately and comprehensive.
Though 3, the perfect pitch representation has solved the problem of the expression mistake of approximation generation; But the vertical overall offset of pitch (Pitch Shiftiness) that is produced when cooperating string relatively type with the related algorithm of some dynamic programmings can be brought serious matching error, so this feature representation method and be not suitable for general similarity computing mechanism.
Though 4, transfer interior sound level representation to avoid the pitch overall offset to hum the influence that is brought with different modes.But this method need add mode keynote and mode attribute as additional information; And in the use scene that humming is used; The attribute of keynote and mode can't directly obtain in most cases; Lack, comprise under the abundant inadequately situation of information in the humming fragment, very large deviation possibly appear in this method.As shown in Figure 2, this is the melody fragment of one section c major, but also meets the mode attribute of the big accent of G simultaneously.This is because only there is a variation sound #F in the syllable of the big accent of G with respect to c major.So when this variation sound not occurring in the melody fragment or its reduction during sound, the melody fragment of being made up of other notes meets the attribute of c major and two modes of the big accent of G.This can cause utilizing the inefficacy of interior each note of mode to the number of degrees (Degrees of the Scale from the Tonic) the accordatura method of keynote.And many music styles in process of creation frequent adopt comprise modulation, transfer outside sound etc. break the musical composition skill of single mode attribute, in these cases, adopt this method to carry out singing search and can produce very mistake.
5 in traditional triple melody representations, and what this attribute of interval was expressed is the change of frequency amplitude between the adjacent note, is unit with the hertz.Employed pitch unit is a semitone in the music system; Although semitone becomes positive correlation with hertz but is not to be linear dependence; Semitone presents logarithmic relationship with hertz, and therefore in different pitch regions, the difference that differs frequency corresponding between two sounds of semitone of equal unit is different.If the difference that adopts frequency is as interval criterion between adjacent two sounds; This will cause same melody to produce different interval sequences in different pitch regions; And then serious musical features distortion appears, for example shown in Figure 4: melody 1 comprises identical melody characteristics with melody 2, but under different modes, hums; The difference that its each naturetone distributes on frequency dimension is distinct, makes that triple melody representations can't objective expression melody characteristics.
Existing method aspect similarity calculating is following at present:
1, editing distance algorithm
Traditional editing distance algorithm, editing distance are to be used for calculating between two character strings, are transformed into a character string A minimum operation cost of another character string B.Simple editing distance algorithm (Levenshtein Distance) only is applicable to the calculating between the character string, can't directly use with the similarity that constitutes music rhythm and calculate.And the editing distance algorithm after expanding can be used for the distance calculation of real number string, and the advantage of this method is the conversion cost between two sequence of real numbers that can quantize to mate relatively each other, with the similarity between two sections melody weighing two sequence of real numbers representatives.But this editing distance algorithm that is extended to the real number scope more is applicable to the overall situation relatively, and when not matching between two melody sequences as input, its similarity calculated performance obviously reduces.For example: when the complete information of certain the phrase fragment of user humming and a piece of music matees; The editing distance algorithm can calculate a large amount of insertion elements or the deletion extra cost that element brought; This can make the melody similarity reduce greatly, thereby causes algorithm to lose efficacy.Shown in the part of Fig. 5 dotted line delineation,, can be considered the melody fragment of coupling although the part of melody B and melody A has very high similarity.But because melody B is carried out similarity calculating by machinery with integral body melody A, its similarity is greatly diminished, and this also is the defective place of editing distance algorithm.
2, the longest common subsequence algorithm
The effect and the advantage of the longest common subsequence algorithm are that this algorithm can find the subsequence of mutual coupling from two character string A, B, thereby can be used for realizing from two sections melody, obtaining the fragment of coupling.But because the longest common subsequence algorithm does not consider that element inserts and the cost of deletion; Therefore; When the complete melodic information of certain the phrase fragment of user humming and a piece of music matees; The short melody sequence of user input can be by unconfined stretching, and the melody of two sections wide of the marks is through one period mating of stretching by force wherein.This matching way has greatly twisted the characteristic of music rhythm, even if two sections stretched couplings that are able to of melody, still this method in fact lost efficacy.
3, dynamic time warping algorithm
The voice signal of user's humming has very strong randomness, different pronunciation customs, the phenomenon that residing environment difference all can cause pronouncing duration length to differ during pronunciation.The dynamic time warping algorithm is to elongate or shorten voice signal, and during consistent with the length of mode standard, the time shaft of unknown words can produce distortion or bending, so that its characteristic quantity is corresponding with mode standard.This algorithm characteristics can be stretched sequence on time shaft, thereby makes similar profile therefore be widely used in fields such as content-based music retrieval, signal Processing, speech recognition to mutual alignment.But this algorithm has some shortcomings equally, at first is that time complexity is too high, matees and whole sentence note when being more or less the same at the whole sentence to random length, causes the not high problem of matching result discrimination easily.
4, hidden Markov model
(Hidden Markov Model is a kind of Statistic analysis models HMM) to hidden Markov model, can be used in the speech recognition of unspecified person.In the singing search field; Because the humming melody of user's input itself also is a voice signal; Can be used as the observation vector of HMM, and the pitch characteristic sequence characteristic in the tone character data storehouse has probabilistic statistical characteristics, can be used as the latent state of model.In realization, carry out modeling through melody characteristics and constitute search space, and model is trained accordingly different songs; In retrieving, can feedback user the probability that matees each other of voice signal and the interior song model of search space of humming.Singing search system based on hidden Markov model is realized can return the good result of precision ratio per family for the usefulness of difference performance level.But it also has inevitable shortcoming simultaneously: hidden Markov model is for every record in the musical features database; Need set up corresponding training pattern respectively; Along with the feature database capacity increase, the workload of training will be very huge, so the hidden Markov model practicality is relatively poor.
Summary of the invention
The object of the present invention is to provide a kind of computing machine that makes that can overcome above-mentioned technical matters initiatively to discern the humming Computer Music search method that music tone changes based on the longest coupling subsequence algorithm.Basic fundamental thinking of the present invention is: on the basis of analyzing present musical features extraction and expression, confirm the characteristic sequence that constitutes with the semitone interval between the adjacent tone; Adopt the RAPT algorithm to realize the extraction of music fundamental frequency; On technique effect, avoided the feature extraction deviation that causes, for prerequisite and basis have been created in the accurate extraction of melody characteristics at different mode hummings.Aspect the melody characteristics expression, basic with twelve-tone equal temperament as giving birth to rule, with pitch contour sequence process log-transformation; Being converted into the semitone is the interval sequence of unit; Avoided different user when mode is hummed to the influence of melody characteristics, realize simultaneously normalization that the MIDI problem characteristic is extracted realizing macroscopical melody contours modeling with the MOMEL algorithm; And, make script length 10 to transform feature extraction and the expression that realizes as technology based on the logarithm of twelve-tone equal temperament
3Order of magnitude fundamental frequency sequence; Under the prerequisite of not losing melody characteristics; Got rid of the influence of fluctuations of the lyrics, intonation, and to make the length reduction that obtains the pitch contour sequence be 10 orders of magnitude, for the matching speed that further improves total system provides important support to macroscopical melody fundamental frequency signal.Aspect similarity calculating; Employing is based on the longest coupling subsequence (Longest Matched Subsequence; LMS) similarity computing mechanism and the method that tradition string coupling computing method combine have been avoided the limitation of other related algorithm in application effectively.
Key step of the present invention is:
(1) fundamental frequency extracts; Through Audio Processing, adopt the RAPT algorithm carry out fundamental frequency extract, adopt low-pass filter and Hi-pass filter carry out the fundamental frequency sequence regular, adopt medium filtering and linear smoothing to carry out fundamental frequency sequence step level and smooth, that adopt the MOMEL algorithm to carry out the melody modeling to realize the voice signal of user's humming is converted into fundamental frequency profile sequence.
(2) structure of musical features database; The MIDI file of all songs in the database is carried out pre-service; Extract MIDI pitch sequence wherein; And deposit the musical features database in independent field, in follow-up retrieval link, save the step of MIDI file processing, but directly from property data base, extracted pitch sequence.
(3) feature representation is realized; The MIDI pitch sequence that fundamental frequency profile sequence that will obtain from audio processing modules and musical features database extract is converted into unified melodic interval sequence, respectively the melody characteristics of representative of consumer humming and data-base recording.
(4) match retrieval; The melody characteristics sequence that to hum audio extraction from the user respectively with search space all musical features sequences carry out similarity and calculate, and according to the longest coupling subsequence (LMS) algorithm mechanism, the result of coupling is at every turn carried out sequencing of similarity.
Advantage of the present invention is, promoted the overall rate that similarity is calculated, and improved the search efficiency of search engine, for Karaoke and content-based search network engine and multifunctional intellectual mobile-terminal platform have made up accurate music retrieval platform; Can be widely used in the fields such as relevant plug-in unit of network search engines; The Method for Accurate Calculation of the extraction of musical features provided by the present invention, the expression of musical features and similarity can provide the accurate calculating of singing search system; Make the retrieval of music accurate, light, happy, have stronger practical value and realistic meaning.
Description of drawings
Fig. 1 is the same section melody expression synoptic diagram in c major and the big accent of G respectively;
Fig. 2 is the one section melody synoptic diagram that meets c major and the big key formula of G attribute simultaneously;
Fig. 3 is the numerical relation synoptic diagram of semitone and hertz;
Fig. 4 is the frequency variation curve synoptic diagram of identical melody under different modes;
Fig. 5 is that local melody and whole melody mate synoptic diagram;
Fig. 6 is an audio feature extraction overall procedure synoptic diagram of the present invention;
Fig. 7 is the interval curve of identical melody in different pitch regions
Fig. 8 is the melody modeling synoptic diagram based on the MOMEL algorithm of the present invention;
Fig. 9 is the similarity calculation flow chart based on the LMS algorithm of the present invention.
Embodiment
Describe the present invention below in conjunction with accompanying drawing and embodiment.Key step of the present invention is:
(1) the fundamental frequency sequence is extracted and is handled
In technology, the accuracy of the feature extraction of audio frequency input is played crucial effects for the overall performance of music information searching system based on the music information content retrieval.Music rhythm in the audio retrieval information that desirable audio feature extraction needs to express the user objective and accurately to be imported; For promoting retrieval rate and recall precision; The present invention proposes a kind of melody characteristics that comprises the rapid combination of multisteps such as fundamental frequency extraction, frequency domain filtering, medium filtering, melody modeling and extract flow process, fundamental frequency sequence extraction of the present invention and processing overall procedure are as shown in Figure 6:
1) the WAV wave file of input is used the RAPT algorithm and carry out the fundamental frequency extraction, thereby obtain the fundamental frequency sequence;
2) original fundamental frequency sequence will be passed through Hi-pass filter and low-pass filter processing, remove burr and noise spot, level and smooth fundamental curve.Generally between E2 (82Hz)~C6 (1047Hz), according to mankind's nature range of voice, the threshold value of high-pass filtering is set to 80Hz to human range width range, and the threshold value of LPF is set to 1100Hz, in order to remove the fundamental frequency value that is in outside the height threshold value;
3) with the linear smoothing processing fundamental frequency sequence is carried out linear filtering and handle, the noise spot in the removal fundamental frequency sequence and further level and smooth the curved profile of fundamental frequency sequence.In an embodiment of the present invention, filter window is set to 50 milliseconds.
4) with resulting fundamental frequency sequence, remove noise spot through medium filtering, removed the noise spot in the fundamental frequency sequence effectively, and intact kept that the step between the continuous curve changes in the fundamental frequency sequence.In an embodiment of the present invention, after the process fundamental frequency extracts was 100 point/seconds to fundamental frequency sequential sampling rate, and the medium filtering window is set to 77 milliseconds.
(2) musical features is expressed
1) feature representation of fundamental frequency curve
With semitone as unit, with the sequence that interval was constituted between adjacent two sounds as melody characteristics.The melody fragment that comprises n natural note can be expressed as the interval sequence that n-1 real number constitutes, and expresses melody characteristics with the mode that quantizes, and the musical features of different melody has discrimination, and calculating for follow-up similarity provides effective result; Insensitive to whole pitch overall offset, allow the user humming in the mode arbitrarily, identical melody characteristics still can be extracted; Have good stable property, even under the melodic information condition of limited, feature representation method advantage such as still can not lose efficacy.To the audio-frequency information of user through the humming input, interval is calculated definition shown in formula (1):
According to above definition, can be with pitch frequencies sequence Fx=(freq
1, freq
2, freq
3..., freq
n) be mapped to interval sequence Pi=(pitch_interval
1, pitch_interval
2, pitch_interval
3..., pitch_interval
N-1).
For the MIDI file of storing in the musical features database, need to adopt same melody characteristics expression way, make to have identical form with the melody characteristics that extracts from database side from user input.To the MIDI file, interval is calculated definition shown in formula (2), wherein MIDI_note
N+1And MIDI_note
nRepresent the pitch value in the MIDI file:
Pitch?Interval
n=MIDI_note
n+1-MIDI_note
n (2)
Through above conversion; Can the same melody characteristics under the different modes be carried out normalization; Eliminated the influence that different humming modes extract melody characteristics simultaneously; As shown in Figure 7, the corresponding point of obvious two curves overlap fully, the characteristic of same melody extracting with normalized mode by success in the different modes.Can design a similarity evaluation mechanism based on identical feature representation mode, accomplish the coupling of the characteristic information in retrieving information and the database, treated fundamental frequency sequence is carried out the melody modeling, obtain one group of melody skeleton that constitutes by discrete point; The melody skeleton transforms through logarithm; Interval is extracted out between the adjacent tone of input, and with this characteristic sequence as the input audio frequency, the melody characteristics that finally extracts; The information that is admitted in matching module and the musical features database is carried out similarity calculating, obtains matching result.
2) feature representation of melody
Through fundamental frequency extract, filtering obtain the fundamental frequency contour curve, can be split into is the combination of two kinds of separate melody compositions: macroscopical melody composition and microcosmic melody composition.It defines respectively as follows:
Macroscopic view melody composition: the tone pattern in the reaction voice messaging, closely related with the overall change in pitch of fundamental frequency.
Microcosmic melody composition: react the phoneme composition in the voice messaging, influence the localized variation of fundamental frequency curve.
In like manner, humming information is a kind of as voice messaging, also can be considered the combination of two kinds of melody compositions.Music rhythm for the voice humming; Change in pitch is only relevant with macroscopical melody composition of its fundamental curve; And phoneme informations such as the phonetic symbol of humming, the lyrics; Then, utilize the quadratic spline function, through the approximate macroscopical melody that obtains the fundamental frequency curve of interpolation by its fundamental frequency curve microcosmic melody composition decision.Resulting macroscopical melody appears with the form of dispersive target point sequence, and has represented the corresponding pitch melody characteristics of this fundamental frequency sequence, and the humming melody characteristics has nothing to do with phonetic symbol, phoneme information based on pitch sequence.So, utilize the MOMEL algorithm to handling through the fundamental frequency contour curve of filtering, can obtain the macroscopical melody sequence in the fundamental frequency contour curve, and the basis of expressing as follow-up melody characteristics.
As shown in Figure 8 is an instance of the processing of MOMEL algorithm.Through the melody modeling, macroscopical melody (below) of fundamental frequency contour curve (top) is extracted by success.Yet the direct result of MOMEL algorithm output is expressed for follow-up melody characteristics and is still had obviously deficiency.For example, in last two sections of the fundamental frequency contour curve, represent the fundamental frequency contour curve of a pitch to be labeled out the very close impact points of two numerical value among Fig. 8.For solving this type of problem, the present invention is provided with a parameterized threshold value, in order to the interval between the control adjacent tone.When the interval between two sounds is lower than this threshold value, this interval can based on concrete condition deleted or adjacent interval with other merge.
(3) similarity calculating-match retrieval algorithm
(a) the longest coupling subsequence algorithm (Longest Matched Subsequence, LMS)
Based on the melody characteristics that feature extracting method obtained among the present invention is one group of sequence of real numbers, and the melody characteristics of storing in the musical features database is integer sequence.At this moment, if the similarity of two sequences of the longest common subsequence algorithm computation of utilization of machinery, so a lot of elements that originally can mate possibly omitted.
The longest coupling subsequence algorithm just can solve the longest common subsequence algorithm (LCS) and in application, have problems.The longest coupling subsequence algorithm is as a kind of improvement to the longest common subsequence algorithm, and its output result is independently two sub-sequence A ', B ', is respectively the subsequence of list entries A, B.
Define the longest coupling subsequence according to following mode:
Given list entries A=(a1, a2, a3 ..., an) and B=(b1, b2, b3 ..., bm),
Promptly produce subsequence A '=(a ' 1, a ' 2, a ' 3 ..., a ' 1) and B '=(b ' 1, b ' 2, b ' 3 ..., b ' 1).
Subsequence A ', B ' satisfy following condition:
1) each element among subsequence A ', the B ' all has the element that matches in another subsequence, and meets following condition:
In subsequence A ', B ':
The element ai of element a ' the k corresponding A of A ';
The element bj of the corresponding B of the element b ' k of B '.
Satisfy: LD (ai, bj)≤δ, wherein δ is given local similar degree maximal value.
2) subsequence A ' and B ' are continuous relatively in original series separately respectively, promptly meet following condition:
In subsequence A ', B ':
The element ai of element a ' the k corresponding A of A ', A ' the element as of element a ' k+1 corresponding A;
The element bj of the corresponding B of the element b ' l of B ', B ' the element at of the corresponding B of element b ' l+1;
Satisfy: s-i≤L and t-j≤L, wherein L is the maximal value that allows to insert element in the subsequence.
3) subsequence A ' has identical length with B '.And A ', B ' be respectively A, B all satisfy condition 1) and 2) subsequence in the longest, promptly | A ' |=| B ' |=max{|Ak|, | Bl|}.
In the longest coupling subsequence algorithm, the notion that element equates is replaced by the notion of coupling.Different with the longest common subsequence algorithm, A ' is not mechanically to equate fully with B ', but the longest and have the highest one group of similarity among the A, all subsequences of B.
(b) the local similar degree calculates
As the basis of the longest coupling subsequence algorithm, introduce with regard to the account form of local similar degree below.
At first, the editing distance algorithm of definition sequence of real numbers:
Sequence of real numbers to given input: X=(x1, x2, x3 ..., xm), Y=(y1, y2, y3 ..., yn).
According to the weights that transform each other between the element in the formula 3 definition sequence of real numbers, wherein δ is for judging the threshold value that equates:
Initialization editing distance matrix D m, n, initialization condition is as follows:
d0,0=0;
Di, 0=di-1,0+w (xi, 0), wherein 1≤i≤m;
D0, j=d0, j-1+w (0, yj), 1≤j≤n wherein.
To the matrix unit of 1≤i≤m and 1≤j≤n, calculate editing distance matrix D m, n, recursion equation are shown in formula 4, and wherein Wdel, Wsub and Wins are respectively deletion, replacement, insert the weights of three kinds of operations:
Finally, the sequence of real numbers X of input and the editing distance ED between the Y (X Y) can be from matrix D m, the lower right corner dm of n, and n obtains, shown in formula 5:
ED(X,Y)=dm,n。(5)
Then, provide the concrete definition of the local similar degree of element:
Melody characteristics sequence to given input: A=(a1, a2, a3 ..., an), B=(b1, b2, b3 ..., bm).
From A, B, respectively get element constitute doublet (ai, bj), for each to doublet, its local similar degree definition as follows, wherein k is a local radius:
Define local subsequence X=(ai-k ..., ai ..., ai+k) and Y=(bj-k ..., bj ..., bj+k).
Then doublet (ai, near the local similar degree LD bj) (ai, bj) can by the editing distance ED between local subsequence X and the Y (X Y) obtains, shown in formula 6:
LD(ai,bj)=ED(X,Y)。(6)
(c) the longest coupling subsequence is calculated in dynamic programming
Near the clear and definite computation rule of local similar degree element ai, the bj among original series A, the B can utilize the policy calculation of dynamic programming (Dynamic Programming) to go out to meet the longest coupling subsequence A ', the B ' of definition.
At first, utilize the calculation process of the dynamic programming the longest common subsequence algorithms of clearing (LCS):
Given list entries A=(a1, a2, a3 ..., am) and B=(b1, b2, b3 ..., bn).
Structure LCS Matrix C m, n, according to following this matrix of condition initialization:
Ci, 0=0, c0, j=0, wherein 0≤i≤m and 0≤j≤n;
Utilize the recursion equation compute matrix in the formula 7, wherein 1≤i≤m and 1≤j≤n:
Finally, the longest common subsequence can be by Cm, the lower right corner cm of n, and n draws.
The longest subsequence algorithm (LMS) that matees can calculate as follows:
Given list entries A=(a1, a2, a3 ..., am) and B=(b1, b2, b3 ..., bn).
Definition Matrix C m, n, Rm, n and Sm, n, in order to calculating the longest coupling subsequence, specific definition respectively as follows:
Shaping Matrix C m, n, its unit ci, j store subsequence (a1, a2, a3 ..., ai) with (b1, b2, b3 ..., the longest coupling sub-sequence length between bj);
INTEGER MATRICES Rm, n, its unit ri, j stores the number of discontinuous element in this two sub-sequence;
Character matrix Sm, n, its unit si, j stores Matrix C m, n, Rm, the inside calculation path of n is to calculate the direction that the normally the longest coupling subsequence of record increases each time.
According to these three matrixes of following condition initialization:
Ci, 0=0, c0, j=0, r0, j=0, ri, 0=0, s0, j=' _ ', si, 0=' _ ', wherein 0≤i≤m and 0≤j≤n;
Utilize the recursion equation compute matrix Cm among the formula 8-10, n, Rm, n and Sm, n, wherein 1≤i≤m and 1≤j≤n:
Finally, the output result of the longest coupling subsequence algorithm of the present invention can pass through sign matrix Sm, the path of the record of n and Cm, and the numerical evaluation that stores among the n obtains.The length of the longest coupling subsequence then can be by cm, and n directly obtains.
The idiographic flow of the longest coupling subsequence algorithm of melody similarity of the present invention is as shown in Figure 9:
1): given input melody characteristics sequence A, B, adopt the LMS algorithm computation to obtain subsequence A ', the B ' that similarity is the highest each other between A, the B, promptly the longest coupling subsequence;
2) length gauge of length through primitive character sequence A, B and coupling subsequence A ', B ' is calculated the shared ratio of compatible portion between primitive character sequence A, the B;
3) editing distance between the editing distance algorithm computation A ' of employing real number field, the B ';
4): the characteristic sequence of each group in the retrieving being participated in coupling; Carry out descending sort with the shared ratio of the compatible portion between A, the B as first key word; Editing distance carries out the ascending order arrangement as second key word between A ', the B '; Its similarity is sorted, constitute the tabulation of similarity descending.
The present invention has selected six tested objects at random in the test compliance test result of embodiment, these six objects are that system provides information to be retrieved with the humming form.In addition; Be to guarantee the validity of humming each time, avoid experimental result to receive the influence of tested object subjective factor, in the experimentation; The humming of each person-time has only other five tested object approvals more than half; Think when song that this humming person is hummed is target song really, just can be regarded as once effectively humming, otherwise will not be designated as once effectively experimental data.Article 87, effective audio retrieval information produces in experiment, and for the retrieving information of major part, desired song is all hit acceptable Search Results cis-position.The first place of Search Results has been hit in retrieval to 58.62%, target song; Shared number percent ranks the first in the result of all cis-positions.To surpassing 88.51% retrieval, target song can both be hit the first five position of Search Results in addition; And 95.40% retrieval, target song can be hit top ten.To independent retrieval each time; Its execution time arrives 550ms, average 289.47ms, the hardware environment of the experiment of consideration at 150ms; For a musical features database that has first songs up to a hundred; Obtained desirable accuracy rate effect its working time also within the acceptable range, and overall experimental result shows that feature extraction proposed by the invention, expression and melody calculation of similarity degree method are effective.
The above; Be merely embodiment of the present invention, but protection scope of the present invention is not limited thereto, any technician who is familiar with the present technique field is in scope disclosed by the invention; The variation that can expect easily or replacement all should be encompassed in the protection domain of claim of the present invention.
Claims (2)
1. the humming Computer Music search method based on the longest coupling subsequence algorithm is characterized in that, may further comprise the steps:
(1) the fundamental frequency sequence is extracted and is handled;
1) the WAV wave file of input is used the RAPT algorithm and carry out the fundamental frequency extraction, thereby obtain the fundamental frequency sequence;
2) original fundamental frequency sequence will be passed through Hi-pass filter and low-pass filter processing; Remove burr and noise spot, level and smooth fundamental curve, human range width range is generally between E2 (82Hz)~C6 (1047Hz); According to mankind's nature range of voice; The threshold value of high-pass filtering is set to 80Hz, and the threshold value of LPF is set to 1100Hz, in order to remove the fundamental frequency value that is in outside the height threshold value;
3) with the linear smoothing processing fundamental frequency sequence is carried out linear filtering and handle, the noise spot in the removal fundamental frequency sequence and further level and smooth the curved profile of fundamental frequency sequence;
4) with resulting fundamental frequency sequence, remove noise spot through medium filtering, removed the noise spot in the fundamental frequency sequence effectively, and intact kept that the step between the continuous curve changes in the fundamental frequency sequence;
(2) musical features is expressed;
1) feature representation of fundamental frequency curve;
With semitone as unit, with the sequence that interval was constituted between adjacent two sounds as melody characteristics.The melody fragment that comprises n natural note can be expressed as the interval sequence that n-1 real number constitutes, and expresses melody characteristics with the mode that quantizes, and the musical features of different melody has discrimination, and calculating for follow-up similarity provides effective result; Insensitive to whole pitch overall offset, allow the user humming in the mode arbitrarily, identical melody characteristics still can be extracted; Have good stable property, to the audio-frequency information of user through the humming input, interval is calculated definition shown in formula (1):
According to above definition, can be with pitch frequencies sequence Fx=(freq
1, freq
2, freq
3..., freq
n) be mapped to interval sequence Pi=(pitch_interval
1, pitch_interval
2, pitch_interval
3..., pitch_interval
N-1);
For the MIDI file of storing in the musical features database; Need to adopt same melody characteristics expression way, make to have identical form with the melody characteristics that extracts from database side, the MIDI file from user input; Interval is calculated definition shown in formula (2), wherein MIDI_note
N+1And MIDI_note
nRepresent the pitch value in the MIDI file:
Pitch?Interval
n=MIDI_note
n+1-MIDI_note
n (2)
Through above conversion; Can the same melody characteristics under the different modes be carried out normalization; Eliminated the influence that different humming modes extract melody characteristics simultaneously; The characteristic of same melody can be designed a similarity evaluation mechanism by the extracting with normalized mode of success based on identical feature representation mode in the different modes, accomplishes the coupling of the characteristic information in retrieving information and the database; Treated fundamental frequency sequence is carried out the melody modeling, obtain one group of melody skeleton that constitutes by discrete point; The melody skeleton transforms through logarithm; Interval is extracted out between the adjacent tone of input, and with this characteristic sequence as the input audio frequency, the melody characteristics that finally extracts; The information that is admitted in matching module and the musical features database is carried out similarity calculating, obtains matching result;
2) feature representation of melody;
Through fundamental frequency extract, filtering obtain the fundamental frequency contour curve, can be split into is the combination of two kinds of separate melody compositions: macroscopical melody composition and microcosmic melody composition; It defines respectively as follows:
Macroscopic view melody composition: the tone pattern in the reaction voice messaging, closely related with the overall change in pitch of fundamental frequency;
Microcosmic melody composition: react the phoneme composition in the voice messaging, influence the localized variation of fundamental frequency curve;
In like manner, humming information is a kind of as voice messaging, also is regarded as the combination of two kinds of melody compositions; For the music rhythm of voice humming, change in pitch is only relevant with macroscopical melody composition of its fundamental curve, and phoneme informations such as the phonetic symbol of humming, the lyrics; Then, utilize the quadratic spline function, through the approximate macroscopical melody that obtains the fundamental frequency curve of interpolation by its fundamental frequency curve microcosmic melody composition decision; Resulting macroscopical melody appears with the form of dispersive target point sequence, and has represented the corresponding pitch melody characteristics of this fundamental frequency sequence, and the humming melody characteristics has nothing to do with phonetic symbol, phoneme information based on pitch sequence; So; Utilize the MOMEL algorithm to handling, can obtain the macroscopical melody sequence in the fundamental frequency contour curve through the fundamental frequency contour curve of filtering, and the basis of expressing as follow-up melody characteristics;
The direct result of MOMEL algorithm output is expressed for follow-up melody characteristics and is still had obviously deficiency; Therefore a parameterized threshold value is set again; In order to the interval between the control adjacent tone; When the interval between two sounds is lower than this threshold value, this interval can based on concrete condition deleted or adjacent interval with other merge;
(3) similarity calculating-match retrieval algorithm
(a) the longest coupling subsequence algorithm (Longest Matched Subsequence, LMS);
Based on the melody characteristics that feature extracting method obtained is one group of sequence of real numbers; And the melody characteristics of storing in the musical features database is integer sequence; At this moment; If the similarity of two sequences of the longest common subsequence algorithm computation of utilization of machinery, so a lot of elements that originally can mate possibly omitted;
The longest coupling subsequence algorithm just can solve the longest common subsequence algorithm (LCS) and in application, have problems; The longest coupling subsequence algorithm is as a kind of improvement to the longest common subsequence algorithm; Its output result is independently two sub-sequence A ', B ', is respectively the subsequence of list entries A, B;
Define the longest coupling subsequence according to following mode:
Given list entries A=(a1, a2, a3 ..., an) and B=(b1, b2, b3 ..., bm),
Promptly produce subsequence A '=(a ' 1, a ' 2, a ' 3 ..., a ' l) and B '=(b ' 1, b ' 2, b ' 3 ..., b ' l);
Subsequence A ', B ' satisfy following condition:
One of which, each element among subsequence A ', the B ' all have the element that matches in another subsequence, and meet following condition:
In subsequence A ', B ':
The element ai of element a ' the k corresponding A of A ';
The element bj of the corresponding B of the element b ' k of B ';
Satisfy: LD (ai, bj)≤δ, wherein δ is given local similar degree maximal value;
Its two, subsequence A ' and B ' are continuous relatively in original series separately respectively, promptly meet following condition:
In subsequence A ', B ':
The element ai of element a ' the k corresponding A of A ', A ' the element as of element a ' k+1 corresponding A;
The element bj of the corresponding B of the element b ' l of B ', B ' the element at of the corresponding B of element b ' l+1;
Satisfy: s-i≤L and t-j≤L, wherein L is the maximal value that allows to insert element in the subsequence;
Its three, subsequence A ' has identical length with B '.And A ', B ' be respectively A, B all satisfy condition 1) and 2) subsequence in the longest, promptly | A ' |=| B ' |=max{|Ak|, | Bl|};
In the longest coupling subsequence algorithm, the notion that element equates is replaced by the notion of coupling.Different with the longest common subsequence algorithm, A ' is not mechanically to equate fully with B ', but the longest and have the highest one group of similarity among the A, all subsequences of B;
(b) the local similar degree calculates;
As the basis of the longest coupling subsequence algorithm, be the account form of local similar degree below;
At first, the editing distance algorithm of definition sequence of real numbers:
Sequence of real numbers to given input: X=(x1, x2, x3 ..., xm), Y=(y1, y2, y3 ..., yn);
According to the weights that transform each other between the element in formula (3) the definition sequence of real numbers, wherein δ is for judging the threshold value that equates:
Initialization editing distance matrix D m, n, initialization condition is as follows:
d0,0=0;
Di, 0=di-1,0+w (xi, 0), wherein 1≤i≤m;
D0, j=d0, j-1+w (0, yj), 1≤j≤n wherein;
To the matrix unit of 1≤i≤m and 1≤j≤n, calculate editing distance matrix D m, n, recursion equation are shown in formula (4), and wherein Wdel, Wsub and Wins are respectively deletion, replacement, insert the weights of three kinds of operations:
Finally, the sequence of real numbers X of input and the editing distance ED between the Y (X Y) can be from matrix D m, the lower right corner dm of n, and n obtains, shown in formula (5):
ED(X,Y)=dm,n。(5)
Then, provide the concrete definition of the local similar degree of element:
Melody characteristics sequence to given input: A=(a1, a2, a3 ..., an), B=(b1, b2, b3 ..., bm);
From A, B, respectively get element constitute doublet (ai, bj), for each to doublet, its local similar degree definition as follows, wherein k is a local radius:
Define local subsequence X=(ai-k ..., ai ..., ai+k) and Y=(bj-k ..., bj ..., bj+k);
Then doublet (ai, near the local similar degree LD bj) (ai, bj) can by the editing distance ED between local subsequence X and the Y (X Y) obtains, shown in formula (6):
LD(ai,bj)=ED(X,Y);(6)
(c) the longest coupling subsequence is calculated in dynamic programming;
Near the clear and definite computation rule of local similar degree element ai, the bj among original series A, the B can utilize the policy calculation of dynamic programming (Dynamic Programming) to go out to meet the longest coupling subsequence A ', the B ' of definition;
At first, utilize the calculation process of the dynamic programming the longest common subsequence algorithms of clearing (LCS):
Given list entries A=(a1, a2, a3 ..., am) and B=(b1, b2, b3 ..., bn);
Structure LCS Matrix C m, n, according to following this matrix of condition initialization:
Ci, 0=0, c0, j=0, wherein 0≤i≤m and 0≤j≤n;
Utilize the recursion equation compute matrix in the formula (7), wherein 1≤i≤m and 1≤j≤n:
Finally, the longest common subsequence can be by Cm, the lower right corner cm of n, and n draws;
The longest subsequence algorithm (LMS) that matees can calculate as follows:
Given list entries A=(a1, a2, a3 ..., am) and B=(b1, b2, b3 ..., bn);
Definition Matrix C m, n, Rm, n and Sm, n, in order to calculating the longest coupling subsequence, specific definition respectively as follows:
Shaping Matrix C m, n, its unit ci, j store subsequence (a1, a2, a3 ..., ai) with (b1, b2, b3 ..., the longest coupling sub-sequence length between bj);
INTEGER MATRICES Rm, n, its unit ri, j stores the number of discontinuous element in this two sub-sequence;
Character matrix Sm, n, its unit si, j stores Matrix C m, n, Rm, the inside calculation path of n is to calculate the direction that the normally the longest coupling subsequence of record increases each time;
According to following condition initialization this≤individual matrix:
Ci, 0=0, c0, j=0, r0, j=0, ri, 0=0, s0, j=' _ ', si, 0=' _ ', wherein 0≤i≤m and 0≤j≤n;
Utilize the recursion equation compute matrix Cm in the formula (8), n, Rm, n and Sm, n, wherein 1≤i≤m and 1≤j≤n:
Finally, the output result of the longest coupling subsequence algorithm can pass through sign matrix Sm, the path of the record of n and Cm, and the numerical evaluation that stores among the n obtains, and the length of the longest coupling subsequence then can be by cm, and n directly obtains.
2. a kind of humming Computer Music search method based on the longest coupling subsequence algorithm according to claim 1 is characterized in that the concrete steps of the longest said coupling subsequence algorithm are following:
(1): given input melody characteristics sequence A, B, adopt the LMS algorithm computation to obtain subsequence A ', the B ' that similarity is the highest each other between A, the B, promptly the longest coupling subsequence;
(2) length gauge of length through primitive character sequence A, B and coupling subsequence A ', B ' is calculated the shared ratio of compatible portion between primitive character sequence A, the B;
(3) editing distance between the editing distance algorithm computation A ' of employing real number field, the B ';
(4): the characteristic sequence of each group in the retrieving being participated in coupling; Carry out descending sort with the shared ratio of the compatible portion between A, the B as first key word; Editing distance carries out the ascending order arrangement as second key word between A ', the B '; Its similarity is sorted, constitute the tabulation of similarity descending.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110382159 CN102521281B (en) | 2011-11-25 | 2011-11-25 | Humming computer music searching method based on longest matching subsequence algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110382159 CN102521281B (en) | 2011-11-25 | 2011-11-25 | Humming computer music searching method based on longest matching subsequence algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102521281A true CN102521281A (en) | 2012-06-27 |
CN102521281B CN102521281B (en) | 2013-10-23 |
Family
ID=46292202
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201110382159 Active CN102521281B (en) | 2011-11-25 | 2011-11-25 | Humming computer music searching method based on longest matching subsequence algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102521281B (en) |
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103021404A (en) * | 2012-11-23 | 2013-04-03 | 黄伟 | Advertisement identification method based on audio |
CN103247286A (en) * | 2013-03-28 | 2013-08-14 | 北京航空航天大学 | Method for extracting melody of counterpoint based on GPU |
CN103412886A (en) * | 2013-07-18 | 2013-11-27 | 北京航空航天大学 | Music melody matching method based on pitch sequence |
CN103440873A (en) * | 2013-08-27 | 2013-12-11 | 大连理工大学 | Music recommendation method based on similarities |
CN103559309A (en) * | 2013-11-19 | 2014-02-05 | 北京航空航天大学 | Audio information retrieval and recommendation system based on GPU (graphics processing unit) acceleration |
CN104077336A (en) * | 2013-05-09 | 2014-10-01 | 腾讯科技(深圳)有限公司 | Method and device for dragging audio file to retrieve audio file information |
CN104091598A (en) * | 2013-04-18 | 2014-10-08 | 腾讯科技(深圳)有限公司 | Audio file similarity calculation method and device |
CN104091594A (en) * | 2013-08-16 | 2014-10-08 | 腾讯科技(深圳)有限公司 | Audio classifying method and device |
CN104133851A (en) * | 2014-07-07 | 2014-11-05 | 小米科技有限责任公司 | Audio similarity detecting method, audio similarity detecting device and electronic equipment |
CN104143339A (en) * | 2013-05-09 | 2014-11-12 | 索尼公司 | Music signal processing apparatus and method, and program |
CN104217722A (en) * | 2014-08-22 | 2014-12-17 | 哈尔滨工程大学 | Dolphin whistle signal spectrum contour extraction method |
CN105718486A (en) * | 2014-12-05 | 2016-06-29 | 科大讯飞股份有限公司 | Online query by humming method and system |
CN105845115A (en) * | 2016-03-16 | 2016-08-10 | 腾讯科技(深圳)有限公司 | Song mode determining method and song mode determining device |
CN106126498A (en) * | 2016-06-22 | 2016-11-16 | 上海者信息科技有限公司 | A kind of batch bilingual terminology recognition methods based on dynamic programming |
CN106205571A (en) * | 2016-06-24 | 2016-12-07 | 腾讯科技(深圳)有限公司 | A kind for the treatment of method and apparatus of singing voice |
CN106448630A (en) * | 2016-09-09 | 2017-02-22 | 腾讯科技(深圳)有限公司 | Method and device for generating digital music file of song |
WO2017028116A1 (en) * | 2015-08-16 | 2017-02-23 | 胡丹丽 | Intelligent desktop speaker and method for controlling intelligent desktop speaker |
CN106547797A (en) * | 2015-09-23 | 2017-03-29 | 腾讯科技(深圳)有限公司 | Audio frequency generation method and device |
CN106776664A (en) * | 2015-11-25 | 2017-05-31 | 北京搜狗科技发展有限公司 | A kind of fundamental frequency series processing method and device |
CN107229629A (en) * | 2016-03-24 | 2017-10-03 | 腾讯科技(深圳)有限公司 | Audio identification methods and device |
WO2018032760A1 (en) * | 2016-08-15 | 2018-02-22 | 中兴通讯股份有限公司 | Voice information processing method and apparatus |
CN108074588A (en) * | 2016-11-15 | 2018-05-25 | 北京唱吧科技股份有限公司 | A kind of pitch computational methods and device |
CN109087669A (en) * | 2018-10-23 | 2018-12-25 | 腾讯科技(深圳)有限公司 | Audio similarity detection method, device, storage medium and computer equipment |
CN109493853A (en) * | 2018-09-30 | 2019-03-19 | 福建星网视易信息系统有限公司 | A kind of the determination method and terminal of audio similarity |
CN109492127A (en) * | 2018-11-12 | 2019-03-19 | 网易传媒科技(北京)有限公司 | Data processing method, device, medium and calculating equipment |
CN110310621A (en) * | 2019-05-16 | 2019-10-08 | 平安科技(深圳)有限公司 | Sing synthetic method, device, equipment and computer readable storage medium |
CN110675845A (en) * | 2019-09-25 | 2020-01-10 | 杨岱锦 | Human voice humming accurate recognition algorithm and digital notation method |
CN112331170A (en) * | 2020-10-28 | 2021-02-05 | 平安科技(深圳)有限公司 | Method, device and equipment for analyzing similarity of Buddha music melody and storage medium |
CN113096619A (en) * | 2021-03-24 | 2021-07-09 | 平安科技(深圳)有限公司 | Music similarity calculation method, device, equipment and storage medium |
US20210241734A1 (en) * | 2020-01-31 | 2021-08-05 | Obeebo Labs Ltd. | Systems, devices, and methods for computer-generated musical note sequences |
CN113643720A (en) * | 2021-08-06 | 2021-11-12 | 腾讯音乐娱乐科技(深圳)有限公司 | Song feature extraction model training method, song identification method and related equipment |
CN113724739A (en) * | 2021-09-01 | 2021-11-30 | 腾讯音乐娱乐科技(深圳)有限公司 | Method, terminal and storage medium for retrieving audio and training acoustic model |
CN114758560A (en) * | 2022-03-30 | 2022-07-15 | 厦门大学 | Humming intonation evaluation method based on dynamic time warping |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040030691A1 (en) * | 2000-01-06 | 2004-02-12 | Mark Woo | Music search engine |
CN101398827A (en) * | 2007-09-28 | 2009-04-01 | 三星电子株式会社 | Method and device for singing search |
CN101916250A (en) * | 2010-04-12 | 2010-12-15 | 电子科技大学 | Humming-based music retrieving method |
-
2011
- 2011-11-25 CN CN 201110382159 patent/CN102521281B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040030691A1 (en) * | 2000-01-06 | 2004-02-12 | Mark Woo | Music search engine |
CN101398827A (en) * | 2007-09-28 | 2009-04-01 | 三星电子株式会社 | Method and device for singing search |
CN101916250A (en) * | 2010-04-12 | 2010-12-15 | 电子科技大学 | Humming-based music retrieving method |
Non-Patent Citations (3)
Title |
---|
张晶等: "音乐哼唱检索技术在web上的应用", 《计算机应用与软件》, vol. 25, no. 12, 31 December 2008 (2008-12-31) * |
李扬等: "一种新的近似旋律匹配方法及其在哼唱检索系统中的应用", 《计算机研究与发展》, vol. 40, no. 11, 30 November 2003 (2003-11-30), pages 1554 - 1560 * |
秦静等: "基于动态分割和加权综合匹配的音乐检索算法", 《计算机工程》, vol. 33, no. 13, 31 July 2007 (2007-07-31) * |
Cited By (60)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103021404A (en) * | 2012-11-23 | 2013-04-03 | 黄伟 | Advertisement identification method based on audio |
CN103247286A (en) * | 2013-03-28 | 2013-08-14 | 北京航空航天大学 | Method for extracting melody of counterpoint based on GPU |
CN103247286B (en) * | 2013-03-28 | 2015-09-23 | 北京航空航天大学 | A kind of melody of counterpoint extracting method based on GPU |
CN104091598A (en) * | 2013-04-18 | 2014-10-08 | 腾讯科技(深圳)有限公司 | Audio file similarity calculation method and device |
WO2014169682A1 (en) * | 2013-04-18 | 2014-10-23 | Tencent Technology (Shenzhen) Company Limited | System and method for calculating similarity of audio files |
US9466315B2 (en) | 2013-04-18 | 2016-10-11 | Tencent Technology (Shenzhen) Company Limited | System and method for calculating similarity of audio file |
CN104143339B (en) * | 2013-05-09 | 2019-10-11 | 索尼公司 | Acoustic musical signals processing device and method |
CN104077336A (en) * | 2013-05-09 | 2014-10-01 | 腾讯科技(深圳)有限公司 | Method and device for dragging audio file to retrieve audio file information |
CN104077336B (en) * | 2013-05-09 | 2016-08-03 | 腾讯科技(深圳)有限公司 | A kind of pull the method and apparatus that audio file carries out audio file information retrieval |
CN104143339A (en) * | 2013-05-09 | 2014-11-12 | 索尼公司 | Music signal processing apparatus and method, and program |
CN103412886A (en) * | 2013-07-18 | 2013-11-27 | 北京航空航天大学 | Music melody matching method based on pitch sequence |
CN104091594A (en) * | 2013-08-16 | 2014-10-08 | 腾讯科技(深圳)有限公司 | Audio classifying method and device |
CN104091594B (en) * | 2013-08-16 | 2016-10-19 | 腾讯科技(深圳)有限公司 | A kind of audio frequency classification method and device |
CN103440873A (en) * | 2013-08-27 | 2013-12-11 | 大连理工大学 | Music recommendation method based on similarities |
CN103440873B (en) * | 2013-08-27 | 2015-10-28 | 大连理工大学 | A kind of music recommend method based on similarity |
CN103559309B (en) * | 2013-11-19 | 2016-05-25 | 北京航空航天大学 | A kind of music retrieval and commending system accelerating based on GPU |
CN103559309A (en) * | 2013-11-19 | 2014-02-05 | 北京航空航天大学 | Audio information retrieval and recommendation system based on GPU (graphics processing unit) acceleration |
CN104133851A (en) * | 2014-07-07 | 2014-11-05 | 小米科技有限责任公司 | Audio similarity detecting method, audio similarity detecting device and electronic equipment |
CN104133851B (en) * | 2014-07-07 | 2018-09-04 | 小米科技有限责任公司 | The detection method and detection device of audio similarity, electronic equipment |
CN104217722A (en) * | 2014-08-22 | 2014-12-17 | 哈尔滨工程大学 | Dolphin whistle signal spectrum contour extraction method |
CN104217722B (en) * | 2014-08-22 | 2017-07-11 | 哈尔滨工程大学 | A kind of dolphin whistle signal time-frequency spectrum contour extraction method |
CN105718486B (en) * | 2014-12-05 | 2021-07-06 | 科大讯飞股份有限公司 | Online humming retrieval method and system |
CN105718486A (en) * | 2014-12-05 | 2016-06-29 | 科大讯飞股份有限公司 | Online query by humming method and system |
WO2017028116A1 (en) * | 2015-08-16 | 2017-02-23 | 胡丹丽 | Intelligent desktop speaker and method for controlling intelligent desktop speaker |
CN106547797A (en) * | 2015-09-23 | 2017-03-29 | 腾讯科技(深圳)有限公司 | Audio frequency generation method and device |
WO2017050059A1 (en) * | 2015-09-23 | 2017-03-30 | 腾讯科技(深圳)有限公司 | Audio generation method, server, and storage medium |
CN106547797B (en) * | 2015-09-23 | 2019-07-05 | 腾讯科技(深圳)有限公司 | Audio generation method and device |
US10261965B2 (en) | 2015-09-23 | 2019-04-16 | Tencent Technology (Shenzhen) Company Limited | Audio generation method, server, and storage medium |
CN106776664A (en) * | 2015-11-25 | 2017-05-31 | 北京搜狗科技发展有限公司 | A kind of fundamental frequency series processing method and device |
CN105845115B (en) * | 2016-03-16 | 2021-05-07 | 腾讯科技(深圳)有限公司 | Song mode determining method and song mode determining device |
CN105845115A (en) * | 2016-03-16 | 2016-08-10 | 腾讯科技(深圳)有限公司 | Song mode determining method and song mode determining device |
CN107229629B (en) * | 2016-03-24 | 2021-03-19 | 腾讯科技(深圳)有限公司 | Audio recognition method and device |
CN107229629A (en) * | 2016-03-24 | 2017-10-03 | 腾讯科技(深圳)有限公司 | Audio identification methods and device |
US10949462B2 (en) | 2016-03-24 | 2021-03-16 | Tencent Technology (Shenzhen) Company Limited | Audio identification method and apparatus, and computer storage medium |
CN106126498A (en) * | 2016-06-22 | 2016-11-16 | 上海者信息科技有限公司 | A kind of batch bilingual terminology recognition methods based on dynamic programming |
CN106126498B (en) * | 2016-06-22 | 2019-06-14 | 上海一者信息科技有限公司 | A kind of batch bilingual terminology recognition methods based on Dynamic Programming |
CN106205571A (en) * | 2016-06-24 | 2016-12-07 | 腾讯科技(深圳)有限公司 | A kind for the treatment of method and apparatus of singing voice |
WO2018032760A1 (en) * | 2016-08-15 | 2018-02-22 | 中兴通讯股份有限公司 | Voice information processing method and apparatus |
CN106448630A (en) * | 2016-09-09 | 2017-02-22 | 腾讯科技(深圳)有限公司 | Method and device for generating digital music file of song |
US10923089B2 (en) | 2016-09-09 | 2021-02-16 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for generating digital score file of song, and storage medium |
CN106448630B (en) * | 2016-09-09 | 2020-08-04 | 腾讯科技(深圳)有限公司 | Method and device for generating digital music score file of song |
CN108074588A (en) * | 2016-11-15 | 2018-05-25 | 北京唱吧科技股份有限公司 | A kind of pitch computational methods and device |
CN108074588B (en) * | 2016-11-15 | 2020-12-01 | 北京唱吧科技股份有限公司 | Pitch calculation method and pitch calculation device |
CN109493853B (en) * | 2018-09-30 | 2022-03-22 | 福建星网视易信息系统有限公司 | Method for determining audio similarity and terminal |
CN109493853A (en) * | 2018-09-30 | 2019-03-19 | 福建星网视易信息系统有限公司 | A kind of the determination method and terminal of audio similarity |
CN109087669A (en) * | 2018-10-23 | 2018-12-25 | 腾讯科技(深圳)有限公司 | Audio similarity detection method, device, storage medium and computer equipment |
CN109492127A (en) * | 2018-11-12 | 2019-03-19 | 网易传媒科技(北京)有限公司 | Data processing method, device, medium and calculating equipment |
CN110310621A (en) * | 2019-05-16 | 2019-10-08 | 平安科技(深圳)有限公司 | Sing synthetic method, device, equipment and computer readable storage medium |
CN110675845A (en) * | 2019-09-25 | 2020-01-10 | 杨岱锦 | Human voice humming accurate recognition algorithm and digital notation method |
US11948542B2 (en) * | 2020-01-31 | 2024-04-02 | Obeebo Labs Ltd. | Systems, devices, and methods for computer-generated musical note sequences |
US20210241734A1 (en) * | 2020-01-31 | 2021-08-05 | Obeebo Labs Ltd. | Systems, devices, and methods for computer-generated musical note sequences |
CN112331170B (en) * | 2020-10-28 | 2023-09-15 | 平安科技(深圳)有限公司 | Method, device, equipment and storage medium for analyzing Buddha music melody similarity |
WO2021203713A1 (en) * | 2020-10-28 | 2021-10-14 | 平安科技(深圳)有限公司 | Buddhist music melody similarity analysis method, apparatus and device, and storage medium |
CN112331170A (en) * | 2020-10-28 | 2021-02-05 | 平安科技(深圳)有限公司 | Method, device and equipment for analyzing similarity of Buddha music melody and storage medium |
CN113096619B (en) * | 2021-03-24 | 2024-01-19 | 平安科技(深圳)有限公司 | Music similarity calculation method, device, equipment and storage medium |
CN113096619A (en) * | 2021-03-24 | 2021-07-09 | 平安科技(深圳)有限公司 | Music similarity calculation method, device, equipment and storage medium |
CN113643720A (en) * | 2021-08-06 | 2021-11-12 | 腾讯音乐娱乐科技(深圳)有限公司 | Song feature extraction model training method, song identification method and related equipment |
CN113724739A (en) * | 2021-09-01 | 2021-11-30 | 腾讯音乐娱乐科技(深圳)有限公司 | Method, terminal and storage medium for retrieving audio and training acoustic model |
CN113724739B (en) * | 2021-09-01 | 2024-06-11 | 腾讯音乐娱乐科技(深圳)有限公司 | Method, terminal and storage medium for retrieving audio and training acoustic model |
CN114758560A (en) * | 2022-03-30 | 2022-07-15 | 厦门大学 | Humming intonation evaluation method based on dynamic time warping |
Also Published As
Publication number | Publication date |
---|---|
CN102521281B (en) | 2013-10-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102521281B (en) | Humming computer music searching method based on longest matching subsequence algorithm | |
Durrieu et al. | Source/filter model for unsupervised main melody extraction from polyphonic audio signals | |
Lee et al. | Acoustic chord transcription and key extraction from audio using key-dependent HMMs trained on synthesized audio | |
CN102664016B (en) | Singing evaluation method and system | |
CN103177722B (en) | A kind of song retrieval method based on tone color similarity | |
CN107680582A (en) | Acoustic training model method, audio recognition method, device, equipment and medium | |
CN107680602A (en) | Voice fraud recognition methods, device, terminal device and storage medium | |
Kong et al. | Audio flamingo: A novel audio language model with few-shot learning and dialogue abilities | |
CN112185321A (en) | Song generation | |
CN110164460A (en) | Sing synthetic method and device | |
Mor et al. | A systematic literature review on computational musicology | |
CN102841932A (en) | Content-based voice frequency semantic feature similarity comparative method | |
Ramirez et al. | Automatic performer identification in commercial monophonic jazz performances | |
Gajjar et al. | Computational musicology for raga analysis in Indian classical music: a critical review | |
Huang et al. | Piano music teaching under the background of artificial intelligence | |
Nagavi et al. | Overview of automatic Indian music information recognition, classification and retrieval systems | |
CN117198251A (en) | Music melody generation method | |
Gulati | Computational approaches for melodic description in indian art music corpora | |
CN117012230A (en) | Evaluation model for singing pronunciation and character biting | |
Liumei et al. | K-means clustering analysis of Chinese traditional folk music based on midi music textualization | |
CN117037796A (en) | AIGC voice fraud wind control method, medium and equipment based on multiple characteristics | |
Waghmare et al. | Raga identification techniques for classifying indian classical music: A survey | |
Mase et al. | HMM-based singing voice synthesis system using pitch-shifted pseudo training data. | |
Guerrero-Turrubiates et al. | Guitar chords classification using uncertainty measurements of frequency bins | |
Kher | Music Composer Recognition from MIDI Representation using Deep Learning and N-gram Based Methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |